I was experimenting with wasmtime-py today and found the current documentation didn't quite give me the information that I needed. …
I'm still working towards adding multi-modal support to my LLM tool. In the meantime, here are notes on running prompts against images and PDFs and audio and video files from the command-line using the Google Gemini family of models. …
I decided to have a poke around and see if I could figure out how the HTTP streaming APIs from the various hosted LLM providers actually worked. Here are my notes so far. …
Here's a trick I've used a couple of times in the past few days. …
I've been trying to get my head around ColBERT. …
llama.cpp recently added the ability to control the output of any model using a grammar. …
I've been experimenting with the combination of Claude and my LLM CLI tool to give me quick summaries of long discussions on Hacker News. …
Xeophon suggested that E5-large-v2 as an embedding model that was worth a closer look. …
I decide to upgrade the related articles feature on my TILs site. Previously I calculated these using full-text search, but I wanted to try out a new trick using OpenAI embeddings for document similarity instead. …
OpenAI announced new models today. Of particular interest to me is the new gpt-3.5-turbo-16k
model, which provides GPT 3.5 with a 16,000 token context window (up from 4,000) priced at 1/10th of GPT-4 - $0.003 per 1K input tokens and $0.004 per 1K output tokens. …
MLC (Machine Learning Compilation) on May 22nd 2023: Bringing Open Large Language Models to Consumer Devices …
The ChatGPT Code Interpreter alpha remains incredibly interesting. I wrote about how I was using it for Python and SQLite benchmarking a few weeks ago. Today I found a neat pattern for expanding its capabilities with custom binaries. …
Dolly 2.0 looks to be a big deal. It calls itself "the first open source, instruction-following LLM, fine-tuned on a human-generated instruction dataset licensed for research and commercial use." …
A popular nightmare scenario for AI is giving it access to tools, so it can make API calls and execute its own code and generally break free of the constraints of its initial environment. …
See also: Large language models are having their Stable Diffusion moment right now. …
This is a follow-up to Running nanoGPT on a MacBook M2 to generate terrible Shakespeare. …
nanoGPT is Andrej Karpathy's "simplest, fastest repository for training/finetuning medium-sized GPTs". …