Simon Willison: TILs on llms

llms Exploring OpenAI's deep research API model o4-mini-deep-research - 2025-10-18

I was reviewing some older PRs and landed this one by Manuel Solorzano adding pricing for o4-mini-deep-research and o3-deep-research to my llm-prices.com site. I realized I hadn't tried those models yet so I decided to give one of them a go. …

llms Running a gpt-oss eval suite against LM Studio on a Mac - 2025-08-16

OpenAI's gpt-oss models come with an eval suite, which is described in their Verifying gpt-oss implementations cookbook. I figured out how to run it on my Mac against their gpt-oss-20b model hosted locally using LM Studio, using uv. …

llms Named Entity Resolution with dslim/distilbert-NER - 2024-12-23

I was exploring the original BERT model from 2018, which is mainly useful if you fine-tune a model on top of it for a specific task. …

llms Generating documentation from tests using files-to-prompt and LLM - 2024-11-05

I was experimenting with wasmtime-py today and found the current documentation didn't quite give me the information that I needed. …

llms Running prompts against images, PDFs, audio and video with Google Gemini - 2024-10-23

I'm still working towards adding multi-modal support to my LLM tool. In the meantime, here are notes on running prompts against images and PDFs and audio and video files from the command-line using the Google Gemini family of models. …

llms How streaming LLM APIs work - 2024-09-21

I decided to have a poke around and see if I could figure out how the HTTP streaming APIs from the various hosted LLM providers actually worked. Here are my notes so far. …

llms Piping from rg to llm to answer questions about code - 2024-02-11

Here's a trick I've used a couple of times in the past few days. …

llms Exploring ColBERT with RAGatouille - 2024-01-27

I've been trying to get my head around ColBERT. …

llms Using llama-cpp-python grammars to generate JSON - 2023-09-12

llama.cpp recently added the ability to control the output of any model using a grammar. …

llms Summarizing Hacker News discussion themes with Claude and LLM - 2023-09-09

I've been experimenting with the combination of Claude and my LLM CLI tool to give me quick summaries of long discussions on Hacker News. …

llms Embedding paragraphs from my blog with E5-large-v2 - 2023-09-08

Xeophon suggested that E5-large-v2 as an embedding model that was worth a closer look. …

llms Storing and serving related documents with openai-to-sqlite and embeddings - 2023-08-14

I decide to upgrade the related articles feature on my TILs site. Previously I calculated these using full-text search, but I wanted to try out a new trick using OpenAI embeddings for document similarity instead. …

llms Running OpenAI's large context models using llm - 2023-06-13

OpenAI announced new models today. Of particular interest to me is the new gpt-3.5-turbo-16k model, which provides GPT 3.5 with a 16,000 token context window (up from 4,000) priced at 1/10th of GPT-4 - $0.003 per 1K input tokens and $0.004 per 1K output tokens. …

llms mlc-chat - RedPajama-INCITE-Chat-3B on macOS - 2023-05-22

MLC (Machine Learning Compilation) on May 22nd 2023: Bringing Open Large Language Models to Consumer Devices …

llms Expanding ChatGPT Code Interpreter with Python packages, Deno and Lua - 2023-04-30

The ChatGPT Code Interpreter alpha remains incredibly interesting. I wrote about how I was using it for Python and SQLite benchmarking a few weeks ago. Today I found a neat pattern for expanding its capabilities with custom binaries. …

llms Running Dolly 2.0 on Paperspace - 2023-04-12

Dolly 2.0 looks to be a big deal. It calls itself "the first open source, instruction-following LLM, fine-tuned on a human-generated instruction dataset licensed for research and commercial use." …

llms A simple Python implementation of the ReAct pattern for LLMs - 2023-03-17

A popular nightmare scenario for AI is giving it access to tools, so it can make API calls and execute its own code and generally break free of the constraints of its initial environment. …

llms Running LLaMA 7B and 13B on a 64GB M2 MacBook Pro with llama.cpp - 2023-03-10

llms Training nanoGPT entirely on content from my blog - 2023-02-09

This is a follow-up to Running nanoGPT on a MacBook M2 to generate terrible Shakespeare. …

llms Running nanoGPT on a MacBook M2 to generate terrible Shakespeare - 2023-02-01

nanoGPT is Andrej Karpathy's "simplest, fastest repository for training/finetuning medium-sized GPTs". …