Running OpenAI's large context models using llm

OpenAI announced new models today. Of particular interest to me is the new gpt-3.5-turbo-16k model, which provides GPT 3.5 with a 16,000 token context window (up from 4,000) priced at 1/10th of GPT-4 - $0.003 per 1K input tokens and $0.004 per 1K output tokens.

I couldn't see that model listed in the GPT Playground interface, but it turns out I did have access to it via the API. This worked for me using my llm tool (see llm, ttok and strip-tags—CLI tools for working with ChatGPT and other LLMs).

curl -s 'https://simonwillison.net/' | strip-tags -m | llm --system 'Summarize' -s

This returns an error:

This model's maximum context length is 4097 tokens. However, your messages resulted in 15038 tokens. Please reduce the length of the messages.

But... specifying -m gpt-3.5-turbo-16k fixes that:

curl -s 'https://simonwillison.net/' | strip-tags -m | llm --system 'Summarize' -s -m gpt-3.5-turbo-16k

Simon Willison’s Weblog is a website where Simon Willison, a software engineer and entrepreneur, shares his thoughts and experiences on various topics related to web development, data management, and technology. In his recent entries, he discusses topics such as understanding GPT tokenizers, the challenges of working with closed models' training data, and the use of command-line tools for working with ChatGPT and other language models. Simon also shares updates on his own projects, including building tools for working with tokens, stripping HTML tags, and sending prompts to the OpenAI API. Overall, Simon's weblog provides valuable insights and resources for anyone interested in web development and working with language models.

And that works for gpt-4-32k-0613 too

I thought I didn't have access to GPT-4 with the 32,000 token context (a much more expensive model) because it never showed up for me in the Playground. But it turns out I do have access to that, and this works:

curl -s 'https://railsatscale.com/2023-06-12-rewriting-the-ruby-parser/' | strip-tags -m | \
  llm --system 'summarize as bullets' -m 'gpt-4-32k-0613' -s

Output (which cost me about 6 cents):

Created 2023-06-13T22:25:48+01:00 · Edit