Simon Willison: TILs on python

Atom feed for python

python Programmatically comparing Python version strings - 2024-03-17

I found myself wanting to compare the version numbers 0.63.1, 1.0 and the 1.0a13 in Python code, in order to mark a pytest test as skipped if the installed version of Datasette was pre-1.0. …

python Getting Python MD5 to work with FIPS systems - 2024-02-13

This issue by Parand Darugar pointed out that Datasette doesn't currently run on Linux systems with FIPS enabled, due to the way it uses MD5 hashes. …

python Using pprint() to print dictionaries while preserving their key order - 2024-01-14

While parsing a CSV file using csv.DictReader today I noticed the following surprising result: …

python A simple pattern for inlining binary content in a Python script - 2023-08-19

For simonw/til issue #82 I needed to embed some binary content directly in a Python script. …

python Checking if something is callable or async callable in Python - 2023-08-04

I wanted a mechanism to check if a given Python object was "callable" - could be called like a function - or "async callable" - could be called using await obj(). …

python Protocols in Python - 2023-07-26

Datasette currently has a few API internals that return sqlite3.Row objects. I was thinking about how this might work in the future - if Datasette ever expands beyond SQLite (plugin-provided backends for PostgreSQL and DuckDB for example) I'd want a way to return data from other stores using objects that behave like sqlite3.Row but are not exactly that class. …

python Using tree-sitter with Python - 2023-07-13

tree-sitter is a "parser generator tool and an incremental parsing library". It has a very good reputation these days. …

python Quickly testing code in a different Python version using pyenv - 2023-07-10

I had a bug that was only showing up in CI against Python 3.8. …

python Python packages with pyproject.toml and nothing else - 2023-07-07

I've been using setuptools and setup.py for my Python packages for a long time: I like that it works without me having to think about installing and learning any additional tools such as Flit or pip-tools or Poetry or Hatch. …

python CLI tools hidden in the Python standard library - 2023-06-28

Seth Michael Larson pointed out that the Python gzip module can be used as a CLI tool like this: …

python TOML in Python - 2023-06-26

I finally got around to fully learning TOML. Some notes, including how to read and write it from Python. …

python The location of the pip cache directory - 2023-04-28

pip uses a cache to avoid downloading packages again: …

python A few notes on Rye - 2023-04-26

Rye is Armin Ronacher's new experimental Python packaging tool. I decided to take it for a test-run. …

python Calculating embeddings with gtr-t5-large in Python - 2023-01-31

I've long wanted to run some kind of large language model on my own computer. Now that I have a M2 MacBook Pro I'm even more keen to find interesting ways to keep all of those CPU cores busy. …

python Installing lxml for Python on an M1/M2 Mac - 2023-01-27

I ran into this error while trying to run pip install lxml on an M2 Mac, inside a virtual environment I had intitially created using pipenv shell: …

python Upgrading a pipx application to an alpha version - 2023-01-11

I wanted to upgrade my git-history installation to a new alpha version. …

python The pdb interact command - 2022-10-31

Today Carlton told me about the interact command in the Python debugger. …

python os.remove() on Windows fails if the file is already open - 2022-10-25

I puzzled over this one for quite a while this morning. I had this test that was failing with Windows on Python 3.11: …

python Simple load testing with Locust - 2022-10-22

I've been using Locust recently to run some load tests - most significantly these tests against SQLite running with Django and this test exercising Datasette and Gunicorn. …

python Using psutil to investigate "Too many open files" - 2022-10-13

I was getting this intermittent error running my Datasette test suite: …

python Running PyPy on macOS using Homebrew - 2022-09-14

Towards Inserting One Billion Rows in SQLite Under A Minute includes this snippet: …

python Defining setup.py dependencies using a URL - 2022-08-13

For sqlite-utils issue 464 I implemented a fix to a tiny bug in a dependency in my own fork on GitHub. …

python struct endianness in Python - 2022-07-28

TIL the Python standard library struct module defaults to interpreting binary strings using the endianness of your machine. …

python Freezing requirements with pip-tools - 2022-07-14

I tried pip-tools for the first time today to pin the requirements for the natbat/pillarpointstewards Django app. …

python Efficiently copying a file - 2022-05-13

TLDR: Use shutil.copyfileobj(fsrc, fdst)

python Generating a calendar week grid with the Python Calendar module - 2022-03-31

I needed to generate a grid calendar that looks like this (design by Natalie Downe): …

python Streaming indented output of a JSON array - 2022-01-17

I wanted to produce the following output from a command-line tool: …

python Annotated explanation of David Beazley's dataklasses - 2021-12-19

David Beazley on Twitter: …

python Safely outputting JSON - 2021-12-17

Carelessly including the output of json.dumps() in an HTML page can lead to an XSS hole, thanks to the following: …

python Using C_INCLUDE_PATH to install Python packages - 2021-12-09

I tried to install my datasette-bplist plugin today in a fresh Python 3.10 virtual environment on macOS and got this error: …

python __init_subclass__ - 2021-12-03

David Beazley on Twitter said: …

python Ignoring a line in both flake8 and mypy - 2021-11-30

I needed to tell both flake8 and mypy to ignore the same line of code. …

python Using cog to update --help in a Markdown README file - 2021-11-18

My csvs-to-sqlite README includes a section that shows the output of the csvs-to-sqlite --help command (relevant issue). …

python Planning parallel downloads with TopologicalSorter - 2021-11-16

For complicated reasons I found myself wanting to write Python code to resolve a graph of dependencies and produce a plan for efficiently executing them, in parallel where possible. …

python Using the sqlite3 Python module in Pyodide - Python WebAssembly - 2021-10-18

Pyodide provides "Python with the scientific stack, compiled to WebAssembly" - it's an incredible project which lets you run a full working Jupyter notebook, complete with complex packages such as numpy and pandas, entirely in your browser without any server-side Python component running at all. …

python Using Fabric with an SSH public key - 2021-10-06

Inspired by this tweet by Mike Driscoll I decided to try using Fabric to run commands over SSH from a Python script, using a public key for authentication. …

python Find local variables in the traceback for an exception - 2021-08-09

For sqlite-utils issue #309 I had an error that looked like this: …

python Check spelling using codespell - 2021-08-03

Today I discovered codespell via this Rich commit. codespell is a really simple spell checker that can be run locally or incorporated into a CI flow. …

python Tracing every executed Python statement - 2021-03-21

Today I learned how to use the Python trace module to output every single executed line of Python code in a program - useful for figuring out exactly when a crash or infinite loop happens. …

python Using io.BufferedReader to peek against a non-peekable stream - 2021-02-15

When building the --sniff option for sqlite-utils insert (which attempts to detect the correct CSV delimiter and quote character by looking at the first 2048 bytes of a CSV file) I had the need to peek ahead in an incoming stream of data. …

python Handling CSV files with wide columns in Python - 2021-02-15

Users were reporting the following error using sqlite-utils to import some CSV files: …

python Packaging a Python app as a standalone binary with PyInstaller - 2021-01-04

PyInstaller can take a Python script and bundle it up as a standalone executable for macOS, Linux and apparently Windows too (I've not tried it on Windows yet). …

python Relinquishing control in Python asyncio - 2020-12-29

asyncio in Python is a form of co-operative multitasking, where everything runs in a single thread but asynchronous tasks can yield to other tasks to allow them to execute. …

python Controlling the style of dumped YAML using PyYAML - 2020-12-07

I had a list of Python dictionaries I wanted to output as YAML, but I wanted to control the style of the output. …

python Running Python code in a subprocess with a time limit - 2020-12-06

I figured out how to run a subprocess with a time limit for datasette-ripgrep, using the asyncio.create_subprocess_exec() method. The pattern looks like this: …

python Decorators with optional arguments - 2020-10-28

sqlite-utils provides a decorator for registering custom Python functions that looks like this: …

python Explicit file encodings using click.File - 2020-10-16

I wanted to add a --encoding option to sqlite-utils insert which could be used to change the file encoding used to read the incoming CSV or TSV file - see sqlite-utils #182. …

python Understanding option names in Click - 2020-09-22

I hit a bug today where I had defined a Click option called open but in doing so I replaced the Python bulit-in open() function: …

python Debugging a Click application using pdb - 2020-09-03

This tip is for when you are working on a Python command-line application that runs using that program's name, as opposed to typing python my_script.py. I usually need this when I'm working on applications built using Click, e.g. projects I start using my click-app cookiecutter template. …

python Outputting JSON with reduced floating point precision - 2020-08-21

datasette-leaflet-geojson outputs GeoJSON geometries in HTML pages in a way that can be picked up by JavaScript and used to plot a Leaflet map. …

python How to call pip programatically from Python - 2020-08-11

I needed this for the datasette install and datasette uninstall commands, see issue #925. …

python Password hashing in Python with pbkdf2 - 2020-07-13

I was researching password hashing for datasette-auth-passwords. I wanted very secure defaults that would work using the Python standard library without any extra dependencies. …

python Introspecting Python function parameters - 2020-05-27

For https://github.com/simonw/datasette/issues/581 I want to be able to inspect a Python function to determine which named parameters it accepts and send only those arguments. …

python Build the official Python documentation locally - 2020-05-08

First, checkout the cpython repo: …

python Use setup.py to install platform-specific dependencies - 2020-05-05

For photos-to-sqlite I needed to install osxphotos as a dependency, but only if the platform is macOS - it's not available for Linux. …

python Installing and upgrading Datasette plugins with pipx - 2020-05-04

If you installed datasette using pipx install datasette you can install additional plugins with pipx inject like so: …

python Generated a summary of nested JSON data - 2020-04-28

I was trying to figure out the shape of the JSON object from https://github.com/simonw/coronavirus-data-gov-archive/blob/master/data_latest.json?raw=true - which is 3.2MB and heavily nested, so it's difficult to get a good feel for the shape. …

python macOS Catalina sort-of includes Python 3 - 2020-04-21

Once you have installed the "command line tools" for Catalina using the following terminal command: …

python Convert a datetime object to UTC without using pytz - 2020-04-19

I wanted to convert a datetime object (from GitPython) to UTC without adding the pytz dependency. …