cr-sqlite is fascinating. It's a loadable SQLite extension by Matt Wonlaw that "allows merging different SQLite databases together that have taken independent writes".
It does that by implementing CRDTs, via a set of SQL functions you can use to upgrade a table to a CRDT-backed version.
Since it's a SQLite extension it's available to any language that can use SQLite - so Python, Rust, Go etc are all covered for free.
And yes, there's a WASM build as well, demonstrated in this Observable notebook.
It took me a bit of fiddling (see this issue) to get it up and running on macOS. Here's what worked for me.
brew install sqlite3
for this.cd
to the same folder as the download, then run:
/opt/homebrew/opt/sqlite/bin/sqlite3
.load crsqlite-darwin-aarch64 sqlite3_crsqlite_init
That second part is essential, it tells SQLite the entrypoint to use. Without that (if you just try .load crsqlite-darwin-aarch64
) you get a symbol not found
error..load
command again... and this time it should work!Once I had done all of that I ran through the example in the README and it worked as advertised. Broken up here into copy-pastable snippets:
.mode qbox
This turned on a new-to-me SQLite output mode which is very pleasant.
-- create tables as normal
create table foo (a primary key, b);
create table baz (a primary key, b, c, d);
-- update those tables to be crrs / crdts
select crsql_as_crr('foo');
select crsql_as_crr('baz');
-- insert some data / interact with tables as normal
insert into foo (a,b) values (1,2);
insert into baz (a,b,c,d) values ('a', 'woo', 'doo', 'daa');
-- ask for a record of what has changed
select "table", "pk", "cid", "val", "col_version", "db_version", COALESCE("site_id", crsql_site_id()), "cl", "seq" from crsql_changes;
Without .mode qbox
this returned:
foo| |b|2|1|1|ӓ?P??N??5I??|1|0
baz|
a|c|doo|1|2|ӓ?P??N??5I??|1|1
baz|
a|d|daa|1|2|ӓ?P??N??5I??|1|2
foo| |b|thing|5|5|p???1F????????|1|0
baz|
a|b|123|101|233|p???1F????????|1|0
But with .mode qbox
those binary values were shown as much more readable hex:
┌───────┬─────────────┬─────┬─────────┬─────────────┬────────────┬──────────────────────────────────────┬────┬─────┐
│ table │ pk │ cid │ val │ col_version │ db_version │ COALESCE("site_id", crsql_site_id()) │ cl │ seq │
├───────┼─────────────┼─────┼─────────┼─────────────┼────────────┼──────────────────────────────────────┼────┼─────┤
│ 'foo' │ x'010901' │ 'b' │ 2 │ 1 │ 1 │ x'd393ef50a8a84e17be1b32cc354985be' │ 1 │ 0 │
│ 'baz' │ x'010b0161' │ 'c' │ 'doo' │ 1 │ 2 │ x'd393ef50a8a84e17be1b32cc354985be' │ 1 │ 1 │
│ 'baz' │ x'010b0161' │ 'd' │ 'daa' │ 1 │ 2 │ x'd393ef50a8a84e17be1b32cc354985be' │ 1 │ 2 │
│ 'foo' │ x'010905' │ 'b' │ 'thing' │ 5 │ 5 │ x'7096e2d505314699a59c95faba14abb5' │ 1 │ 0 │
│ 'baz' │ x'010b0161' │ 'b' │ 123 │ 101 │ 233 │ x'7096e2d505314699a59c95faba14abb5' │ 1 │ 0 │
└───────┴─────────────┴─────┴─────────┴─────────────┴────────────┴──────────────────────────────────────┴────┴─────┘
-- simulate merging changes from a peer
insert into crsql_changes
("table", "pk", "cid", "val", "col_version", "db_version", "site_id", "cl", "seq")
values
('foo', x'010905', 'b', 'thing', 5, 5, X'7096E2D505314699A59C95FABA14ABB5', 1, 0);
insert into crsql_changes ("table", "pk", "cid", "val", "col_version", "db_version", "site_id", "cl", "seq")
values
('baz', x'010b0161', 'b', 123, 101, 233, X'7096E2D505314699A59C95FABA14ABB5', 1, 0);
-- check that peer's changes were applied
select * from foo;
select * from baz;
-- tear down the extension before closing the connection
select crsql_finalize();
I ran the above sequence again against a database file on disk called /tmp/temp.db
and then did this:
sqlite3 /tmp/temp.db .schema
Here's the resulting schema:
CREATE TABLE foo (a primary key, b);
CREATE TABLE baz (a primary key, b, c, d);
CREATE TABLE crsql_tracked_peers ("site_id" BLOB NOT NULL, "version" INTEGER NOT NULL, "seq" INTEGER DEFAULT 0, "tag" INTEGER, "event" INTEGER, PRIMARY KEY ("site_id", "tag", "event")) STRICT;
CREATE TABLE IF NOT EXISTS "crsql_master" ("key" TEXT PRIMARY KEY, "value" ANY);
CREATE TABLE IF NOT EXISTS "crsql_site_id" (site_id BLOB NOT NULL, ordinal INTEGER PRIMARY KEY AUTOINCREMENT);
CREATE TABLE sqlite_sequence(name,seq);
CREATE UNIQUE INDEX crsql_site_id_site_id ON "crsql_site_id" (site_id);
CREATE TABLE IF NOT EXISTS "foo__crsql_clock" (
"a",
__crsql_col_name TEXT NOT NULL,
__crsql_col_version INT NOT NULL,
__crsql_db_version INT NOT NULL,
__crsql_site_id INT,
__crsql_seq INT NOT NULL,
PRIMARY KEY ("a", __crsql_col_name)
);
CREATE INDEX "foo__crsql_clock_dbv_idx" ON "foo__crsql_clock" ("__crsql_db_version");
CREATE TRIGGER "foo__crsql_itrig"
AFTER INSERT ON "foo" WHEN crsql_internal_sync_bit() = 0
BEGIN
UPDATE "foo__crsql_clock" SET
__crsql_col_version = CASE __crsql_col_version % 2 WHEN 0 THEN __crsql_col_version + 1 ELSE __crsql_col_version + 2 END,
__crsql_db_version = crsql_next_db_version(),
__crsql_seq = crsql_increment_and_get_seq(),
__crsql_site_id = NULL
WHERE "a" IS NEW."a" AND __crsql_col_name = '-1';
INSERT INTO "foo__crsql_clock" (
"a",
__crsql_col_name,
__crsql_col_version,
__crsql_db_version,
__crsql_seq,
__crsql_site_id
) SELECT
NEW."a",
'b',
1,
crsql_next_db_version(),
crsql_increment_and_get_seq(),
NULL
ON CONFLICT DO UPDATE SET
__crsql_col_version = __crsql_col_version + 1,
__crsql_db_version = crsql_next_db_version(),
__crsql_seq = crsql_get_seq() - 1,
__crsql_site_id = NULL;
END;
CREATE TRIGGER "foo_a__crsql_utrig"
AFTER UPDATE OF "a" ON "foo"
WHEN crsql_internal_sync_bit() = 0 AND NEW."a" IS NOT OLD."a"
BEGIN
INSERT INTO "foo__crsql_clock" (
"a",
__crsql_col_name,
__crsql_col_version,
__crsql_db_version,
__crsql_seq,
__crsql_site_id
) SELECT
OLD."a",
'-1',
2,
crsql_next_db_version(),
crsql_increment_and_get_seq(),
NULL WHERE true
ON CONFLICT DO UPDATE SET
__crsql_col_version = 1 + __crsql_col_version,
__crsql_db_version = crsql_next_db_version(),
__crsql_seq = crsql_get_seq() - 1,
__crsql_site_id = NULL;
DELETE FROM "foo__crsql_clock"
WHERE "a" IS OLD."a" AND __crsql_col_name != '-1';
END;
CREATE TRIGGER "foo__crsql_utrig"
AFTER UPDATE ON "foo" WHEN crsql_internal_sync_bit() = 0
BEGIN
INSERT INTO "foo__crsql_clock" (
"a",
__crsql_col_name,
__crsql_col_version,
__crsql_db_version,
__crsql_seq,
__crsql_site_id
) SELECT
NEW."a",
'-1',
1,
crsql_next_db_version(),
crsql_increment_and_get_seq(),
NULL
WHERE NEW."a" IS NOT OLD."a"
ON CONFLICT DO UPDATE SET
__crsql_col_version = CASE __crsql_col_version % 2 WHEN 0 THEN __crsql_col_version + 1 ELSE __crsql_col_version + 2 END,
__crsql_db_version = crsql_next_db_version(),
__crsql_seq = crsql_get_seq() - 1,
__crsql_site_id = NULL;
INSERT INTO "foo__crsql_clock" (
"a",
__crsql_col_name,
__crsql_col_version,
__crsql_db_version,
__crsql_seq,
__crsql_site_id
) SELECT
NEW."a",
'b',
1,
crsql_next_db_version(),
crsql_increment_and_get_seq(),
NULL
WHERE NEW."b" IS NOT OLD."b"
ON CONFLICT DO UPDATE SET
__crsql_col_version = __crsql_col_version + 1,
__crsql_db_version = crsql_next_db_version(),
__crsql_seq = crsql_get_seq() - 1,
__crsql_site_id = NULL;
END;
CREATE TRIGGER "foo__crsql_dtrig"
AFTER DELETE ON "foo" WHEN crsql_internal_sync_bit() = 0
BEGIN
INSERT INTO "foo__crsql_clock" (
"a",
__crsql_col_name,
__crsql_col_version,
__crsql_db_version,
__crsql_seq,
__crsql_site_id
) SELECT
OLD."a",
'-1',
2,
crsql_next_db_version(),
crsql_increment_and_get_seq(),
NULL WHERE true
ON CONFLICT DO UPDATE SET
__crsql_col_version = 1 + __crsql_col_version,
__crsql_db_version = crsql_next_db_version(),
__crsql_seq = crsql_get_seq() - 1,
__crsql_site_id = NULL;
DELETE FROM "foo__crsql_clock"
WHERE "a" IS OLD."a" AND __crsql_col_name != '-1';
END;
CREATE TABLE IF NOT EXISTS "baz__crsql_clock" (
"a",
__crsql_col_name TEXT NOT NULL,
__crsql_col_version INT NOT NULL,
__crsql_db_version INT NOT NULL,
__crsql_site_id INT,
__crsql_seq INT NOT NULL,
PRIMARY KEY ("a", __crsql_col_name)
);
CREATE INDEX "baz__crsql_clock_dbv_idx" ON "baz__crsql_clock" ("__crsql_db_version");
CREATE TRIGGER "baz__crsql_itrig"
AFTER INSERT ON "baz" WHEN crsql_internal_sync_bit() = 0
BEGIN
UPDATE "baz__crsql_clock" SET
__crsql_col_version = CASE __crsql_col_version % 2 WHEN 0 THEN __crsql_col_version + 1 ELSE __crsql_col_version + 2 END,
__crsql_db_version = crsql_next_db_version(),
__crsql_seq = crsql_increment_and_get_seq(),
__crsql_site_id = NULL
WHERE "a" IS NEW."a" AND __crsql_col_name = '-1';
INSERT INTO "baz__crsql_clock" (
"a",
__crsql_col_name,
__crsql_col_version,
__crsql_db_version,
__crsql_seq,
__crsql_site_id
) SELECT
NEW."a",
'b',
1,
crsql_next_db_version(),
crsql_increment_and_get_seq(),
NULL
ON CONFLICT DO UPDATE SET
__crsql_col_version = __crsql_col_version + 1,
__crsql_db_version = crsql_next_db_version(),
__crsql_seq = crsql_get_seq() - 1,
__crsql_site_id = NULL;
INSERT INTO "baz__crsql_clock" (
"a",
__crsql_col_name,
__crsql_col_version,
__crsql_db_version,
__crsql_seq,
__crsql_site_id
) SELECT
NEW."a",
'c',
1,
crsql_next_db_version(),
crsql_increment_and_get_seq(),
NULL
ON CONFLICT DO UPDATE SET
__crsql_col_version = __crsql_col_version + 1,
__crsql_db_version = crsql_next_db_version(),
__crsql_seq = crsql_get_seq() - 1,
__crsql_site_id = NULL;
INSERT INTO "baz__crsql_clock" (
"a",
__crsql_col_name,
__crsql_col_version,
__crsql_db_version,
__crsql_seq,
__crsql_site_id
) SELECT
NEW."a",
'd',
1,
crsql_next_db_version(),
crsql_increment_and_get_seq(),
NULL
ON CONFLICT DO UPDATE SET
__crsql_col_version = __crsql_col_version + 1,
__crsql_db_version = crsql_next_db_version(),
__crsql_seq = crsql_get_seq() - 1,
__crsql_site_id = NULL;
END;
CREATE TRIGGER "baz_a__crsql_utrig"
AFTER UPDATE OF "a" ON "baz"
WHEN crsql_internal_sync_bit() = 0 AND NEW."a" IS NOT OLD."a"
BEGIN
INSERT INTO "baz__crsql_clock" (
"a",
__crsql_col_name,
__crsql_col_version,
__crsql_db_version,
__crsql_seq,
__crsql_site_id
) SELECT
OLD."a",
'-1',
2,
crsql_next_db_version(),
crsql_increment_and_get_seq(),
NULL WHERE true
ON CONFLICT DO UPDATE SET
__crsql_col_version = 1 + __crsql_col_version,
__crsql_db_version = crsql_next_db_version(),
__crsql_seq = crsql_get_seq() - 1,
__crsql_site_id = NULL;
DELETE FROM "baz__crsql_clock"
WHERE "a" IS OLD."a" AND __crsql_col_name != '-1';
END;
CREATE TRIGGER "baz__crsql_utrig"
AFTER UPDATE ON "baz" WHEN crsql_internal_sync_bit() = 0
BEGIN
INSERT INTO "baz__crsql_clock" (
"a",
__crsql_col_name,
__crsql_col_version,
__crsql_db_version,
__crsql_seq,
__crsql_site_id
) SELECT
NEW."a",
'-1',
1,
crsql_next_db_version(),
crsql_increment_and_get_seq(),
NULL
WHERE NEW."a" IS NOT OLD."a"
ON CONFLICT DO UPDATE SET
__crsql_col_version = CASE __crsql_col_version % 2 WHEN 0 THEN __crsql_col_version + 1 ELSE __crsql_col_version + 2 END,
__crsql_db_version = crsql_next_db_version(),
__crsql_seq = crsql_get_seq() - 1,
__crsql_site_id = NULL;
INSERT INTO "baz__crsql_clock" (
"a",
__crsql_col_name,
__crsql_col_version,
__crsql_db_version,
__crsql_seq,
__crsql_site_id
) SELECT
NEW."a",
'b',
1,
crsql_next_db_version(),
crsql_increment_and_get_seq(),
NULL
WHERE NEW."b" IS NOT OLD."b"
ON CONFLICT DO UPDATE SET
__crsql_col_version = __crsql_col_version + 1,
__crsql_db_version = crsql_next_db_version(),
__crsql_seq = crsql_get_seq() - 1,
__crsql_site_id = NULL;
INSERT INTO "baz__crsql_clock" (
"a",
__crsql_col_name,
__crsql_col_version,
__crsql_db_version,
__crsql_seq,
__crsql_site_id
) SELECT
NEW."a",
'c',
1,
crsql_next_db_version(),
crsql_increment_and_get_seq(),
NULL
WHERE NEW."c" IS NOT OLD."c"
ON CONFLICT DO UPDATE SET
__crsql_col_version = __crsql_col_version + 1,
__crsql_db_version = crsql_next_db_version(),
__crsql_seq = crsql_get_seq() - 1,
__crsql_site_id = NULL;
INSERT INTO "baz__crsql_clock" (
"a",
__crsql_col_name,
__crsql_col_version,
__crsql_db_version,
__crsql_seq,
__crsql_site_id
) SELECT
NEW."a",
'd',
1,
crsql_next_db_version(),
crsql_increment_and_get_seq(),
NULL
WHERE NEW."d" IS NOT OLD."d"
ON CONFLICT DO UPDATE SET
__crsql_col_version = __crsql_col_version + 1,
__crsql_db_version = crsql_next_db_version(),
__crsql_seq = crsql_get_seq() - 1,
__crsql_site_id = NULL;
END;
CREATE TRIGGER "baz__crsql_dtrig"
AFTER DELETE ON "baz" WHEN crsql_internal_sync_bit() = 0
BEGIN
INSERT INTO "baz__crsql_clock" (
"a",
__crsql_col_name,
__crsql_col_version,
__crsql_db_version,
__crsql_seq,
__crsql_site_id
) SELECT
OLD."a",
'-1',
2,
crsql_next_db_version(),
crsql_increment_and_get_seq(),
NULL WHERE true
ON CONFLICT DO UPDATE SET
__crsql_col_version = 1 + __crsql_col_version,
__crsql_db_version = crsql_next_db_version(),
__crsql_seq = crsql_get_seq() - 1,
__crsql_site_id = NULL;
DELETE FROM "baz__crsql_clock"
WHERE "a" IS OLD."a" AND __crsql_col_name != '-1';
END;
And just because I can, using LLM:
sqlite3 /tmp/temp.db .schema | llm -m gpt-4 -s 'explain this SQL, with a villainous twist'
Outputs:
This SQL script is establishing database tables and setting up triggers for tracking changes to the database. The villainous twist here is that it's designed to covertly track every change made to certain database tables ("foo" and "baz"), including inserts, updates, and deletions. The changes are logged in system tables ("crsql_tracked_peers", "crsql_master", "crsql_site_id", "foo__crsql_clock", "baz__crsql_clock") and tracked with various specific criteria ("__crsql_col_version", "__crsql_db_version", "__crsql_seq", "__crsql_site_id").
The "foo" and "baz" tables represent databases the villain would like to monitor. The script will create these tables if they don't exist and is also adding triggers to them.
Triggers are procedures that are automatically executed in response to certain events (e.g., insertions, updates, or deletions) on a particular table.
So the villainous twist is that after every INSERT, UPDATE, or DELETE operation on "foo" or "baz" tables, these changes are tracked and stored in "__crsql_clock" tables.
When a row is inserted into "foo" or "baz", the "foo__crsql_itrig" or "baz__crsql_itrig" triggers fire. They update certain fields in the "foo__crsql_clock" or "baz__crsql_clock" tables and insert new rows if necessary.
When a row is updated in "foo" or "baz", "foo__crsql_utrig" or "baz__crsql_utrig" triggers fire, doing the same thing but additionally checking if value of "a" was changed.
When a row is deleted from "foo" or "baz", "foo__crsql_dtrig" or "baz__crsql_dtrig" triggers run. They insert info about deleted row into "foo__crsql_clock" or "baz__crsql_clock" tables, and delete rows associated with that object.
The villain's script is making sure no change goes unnoticed. If you don't know the triggers are there, you may never realize what's happening!
Created 2023-09-12T17:01:41-07:00, updated 2023-09-12T17:10:56-07:00 · History · Edit