CRDT

Planted 17 May 2024 Tended 18 May 2026

641 words ~4 min read

A Conflict-free Replicated Data Type is a data structure that can be replicated across multiple peers, edited concurrently without coordination, and merged deterministically -- no consensus protocol required. The key insight is that if your merge function forms a join-semilattice (i.e. it's associative, commutative, and idempotent), then replicas converge regardless of message ordering or delivery.

CRDTs are one of the core enabling technologies for local-first software -- they let you keep working offline and sync up later without conflict resolution dialogs.

Two Flavours

	State-based (CvRDT)	Operation-based (CmRDT)
What's sent	Full state	Individual operations
Merge	Join on lattice	Apply op (must be commutative)
Delivery	At least once	Exactly once, causal order
Bandwidth	Higher (sends whole state)	Lower (sends deltas)
Typical use	Simpler protocols, gossip	Real-time collaboration

In practice the line blurs -- Automerge is technically op-based but ships a columnar binary format that looks a lot like state, and delta-state CRDTs split the difference.

Common CRDT Types

Type	What it models	Merge strategy
G-Counter	Grow-only counter	Per-replica max
PN-Counter	Increment/decrement counter	Two G-Counters
G-Set	Grow-only set	Union
OR-Set (Observed-Remove)	Add/remove set	Tag elements with unique IDs; union of adds minus observed removes
LWW-Register	Last-writer-wins single value	Timestamp comparison
MV-Register	Multi-value register	Keep all concurrent values (let the app decide)
RGA / LSEQ / Fugue	Ordered sequences (text)	Interleave with unique position IDs
Map / JSON	Nested documents	Recursive merge per key

Why CRDTs Are Hard

The theory is elegant; the engineering is not. A few recurring pain points:

Tombstones -- deleting something from a set means you have to remember that you deleted it (forever, or at least until you can prove all replicas saw the delete). This is the "metadata grows without bound" problem.
Large documents -- naive CRDTs store the full operation history. For a text document, that means every keystroke. Compaction and garbage collection are non-trivial -- see Automerge Binary Document Format for how Automerge handles this with columnar compression.
Access control -- CRDTs assume all replicas are trusted. If you want to revoke someone's access or enforce permissions, you need something on top -- this is exactly the problem Keyhive solves.
Encryption -- syncing encrypted CRDTs means reconciling state you can't inspect. Subduction addresses this by operating entirely on content hashes and fingerprints. WNFS took a different approach -- an encrypted file system built on CRDTs with a Skip Ratchet for key management.
Rich text -- merging concurrent edits to formatted text (bold, headings, lists) is significantly harder than plain text. See PeriText for the state of the art.

The CALM Connection

The CALM theorem (Consistency And Logical Monotonicity) gives a formal foundation: any monotone program is eventually consistent without coordination. CRDTs are the data structure instantiation of this -- their merge functions are monotone by construction. Keep CALM and CRDT On extends this further with a monotone query model that tells you which reads are safe without coordination.

My Work in This Space

Most of my current research at Ink & Switch is about making CRDTs practical for real-world applications:

Project	What it addresses
Keyhive	Authorisation for local-first CRDTs (who can read/write?)
Subduction	Encrypted peer-to-peer sync (how do you reconcile what you can't decrypt?)
Authomerge	The earlier design exploration behind Keyhive
UCAN	Decentralised auth that doesn't need a server to check permissions

CRDT

Two Flavours

Common CRDT Types

Why CRDTs Are Hard

The CALM Connection

My Work in This Space

Further Reading

Papers

Talks

Notes

Linked from