A Conflict-free Replicated Data Type is a data structure that can be replicated across multiple peers, edited concurrently without coordination, and merged deterministically — no consensus protocol required. The key insight is that if your merge function forms a join-semilattice (i.e. it’s associative, commutative, and idempotent), then replicas converge regardless of message ordering or delivery.

CRDTs are one of the core enabling technologies for local-first software — they let you keep working offline and sync up later without conflict resolution dialogs.

Two Flavours

State-based (CvRDT)Operation-based (CmRDT)
What’s sentFull stateIndividual operations
MergeJoin on latticeApply op (must be commutative)
DeliveryAt least onceExactly once, causal order
BandwidthHigher (sends whole state)Lower (sends deltas)
Typical useSimpler protocols, gossipReal-time collaboration

In practice the line blurs — Automerge is technically op-based but ships a columnar binary format that looks a lot like state, and delta-state CRDTs split the difference.

Common CRDT Types

TypeWhat it modelsMerge strategy
G-CounterGrow-only counterPer-replica max
PN-CounterIncrement/decrement counterTwo G-Counters
G-SetGrow-only setUnion
OR-Set (Observed-Remove)Add/remove setTag elements with unique IDs; union of adds minus observed removes
LWW-RegisterLast-writer-wins single valueTimestamp comparison
MV-RegisterMulti-value registerKeep all concurrent values (let the app decide)
RGA / LSEQ / FugueOrdered sequences (text)Interleave with unique position IDs
Map / JSONNested documentsRecursive merge per key

Why CRDTs Are Hard

The theory is elegant; the engineering is not. A few recurring pain points:

  • Tombstones — deleting something from a set means you have to remember that you deleted it (forever, or at least until you can prove all replicas saw the delete). This is the “metadata grows without bound” problem.
  • Large documents — naive CRDTs store the full operation history. For a text document, that means every keystroke. Compaction and garbage collection are non-trivial — see Automerge Binary Document Format for how Automerge handles this with columnar compression.
  • Access control — CRDTs assume all replicas are trusted. If you want to revoke someone’s access or enforce permissions, you need something on top — this is exactly the problem Keyhive solves.
  • Encryption — syncing encrypted CRDTs means reconciling state you can’t inspect. Subduction addresses this by operating entirely on content hashes and fingerprints. WNFS took a different approach — an encrypted file system built on CRDTs with a Skip Ratchet for key management.
  • Rich text — merging concurrent edits to formatted text (bold, headings, lists) is significantly harder than plain text. See PeriText for the state of the art.

The CALM Connection

The CALM theorem (Consistency And Logical Monotonicity) gives a formal foundation: any monotone program is eventually consistent without coordination. CRDTs are the data structure instantiation of this — their merge functions are monotone by construction. Keep CALM and CRDT On extends this further with a monotone query model that tells you which reads are safe without coordination.

My Work in This Space

Most of my current research at Ink & Switch is about making CRDTs practical for real-world applications:

ProjectWhat it addresses
KeyhiveAuthorisation for local-first CRDTs (who can read/write?)
SubductionEncrypted peer-to-peer sync (how do you reconcile what you can’t decrypt?)
AuthomergeThe earlier design exploration behind Keyhive
UCANDecentralised auth that doesn’t need a server to check permissions

Further Reading

Papers

Talks

Notes

  • Automerge — the CRDT I work with most
  • Automerge Sync — how Automerge’s sync protocol works under the hood
  • Local-first — the broader movement CRDTs enable
  • CALM — the theoretical foundation