Below is a snapshot of my understanding of Automerge and related projects as of 2024-07-25.
Preliminaries
Automerge Itself
- Op-based CRDT
- Requires the entire history in order to materialize a view
- On concurrent change, it performs a topsort first to linearise the history
- This probably lends itself nicely to compression
Nested Docs
- Documents can be linked to each other via
automerge:
URL- This is similar to a WNFS soft link
- Each doc acts as a “unit of change”, but I’m still a bit fuzzy on what that means
- In WNFS we used this distinction as an access control boundary
- I’m increasingly of the opinion that we should do the same here for the access contol project
- In WNFS we used this distinction as an access control boundary
Compression
- Compression on Automerge docs is performed by RLE and laying out as columns
- RLE is not always more efficient(!), and there is no hard rule on how much to chunk by
- i.e. longer runs are not always preferred
- The Automerge Binary Document Format doesn’t have runs of changes today, but it’s expected to be added
- Maybe this is the Compressed Change Chunk, but I’m not 100% certain
- Runs are not generally directly appendable
- Discovery of the content requires unpacking the document
- This means that two compressed chunks with overlapping content can be discovered to be concurrent if you expose the parent metadata, but you won’t know if one is contained in the other (which may be helpful for subsumption, see also the Leveled Sync Proposal below)
Sync System
- OOM
- 100k docs
- Some 10ks of ops/doc
- Total: ~ 1B ops to keep in sync
- If kept in a tree, balancing the branching factor changes round trips pretty significantly (esp since latency is usually the bottleneck). The wider the tree, the more redundant metadata needs to be synced.
- Binary tree ~ 30 RT
- Hex tree ~ 8 RT
- Practial Rateless IBLT scales with the delta between replica states:
- This is near optimal, often RT for reconcilliation
- …but is much more complex, depending on concepts like invertable bloom filters, erasure coding, and peel compression
Primary Sync Proposal
- Builds a Merkle Search Tree on top of the linear history
- When syncing multiple documents:
- Each document is first topsorted
- They are all laid end-to-end
- A Merkle Search Tree is built out of the elements
These diagrams would be better with Graphviz, but we have Mermaid so imagine that there’s blank nodes and whatnot. Also ignore that the doc numbers are backwards :P Mermaid is not the best at layout
flowchart
subgraph doc1
op_1a
op_1b
end
subgraph doc2
op_2a
op_2b
op_2c
end
subgraph doc3
op_3c
op_3b
op_3a
op_3d
end
op_2b --> op_2c
op_2b --> op_3d
op_2b --> op_3b
op_3b --> op_3c
op_3b --> op_3a
op_3d --> op_2a
op_2c --> op_1a
op_2c --> op_1b
doc1 ~~~ doc2 ~~~ doc3
op_3b ~~~ op_3d
- The intention above is that this remains in sequence if you perform an in-order traversal.
- My understanding is that this doesnot break cleanly at document boundaries.
- I think that the lack of document awareness means that you have to explore all child branches, despite the fact that we know that if you don’t have a parent you definitely don’t have the child node.
Idea for Takig Advantage of History
Essentially: use a MST for doc heads, and ship lists for the histories. The above example doesn’t have enough history to really show off the skip lists, but here it is translated.
flowchart
subgraph doc1
op_1b --> op_1a
end
subgraph doc2
op_2a
op_2b --> op_2a
op_2c --> op_2a
end
subgraph doc3
op_3a
op_3b
op_3c
op_3d
op_3d --> op_3c --> op_3b --> op_3a
op_3d -.->|skip| op_3b
end
doc2node --> doc1node
doc2node --> doc3node
doc1node -.-> op_1b
doc2node -.-> op_2b
doc2node -.-> op_2c
doc3node -.-> op_3d
Here’s the relevant video section1 about how we did the skip list part for searching history.
Idea for Efficient Document Store Merge
This is just a stray thought:
Much like (my assumptions about the) Leveled Sync Proposal, it may make sense to add blank nodes to make documents break at natural tree boundaries (i.e. so that documents don’t share inner nodes up to some depth). This could make merging docs very efficient because you would only need to build the top portion of the strutcure, and not change the structure in relation to the sibling document changes.
flowchart
subgraph doc1
op_1a
op_1b
x11[blank]
x12[blank]
x13[blank]
end
subgraph doc2
op_2a
op_2b
op_2c
end
subgraph doc3
op_3b
op_3a
op_3c
op_3d
x31[blank]
end
op_2b --> op_3b
op_2b --> op_2a
op_2b --> op_2c
op_3b --> op_3a
op_3b --> op_3d
op_3b --> op_3c
op_2b --> op_1a --> op_1b
op_1a -.-> x11
op_1a -.-> x12
op_1a -.-> x13
op_3b -.-> x31
I don’t know is this is really a huge improvement, but it may make it more efficient to maintain doc sync for users that want nonidentical-but-overlapping collections of docs. A downside of this approach is that in filling the remaining space with blank nodes artifically inflates the size of the set making it deeper, but that overhead “should” become smaller as the doc size increases.
Leveled Sync Proposal
I think that this was the original idea. There is also a proposal to use leveled compression inside a Merkle structure that’s related to (but not exactly) an MST.
flowchart
subgraph doc1
op_1a
op_1b
op_1c
op_1d
op_1e
run1_abc --> op_1a
run1_abc --> op_1b
run1_abc --> op_1c
run1_de --> op_1d
run1_de --> op_1e
run1_abcde --> run1_abc
run1_abcde --> run1_de
end
subgraph doc2
op_2a
op_2b
op_2c
run2_abc --> op_2a
run2_abc --> op_2b
run2_abc --> op_2c
end
subgraph doc3
op_3a
op_3b
op_3c
op_3d
run3_ab --> op_3a
run3_ab --> op_3b
run3_cd --> op_3c
run3_cd --> op_3d
run3_abcd --> run3_ab
run3_abcd --> run3_cd
end
run12 --> run1_abcde
run12 --> run2_abc
run123 --> run12
run123 --> run3_abcd
I think that in this version, it doesn’t make sense to have runs end on the start of the next docuemnt (it’s unlikely that you’ll need to grab the very end of one doc, and the very start of the next). In the diagram above, we have the doc runs isolated until we get to the collection level. It may actually make sense for collection sync (if you know the collection that the user wants in advance) to keep generations of runs around so that runs of changes of related documents can get syned together, but that’s fairly half-baked at this stage.
There’s certainly a space/time tradeoff in this version. It’s very efficient to get (at least close to) the level of granularity that you need at runtime on static data, but you need to store the same change times. Thanks to RLE, this will be less than holding into the entire change, but there’s certainly some amplification.
We did something vaguely similar in WNFS which we called a “skeleton” which was a materialized view of the subdirectory structure. This made many things MUCH faster at runtime, but came at the cost of being brittle and (aside from SNARKing) impossible to use in a trustless setting (would need to be BFT).