Notes
During the Q&A following my PLF 2023 keyote, someone asked about “MRDTs”. I was unfamiliar and followed up after.
There are some similarities to things we explored as part of the Rhizome Database.
The basic idea is as follows: what if state-based RDTs could automatically derive a merge function? The most general case on any causal graph is a three-way merge. The downside is that you must keep the entire history around, since you need a least common ancestor (LCA) between the nodes that are merging. If you’re using a hash-linked causal history (such as a BFT-CRDT), then you have this already. In fact, the paper calls out content addressed storage as being a great way to implement these ideas. The “hash hardness” skip list from WNFS could easily be adapted here for performance, as well.
General Strategy
For example, if there’s graph along there lines…
flowchart TD Root([Root]):::irrel --> History([More History]):::irrel --> LCA(Least Common Ancestor) LCA --> MoreHistoryA([More History]):::irrel --> MoreHistoryB([More History]):::irrel --> Left --> Merged(Merged) LCA --> MoreHistoryC([More History]):::irrel --> Right --> Merged linkStyle 0,1,2,3,4,6,7 stroke:grey,stroke-width:0.5px,color:red; classDef irrel stroke-width:1px,color:grey;
…then you ignore all but the causal heads (Left
and Right
) and the LCA…
flowchart TD LCA --> Left -.-> Merged(Merged) LCA --> Right -.-> Merged
…and perform a merge on the states with the following strategy:
-- Invertable Monoid
-- vvvvvvvvvvvvvvvv
merge :: (Monoid a, Diff a) => a -> a -> a
merge lca left right = lca <> diff left lca <> diff right lca
Counter Example
A concrete example form the paper is merging two counters by their state:
flowchart TD lca(5) -->|"x2 = +5"| left(10) -.-> merged(9) lca(5) -->|"-1"| right(4) -.-> merged(9)
Note that this is not op-based. We didn’t record two streams ([x2], [-1])
, since doing so would require inspecting the entire history (all of those intermediate nodes that we dropped for the 3-way merge).
instance Monoid Int where
(<>) = (+)
instance Diff Int where -- Not in Prelude, just made this up
diff a b = (+ (a - b))
-- Concrete
merge :: Int -> Int -> Int
merge 5 20 4 = 5 + diff 10 5 + diff 4 5
= 5 + 5 + (-1) -- Reduction steps / "show your work"
= 10 - 1
= 9
IMO this is a much better result than a LWW register, and it’s derived from a simple mechanism.
Extending to ADTs
They go further for arbitrary ADTs. Their example is how to extend the above to a pair of counters:
flowchart TD lca("(1, 2)") --> left("(3, 4)") -.-> merged("(?, ?)") lca --> right("(5, 6)") -.-> merged
In a typical LWW, we’d drop one side and get either (3,4)
or (5,6)
. This paper proposes essentially mapping the merge function over both sides. This is a very familiar exercise for anyone with experience in ML-family languages (they use OCaml, I’m using Haskell). And product type case be reduced to pairs so IMO this gives a great intuition (coproducts are less satisfying because you’re forced to pick a branch).
instance (Monoid a, Monoid b) => Monoid (a, b) where
(a, b) <> (a', b') = (a <> a', b <> b')
instance (Diff a, Diff b) = Diff (a, b) where
diff (a, b) (a', b') = (diff a a', diff b b')
The above example would reduce as follows:
merge (1, 2) (3, 4) (5, 6) = (1, 2) <> diff (3, 4) (1, 2) <> diff (5, 6) (1, 2)
= (1, 2) <> (+2, +2) <> (+4, +4)
= (1 + 2 + 4, 2 + 2 + 4)
= (7, 8)
Or pictorially:
flowchart TD lca("(1, 2)") -->|"(+2, +2)"| left("(3, 4)") -.-> merged("(1 + 2 + 4, 2 + 2 + 4)") -.-> final("(7, 8)") lca -->|"(+4, +4)"| right("(5, 6)") -.-> merged
Linearlization
The other way of thinking about this (from the paper) is to treat the causal history as something that can be linearlized. This is great (and again: not unlike Rhizome Database), because you only have to define ordering and reducing functions. Take this causal history:
flowchart LR a --> b a --> c
We can think of this as the following merge graph:
flowchart TD a(a) --> left a --> right subgraph left direction LR al(a) --> bl(b) end subgraph right direction LR ar(a) --> cl(c) end subgraph linearize subgraph connect direction LR am(a) --> bm(b) am --> cm(c) end subgraph compare direction LR ac(a) --> bc(b) ac --> cc(c) bc -.-> cc end subgraph flatten direction LR at(a) --> bt(b) --> ct(c) end end left --> connect right --> connect --> compare --> flatten
This process produces a linear stream. To perform the generalised merge, you write a standard reducer from the LCA (again, something that we had played with in Rhizome).