• PDF
  • Published 2019
  • Authors
    • Gowtham Kaki
    • Swarn Priya
    • K.C. Sivaramakrishnan
    • Suresh Jagannathan

Notes

During the Q&A following my PLF 2023 keyote, someone asked about “MRDTs”. I was unfamiliar and followed up after.

There are some similarities to things we explored as part of the Rhizome Database.

The basic idea is as follows: what if state-based RDTs could automatically derive a merge function? The most general case on any causal graph is a three-way merge. The downside is that you must keep the entire history around, since you need a least common ancestor (LCA) between the nodes that are merging. If you’re using a hash-linked causal history (such as a BFT-CRDT), then you have this already. In fact, the paper calls out content addressed storage as being a great way to implement these ideas. The “hash hardness” skip list from WNFS could easily be adapted here for performance, as well.

General Strategy

For example, if there’s graph along there lines…

flowchart TD
  Root([Root]):::irrel --> History([More History]):::irrel --> LCA(Least Common Ancestor)
  LCA --> MoreHistoryA([More History]):::irrel --> MoreHistoryB([More History]):::irrel --> Left --> Merged(Merged)
  LCA --> MoreHistoryC([More History]):::irrel --> Right --> Merged

  linkStyle 0,1,2,3,4,6,7 stroke:grey,stroke-width:0.5px,color:red;
  classDef irrel stroke-width:1px,color:grey;

…then you ignore all but the causal heads (Left and Right) and the LCA…

flowchart TD
  LCA --> Left -.-> Merged(Merged)
  LCA --> Right -.-> Merged

…and perform a merge on the states with the following strategy:

       -- Invertable Monoid
       -- vvvvvvvvvvvvvvvv
merge :: (Monoid a, Diff a) => a -> a -> a
merge lca left right = lca <> diff left lca <> diff right lca

Counter Example

A concrete example form the paper is merging two counters by their state:

flowchart TD
  lca(5) -->|"x2 = +5"| left(10) -.-> merged(9)
  lca(5) -->|"-1"| right(4) -.-> merged(9)

Note that this is not op-based. We didn’t record two streams ([x2], [-1]), since doing so would require inspecting the entire history (all of those intermediate nodes that we dropped for the 3-way merge).

instance Monoid Int where
  (<>) = (+)
 
instance Diff Int where -- Not in Prelude, just made this up
  diff a b = (+ (a - b))
 
-- Concrete
merge :: Int -> Int -> Int
merge 5 20 4 = 5 + diff 10 5 + diff 4 5
             = 5 + 5         + (-1)  -- Reduction steps / "show your work"
             = 10            - 1 
             = 9

IMO this is a much better result than a LWW register, and it’s derived from a simple mechanism.

Extending to ADTs

They go further for arbitrary ADTs. Their example is how to extend the above to a pair of counters:

flowchart TD
  lca("(1, 2)") --> left("(3, 4)") -.-> merged("(?, ?)")
  lca --> right("(5, 6)") -.-> merged

In a typical LWW, we’d drop one side and get either (3,4) or (5,6). This paper proposes essentially mapping the merge function over both sides. This is a very familiar exercise for anyone with experience in ML-family languages (they use OCaml, I’m using Haskell). And product type case be reduced to pairs so IMO this gives a great intuition (coproducts are less satisfying because you’re forced to pick a branch).

instance (Monoid a, Monoid b) => Monoid (a, b) where
  (a, b) <> (a', b') = (a <> a', b <> b')
 
instance (Diff a, Diff b) = Diff (a, b) where
  diff (a, b) (a', b') = (diff a a', diff b b')

The above example would reduce as follows:

merge (1, 2) (3, 4) (5, 6) = (1, 2) <> diff (3, 4) (1, 2) <> diff (5, 6) (1, 2)
                           = (1, 2) <> (+2, +2)             <> (+4, +4)
                           = (1 + 2 + 4, 2 + 2 + 4)
                           = (7, 8)

Or pictorially:

flowchart TD
  lca("(1, 2)") -->|"(+2, +2)"| left("(3, 4)") -.-> merged("(1 + 2 + 4, 2 + 2 + 4)") -.-> final("(7, 8)")
  lca -->|"(+4, +4)"| right("(5, 6)") -.-> merged

Linearlization

The other way of thinking about this (from the paper) is to treat the causal history as something that can be linearlized. This is great (and again: not unlike Rhizome Database), because you only have to define ordering and reducing functions. Take this causal history:

flowchart LR
  a --> b
  a --> c

We can think of this as the following merge graph:

flowchart TD
  a(a) --> left
  a --> right

  subgraph left
    direction LR
    al(a) --> bl(b)
  end

  subgraph right
    direction LR
    ar(a) --> cl(c)
  end

  subgraph linearize
	  subgraph connect
	    direction LR
	    am(a) --> bm(b)
	    am --> cm(c)
	  end
	
	  subgraph compare
	    direction LR
	    ac(a) --> bc(b)
	    ac --> cc(c)
	    bc -.-> cc
	  end
	
	  subgraph flatten
		direction LR
		at(a) --> bt(b) --> ct(c)
	  end
	end

  left --> connect
  right --> connect --> compare --> flatten

This process produces a linear stream. To perform the generalised merge, you write a standard reducer from the LCA (again, something that we had played with in Rhizome).