• PDF
  • Published 2019
  • Authors
    • Gowtham Kaki
    • Swarn Priya
    • K.C. Sivaramakrishnan
    • Suresh Jagannathan

Notes

During the Q&A following my PLF 2023 keyote, someone asked about β€œMRDTs”. I was unfamiliar and followed up after.

There are some similarities to things we explored as part of the Rhizome Database.

The basic idea is as follows: what if state-based RDTs could automatically derive a merge function? The most general case on any causal graph is a three-way merge. The downside is that you must keep the entire history around, since you need a least common ancestor (LCA) between the nodes that are merging. If you’re using a hash-linked causal history (such as a BFT-CRDT), then you have this already. In fact, the paper calls out content addressed storage as being a great way to implement these ideas. The β€œhash hardness” skip list from WNFS could easily be adapted here for performance, as well.

General Strategy

For example, if there’s graph along there lines…

flowchart TD
  Root([Root]):::irrel --> History([More History]):::irrel --> LCA(Least Common Ancestor)
  LCA --> MoreHistoryA([More History]):::irrel --> MoreHistoryB([More History]):::irrel --> Left --> Merged(Merged)
  LCA --> MoreHistoryC([More History]):::irrel --> Right --> Merged

  linkStyle 0,1,2,3,4,6,7 stroke:grey,stroke-width:0.5px,color:red;
  classDef irrel stroke-width:1px,color:grey;

…then you ignore all but the causal heads (Left and Right) and the LCA…

flowchart TD
  LCA --> Left -.-> Merged(Merged)
  LCA --> Right -.-> Merged

…and perform a merge on the states with the following strategy:

       -- Invertable Monoid
       -- vvvvvvvvvvvvvvvv
merge :: (Monoid a, Diff a) => a -> a -> a
merge lca left right = lca <> diff left lca <> diff right lca

Counter Example

A concrete example form the paper is merging two counters by their state:

flowchart TD
  lca(5) -->|"x2 = +5"| left(10) -.-> merged(9)
  lca(5) -->|"-1"| right(4) -.-> merged(9)

Note that this is not op-based. We didn’t record two streams ([x2], [-1]), since doing so would require inspecting the entire history (all of those intermediate nodes that we dropped for the 3-way merge).

instance Monoid Int where
  (<>) = (+)
 
instance Diff Int where -- Not in Prelude, just made this up
  diff a b = (+ (a - b))
 
-- Concrete
merge :: Int -> Int -> Int
merge 5 20 4 = 5 + diff 10 5 + diff 4 5
             = 5 + 5         + (-1)  -- Reduction steps / "show your work"
             = 10            - 1 
             = 9

IMO this is a much better result than a LWW register, and it’s derived from a simple mechanism.

Extending to ADTs

They go further for arbitrary ADTs. Their example is how to extend the above to a pair of counters:

flowchart TD
  lca("(1, 2)") --> left("(3, 4)") -.-> merged("(?, ?)")
  lca --> right("(5, 6)") -.-> merged

In a typical LWW, we’d drop one side and get either (3,4) or (5,6). This paper proposes essentially mapping the merge function over both sides. This is a very familiar exercise for anyone with experience in ML-family languages (they use OCaml, I’m using Haskell). And product type case be reduced to pairs so IMO this gives a great intuition (coproducts are less satisfying because you’re forced to pick a branch).

instance (Monoid a, Monoid b) => Monoid (a, b) where
  (a, b) <> (a', b') = (a <> a', b <> b')
 
instance (Diff a, Diff b) = Diff (a, b) where
  diff (a, b) (a', b') = (diff a a', diff b b')

The above example would reduce as follows:

merge (1, 2) (3, 4) (5, 6) = (1, 2) <> diff (3, 4) (1, 2) <> diff (5, 6) (1, 2)
                           = (1, 2) <> (+2, +2)             <> (+4, +4)
                           = (1 + 2 + 4, 2 + 2 + 4)
                           = (7, 8)

Or pictorially:

flowchart TD
  lca("(1, 2)") -->|"(+2, +2)"| left("(3, 4)") -.-> merged("(1 + 2 + 4, 2 + 2 + 4)") -.-> final("(7, 8)")
  lca -->|"(+4, +4)"| right("(5, 6)") -.-> merged

Linearlization

The other way of thinking about this (from the paper) is to treat the causal history as something that can be linearlized. This is great (and again: not unlike Rhizome Database), because you only have to define ordering and reducing functions. Take this causal history:

flowchart LR
  a --> b
  a --> c

We can think of this as the following merge graph:

flowchart TD
  a(a) --> left
  a --> right

  subgraph left
    direction LR
    al(a) --> bl(b)
  end

  subgraph right
    direction LR
    ar(a) --> cl(c)
  end

  subgraph linearize
	  subgraph connect
	    direction LR
	    am(a) --> bm(b)
	    am --> cm(c)
	  end
	
	  subgraph compare
	    direction LR
	    ac(a) --> bc(b)
	    ac --> cc(c)
	    bc -.-> cc
	  end
	
	  subgraph flatten
		direction LR
		at(a) --> bt(b) --> ct(c)
	  end
	end

  left --> connect
  right --> connect --> compare --> flatten

This process produces a linear stream. To perform the generalised merge, you write a standard reducer from the LCA (again, something that we had played with in Rhizome).