Algorithms¶
Replication Model¶
Collabs implements hybrid op-based/state-based Conflict-free Replicated Data Types (CRDTs). This means that you can share updates using both messages (op-based CRDT usage) and saved states (state-based CRDT usage), as described in Updates and Sync.
In a Collab implementation, send and receive correspond to op-based CRDT usage, while save and load correspond to state-based usage.
Built-in Collabs¶
Our built-in Collabs use CRDT algorithms from (or inspired by) various sources, shown in the table below.
Collab |
Algorithm |
References |
Comments |
---|---|---|---|
Enable-Wins Flag, or Disable-Wins Flag with option |
Pure Operation-based Replicated Data Types, Carlos Baquero, Paulo Sérgio Almeida, and Ali Shoker |
||
State-based PN-Counter |
A Comprehensive Study of Convergent and Commutative Replicated Data Types, Marc Shapiro, Nuno Preguiça, Carlos Baquero, and Marek Zawirski |
Op-based messages work similarly to the delta-CRDT Counter from Delta State Replicated Data Types, Paulo Sérgio Almeida, Ali Shoker, and Carlos Baquero. |
|
Riak Map without deletions |
Riak Data Types: Maps. Similar CRDT-valued maps appear in Logic and Lattices for Distributed Programming, Neil Conway, William R. Marczak, Peter Alvaro, Joseph M. Hellerstein, and David Maier; LVars: Lattice-based Data Structures for Deterministic Parallelism, Lindsey Kuper and Ryan R. Newton; and Composition of State-based CRDTs, Carlos Baquero, Paulo Sérgio Almeida, Alcino Cunha and Carla Ferreira. |
Like the Riak Map, there is a single value CRDT per key that implicitly always exists, all operations on that value are merged, and a key is present if and only if its value CRDT is in a nontrivial state (determined by Collab.canGC). Unlike the Riak map, we do not allow delete-key operations, which in Riak would perform an observed-reset operation on the value. |
|
List-with-Move |
Moving Elements in List CRDTs, Martin Kleppmann |
In addition to position, the register has an |
|
Inspired by how Yjs’s Y.Map handles values that are themselves shared types. |
|||
Multi-Value Register per key |
A Comprehensive Study of Convergent and Commutative Replicated Data Types, Marc Shapiro, Nuno Preguiça, Carlos Baquero, and Marek Zawirski |
A key is present whenever its Multi-Value Register has nonzero conflicting values; the delete operation just clears the (causally prior) conflicting values. For state-based merging of a saved state, we delete conflicts based on a vector clock instead of tombstones, following An Optimized Conflict-free Replicated Set, Annette Bieniusa, Marek Zawirski, Nuno Preguiça, Marc Shapiro, Carlos Baquero, Valter Balegas, and Sérgio Duarte. |
|
Similar to CLazyMap, but with a fixed set of keys (the child names). |
See CLazyMap references. |
We have not seen an identical “CRDT object” idea in the literature, but the idea and semantics are obvious (multiple independent CRDTs side-by-side, keyed by name). |
|
CTotalOrder and LocalList (used in all list types) |
Tree-Fugue List CRDT |
The Art of the Fugue: Minimizing Interleaving in Collaborative Text Editing, Matthew Weidner, Joseph Gentle, and Martin Kleppmann. |
Blog post: Fugue: A Basic List CRDT. As an optimization, we use “waypoints” as described in CTotalOrder’s doc header; those are inspired by Yjs’s items. |
Peritext |
Peritext: A CRDT for Collaborative Rich Text Editing, Geoffrey Litt, Sarah Lim, Martin Kleppmann, and Peter van Hardenberg. |
The semantics are substantially similar to Peritext’s, but we use somewhat different internal representations. For example, we store formatting span starts and ends in a separate LocalList instead of inline with the text (which is stored in a CValueList). |
|
Unique Set of CRDTs |
Designing Data Structures for Collaborative Apps, Matthew Weidner. The Unique Set (U-Set) with non-CRDT values is from A Comprehensive Study of Convergent and Commutative Replicated Data Types, Marc Shapiro, Nuno Preguiça, Carlos Baquero, and Marek Zawirski. |
If a value is deleted concurrently to an operation on that value, the delete “wins” and the operation is ignored. That “deleting” semantics comes from Yjs’s Y.Array shared type. |
|
Aggregator of a Multi-Value Register per value |
Eventually Consistent Register Revisited, Marek Zawirski, Carlos Baquero, Annette Bieniusa, Nuno Preguiça, and Marc Shapiro |
The default aggregator returns the conflicting value with least senderID. That is similar in spirit to a Last-Writer-Wins (LWW) Map - pick an arbitrary causally-maximal value - but it does not use an actual timestamp. |
|
Add-Wins Set / Observed-Remove Set |
An Optimized Conflict-free Replicated Set, Annette Bieniusa, Marek Zawirski, Nuno Preguiça, Marc Shapiro, Carlos Baquero, Valter Balegas, and Sérgio Duarte |
||
Aggregator of a Multi-Value Register |
Eventually Consistent Register Revisited, Marek Zawirski, Carlos Baquero, Annette Bieniusa, Nuno Preguiça, and Marc Shapiro |
The default aggregator returns the conflicting value with least senderID. That is similar in spirit to a Last-Writer-Wins (LWW) Register - pick an arbitrary causally-maximal value - but it does not use an actual timestamp. |