Algorithms

Replication Model

Collabs implements hybrid op-based/state-based Conflict-free Replicated Data Types (CRDTs). This means that you can share updates using both messages (op-based CRDT usage) and saved states (state-based CRDT usage), as described in Updates and Sync.

In a Collab implementation, send and receive correspond to op-based CRDT usage, while save and load correspond to state-based usage.

Built-in Collabs

Our built-in Collabs use CRDT algorithms from (or inspired by) various sources, shown in the table below.

Collab

Algorithm

References

Comments

CBoolean

Enable-Wins Flag, or Disable-Wins Flag with option { winner: false }

Pure Operation-based Replicated Data Types, Carlos Baquero, Paulo Sérgio Almeida, and Ali Shoker

CCounter

State-based PN-Counter

A Comprehensive Study of Convergent and Commutative Replicated Data Types, Marc Shapiro, Nuno Preguiça, Carlos Baquero, and Marek Zawirski

Op-based messages work similarly to the delta-CRDT Counter from Delta State Replicated Data Types, Paulo Sérgio Almeida, Ali Shoker, and Carlos Baquero.

CLazyMap

Riak Map without deletions

Riak Data Types: Maps. Similar CRDT-valued maps appear in Logic and Lattices for Distributed Programming, Neil Conway, William R. Marczak, Peter Alvaro, Joseph M. Hellerstein, and David Maier; LVars: Lattice-based Data Structures for Deterministic Parallelism, Lindsey Kuper and Ryan R. Newton; and Composition of State-based CRDTs, Carlos Baquero, Paulo Sérgio Almeida, Alcino Cunha and Carla Ferreira.

Like the Riak Map, there is a single value CRDT per key that implicitly always exists, all operations on that value are merged, and a key is present if and only if its value CRDT is in a nontrivial state (determined by Collab.canGC). Unlike the Riak map, we do not allow delete-key operations, which in Riak would perform an observed-reset operation on the value.

CList

List-with-Move

Moving Elements in List CRDTs, Martin Kleppmann

In addition to position, the register has an isPresent field, used for archive/restore. Also, we use our CSet instead of an Add-Wins Set.

CMap

CValueMap with values from a CSet

Inspired by how Yjs’s Y.Map handles values that are themselves shared types.

CMultiValueMap

Multi-Value Register per key

A Comprehensive Study of Convergent and Commutative Replicated Data Types, Marc Shapiro, Nuno Preguiça, Carlos Baquero, and Marek Zawirski

A key is present whenever its Multi-Value Register has nonzero conflicting values; the delete operation just clears the (causally prior) conflicting values. For state-based merging of a saved state, we delete conflicts based on a vector clock instead of tombstones, following An Optimized Conflict-free Replicated Set, Annette Bieniusa, Marek Zawirski, Nuno Preguiça, Marc Shapiro, Carlos Baquero, Valter Balegas, and Sérgio Duarte.

CObject

Similar to CLazyMap, but with a fixed set of keys (the child names).

See CLazyMap references.

We have not seen an identical “CRDT object” idea in the literature, but the idea and semantics are obvious (multiple independent CRDTs side-by-side, keyed by name).

CTotalOrder and LocalList (used in all list types)

Tree-Fugue List CRDT

The Art of the Fugue: Minimizing Interleaving in Collaborative Text Editing, Matthew Weidner, Joseph Gentle, and Martin Kleppmann.

Blog post: Fugue: A Basic List CRDT. As an optimization, we use “waypoints” as described in CTotalOrder’s doc header; those are inspired by Yjs’s items.

CRichText

Peritext

Peritext: A CRDT for Collaborative Rich Text Editing, Geoffrey Litt, Sarah Lim, Martin Kleppmann, and Peter van Hardenberg.

The semantics are substantially similar to Peritext’s, but we use somewhat different internal representations. For example, we store formatting span starts and ends in a separate LocalList instead of inline with the text (which is stored in a CValueList).

CSet

Unique Set of CRDTs

Designing Data Structures for Collaborative Apps, Matthew Weidner. The Unique Set (U-Set) with non-CRDT values is from A Comprehensive Study of Convergent and Commutative Replicated Data Types, Marc Shapiro, Nuno Preguiça, Carlos Baquero, and Marek Zawirski.

If a value is deleted concurrently to an operation on that value, the delete “wins” and the operation is ignored. That “deleting” semantics comes from Yjs’s Y.Array shared type.

CValueMap

Aggregator of a Multi-Value Register per value

Eventually Consistent Register Revisited, Marek Zawirski, Carlos Baquero, Annette Bieniusa, Nuno Preguiça, and Marc Shapiro

The default aggregator returns the conflicting value with least senderID. That is similar in spirit to a Last-Writer-Wins (LWW) Map - pick an arbitrary causally-maximal value - but it does not use an actual timestamp.

CValueSet

Add-Wins Set / Observed-Remove Set

An Optimized Conflict-free Replicated Set, Annette Bieniusa, Marek Zawirski, Nuno Preguiça, Marc Shapiro, Carlos Baquero, Valter Balegas, and Sérgio Duarte

CVar

Aggregator of a Multi-Value Register

Eventually Consistent Register Revisited, Marek Zawirski, Carlos Baquero, Annette Bieniusa, Nuno Preguiça, and Marc Shapiro

The default aggregator returns the conflicting value with least senderID. That is similar in spirit to a Last-Writer-Wins (LWW) Register - pick an arbitrary causally-maximal value - but it does not use an actual timestamp.