Beyond the Latency: Building Scalable Collaborative Apps with Yjs and WebSockets
Stop fighting merge conflicts in your database. Learn how to implement Conflict-free Replicated Data Types (CRDTs) with WebSockets to build seamless, Figma-like collaboration in your React apps.

The Death of Last-Write-Wins
You've just spent three weeks building a "real-time" dashboard, only to realize that when two users edit the same text field simultaneously, one person's changes simply vanish. Last-write-wins is not a strategy; it's a bug that erodes user trust. In the past, we tried to solve this with Operational Transformation (OT) — the tech behind Google Docs — but OT is notoriously difficult to implement, requiring a central server to mediate every single character stroke. If the server goes down or the network jitters, the whole experience falls apart.
In 2026, we don't do that anymore. We use Conflict-free Replicated Data Types (CRDTs). CRDTs allow state to diverge on multiple clients and automatically converge to the exact same state without a central arbiter. It's the foundation of "local-first" software. I've spent the last two years moving our production collaborative tools from basic WebSocket events to a robust CRDT-based architecture using Yjs and Hocuspocus, and the difference in reliability is night and day.
Why CRDTs Matter Now
The shift from 'cloud-first' to 'local-first' is driven by user expectation. Users want Figma-level responsiveness. They want to work offline on a train and have their changes merge perfectly when they hit Wi-Fi at the station. Standard REST APIs or simple WebSocket message-passing (e.g., socket.emit('update', data)) cannot handle this. If User A deletes a paragraph while User B is fixing a typo in it, a simple message-passing system will likely crash or result in corrupted state. CRDTs treat data as a mathematical structure where operations are commutative, associative, and idempotent. In plain English: the order of operations doesn't matter; everyone ends up with the same result.
The Architecture: Yjs, Hocuspocus, and React
For production systems, Yjs is the gold standard for CRDT libraries in the JavaScript ecosystem. It's incredibly fast, has a tiny footprint, and supports shared types like Maps, Arrays, and Text. However, Yjs is just the logic layer. You still need a way to transport those updates and persist them. This is where Hocuspocus comes in. Hocuspocus is a suite of tools built by the Uberdosis team that acts as a backend for Yjs, handling WebSocket connections, authentication, and database persistence out of the box.
1. The Backend: Setting up Hocuspocus
You don't want to manage raw WebSocket heartbeats. Hocuspocus wraps the ws library and provides a hook-based API for intercepting document lifecycle events. In a production environment, you'll want to persist your Yjs binary blobs to a database (like Postgres or SQLite) so that when the last user leaves a room, the data isn't lost.
import { Hocuspocus } from '@hocuspocus/server';
import { SQLite } from '@hocuspocus/extension-sqlite';
import { Logger } from '@hocuspocus/extension-logger';
const server = new Hocuspocus({
port: 1234,
extensions: [
new Logger(),
new SQLite({
database: 'db.sqlite',
}),
],
async onAuthenticate(data) {
const { token } = data;
// Implement your JWT check here
if (token !== 'valid-token') {
throw new Error('Unauthorized');
}
},
async onStoreDocument(data) {
// This is where you can trigger webhooks or side effects
console.log(`Document ${data.documentName} saved.`);
},
});
server.listen();
2. The Frontend: Integrating with React
On the client side, you need to connect your local Yjs document to the WebSocket provider. The beauty of Yjs is that it doesn't care about your UI framework, but we can use hooks to make it reactive. Here is a simplified implementation using y-websocket.
import React, { useEffect, useState, useMemo } from 'react';
import * as Y from 'yjs';
import { HocuspocusProvider } from '@hocuspocus/provider';
export const CollaborativeEditor = ({ docId, token }) => {
const [status, setStatus] = useState('connecting');
// 1. Initialize the Yjs Document
const ydoc = useMemo(() => new Y.Doc(), []);
useEffect(() => {
// 2. Setup the Provider
const provider = new HocuspocusProvider({
url: 'ws://localhost:1234',
name: docId,
document: ydoc,
token: token,
onStatus: ({ status }) => setStatus(status),
});
return () => provider.destroy();
}, [ydoc, docId, token]);
const addNote = () => {
const sharedArray = ydoc.getArray('notes');
sharedArray.push([{ content: 'New Note', timestamp: Date.now() }]);
};
return (
<div>
<p>Status: {status}</p>
<button onClick={addNote}>Add Note</button>
{/* Render logic for sharedArray updates would go here */}
</div>
);
};
Scaling to Thousands of Users
Once you move past a few dozen users, a single Node.js process won't cut it. WebSockets are stateful and memory-intensive. In 2026, the standard approach is to use a Redis-backed pub/sub system to synchronize multiple Hocuspocus nodes.
When User A connects to Server 1 and User B connects to Server 2, they both need to see each other's updates. Hocuspocus provides a Redis extension that handles this automatically. However, watch out for memory usage. Every Yjs document is stored in memory while it's active. If you have 10,000 active documents, you'll need significant RAM. We found that for documents with ~5,000 nodes (elements), the memory overhead was roughly 15MB per document. Plan your Kubernetes clusters accordingly.
The Gotchas: What the Docs Don't Tell You
-
History Bloat: CRDTs store the history of all changes to ensure they can merge. If you have a document that lives for years with millions of edits, the binary blob will grow. You need a strategy for "snapshotting" or clearing history periodically, though Yjs is surprisingly efficient at garbage collecting deleted items.
-
Binary Encoding: Yjs updates are
Uint8Arrayblobs. If you try to send these as JSON over a standard REST endpoint, they will be corrupted unless you base64 encode them. Always use binary-aware transports. -
Awareness Latency: "Presence" features (showing where someone's cursor is) shouldn't be persisted to the database. Use the Yjs Awareness protocol which keeps this data in-memory and volatile. If a user disconnects, their cursor should simply vanish, not wait for a database timeout.
-
Security: Hocuspocus's
onAuthenticateis vital. Never allow clients to name their own documents without server-side validation, or a malicious user could join any document by guessing its ID.
Takeaway
Stop thinking about 'sending messages' between clients. Instead, think about 'synchronizing a shared data structure'. If you're building any feature where two people might touch the same pixel, switch to Yjs and Hocuspocus today. Start by migrating one small shared state — like a list of active users or a single text field — before moving your entire application state to a CRDT.","tags":["WebSockets","CRDT","Yjs","Real-time","Node.js","React"],"seoTitle":"Building Real-time Collaborative Apps with WebSockets & CRDTs (2026)","seoDescription":"Senior Engineer's guide to building scalable, local-first collaborative features using Yjs, Hocuspocus, and WebSockets. Learn why CRDTs are superior to OT."}