Explore Grammarly's Real Time Collaborative Text Editor

This article was co-written by Front-End Software Engineers Oleksii Levzhynskyi and Anton Pets.

In this series of engineering blog posts, Grammarly’s Front-End team will share the core pieces of functionality that combine to make a powerful document editor in the cloud: the Grammarly Editor. With the Editor as an example, we’ll explain the approaches and protocols our client applications use to work with texts of different sizes and manage complex data flows under the hood, all while ensuring that our users get a seamless writing experience.

The Grammarly Editor is a web app where users can work on documents with quick access to view and accept Grammarly’s writing suggestions.

To tell the story, we’ll focus on the example of a hypothetical user, Harper, an economics student writing her essay in the Grammarly Editor. Editing text and accepting Grammarly suggestions may seem simple on the surface. Behind the scenes, propagating changes from the client to our servers and back can be complex. In this article, we will describe our protocol of text communication. We will discuss how text changes are communicated to the server through our protocol for collaborative editing. Because this protocol is flexible, it’s remained the same, even as the Editor (which was one of Grammarly’s first product offerings!) has undergone many changes over the years to use new front-end libraries and support new kinds of Grammarly suggestions.

Shape the way millions of people communicate!

How we handle text changes

Whenever Harper writes a part of her essay, we want to save it immediately on the server, keeping in mind issues like loss of internet connection or unexpected power outages. But we don’t want to overwhelm the server by sending the entire text whenever something changes—especially because Harper’s short essay may well become a massive thesis. So to lower the impact on bandwidth, we communicate only edit operations to the backend. This approach increases the complexity of the Grammarly Editor, the Grammarly browser extensions, and mobile keyboards, but minimizes the possibility of losing the writer’s text if there’s a network problem.

Client-server communication

For example, above, Harper is appending the word “shock” to “A supply”. Rather than sending the entire updated text to the server, the client will tell the server that “shock” was inserted at position 10.

Why do we need collaborative editing?

Collaborative editing allows multiple Grammarly Editor clients to make changes to the same document at once. This functionality lets Harper pull the essay up on her personal computer and her phone at the same time, and edits on either device will be reflected instantly across both clients. Or, as often happens (no one’s browser tab management is perfect!), she might open a few tabs containing the essay and switch between them throughout the day. With collaborative editing, Harper won’t have to wait for the page to reload, as changes are instantly synced across instances of the Grammarly client in each tab.

When several Grammarly Editor clients are making edits at once, the flow described in the previous section becomes trickier due to the possibility that clients might send changes that conflict with each other.

Two clients send conflicting texts to the server

To create a smooth experience for writers, we silently merge text from different clients without reloading the page. We also support optimistic text updates, which occur when a user is typing some text and sees the changes immediately while the server-client synchronization happens.

To satisfy all the requirements mentioned, we needed an algorithm to resolve conflicts between different Grammarly Editor clients and synchronize the changes so that every client ends up with the same text.

Different approaches to conflict resolution

Conflict happens when two or more clients change the same text in real time. There are two common protocols for collaborative editing: operational transformation (OT) and conflict-free replicated data type (CRDT).

Operational transformation defines edits as a sequence of granular operations like “insert” or “delete.” We have already shown examples of them in the previous sections. When there’s a conflict, the idea is to adjust (transform) the operations according to the changes that happened first. Since it’s essential to maintain a consistent order of operations, OT requires a central server to determine which changes should be applied first.

Conflict-free replicated data type (CRDT) was designed to resolve conflicts without a central server in order to support peer-to-peer communication among clients. Therefore, clients must be able to get to the same state by applying each other’s changes in any order. Information about changes is stored in the data itself—for example, one approach treats each character having a state, and instead of deleting characters, marks the state as “deleted.” Xi-editor is an excellent example of real-life implementation, and the team behind it wrote a detailed article about the usage and caveats of CRDT.

The idea behind OT is simple and can be implemented with very little code. And unlike CRDT’s complex data structures, OT operations are easy to extend—for example, with rich text formatting. Furthermore, we didn’t require peer-to-peer; we already needed to use a central server to store our data. For these reasons, we decided to choose OT as our approach.

How operational transformation resolves conflicts

In some aspects, conflict resolution with the OT protocol is analogous to version control with Git. The main difference is the fact that OT resolves conflicts continuously in real time. Each client has its own “branch” with changes, and the server is the “main” branch. When a client communicates a change to the server, it’s similar to rebasing onto the main branch in Git. The server, in turn, communicates this change to other clients, essentially telling them to rebase and apply the change into their local content.

When we need to resolve conflicts in the Grammarly Editor, it’s often because a client is lagging behind changes that have happened elsewhere, due to a slow connection for instance. But to keep things simple, let’s look at what happens when text is changed at the same time across several clients. Though it’s an unlikely scenario, we’ll be able to walk through the OT algorithm without introducing additional complexity.

Conflict resolution when two clients modify the same text simultaneously

So, for the sake of our example, imagine that Harper is sitting at her desk with her iPhone and her computer in front of her, and she has her economics paper open in the Grammarly Editor on both devices. She’s editing a paragraph that begins with the phrase “a sudden”. On the mobile client, she enters “a sudden price”. Meanwhile, on the desktop client, she completes the phrase with “a sudden change”.

Both clients will send a message to the server describing their changes and wait for an acknowledgment. Until then, all new changes are buffered in the client’s local queue. The server controls the operation sequence and ensures that operations are applied one after another. It then broadcasts changes to all the clients. Such a “blocking” model guarantees that the order of operations is preserved.

Operational transformation resolves conflicts so that both clients end up with the same text. Here is a step-by-step explanation, following the diagram above:

1 The server adds “price” to the original text and broadcasts it to the desktop client.

2 The desktop client has “change” in its queue. It applies the incoming change “price” by doing a bidirectional rebase (rebasing the queue onto the incoming change, and vice versa) and assuming it happened before its own changes. The result is “a sudden price change”. This is probably the most complex step, and the actual rebase will be explained in Part Two of our series of articles about the Grammarly Editor.

3 The server applies changes from the desktop client. The “change” initially inserted at position 10 cannot be applied as is—the mobile client already changed the text. Instead, it’s rebased to the changes from the mobile client. Then, the resulting change is broadcast to the mobile client.

4 Now the mobile client receives the already rebased insert operation “change” and applies it as is (the queue is empty, and there are no conflicts).

Since changes that come from the server are always applied with higher priority than local changes, and each client is allowed to send only one change at a time, Harper is guaranteed to see identical text—“a sudden price change”—reflected on her phone and desktop.

Architecture

Below, you can see a simplified diagram of our front-end. To process text changes, we need the following:

The text editor itself
A module that rebases incoming and outgoing changes. It uses a buffer so that we can avoid sending changes on each keystroke.
A queue that keeps track of the changes. Those changes are “blocking” until we receive the required acknowledgments from the server. However, we still optimistically show those changes to the user even before the acknowledgments are received.

A simplified diagram of the modules used in text synchronization

Server features

Our client-server communication happens via WebSocket. The crucial role of the server is to synchronize text among clients to ensure that everyone ends up with the same content. Moreover, by having a central server in place, we benefit from some additional features.

First, the server can be used as a source of truth. The client will be restored to the server’s state if something goes awry, eliminating the possibility of losing or corrupting the text.

Secondly, if we want to show a document revision history to the user, our work is mostly already done for us. Each change sent to the server is a new revision, and we just need to store these revisions on the backend and mirror them to the front-end.

Each change is a revision.

Thanks to the revision system, clients can resend the same data many times and are guaranteed to have the correct text. Consequently, we can provide a good user experience even when the writer is on an unstable network. Whenever a connection is lost and then restored, the content will be synced with the server.

How we express text changes

Now, let’s see how all text changes are actually represented in our client-server protocol. We use QuillJS as a basis for the Grammarly Editor. It utilizes the Delta format for text changes, which is expressive, readable, and extensible enough to fit most of our needs.

Three fundamental operations can represent any type of text change:

insert for adding new text (possibly with formatting)
delete for removing text
retain for stating that some operation should be applied starting at a specific position (by default, operations start at the beginning of the text). It is also used to add rich text formatting.

For example, to express removing boldface from “supply shock” and changing the phrase to “supply disruption” (non-boldface), the following Delta could be used:

```

[{retain: 7, attributes: { bold: false }}, {insert:’disruption’}, {delete: 5}]

```

Delta expresses what should be changed in the text. But to understand the order in which these changes should be applied, each operation that we send to the server is tagged with the revision number. We know we need to resolve conflicts when several clients modify text with the same revision number.

If you want to know more about Delta, you can check out this great article: Designing the Delta Format

Conclusion

The OT protocol and Delta format are our basis for real-time collaboration in the Grammarly Editor, ensuring that even when users are working on text across multiple devices or browser windows, their changes are always synchronized, and there are no losses or corruptions to the text.

We’ve talked about how we support collaborative text editing in general. But how do we express and apply changes that come from Grammarly’s writing suggestions? In Part Two of our series, we’ll explain this, as well as show you how some advanced features of the Grammarly Editor work.

If you’re interested in working on a text editor and many other user-facing features that impact 30 million people and 30,000 professional teams every day, Grammarly’s Front-End team is hiring! Check out our open roles here.

Under the Hood of the Grammarly Editor, Part One: Real-Time Collaborative Text Editing