JSON-delta is a multi-language software suite for computing deltas between JSON-serialized data structures, and applying those deltas as patches. It enables separate programs at either end of a communications channel (e.g. client and server over HTTP, or two processes talking to one another using bidirectional IPC) to manipulate a data structure while minimizing communications overhead.
Consider the example JSON-LD entry for John Lennon from http://json-ld.org/:
{
"@context": "http://json-ld.org/contexts/person.jsonld",
"@id": "http://dbpedia.org/resource/John_Lennon",
"name": "John Lennon",
"born": "1940-10-09",
"spouse": "http://dbpedia.org/resource/Cynthia_Lennon"
}
Suppose we have a piece of software that updates this record to show his date of death, like so:
{
"@context": "http://json-ld.org/contexts/person.jsonld",
"@id": "http://dbpedia.org/resource/John_Lennon",
"name": "John Lennon",
"born": "1940-10-09",
"died": "1980-12-07",
"spouse": "http://dbpedia.org/resource/Cynthia_Lennon"
}
Further suppose that we wish to communicate this update to another piece of software whose only job is to store information about John Lennon in JSON-LD format. (Yes, I know this is getting unlikely, but stay with me.) If this Lennon-record-keeper accepts updates in json-delta format, all you have to do is send the following over the wire:
[[["died"], "1980-12-07"]]
This is a complete diff in json-delta format. It is itself a JSON-serializable data structure: specifically, it is a sequence of what I refer to as diff stanzas for some reason. The format for a diff stanza is [<key path>, (<update>)] (The parentheses mean that the <update> part is optional. I’ll get to that in a minute). A key path is a sequence of keys specifying where in the data structure the node you want to alter is found, much like those emitted by JSON.sh. The stanza may be thought of as an instruction to update the node found at that path so that its content is equal to <update>.
Now, let’s do some more supposing. Suppose the software we’re communicating with is dedicated to storing information about the Beatles in general. Also, suppose we’ve remembered that it was actually on the 8th of December 1980 that John Lennon died, not the 7th. Finally, suppose we live in an Orwellian dystopia, and Cynthia Lennon has been declared a non-person who must be expunged from all records. Unfortunately, json-delta is incapable of overthrowing corrupt and despotic governments, so let’s make one last supposition, that what we’re interested in is updating the record kept by the software on the other end of the wire, which looks like this:
[
{
"@context": "http://json-ld.org/contexts/person.jsonld",
"@id": "http://dbpedia.org/resource/John_Lennon",
"name": "John Lennon",
"born": "1940-10-09",
"died": "1980-12-07",
"spouse": "http://dbpedia.org/resource/Cynthia_Lennon"
},
{"name": "Paul McCartney"},
{"name": "George Harrison"},
{"name": "Ringo Starr"}
]
(Allegations of bias in favor of specific Beatles on the part of the maintainer of this record are punished by the aforementioned despotic government. All glory to Arstotzka!)
To make the changes we’ve decided on (correcting John’s date of death, and expunging Cynthia Lennon from the record), we need to send the following sequence:
[
[[0, "died"], "1980-12-08"],
[[0, "spouse"]]
]
Now, of course, you see what I meant when I said I’d tell you why <update> is optional later. If a stanza includes no update material, it is interpreted as an instruction to delete the node the key-path points to.
Note also that there is no difference between a stanza that adds a node, and one that changes one.
The intention is to save as much communications bandwidth as possible without sacrificing the ability to communicate arbitrary modifications to the data structure (this format can be used to describe a change from any JSON-serialized object into any other). The worst-case scenario, where there is no commonality between the two structures, is that the protocol adds seven octets of overhead, because a diff can always be expressed as [[[],<target>]], meaning “substitute <target> for the data structure that is to be modified”.
JSON-delta is my language learning project: whenever I decide to learn a new programming language, I do so by implementing JSON-delta in it. As such, there are five implementations, of varying degrees of fullness, available:
Language | Patch | Diff | Compact | U-patch | U-diff |
---|---|---|---|---|---|
Python 2 | ✓ | ✓ | ✓ | ✓ | ✓ |
Python 3 | ✓ | ✓ | ✓ | ✓ | ✓ |
Javascript | ✓ | ✓ | ✗ | ✗ | ✗ |
Racket | ✓ | ✓ | ✗ | ✗ | ✗ |
Perl | (sort of) | ✗ | ✗ | ✗ | ✗ |
As you may be able to guess, I’m most comfortable programming in Python… Anyway, here’s what those enigmatic column headers mean:
I initially developed JSON-delta to work with a Django web-app, allowing client-side Javascript to manipulate a data structure exposed by the server, so the Javascript and Python implementations are the most robust.
Development of the Perl version stalled when I discovered that, due to Perl not having as ramified a type ontology as Javascript, numeric values do not round-trip cleanly. This meant that there was nothing I could do to make my test suite pass. This made me sad.
The Python implementation (compatible with version 2.7 and later, including 3) are available from PyPI.
You can download the Javascript versions here ( full, minified)
(Please, for the sake of my bandwidth bills, serve your own copy rather than hot-linking mine!)
The racket and perl implementations are very alpha (remember, they were written by a beginner!) If you want to check them out, I recommend looking at the source repo (no pun intended.): development of json-delta takes place against a master repository containing all implementations.