rumor-oriented programming
Kragen Sitaker
kragen@pobox.com
Sun, 25 Jan 2004 18:15:59 -0500 (EST)
On rumor-based programming
==========================
Suppose we want to build a distributed application with automatic
change synchronization. Here's a persistence system with coordination
functions somewhat similar to mod_pubsub or Linda, but specifically
designed for replicating the state of an application.
rumorsets
---------
A rumorset is kind of like a newsgroup, a KnowNow topic, a SQL table,
a filesystem directory, or the first item of a Linda tuple. Different
application instances (say, on your server, your desktop, your laptop,
and your cellphone) are members of the same rumorset; an
application-independent protocol (like NNTP) propagates rumors on that
rumorset back and forth, until each application instance has all the
rumors in that rumorset. Whether connected or disconnected, your
application can post rumors to its local copy of the rumorset, and
those rumors get propagated back to other instances of the application
when it reconnects.
The order of rumors in a rumorset is not well-defined.
So the synchronization primitive is application-independent and very,
very simple.
You can't change or delete rumors once they're posted, just like in
real life. But you can program your application to ignore some
rumors, depending on other rumors that may have been posted.
An application might use several rumorsets. This provides a way for a
second application to read data originally created by a first
application, and store its own data as well, without introducing
foreign data into the stream seen by the first application. (Sorry
for the patentese.)
It's important that the application only use rumorsets the user tells
it to use, but it would be burdensome for the user to have to specify
several rumorsets when they correspond to application-internal
concepts. So I think it desirable for the rumorset namespace to be
hierarchical, rather than flat.
The primary reason for having multiple rumorsets is to keep different
applications from interfering with each other.
rumors
------
A rumor is a set of name-value pairs, where both names and values are
flat strings. I'd make rumors just be flat strings if I could, but I
think upward compatibility requires name-value pairs.
Each rumor has a globally unique rumor ID, the value attached to the
name "rumorid", which the application instances use in the
synchronization protocol. It can also be used in
I'm not sure whether I should allow multiple names with the same value
in a rumor. It would be a more natural representation of some data,
and in particular, it would allow join queries to represent some
things more naturally, but it also means a more complicated
programming interface.
field names and upgradability
-----------------------------
Since you can't change or delete existing rumors, the meaning of a
particular field name within a particular rumorset can't change. New
data either has to be wedged into existing fields in a
backwards-compatible way, or added in new fields. Applications in
general should either pass through or ignore fields they don't know
about.
This means that you can't treat a rumorset as an object-state store,
at least not if you ever want to change the state of your objects.
distributed upgrades (todo
queries (todo)
queries as rumorsets (todo)