lazy evaluation in a distributed system
Kragen Sitaker
kragen at pobox.com
Mon May 16 03:37:01 EDT 2005
Mark S. Miller was talking on Friday about a clever idea he came up
with to support distributed a Observer pattern without clogging the
network.
The essential problem is that polling a resource for its current state
over the network requires resources even when the resource isn't
changing, while asynchronously sending update notifications when the
state changes requires resources even for outdated update
notifications when the resource is changing quickly. So when the
resource is changing very frequently, polling is more efficient.
So you'd like your Observer implementation to use asynchronous
notifications (as per the canonical Observer) except when they're
happening too fast, in which case you'd like it to fall back to
polling.
This is exactly analogous to the tension between having a query method
that computes some value of an object, and having a redundant instance
variable that gets updated whenever the underlying properties from
which it's computed. To make this concrete in Python, with the tired
old Point, first with a query method:
import math
class MutablePointQuery(object):
def __init__(self, x, y): (self._x, self._y) = (x, y)
def get_theta(self): return math.atan2(self._y, self._x)
theta = property(get_theta)
def set_x(self, new_x): self._x = new_x
def get_x(self): return self._x
x = property(get_x, set_x)
def set_y(self, new_y): self._y = new_y
def get_y(self): return self._y
y = property(get_y, set_y)
That performs atan2 each time we access the theta property, which is
clearly redundant. Suppose instead we calculate theta each time its
value changes, and then store that version:
class MutablePointVar(MutablePointQuery):
theta = 0
def invalidate_theta(self):
self.theta = self.get_theta()
def __init__(self, x, y):
(self._x, self._y) = (x, y)
self.invalidate_theta()
def set_x(self, new_x):
self._x = new_x
self.invalidate_theta()
x = property(MutablePointQuery.get_x, set_x)
def set_y(self, new_y):
self._y = new_y
self.invalidate_theta()
y = property(MutablePointQuery.get_y, set_y)
This version implements a superset of the same interface --- it has
mutable x and y properties and a theta property --- but now we
calculate theta every time we change x or y. That's a performance
improvement if we change x or y a lot less frequently than we read
theta, but makes performance worse if we read theta less often.
(This is probably overkill in this case anyway, because calculating
atan2 is pretty snappy, and it isn't worth it to cache it. I'm
writing this late at night, so my mind isn't up to coming up with a
better example at the moment.)
So, if we change x N times, and read theta M times, we have one
solution where we calculate theta N times, and one where we calculate
it M times. In general, we can do better than that with the standard
trick of lazy evaluation; in the worst case, it calculates theta
min(N, M) times, and usually does better than that:
class MutablePointLazy(MutablePointVar):
_theta = None
def invalidate_theta(self): self._theta = None
def _get_theta(self):
if self._theta is None: self._theta = self.get_theta()
return self._theta
theta = property(_get_theta)
Here we use "None" to represent "unknown", but in the general case,
you'd use a separate variable to distinguish known from unknown
values.
Generalizing this, we would invalidate theta not only when x or y
changed to a new known value, but also when x or y themselves became
unknown. When theta was queried, we would query x and y, which might
cause their values to be calculated at that time, and then calculate
its own value.
So Mark had come up with a very clever way of sending Observer
messages in a distributed object system which reduces, essentially, to
the lazy-evaluation interaction pattern above. Each Observable object
maintains a list of Observers; when its state changes, it notifies all
of them that its state has changed, then starts with a fresh list.
(This notification contains its new state and a generation number, but
these details don't seem relevant at the moment.) Observers who have
died now cease to consume any resources, just as if they had been
referenced by weak references managed by a distributed garbage
collector, but without requiring the construction of a distributed
garbage collector.
Observers who want to know about the current value and future changes
to it must request the current value and resubscribe to the
Observable, respectively. In the case where updates are infrequent,
this reduces to asynchronous event notification; in the case where
updates are very frequent, this reduces to polling. Standard lazy
evaluation works in exactly the same way.
(Andreas Raab mentioned that he had implemented exactly this system in
another system, but unfortunately I have forgotten which one.)
It seems to me that handling partial failures in a distributed system
also requires polling; the Observable may have run out of memory for
subscribers, or the invalidation message may have been lost due to a
temporary network failure. (The Observable might retry sending the
message, but they can't be expected to retry forever, or their memory
will eventually fill up with messages for long-dead Observers, being
retried from months or years ago.) So an Observer who wants some
finite bound on the time before they are notified of changes must
request the current value and resubscribe again, each time that bound
has passed.
Part of the idea, also, is that only Observables who currently need to
update something external need to continue receiving notifications.
Norm Hardy mentioned a system he'd seen in which some the contents of
some fields initially displayed as blurred, but waving the mouse over
them would trigger a cascade of lazy evaluation to find their real
value. Mark spoke of trying to tell whether a UI widget was visible
on the screen in order to see whether to spend time updating it.
(In lazy-evaluation systems in which the dependency graph may
accidentally contain cycles, I think it's also standard to have a
third state for a value, in addition to "known" and "unknown":
"working". This allows detection of the dependency loop, which can
invoke some kind of special handling.)
More information about the Kragen-tol
mailing list