How, exactly, X sucks

Kragen Sitaker kragen@pobox.com
Thu, 16 Mar 2000 12:42:13 -0500 (EST)


This will be unbelievably dull for most people, and depressing for the
rest.  I'm writing it under the assumption that X will not last
forever, and it will be helpful for posterity to know what was wrong
with it so they can design a better replacement.

It's also based on a very limited knowledge of Xlib.

First, I'm disgusted at the naming conventions for events.  If you want
to handle mouse motion, you need to look for events with a type of
MotionNotify and cast them, not to XMotionNotifyEvents, but to
XMotionEvents.  But before you do that, you need to indicate that
you're interested in XMotionEvents by modifying your window's event
mask.  The incantation used for this is not documented in the various
man pages, but apparently you use PointerMotionMask.  Not MotionMask,
or MotionEventMask, or MotionNotifyMask, but PointerMotionMask.

(I grepped.  PointerMotionMask appears nowhere in the section 3 X man
pages.)

It gets worse.  Some masks have more than one associated event type;
some structures (like XButtonEvent) have more than one associated event
type.  I do not think the two sets of event type categories coincide.

All of this provides exactly zero value to the client programmer.  We
would surely be much happier to say

	XListenForEvents(w, XMotionEventMask | XButtonEventMask);
	. . .
	if (ev.type == XMotionEventType) {
		XMotionEvent *mev = (XMotionEvent*)&ev;
	}

(Actually, you can just use ev.xmotion, assuming ev is an XEvent.  The
members of the XEvent union are also inconsistently named; the
XCrossingEvent is xcrossing, the XFocusChangeEvent is xfocus, etc.)

And speaking of masks, there are masks everywhere.  

Xlib has a lot of cases where you have some struct that is maintained
on the X server which users might want to change some items in; they
include the Window and the GC, although I think there are more.  It has
a relatively uniform method for handling this, which is as follows.

First, there is a 'create' call.  The create call has a few arguments to
specify what you want to create, and a pair of arguments called
'valuemask' and 'attributes'.  The 'attributes' argument points to a
client-side duplicate of the struct you're creating.which has a
randomly inconsistent name.    The 'valuemask' indicates which of the
values in 'attributes' are intended to actually be used; the rest are
intended to be ignored, set from defaults instead of from the passed
values.

Then, there is a 'change' call, which has a few arguments to indicate
which instance of this struct you want to change, followed by the same
'valuemask' and 'attributes' arguments as above.

Then, there is a 'get' call, which has roughly the same arguments as
the 'change' call.  Except, for the Window structure, for some random
reason, the client-side struct is different for the 'get' call, and
there's no  valuemask.

This is impossibly clumsy.  To change the event mask on a window to
listen to new events, one must do the following:

	XWindowAttributes xwa;
	XSetWindowAttributes xswa;  /* it sounds like a verb, but it's a noun */

	/* the return value of this function is not documented in its man page*/
	XGetWindowAttributes(display, window, &xwa);
	xswa.event_mask = xwa.your_event_mask | TheEventsIWant;
	/* this function isn't called XSetWindowAttributes because that
	 * name is already used for its argument */
	XChangeWindowAttributes(display, window, CWEventMask, &xswa);

Although (nice change!) the 'valuemask' manifest constants like
CWEventMask are documented in a man page (not
XChangeWindowAttributes's, though), they cannot be deduced from the
member names of the struct that they correspond to.  The valuemask
value for the background_pixmap member is CWBackPixmap, not
CWBackgroundPixmap, to take the first example.  So you *have* to look
them up in the man page, or memorize them.

Writing programs on an API like this feels like filling out forms in
triplicate.

Speaking of filling things out in triplicate, X's multiheading support
stinks to high heaven.

First, in the vast majority of cases, X clients don't care which screen
they're on.  They are happy to display themselves on the screen they
were told to display themselves on by the user; they never try to
display some windows on one screen and other windows on another
screen.  They don't even need to know.

Furthermore, the vast majority of X displays don't even *have* multiple
screens.  And X doesn't do a good job when it *does* have multiple
screens; you can't move windows from one screen to the other, for
example, even if they are identical in every way.  You almost might as
well have one X server per display.  You'd just have to figure out how
to make cut-and-paste work.  (Xinerama is correcting the worst
deficiencies of X's multiscreen support.)

In this environment, what did the X designers decide to do?  That's
right.  They require a screen number as an argument to any function or
macro that needs to know what screen it's on and doesn't have a Window
argument.  You have to explicitly say DefaultScreen(display) to get the
default screen.

And speaking of unnecessary arguments, all the functions that have
Windows as arguments also have a Display* argument, just in case you
want to do something with a Window on a display it doesn't exist on.

[Actually, a Window is just an integer, which is the real reason for
this brain damage.  This is probably excusable in 1985, when some
widely-used C compilers probably still didn't support passing structs
as arguments by value (although I wouldn't know), and making Windows
into pointers to structs would be an extra thing for the programmer to
remember to deallocate.  It isn't excusable today.]

Everything is a lot of work with Xlib.  Even getting a window up on the
display requires an XOpenDisplay (which is reasonable), a RootWindow()
and a DefaultScreen() (or a DefaultRootWindow()), some fetching of a
color (probably with WhitePixel(display, DefaultScreen(display)), an
XCreateSimpleWindow (with nine arguments!), and an XMapWindow
(reasonable).

Then you have to XCloseDisplay later, which is reasonable.

If you want to select() on an X connection --- so you can handle X
events along with other events in an event loop --- you'll need to read
the header files to discover that what you want is the
ConnectionNumber(display) macro to get the file descriptor.  The
documentation just says:

     The ConnectionNumber macro returns a connection number for
     the specified display.

I'm not certain that all of this event-mask stuff actually does any
good at all.  

Jim Gettys explains that, due to X's highly efficient wire protocol and
focus on reducing round trips, it takes about 100 instructions per X
protocol request to handle the protocol.  Which means that on the
1-MIPS machines that were the low end when X was being originally
developed, you could handle ten thousand null messages per second.
Roughly.

Let's derate that by a factor of ten and assume that we can handle
about a thousand.  And let's assume that X protocol events are roughly
as cheap to handle, and also roughly as cheap to generate, as
requests.  (The factor of ten derating should cover our ass here.)

Only up to about 20 X protocol events happen per second.   (That's if
you strap a vibrator to your mouse and hold down a key so it
autorepeats.)  If each of these events gets handled by two clients on
average, that's about 4% of the capacity of this 1-MIPS workstation.
It's about 0.004% --- well, to be safe, 0.04% --- of the capacity of
today's 1000-MIPS workstations.

In other words, if every event were delivered to the appropriate client
whether or not it cared, we would lose an insignificant amount of
performance.  And a great deal of complexity.

I'll close with a rant about borders.

In X, every window has a border.  It is a fixed width all the way
around the outside of the window, and displays in a solid color or
pixmap.  Leaving aside that this is not the optimal kind of border for
screens with nonsquare pixels, it is completely useless for drop
shadows, 3-D bevels, dashed lines, etc., so it rarely gets used, except
by interface designers without a sense of aesthetics.

But every time you create a window, you have to specify a border
width.  Every window on the display knows its border width; every time
the X server draws the window background, it consults the border
width.  Information about setting border widths, colors, and pixmaps
(incidentally, you can't set the border pixmap in XCreateWindow, but
you're required to set the border width and pixel color) is spread
throughout the man pages that describe Windows, making it incrementally
harder to find information that might actually be useful to a sane
person.  From looking at the XWindowAttributes structure, I estimate
that the X server's Window structure has about 32 32-bit fields, of
which information about the border constitutes three --- increasing the
size of the structure by 9%, thus decreasing the number of windows it
is reasonable to create by about 10%.

Window borders are the $500 toilet seat of X.  By themselves, they
don't have much of an impact on the complexity and performance of X.
But they are only one of many such misguided features.

-- 
<kragen@pobox.com>       Kragen Sitaker     <http://www.pobox.com/~kragen/>
The Internet stock bubble didn't burst on 1999-11-08.  Hurrah!
<URL:http://www.pobox.com/~kragen/bubble.html>
The power didn't go out on 2000-01-01 either.  :)