two-dimensional syntax for Lisp

Kragen Sitaker kragen@pobox.com
Tue, 29 Jan 2002 03:38:02 -0500 (EST)


(Apologies for the idiosyncratic markup.)

On two-dimensional Lisp syntax
==============================

These are notes on some ideas from
\href(http://www.paulgraham.com/arcll1 Paul
Graham's language Arc.)  He suggests that Lisp's
syntax is hard to read and write.

Inferred parens
---------------

Paul Graham notes that you can infer some parens
in Common Lisp from indentation and newlines.

In particular, if a line contains more than one
s-expression, it must be a prefix of a list of
those s-expressions.  If a sequence of lines is
indented from the previous text line, it must be a
continuation of that line's s-expression.

So 

    a b c
    d e f

must be (a b c) followed by a list beginning d, e,
f.

    a b
      c
    d e 
      f

can be deduced to be the same.

It might be nice for some keywords to require a
single expression following them on that line, in
which case parens can be inferred around second
and following sexps on that line, if there are
more than two.  But I'm inclined not to allow
this, because that makes the transformation much
more complicated.

All of this leads to needing extra parens around
invocations of parameterless methods.  Oh well.

Infix
-----

In general, infix-to-prefix transformation should
be relatively simple, reversible, and allow mixing
of infix and prefix expressions; infix operators
can't be infix if they're the first element of a
list anyway.  However, infix expressions could be
valid prefix expressions; how about (reduce +
mylist), for example?  OCaml solves this problem
by adopting the infix interpretation where there
is ambiguity, requiring parens around infix
operators used as values.  This probably isn't the
best solution for a Lisp, but you could probably
require (function +) instead of (+) and get away
with it.

I'd like to be able to use at least (+ - * ** /
mod) infix, with their usual precedence; and a
left-associative : for cons would shorten a heck
of a lot of Lisp programs.

Multiline strings
-----------------

On multiline string syntax: it's ridiculous to
include the leading spaces on successive lines.
Spaces to get subsequent lines of the string in to
where the first line of the string started should
not be included in the string; their absence
should simply be a syntax error.

Inferred parens and cond
------------------------

These syntax rules still don't help cond much,
either the Arc variant of cond or the traditional
one.

Traditional:
    (cond ((> x 1) (foo bar baz))
          ((> x -1) (baz buz quux))
          ((< x -10) (bur blar baz)))
becomes:
    cond ((> x 1) (foo bar baz))
         (> x -1) (baz buz quux)
         (< x -10) (bur blar baz)

and Arc's:
    (cond (> x 1) (foo bar baz)
          (> x -1) (baz buz quux)
          (< x -10) (bur blar baz))
becomes:
    cond (> x 1) (foo bar baz)
         > x -1
         baz buz quux
         < x -10
         bur blar baz

This contains fewer parens, but I'm not confident
that it's at all more readable.

McCarthy's 1960 paper uses a different syntax:
condition => consequent, condition => consequent,
condition => consequent.  If => were a special
token in the grammar, you could write

    cond
        > x 1 => foo bar baz
        > x -1 => baz buz quux
        < x -10 => bur blar baz

Here are some examples using an expression that was bandied about on
ll1-discuss recently.

; are there two real roots?
(> (/ (+ (- b) (sqrt (- (* b b) (* 4 a c)))) (* 2 a)) 0)

; standardly indented version
(> (/ (+ (- b)
	 (sqrt (- (* b b)
		  (* 4 a c))))
      (* 2 a))
   0)

; a paren-free expression using only indentation
; note that this is possible for any expressions that have no niladic
; calls
>
  /
    +
      - b
      sqrt
         -
           * b b
           * 4 a c
    * 2 a
  0

; an attempt to add some parens to make that more readable
> (/ (+
        - b
        sqrt (-
                * b b
                * 4 a c))
     * 2 a)
  0

; if parens don't have to match and indentation length is significant, as
; it normally is:
> (/ (+ (- b)
        sqrt (- (* b b)
                * 4 a c
     * 2 a
  0

; a version of the above with matching parens; perhaps this should be 
; required.  It would answer Erann Gat's complaint about Python, that
; there is no redundancy in the block syntax, and it also works better
; (though not perfectly) with existing Lisp editors.
> (/ (+ (- b)
	sqrt (- (* b b)
		* 4 a c))
     * 2 a)
  0