finding sets of collinear points in Common Lisp
Kragen Javier Sitaker
kragen at pobox.com
Sat Mar 8 16:58:36 EST 2008
I tested this in SBCL. See the end for instructions for running it. On
a randomly generated file of 3000 points, it finds 2712 segments on my
machine in a little under 2 CPU minutes, which is about 23 lines or
points per second. Not that fast, but hey, it's O(N²).
;;; Solution to fun pattern-recognition exercise
;; Briefly, the idea is that you can find sets of (exactly) collinear
;; points in O(N² lg N) time instead of O(N³) or worse, by sorting
;; the points by their angle from some starting point.
;; I think you can do O(N²) in most cases if you use a base-256 radix
;;; Notable features
;; I'm just using cons cells ("pairs") to hold points --- X in the
;; car, Y in the cdr.
;; The code is written in a purely functional style, with no side
;; effects except for input, output, and a couple of calls to "sort".
;; This version has an additional step at the end for removing
;; duplicate line segments. This step is actually O(S²) where S is
;; the number of segments, multiplied by an additional O(M²) step of
;; looking for common points between the line segments if they are
;; parallel. In the normal case, each point participates in zero,
;; one, or a few line segments, and the line segments are short and
;; not parallel, so this part is quite fast.
;; However, in a bad case, all of the points might be collinear. This
;; will provoke O(N³) behavior. I'm also not sure what the maximum
;; number of distinct 4-or-more-point line segments for N points is;
;; I'm sure it's not less than O(N) or more than O(N²), but if it's
;; more than O(N sqrt N), that could actually be even worse than the
;; all-collinear case. That case would be easier to fix than the
;; all-collinear case.
;; It would be straightforward to maintain a hash of already-recorded
;; (slope, x, y) tuples, and query that before adding a new line
;; segment to the list, inside of "fast". This would avoid the need
;; for the seven lines of code that currently eliminate duplicates in
;; O(S²) time, multiplying the asymptotic performance at O(N²) only by
;; a constant factor. But I haven't done this.
;; A second bug is that multiple copies of the same point will be
;; considered to be collinear with some third point under some
;; circumstances (if the third point comes earlier in the file, or if
;; the line with the third point is vertical) but not under other
;; And it's possible that "brute" is actually worse than the O(N^4)
;; performance you'd naïvely expect, due to appending a bunch of
;; possibly long lists together.
;; file format: one int (on first line) giving number of points N,
;; followed by N lines each containing two ints that are the
;; coordinates of a point. We don't bother to pay attention to the
;; line boundaries.
;; Here's a sample data file (commented out):
;; 16384 19200
;; 16384 18666
;; 16384 32000
;; 16384 21761
;; 10000 10000
;; 20000 10000
;; 30000 10000
;; 40000 10000
;; 1 1
;; 2 2
;; 3 3
;; 4 4
;; 5 5
;; It contains a horizontal line at y=10000, a vertical line at
;; x=16384, and a diagonal line at y=x.
(defun get-point (stream)
(cons (read stream) (read stream)))
(defun get-points (stream)
(let ((npoints (read stream)))
(loop for i from 1 to npoints
collect (get-point stream))))
(defun get-points-from-fname (fname)
(with-open-file (input fname :direction :input)
;;; Brute-force version.
;; Return the slope of the line between points a and b, or i (the
;; imaginary number) for vertical lines. Positive infinity would be
;; better, but I don't think Common Lisp has a way to express it.
(defun slope (a b)
(if (= (cdr a) (cdr b)) #C(0 1) ; return i for vertical lines
(/ (- (car b) (car a))
(- (cdr b) (cdr a)))))
;; Return true if three points are collinear.
(defun collinear-p (a b c) (= (slope a b) (slope a c)))
;; Return true if all points passed as arguments are collinear.
(defun all-collinear-p (a b &rest points)
(loop for c in points
always (collinear-p a b c)))
;; Brute-force main entry point: test all sets of four points.
(defun brute (points)
(loop for p1s on points
append (loop with p1 = (car p1s)
for p2s on (cdr p1s)
append (loop with p2 = (car p2s)
for p3s on (cdr p2s)
;; Try to be slightly efficient by
;; not looping over fourth points
;; when the first three points are
;; not collinear.
when (collinear-p p1 p2 (car p3s))
append (loop with p3 = (car p3s)
for p4 in (cdr p3s)
(all-collinear-p p1 p2 p3 p4)
collect (list p1 p2 p3 p4))))))
;;; "Fast" O(N² lg N) version: sort points by slope from a starting point
;; Return true if the slope of segment AB is less than the slope of segment AC.
(defun slope-<-p (a b c)
(let ((slope1 (slope a b)) (slope2 (slope a c)))
(and (not (complexp slope1)) ; return false if both or just first vertical
(or (complexp slope2) ; true if just second vertical
(< slope1 slope2)))))
;; Partition "others", a list of points, into a list of
;; (variable-length) lines starting from "start" (appending "start" to
;; each one).
(defun fast-helper (start others)
(if (null others) nil
;; "iterate" loops and adds points onto "currentline" until it
;; finds a point with a different slope, at which point it
;; recurses back to the top level of "fast-helper" to handle the
;; rest of the points.
(labels ((iterate (a currentline others)
(cond ((null others)
((= (slope a (car currentline))
(slope a (car others)))
(cons (car others) currentline)
(cons currentline (fast-helper a others))))))
(list (car others) start)
;; Sort a list of points by their slope from a certain starting point.
(defun sort-by-slope (start points)
;; copy-list because "sort" is destructive. That was a frustrating
(sort (copy-list points) (lambda (b c) (slope-<-p start b c))))
;; "Fast" main entry point.
;; For each point, finds all the line segments starting from it, and
;; discards the ones whose length is less than 4.
(defun fast (points)
(loop for p1s on points
append (let* ((p1 (car p1s))
(p2s (sort-by-slope p1 (cdr p1s))))
(remove-if (lambda (line) (< (length line) 4))
(fast-helper p1 p2s)))))
;;; Eliminating duplicate segments.
;; We assume that segments will have their last point in common.
;; This is valid with "fast" because all of the duplicate segments
;; will include the last point in the list.
(defun segments-equal (a b)
(and (= (slope (car a) (cadr a))
(slope (car b) (cadr b)))
(intersection a b)))
;; Return the first member of each equivalence class of segments.
;; This does the right thing with "fast" because "fast" always
;; generates the longest segment first. It doesn't do the correct
;; thing with "brute".
(defun distinct-segments (lines)
(remove-duplicates lines :test #'segments-equal :from-end t))
(defun distinct-fast (points) (distinct-segments (fast points)))
;; Comparator to provide a sorting order for points.
;; The only correctness constraint here is that the endpoints of a
;; line segment must sort outside of points inside that segment.
(defun point-<-p (a b)
(or (< (car a) (car b))
(and (= (car a) (car b)) (< (cdr a) (cdr b)))))
;; Destructively sort the points within lines to bring their endpoints
;; to the ends.
(defun normalize-lines (lines)
(mapcar (lambda (points) (sort points #'point-<-p)) lines))
;; List the line segments in a nice format, according to their endpoints.
(defun print-lines (lines)
(loop for line in (normalize-lines lines)
do (let ((a (car line)) (b (car (last line))))
(format t "(~S, ~S) -> (~S, ~S)~%"
(car a) (cdr a) (car b) (cdr b)))))
;;; Main program
(defun print-lines-from-fname (linefinder fname)
;;; Main program (SBCL version)
(defun sbcl-main ()
(print-lines-from-fname #'distinct-fast (second *posix-argv*))
;; Run as:
;; sbcl --noinform --load collinear-points.lisp --eval '(sbcl-main)' pointsfile3
;; Alternatively, compile as:
;; (load "collinear-points.lisp")
;; (sb-ext:save-lisp-and-die "collinear-points" :toplevel #'sbcl-main
;; :executable t)
More information about the Kragen-hacks