comparative study of iteration and ordered collections
dave.long at bluewin.ch
Mon Apr 9 10:27:37 EDT 2007
> random_shuffle looks pretty hairy.
random_shuffle, as usually implemented, is based on swaps (establishing a
random permutation) -- but one can also think of it as sorting with a
"broken" comparison function which returns random results, so there may be
a clever alternative to doing a real sort with an appended random key?
> There's a non-clever way to make any sort stable, which is to add a
> secondary sort key that is the original position in the list. There
> may be a clever way to do a stable mergesort without knowing the list
> lengths in advance, but I don't know it.
The idea of radix-sort is that one can reduce the problem of sorting on a
wide key to several sorts on narrow keys -- adding a secondary sort key is
a bit self-defeating for this use.
The brute-force way to preserve stability is to reverse the reversed list
in between passes, as used here with integers providing lists (and a pair
of integers providing a tape):
The clever way (as used with actual tape drives) was to reverse the sense
of comparison on each pass.
> That leaves only the heap and permutation operations. I really don't
> know about them.
I'd guess heaps and permutations are essentially swap-based, and that
swaps imply random access.
(permutations are closely related to insertion/selection-sort, which are
not so far away from heap sort)
Knuth Vol 3, "Sorting and Searching", might be a good reference. Anything
that was good for external sorting with tapes ought to be good with
lists. (indeed, being good with tapes implies that one should be able to
find more cache-friendly implementations than the standard
> A remarkably large share of the uses of array indexing by numerical
> for some variant of the string.join problem --- given an array of N
> strings, connect them with N-1 commas, except when N=0, in which case
> we should connect them with N=0 commas instead of N-1 = -1 commas.
Squint at it properly, and tail-call optimization becomes an instance of
string.join: instead of "a,b,c," (where x is a jump and ',' the equivalent
return) we want to generate "a,b,c". The dual (context preservation when
building argument lists) follows the same pattern: we only need to
save/restore contexts if there are at least two overlapping evaluations.
This may be a practical reason for the last couple of decades' interest in
monads: we prefer to program in an algebraic, tree-like form (code as well
as data), while the iron prefers to handle linear sequences of code
operating on contiguous chunks of data (op x state -> state') -- and
monads provide machinery to go from tree inputs to flat results in a
theoretically nice manner.
More information about the Kragen-discuss