[jrbl@jrbl.org: Re: reducing charset size for compressibility with
case-shift characters (in Python)]
Kragen Javier Sitaker
kragen at canonical.org
Mon Apr 18 16:32:59 EDT 2011
----- Forwarded message from Joe Blaylock <jrbl at jrbl.org> -----
Subject: Re: reducing charset size for compressibility with case-shift
characters (in Python)
From: Joe Blaylock <jrbl at jrbl.org>
To: Kragen Javier Sitaker <kragen at canonical.org>
On Sat, 2011-04-16 at 03:37 -0400, Kragen Javier Sitaker wrote:
> lowercase = 'abcdefghijklmnopqrstuvwxyz'
> numbers = '0123456789'
>
> else:
> yield current_state[lowercase.index(char)]
> elif char == DC3:
> current_state = numbers
Couldn't you achieve a modest increase in compressibility at the expense of
calculation time by representing all numerical sequences as base-26 encoded
strings? You'd have to run a buffer large enough for any numeric runs you
process, but the transformation itself is easy. You couldn't do that nice
direct-indexing thing any more though. Well, not without creating more
abstraction.
Joe
----- End forwarded message -----
More information about the Kragen-discuss
mailing list