[A83] Re: Is compression possible?
[Prev][Next][Index][Thread]
[A83] Re: Is compression possible?
Patrick Davidson wrote:
> You can look at the source code to see what it is doing. I don't know the
> precise name for the algorithm, but it uses simple offset-length
> enconding. The decompression code certainly does look very well optimized
> in this case. As for the compressor, that certainly would be larger, but
> not much larger since the compression is not terribly complex. Probably a
> larger concern should be that the compression might be very slow on-calc,
> since you have to compare data at each byte boundary with the data at each
> previous boundary back 2048 bytes. So for a 10K program, 20 million
> comparisons, or 400 million clock cycles (about a minute) if each takes 20
> clock cycles (which would seem fairly good).
A window size of 256 bytes would probably suffice for calc programs though. With
an average size of probably around 10k, I dare say a 2k window would yeild that
much better compression. a 256 byte window also allows eveything to fit neatly
into a byte (if you assume 0 to be 256), and reduces the size of the phrase
tokens... say, 1 bit for the token type, 8 bits window offset and 5 bits for the
length.
So.. for a 10k program with an 8 bit window... that's still ~51 million cycles :P
Also.... memory need not be as big an issue as first thought. Under LZSS, the
maximum bloat would be 112.5% (ie: no matching phrases, therefore 9 bits to store
each byte). Phrase tokens aren't output unless the length of the data it
represents is larger than the length of the token itself.
So, for a 10k program, that 10240 * 1.125 = 11520 bytes maximum (not including
header info). During compression, that would mean we'd need an extra 1280 bytes
for a 10k program in order to be 'safe'. This is assuming it is possible to
directly modify a program variable (which I've never tried before).
It should be possible... slow, but possible ;)
At least the decompression will be fast.
References: