| |
ChristopherJam
Registered: Aug 2004 Posts: 1380 |
On the relative pros and cons of various RLE schemes
So over in Native crunch/decrunch code. I just suggested encoding
abbcdddeffffg as
abb{0}cdd{1}eff{2}g
Raistlin kindly liked the idea, Krill quite sensibly didn't want us to hijack the post and suggested a new post for the new topic. so... Have at it! |
|
... 26 posts hidden. Click here to view all posts.... |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1380 |
Quoting CyberBrainThat looks like the encoding scheme from the codebase64 article? (https://codebase64.org/doku.php?id=base:rle_pack_unpack)
Oh! Yes, it's almost exactly that, except I didn't think to reserve one count value as an end-of-data marker.
Quote:Yes, that way of encoding it, wastes one byte for each 2-byte repeat (starts saving bytes for 4-byte repeats and longer), so could be bad if there are too many of those in the input data
Yeah it's a size loss if there are more 2-byte repeats than occurrences of a potential escape byte value (assuming counts are stored as whole bytes of course). But the implementation is nice and simple.
Quote:(I actually used that one in my latest demo to get room for more pics in memory, and this ultra simple RLE-based packing, together with delta-encoding, worked surprisingly (to me) well - it saved multiple $100s of bytes off some pics (they did have large background areas, but still), so the 2-repeats doesn't have to be a problem)
Nice!
Quote:It could be interesting to hear about the other ways of representing the bytes of RLE-encoding, which are used on the C64.
For example, what is this "the usual escape byte based scheme" you guys have mentioned in this and the other thread? And the counter representation?
Oh, things like (esc)(byte)(repeat count), where (esc) is the least commonly occurring byte in the source data. It does mean than any occurrences of the escape byte are inflated.
As for counter representation, most RLE implementations use a whole byte for the repeat count, but you can do things like read a smaller number of bits to make the repeat counts more compact, much as that slows decrunch a bit (you still keep the literals byte aligned of course). I do like that idea Fungus mentioned of having a few different escape codes for common repeat counts. |
| |
Fungus
Registered: Sep 2002 Posts: 624 |
In the case of my packer I used esc,esc to represent a literal. It still caused expansion, but it also was a common sequence that would be crunched by whatever lz compressor I was using at the time, either byteboiler/level ab2, or level crusher.
The esc,esc worked for the other encoding way too.
Lightemizer and Oneway's packers would be interesting to look into these days. Back then I was trying to understand algos and make my own rather than using other peoples code. But you don't get accused of ripping anymore so looking at stuff is OK these days. |
| |
Krill
Registered: Apr 2002 Posts: 2852 |
Quoting FungusBut you don't get accused of ripping anymore so looking at stuff is OK these days. Ripping and looking at other people's code are different things, though. =) |
| |
The Human Code Machine
Registered: Sep 2005 Posts: 110 |
Here's an exmaple of a byte packer using different pack codes: https://codebase64.org/doku.php?id=base:mdg_bytesmasher_0.04&s[.. |
| |
JackAsser
Registered: Jun 2002 Posts: 1989 |
Quote: That depends how expensive your encoding for a zero additional repeat counts is. But sure, every compression scheme is vulnerable to certain inputs getting bigger instead of smaller, otherwise you could just keep recrunching to get arbitrarily small outputs.
Like Dinosours? |
| |
Krill
Registered: Apr 2002 Posts: 2852 |
Quoting JackAsserLike Dinosours? Dinasours? Which magical infinite cruncher are you referring to? =) |
| |
chatGPZ
Registered: Dec 2001 Posts: 11136 |
did someone mention packers.txt yet? |
| |
tlr
Registered: Sep 2003 Posts: 1721 |
Quoting RaistlinIsn’t it more common to use a bit to define “constant or not”?
$00-7F .. count of non-repeats ($01-$80)
$80-FF .. count of repeats ($04-$83) Do you have any other examples of packers implementing this? I haven't seen that many.
I think the reason it's uncommon (on the C64) because it doesn't pack as well, and size was probably the main driver for the development of packers/crunchers. Decompression speed came much later. |
| |
Krill
Registered: Apr 2002 Posts: 2852 |
Quoting tlrI think the reason it's uncommon (on the C64) because it doesn't pack as well, and size was probably the main driver for the development of packers/crunchers. Decompression speed came much later. I think it's uncommon because everybody just copied everybody else's approach. :)
RLE used to be LZ-crunched anyways, so RLE compression ratio wasn't much of a concern.
How are you so sure this approach (generally?) packs worse than the escape-byte approach? |
| |
tlr
Registered: Sep 2003 Posts: 1721 |
Quoting KrillRLE used to be LZ-crunched anyways, so RLE compression ratio wasn't much of a concern. Initially RLE seemed to have been used by itself.
Quoting KrillHow are you so sure this approach (generally?) packs worse than the escape-byte approach? I'm thinking because only 128-ish runs are allowed + only 128-ish literal blocks. For a typical C64 program (demo?) there would be longish runs of empty space, interweaved with longish runs of code and data.
But maybe I'm assuming too much? Would be interesting to seem some practical measurements. |
Previous - 1 | 2 | 3 | 4 | 5 - Next |