[CSDb] - User Forums - Native crunch/decrunch code.

You are not logged in - nap

CSDb User Forums

Forums > C64 Coding > Native crunch/decrunch code.

2024-04-20 05:56

Flavioweb

Registered: Nov 2011
Posts: 447

Native crunch/decrunch code.

Do you know if the source code of a native crunch/decrunch routine for C64 exists somewhere?
Not a complete utility with interface etc...
just the compression/decompression code.

... 13 posts hidden. Click here to view all posts....

2024-04-21 12:47

Krill

Registered: Apr 2002
Posts: 2845

Quoting Flavioweb

The "pure" RLE compression didn't convince me very much because, in practice, it doubled the space required for the code to be compressed, but it works well for the data part.

It seems like you've misunderstood how the algorithm/encoding works.

Actual RLE emits a counter/type byte to encode a run, with up to 128 bytes of repeating or literal bytes. (Ignore the inferior escape byte approach that was so popular on C-64.)

So, worst case would be one extra byte for every 128 input bytes.

2024-04-21 13:26

tlr

Registered: Sep 2003
Posts: 1714

Quoting Krill

Actual RLE emits a counter/type byte to encode a run, with up to 128 bytes of repeating or literal bytes. (Ignore the inferior escape byte approach that was so popular on C-64.)

Not sure I agree the escape byte approach is inferior, but it's more messy.

Though as you point out, some way to differentiate literals vs runs is required for RLE. There should be little or no expansion of uncompressible sections.

2024-04-21 13:30

Krill

Registered: Apr 2002
Posts: 2845

Quoting tlr

Not sure I agree the escape byte approach is inferior, but it's more messy.

It's inferior in the way the decompressor needs to look at every input byte to decide what to do,
while the 1.7bits control byte approach allows for loops or nice Duff's devices that don't need to branch for every input byte, as the length of both types of run is known beforehand.

(It's a bit like C-strings vs Pascal-strings. =D)

2024-04-21 14:21

tlr

Registered: Sep 2003
Posts: 1714

Quote: Quoting tlr
Not sure I agree the escape byte approach is inferior, but it's more messy.
It's inferior in the way the decompressor needs to look at every input byte to decide what to do,
while the 1.7bits control byte approach allows for loops or nice Duff's devices that don't need to branch for every input byte, as the length of both types of run is known beforehand.

(It's a bit like C-strings vs Pascal-strings. =D)

Yeah, but you have to count instead. In the RLE case it's also usually less efficient, although RLE is very inefficient to begin with.

2024-04-21 14:49

ChristopherJam

Registered: Aug 2004
Posts: 1378

So far as RLE is concerned, alternately you can just output a count of "how many more repeats" after each adjacent pair of identical values, and potentially huffman encode the counts to make them smaller.

(eg abbcdddeffffg becomes abb{0}cdd{1}eff{2}g )

But anyway, I'm kind of tempted to push a simple MTF+uneven symbol size codec out the door - a quick draft crunches zorro from 54105 bytes to 45892, and would be fairly simple to implement in 6502. It's not as good a ratio as even tinycrunch (38962 bytes for the same test file), but it'd compress in a matter of seconds, and would outperform RLE unless you're crunching quite a bit of of empty space. Of interest?

2024-04-21 15:51

Martin Piper

Registered: Nov 2007
Posts: 634

(number of literals)(output literals)(number of repeated bytes, repeat the previous byte)...

Since the number of repeats always follows a number of literals, there is no need for a control flag.
The "number of" can use some form of Elias gamma coding, for up to 16 bit values for the whole memory range, to keep things short.

2024-04-21 16:06

Krill

Registered: Apr 2002
Posts: 2845

Let's not let this degrade to Code Golf, shall we.

2024-04-21 16:40

Raistlin

Registered: Mar 2007
Posts: 562

Quote: Let's not let this degrade to Code Golf, shall we.

Why? I think the thread becomes more interesting that way… if “Code Golf” means what I think it means..? Or did you mean “Code Tennis”?

I like CJam’s idea, I never thought of doing RLE in that way and, annoyingly, it’s so damn obvious now that he’s suggested it.

2024-04-21 16:44

Krill

Registered: Apr 2002
Posts: 2845

Quoting Raistlin

Why? I think the thread becomes more interesting that way… if “Code Golf” means what I think it means..? Or did you mean “Code Tennis”?

No idea where the difference is, but i suggest another thread to discuss the relative pros and cons of various RLE schemes. =)

2024-04-22 11:18

ChristopherJam

Registered: Aug 2004
Posts: 1378

(RLE discussion continued at On the relative pros and cons of various RLE schemes)

Previous - 1 | 2 | 3 - Next

Refresh

Subscribe to this thread:

You need to be logged in to post in the forum.

Search the forum:
Search for in
All times are CET.

Search CSDb

Advanced

Users Online

Didi/Laxity
Guests online: 137

Top Demos

1 Next Level  (9.8)
2 Mojo  (9.7)
3 Coma Light 13  (9.7)
4 Edge of Disgrace  (9.6)
5 Comaland 100%  (9.6)
6 No Bounds  (9.6)
7 Uncensored  (9.6)
8 Wonderland XIV  (9.6)
9 Memento Mori  (9.6)
10 Bromance  (9.5)

Top onefile Demos

1 It's More Fun to Com..  (9.7)
2 Party Elk 2  (9.7)
3 Cubic Dream  (9.6)
4 Copper Booze  (9.5)
5 TRSAC, Gabber & Pebe..  (9.5)
6 Rainbow Connection  (9.5)
7 Wafer Demo  (9.5)
8 Dawnfall V1.1  (9.5)
9 Quadrants  (9.5)
10 Daah, Those Acid Pil..  (9.5)

Top Groups

1 Nostalgia  (9.3)
2 Oxyron  (9.3)
3 Booze Design  (9.3)
4 Censor Design  (9.3)
5 Crest  (9.3)

Top Fullscreen Graphicians

1 Carrion  (9.8)
2 Joe  (9.8)
3 Duce  (9.8)
4 Mirage  (9.7)
5 Facet  (9.7)

Page generated in: 0.048 sec.