[CSDb] - User Forums - Monochrome hires compression

You are not logged in - nap

CSDb User Forums

Forums > C64 Pixeling > Monochrome hires compression

2024-04-06 16:16

Frostbyte

Registered: Aug 2003
Posts: 169

Monochrome hires compression

I've been following the discussion about koala optimising (and compression) under the other topic, and this is somewhat related but not quite there so decided to start a new topic.

What would be the best methods of compressing monochrome hires bitmap? Particularly some graphics fitted into a charset, and maybe a dozen single colour sprites. Speed is not important, compressed size (incl size of decompressor) is everything.

I read ChristopherJam's description about his koala compressor, but tbh it is way too advanced stuff for me to understand and thus I don't know if it would be suitable, with modifications, for compressing monochrome images too. :)

... 21 posts hidden. Click here to view all posts....

2024-04-10 17:05

Oswald

Registered: Apr 2002
Posts: 5020

Quote: Quoting Oswald
thanks, bit too abstract for me, c64ised explanation would have been nice, but its not your job I get it :)
Anything on C-64 follows the theoretical groundwork, and it would probably be exceptionally hard to make sense of the C-64 decoder code alone. :)
(Also encoder part must match with same predictors etc., and would be written in something like Python or C or whatever.)

I always find that I understand better if there is an example, abstract math notation I hardly even have learned doesnt do it for me. like the derivative of wtf is the ideal predictor, anyway its not important, just an interesting topic.

2024-04-10 17:22

Krill

Registered: Apr 2002
Posts: 2847

Quoting Oswald

I always find that I understand better if there is an example, abstract math notation I hardly even have learned doesnt do it for me. like the derivative of wtf is the ideal predictor, anyway its not important, just an interesting topic.

The 3rd link is the most advanced. Start with the top-down prose Wikipedia article. =)

2024-04-10 18:01

Bansai

Registered: Feb 2023
Posts: 34

Quote: Hmm... come to think of of it...

CJam: What if the hires-bitmap-that-neatly-fits-into-charset to be compressed is in native tile-wise format, and the predictors would look at "adjacent" 8x8 blocks for the current char's top and left edge pixels, but for the "inner" pixels also look at previous 8x8 blocks (64-pixel runs at 8-byte aligned spots) for likely matches? Wouldn't that make the LZ layer superfluous again?

In other words: as long as all known pixels in an incomplete 8x8 block match a previous 8x8 block, expect this one to match it/them (and predict next pixel accordingly).

In some repects, character-based (or character-oriented) graphics can be viewed as a limited form of LZ with a static dictionary using fixed match lengths. If you look at the bitmap mode on TI-99, IIRC, you have to init a chunk of memory to 0..255 three consecutive times in order to instruct the display processor as to what would be three character sets for the 32x24 screen. That is, it's still really text mode, but after 8 rows of characters it flips to another character set giving the illusion of a bitmap mode.

Your approach is interesting as it makes what would be an implicit 0..255 style of mapping dynamic. I wonder if it would be advantageous to initialize the predictor with some number of most useful characters. Also, does a zigzag pixel scanning pattern similar to MPEG work better as it would seem to provide more locality from pixel to pixel as compared to when one skips to the next y row.

As far as LZ being superfluous, it's probably one of those best determined by experimentation issues.

2024-04-10 20:56

Pararaum

Registered: Sep 2018
Posts: 11

For monochrome pictures for the Amstrad CPC I used a variant of the encodings used in Fax machines. Alternating run-lengths of black and white pixels were Elias-Gamma encoded. This works well as the CPCs graphics memory is line based, see https://gitlab.com/pararaum/amstrad-cpc-graphics-converter.
The graphics-memory organization in the C64 makes it much harder.

2024-04-10 21:15

Krill

Registered: Apr 2002
Posts: 2847

Quoting Bansai

As far as LZ being superfluous, it's probably one of those best determined by experimentation issues.

I meant superfluous (or not) on top of/intermingled with PAQ-like compression. And as far as i've understood CJam, the LZ encoding cost function becomes very messy.

2024-04-11 09:27

ChristopherJam

Registered: Aug 2004
Posts: 1378

Biggest issue with combining lz with paq style compression is that the repeated strings start becoming quite cheap to just include as literals, as every time they recur the predictive encoder gets better at expecting subsequent characters within the string. Eventually most short runs of repeated bytes become cheaper as predictions than as backreferences. The savings produced by the optional LZ layer in bitpickler while compressing Onscreen 5k were barely enough to justify the extra code to interpret the LZ tokens.

Regarding improving locality with zigzag patterns, that's actually something we get for free to some extent just from the c64's (frequently otherwise annoying) bitmap layout. Every eight bytes defining a square area of bitmaps gives much better locality than we would have had from a linear 40 bytes per raster line.

KoalaCanner uses as predictors x(n-1) and x(n-4) (where x is the sequence of bitpairs in the raw bitmap), which is a pretty decent approximation to "the pixel above and the pixel to the left," while just using a single 8 bit value to store recent history :)

Krill - interesting idea about differing predictions for char interior vs borders. I suspect that would depend a lot on the art style and how it was authored - like, did the artist originally align things on eight pixel boundaries or not? I've been vaguely pondering getting an encoder to see if the next 8x8 area is an exact or near match to any 8x8 areas decoded so far, potentially without aligning the source to char boundaries (fine if speed is not an issue), and if there's only an approximate match then just encode the match location along with the pixels that need flipping (conceptually just an 8x8 bitmap to xor with the copied data, but there are various ways to encode that cheaply if it's mostly zeros).

Oswald and Krill - I think fundamentals of this prediction stuff probably should go in a seperate topic, and yes, codebase would be a good place for an article. I'll bump up the priority on that particular todo list item..

Bansai - very cool about the TI-99. An oft used technique on c64 as well of course - building a bitmap out of four or five character sets for ease of scrolling or plotting, especially for things like 80 column text mode terminals.

Almost time for a test corpus - Frostbyte, any thoughts on good representative images for the sort of thing you're after?

2024-04-11 09:46

Krill

Registered: Apr 2002
Posts: 2847

Quoting ChristopherJam

interesting idea about differing predictions for char interior vs borders. I suspect that would depend a lot on the art style and how it was authored - like, did the artist originally align things on eight pixel boundaries or not?

Assumption is that a bitmap intended to go into a single charset has many recurring 8x8 tiles.

I guess you need not look at borders vs interior, but more like mix probabilities of 2 contexts: pixel-wise neighbours and maybe-match to previous 8x8 tiles - that way, it'd be rather adaptive and alignment to tile borders or not may not be so important for compression ratio.

Quoting ChristopherJam

KoalaCanner uses as predictors x(n-1) and x(n-4) (where x is the sequence of bitpairs in the raw bitmap), which is a pretty decent approximation to "the pixel above and the pixel to the left," while just using a single 8 bit value to store recent history :)

For hires, i guess you can easily look up the already-decoded pixel predecessors in the partially decoded bitmap. =)

2024-04-11 09:58

ChristopherJam

Registered: Aug 2004
Posts: 1378

Quoting Krill

Assumption is that a bitmap intended to go into a single charset has many recurring 8x8 tiles.

This also works for "artist attempted to make something to fit in a charset, but only got it down to 280 characters" :)

Quote:

I guess you need not look at borders vs interior, but more like mix probabilities of 2 contexts: pixel-wise neighbours and maybe-match to previous 8x8 tiles - that way, it'd be rather adaptive and alignment to tile borders or not may not be so important for compression ratio.

True. There are various possibilities for mixing there - including doing things like using the first 8 pixels to limit the likely potential source characters to copy the remaining 8x7 from

Quote:

Quoting ChristopherJam
KoalaCanner uses as predictors x(n-1) and x(n-4) (where x is the sequence of bitpairs in the raw bitmap), which is a pretty decent approximation to "the pixel above and the pixel to the left," while just using a single 8 bit value to store recent history :)
For hires, i guess you can easily look up the already-decoded pixel predecessors in the partially decoded bitmap. =)

Indeed - and BitPickler already has a predictor of "previous byte and all the bits so far of the current byte" that would probably be a pretty good start for most bitmap images, as that's already looking at an 8x2 tile for seven bytes out of eight.

2024-04-11 10:14

Krill

Registered: Apr 2002
Posts: 2847

Quoting ChristopherJam

True. There are various possibilities for mixing there - including doing things like using the first 8 pixels to limit the likely potential source characters to copy the remaining 8x7 from

Could you also encode literal symbols every 64 pixels that would tell the decoder "here comes a copy" (or not), and in the copy case emit the backref pointer?
Or would that likely impair pack ratio, since no prediction (or little based on ratio of previous copies) is done at that point?

2024-04-11 10:20

ChristopherJam

Registered: Aug 2004
Posts: 1378

Yes :)

And, that's only 1000 bits that'd probably compress to well under the already paltry 125 bytes they'd occupy raw.

Previous - 1 | 2 | 3 | 4 - Next

Refresh

Subscribe to this thread: