Log inRegister an accountBrowse CSDbHelp & documentationFacts & StatisticsThe forumsAvailable RSS-feeds on CSDbSupport CSDb Commodore 64 Scene Database
You are not logged in - nap
CSDb User Forums


Forums > C64 Coding > Stacking multicolour layers in assembly
2024-02-15 22:20
Krill

Registered: Apr 2002
Posts: 2854
Stacking multicolour layers in assembly

Consider 3 single-coloured multicolour layers, such that, e.g.,

00 or 01 - layer 1
00 or 10 - layer 2
00 or 11 - layer 3 (with 00 being background or transparent).

Now, how to merge them, rendering one over/on top of the other (no "glenz"-like colour blending, particular layer ordering isn't important as long as any kind of priority regime is preserved, and background/transparent may not be 00) using only binary arithmetic or other primitives, but no lookup tables?

With the above example, it's some kind of max operation on bitpairs, with something like
  |00 01 10 11
--------------
00|00 01 10 11
01|01 01 10 11
10|10 10 10 11
11|11 11 11 11
but this doesn't seem to map very well to the 6502's operations. =)
2024-02-15 22:40
Oswald

Registered: Apr 2002
Posts: 5026
doesnt seem to make sense to do this "brute force"

table or animate or somehow cheat it.

2 layers can only merge in 256 combinations and 16 more for the final one.


ldx table_4pixels_from_layer1_4pixels_from_layer2
lda table_4pixelsfromlayer3,x
sta

if you dont use all possible combinations of pixels then it can be cheated into a single 8 bit table 3+3+2bits for example.
2024-02-15 22:44
Krill

Registered: Apr 2002
Posts: 2854
If it's not possible, i'd like to see some kind of elegant formal proof in a few sentences. =)
2024-02-15 23:00
Oswald

Registered: Apr 2002
Posts: 5026
Quote: If it's not possible, i'd like to see some kind of elegant formal proof in a few sentences. =)

first please notice you didnt say what exactly you are looking for. if its turing complete anything is possible.

edit: oh god damn, okay now I see it my bad.
edit2: seems like a good candidate for xy problem
2024-02-15 23:12
Krill

Registered: Apr 2002
Posts: 2854
Quoting Oswald
edit2: seems like a good candidate for xy problem
It's formulated as an academic question in this thread, but the origin is..., well, just having but 2 index registers. =)
2024-02-15 23:17
chatGPZ

Registered: Dec 2001
Posts: 11145
That has to be possible with some bitfiddling...maybe perhaps :) Why no lookup table though? :)
2024-02-15 23:17
Oswald

Registered: Apr 2002
Posts: 5026
Quote: Quoting Oswald
edit2: seems like a good candidate for xy problem
It's formulated as an academic question in this thread, but the origin is..., well, just having but 2 index registers. =)


then it really IS an XY problem :D dont think you will find a nice way, cheat it, or go around it. but lets see what the experts have to say :)
2024-02-15 23:20
Krill

Registered: Apr 2002
Posts: 2854
Quoting chatGPZ
That has to be possible with some bitfiddling...maybe perhaps :) Why no lookup table though? :)
The question is precisely about that bitfiddling! :)

Quoting Oswald
dont think you will find a nice way, cheat it, or go around it. but lets see what the experts have to say :)
Yes, that's what i thought when i created this thread. =)

(XY problem is irrelevant in this context.)
2024-02-15 23:21
Krill

Registered: Apr 2002
Posts: 2854
<i misclicked something>
2024-02-15 23:23
Krill

Registered: Apr 2002
Posts: 2854
Quoting Krill
Quoting chatGPZ
That has to be possible with some bitfiddling...maybe perhaps :) Why no lookup table though? :)
The question is precisely about that bitfiddling! :)
And no tables, yeah, i hate swapping registers in and out in tight unrolled inner loops. Maybe the bitfiddling solution, if it exists, is surprisingly terse and elegant? Who knows!
2024-02-15 23:27
chatGPZ

Registered: Dec 2001
Posts: 11145
It has to be bytes composed of 4 2bit pairs, right?
2024-02-16 00:22
Krill

Registered: Apr 2002
Posts: 2854
Quoting chatGPZ
It has to be bytes composed of 4 2bit pairs, right?
Yes. :) Ready to be displayed by VIC.

(But if you have something on your mind that works but ignores this constraint, go ahead any say. =D)
2024-02-16 01:05
Martin Piper

Registered: Nov 2007
Posts: 645
1) Large 256x256 table in cartridge. Do this lookup once per byte.
2) Isolate the two colour bits, use a 256x4x4 byte table (based on four colours and four shifted pixel positions), do this 4 times for each byte. Unroll for each pixel position optimises the table usage.
3) Use tiny tables and small code, but loop each pixel like below:
.sprColMaskTab
	!by %00000011
	!by %00001100
	!by %00110000
	!by %11000000

; Merge from: SpriteWorkingByte
; Into: SpriteFinalByte
; Only when the final pixel is clear.
MergeTwoColourBytes
	ldy #3
.l2
	; Front to back ordering
	lda SpriteFinalByte
	and .sprColMaskTab,y
	bne .no0

	; Final, destination, is empty so merge in the working source pixel
	lda SpriteWorkingByte
	and .sprColMaskTab,y
	ora SpriteFinalByte
	sta SpriteFinalByte

.no0
	dey
	bpl .l2


4) Unroll the above, remove index registers for the sprite mask lookups use constants and immediate mode, add index registers for the sprite/char/bitmap data access.

5) Use bit and keep the current pixel byte in A to avoid loading it back again.
2024-02-16 02:01
chatGPZ

Registered: Dec 2001
Posts: 11145
what about this
; 00 or 01 - layer 1
; 00 or 10 - layer 2
; 00 or 11 - layer 3 (with 00 being background or transparent).
; 
; dst = ((layer1 | layer2) & (~(layer2 >> 1))) | layer3

lda layer2
lsr
eor #$ff
sta tmp

lda layer1
ora layer2
and tmp

ora layer3
sta dst

(There is probably a clever way to make this faster, with illegals perhaps)
2024-02-16 02:27
chatGPZ

Registered: Dec 2001
Posts: 11145
... and if you allow selfmod and one page table this becomes
    lda layer2
    sta lut_selfmod
 
    ora layer1
lut_selfmod = *+1
    and lut            ; (n >> 1) ^ 0xff
 
    ora layer3
    sta dst

(suggested by Noobtracker)

edit: another suggestion by Noobtracker - that table can be located in the zeropage ($00 needs to be $ff, $01 is not used. Many other locations are unused too, so still possible to use zp variables)

Now i expect some cool routine using this from you :)
2024-02-16 03:07
chatGPZ

Registered: Dec 2001
Posts: 11145
... and if you can afford to use X:

    lax layer2
 
    ora layer1
    and lut,x          ; (n >> 1) ^ 0xff
 
    ora layer3
    sta dst

... :)
2024-02-16 04:21
ChristopherJam

Registered: Aug 2004
Posts: 1381
My ten minute take, without having read the replies.

layer 3 trivially ORs over the others, so the tricky part is layering layer 2 over layer 1
We need high bits on l2 to mask low bits on l1, so a trivial implementation is
    lda l2
    lsr
    eor#$55   ; low bits are only set if high bits were clear
    and l1    ; bring in l1, but only if l2 was transparent
    ora l2
    ora l3


If we can change the inputs to

01 or 10 - layer 1
00 or 11 - layer 2
00 or 11 - layer 3 (with 00 being background or transparent).

and ensure carry is set on entry, then this is slightly shorter, and also preserves carry

    lda l1
    ora l2
    sbc#$55
    ora l2
    ora l3
2024-02-16 04:36
ChristopherJam

Registered: Aug 2004
Posts: 1381
Quoting Krill
using only binary arithmetic or other primitives, but no lookup tables?

^^ (emphasis mine)
2024-02-16 04:57
CyberBrain
Administrator

Posts: 392
Very cool solutions!

Here is a solution if you don't want to use a lookup table or need the X-reg for something else, *BUT* it is allowed to change the bit-patterns that are used in layer1 and layer2 (the result will still have the correct patterns).
Then you could combine layer 1 and 2 with ADD and AND, in the same number of cycles as Groepaz's, something like this: (i hope. i'm tired)

If we use these bitpattern replacements for layer 1 and layer 2:
	layer1_00 = %01		// <- in layer 1: instead of %00 we use %01
	layer2_00 = %01         // <- in layer 2: --||--
	layer1_01 = %00		// <- in layer 1: instead of %01 we use %00
	layer2_10 = %10         // <- in layer 2: we still use %10 as %10 :'(
Then we can do: (Note that the bitpairs can never overflow, since we at most add 01 and 10)
	dst = ((layer1 + layer2) & layer2) | layer3
Why this stupid crap? Because if we try it for all possible (legal) bitpair-values we get the right result:
	(layer2_00 + layer1_00) & layer2_00 = (01 + 01) & 01 = 10 & 01 = 00
	(layer2_00 + layer1_01) & layer2_00 = (01 + 00) & 01 = 01 & 01 = 01
	(layer2_10 + layer1_00) & layer2_10 = (10 + 01) & 10 = 11 & 10 = 10
	(layer2_10 + layer1_01) & layer2_10 = (10 + 00) & 10 = 10 & 10 = 10
Code:
	lda layer1
	adc layer2 // assume C=0. C=0 afterwards always, since no overflow
        and layer2
	ora layer3
	sta dst
2024-02-16 09:31
Krill

Registered: Apr 2002
Posts: 2854
Very nice!

Ignoring all the approaches using tables as per OP premise (thank you, CJam), CyberBrain's solution seems to be the fastest so far.
             ; fg bg
 4 lda layer1; 00:01 00 00 01 01 <- will be bg 00:01 fg
 4 adc layer2; 10:01 01 10 01 10 <- will be bg 00:10 fg
             ;       01 10 10 11
 4 and layer2;       01 10 01 10
             ;       01 10 00 10
 4 ora layer3
16
Can it get any more concise and elegant? :)
2024-02-16 10:32
Oswald

Registered: Apr 2002
Posts: 5026
one could just brute force check all logical / add commands and order of layer loads if any of the combination works :) fantastic solution, never thought it can come down to 4 instructions.
2024-02-16 10:41
Krill

Registered: Apr 2002
Posts: 2854
Quoting Oswald
fantastic solution, never thought it can come down to 4 instructions.
Yes!

Only way to speed it up would be to somehow get rid of the second access to layer2.
Either completely or by replacing it with some immediate operation.
2024-02-16 10:59
Oswald

Registered: Apr 2002
Posts: 5026
how about:

lax layer1
axs #$ff-layer2 ;X:=A&X-#{imm}
lda table_+layer3,x

edit: ok final step not gonna fly, but food for thoughts

edit2:

lax layer1
axs #$ff-layer2 ;X:=A&X-#{imm}
txa
ora layer3
2024-02-16 11:17
Krill

Registered: Apr 2002
Posts: 2854
No tables and no index registers, please. =)
2024-02-16 11:26
Oswald

Registered: Apr 2002
Posts: 5026
Quote: No tables and no index registers, please. =)

2024-02-16 13:58
Martin Piper

Registered: Nov 2007
Posts: 645
Hmm, are you sure you want to process one pixel, across three layers, at a time? Not process the whole byte and utilise optimisations processing 4 pixels in one go?
2024-02-16 14:18
ChristopherJam

Registered: Aug 2004
Posts: 1381
Yes, excellent work CyberBrain!

lol @ Oswald

Martin - my and CyberBrain's solutions do process the whole byte and generate 4 pixels in one go.
2024-02-16 15:02
CyberBrain
Administrator

Posts: 392
Thx - same to you! My solution was based on Groepaz's/Noobtrackers very nice solutions and Groepaz's excellent breakdown of the problem, and those solutions also process whole bytes (4 pixels/bitpairs) at a time. Fun little riddle, btw - i'm sure it has no practical use whatsoever and is just a little brain teaser? :)
2024-02-16 15:19
Oswald

Registered: Apr 2002
Posts: 5026
I can imagine 3 layers additive like in many miggy vector fx, can be static scrolling texture, or even scroller.. and as Gunnar wants registers free maybe 3 zoomscrollers ? :P :)
2024-02-16 15:23
chatGPZ

Registered: Dec 2001
Posts: 11145
Ha! Cool stuff. Cjam and Cyberbrain kinda picked up where NT and me stopped, because tired :)

Now i really want to see what you make with it, Krill :=)
2024-02-16 15:26
Martin Piper

Registered: Nov 2007
Posts: 645
CyberBrain...

lda layer1
adc layer2 ; Assume C = 0
and layer2


If:
layer1 = 01
layer2 = 00 (transparent)
layer3 = 00 transparent

Doesn't that produce 0, which forgets that layer 1 already has a colour?
2024-02-16 15:27
chatGPZ

Registered: Dec 2001
Posts: 11145
Quote:
I can imagine 3 layers additive like in many miggy vector fx, can be static scrolling texture, or even scroller.. and as Gunnar wants registers free maybe 3 zoomscrollers ? :P :)

Layered chessboard zoomers - but with Z rotator!

GOGOGO! :D
2024-02-16 15:35
Krill

Registered: Apr 2002
Posts: 2854
Quoting chatGPZ
Quote:
I can imagine 3 layers additive like in many miggy vector fx, can be static scrolling texture, or even scroller.. and as Gunnar wants registers free maybe 3 zoomscrollers ? :P :)

Layered chessboard zoomers - but with Z rotator!

GOGOGO! :D
That was EXACTLY what i had in mind before i stumbled over this problem. :)

(But it was a sidetrack so i just asked the experts rather than spending too much time on this one.)
2024-02-16 15:39
The Syndrom

Registered: Aug 2005
Posts: 56
Quote: CyberBrain...

lda layer1
adc layer2 ; Assume C = 0
and layer2


If:
layer1 = 01
layer2 = 00 (transparent)
layer3 = 00 transparent

Doesn't that produce 0, which forgets that layer 1 already has a colour?


@martin piper
you probably overlooked the twisted input bits of layer 1&2:

>If we use these bitpattern replacements for layer 1 and layer 2:
>layer1_00 = %01 // <- in layer 1: instead of %00 we use %01
>layer2_00 = %01 // <- in layer 2: --||--
>layer1_01 = %00 // <- in layer 1: instead of %01 we use %00
>layer2_10 = %10 // <- in layer 2: we still use %10 as %10 :'(
2024-02-16 16:48
chatGPZ

Registered: Dec 2001
Posts: 11145
Quote:
Can it get any more concise and elegant? :)


Noobtracker to the rescue :)

lda layer1 ; 10/11 (becomes 00/01)
and layer2 ; 01/10 (becomes 00/10)
ora layer3 ; 00/11
2024-02-16 17:03
ChristopherJam

Registered: Aug 2004
Posts: 1381
Gorgeous!
2024-02-16 17:21
CyberBrain
Administrator

Posts: 392
Damn, that's cool! And elegant! This is one of those things, that makes you go "why th did i not see that?!" after you see it :) Well done, the quest has been solved!
2024-02-16 17:36
Oswald

Registered: Apr 2002
Posts: 5026
lda layer1 ; 10/11 (becomes 00/01)
and layer2 ; 01/10 (becomes 00/10)
ora layer3 ; 00/11

I dont see it layer1 and layer2 will make everything 0 where there is a 0, while what is needed a kind of max function per bitpair. and how can an lda perform an operation ?

edit: unless it is using the bitpair scramble offered
2024-02-16 18:22
chatGPZ

Registered: Dec 2001
Posts: 11145
Now that Y is solved, i want to see X. AT X :D
2024-02-16 19:29
JackAsser

Registered: Jun 2002
Posts:
Quote: lda layer1 ; 10/11 (becomes 00/01)
and layer2 ; 01/10 (becomes 00/10)
ora layer3 ; 00/11

I dont see it layer1 and layer2 will make everything 0 where there is a 0, while what is needed a kind of max function per bitpair. and how can an lda perform an operation ?

edit: unless it is using the bitpair scramble offered


This works fine Oswald, very neat solution which I definitely will use at some point.
L3	L2	L1	(L1&L2|L3)
00	01	10	00
00	01	11	01
00	10	10	10
00	10	11	10
11	01	10	11
11	01	11	11
11	10	10	11
11	10	11	11


@Oswald: In layer 1 on=10, off=11, in layer 2 on=01, off=10, in layer 3 on=11, off=00
2024-02-16 20:26
Copyfault

Registered: Dec 2001
Posts: 467
First of all: nice question raised by Krill, and what a brilliant solution found by Noobtracker. I'd opt for granting him access to csdb, did never really understand why he got banned...

In order to unconfuse Oswald, the meaning of the bitpairs should be put right in your explanation, JA.

In Layer 1, a set (or foreground) pixel is encoded by %11, while a transparent one is encoded by %10.
For Layer 2, it will be %10 for a set pixel, while %01 does the job for a transparent one here.
Finally, Layer 3 is just like described: %11 is the bitpair for a set pixel, %00 for a transparent pixel.

The table given by JA is correct. It shows that Noobtracker's golden opcode trio always gives bitpairs that can be used to set the pixels in the standard encoding.


Now let's try to shorten it even more *evilgrinplusshallowlaughter*

CF
2024-02-16 20:30
JackAsser

Registered: Jun 2002
Posts:
Quote: First of all: nice question raised by Krill, and what a brilliant solution found by Noobtracker. I'd opt for granting him access to csdb, did never really understand why he got banned...

In order to unconfuse Oswald, the meaning of the bitpairs should be put right in your explanation, JA.

In Layer 1, a set (or foreground) pixel is encoded by %11, while a transparent one is encoded by %10.
For Layer 2, it will be %10 for a set pixel, while %01 does the job for a transparent one here.
Finally, Layer 3 is just like described: %11 is the bitpair for a set pixel, %00 for a transparent pixel.

The table given by JA is correct. It shows that Noobtracker's golden opcode trio always gives bitpairs that can be used to set the pixels in the standard encoding.


Now let's try to shorten it even more *evilgrinplusshallowlaughter*

CF


Oops! Sorry for the typo!
2024-02-16 23:21
HCL

Registered: Feb 2003
Posts: 717
Are you guys trying to do what Graham did in Dawnfall (1995).. ;)
2024-02-16 23:28
Oswald

Registered: Apr 2002
Posts: 5026
that just shows how brilliant graham was 1995 goddamn
2024-02-16 23:43
Krill

Registered: Apr 2002
Posts: 2854
Quoting HCL
Are you guys trying to do what Graham did in Dawnfall (1995).. ;)
No.

Afaict, Dawnfall had 2 layers, either with one of them having 2 colours (with what appears to be some temporal blur from the previous frame), or both of them blended together.

Unless i missed something, there aren't 3 independent solid stacked layers.
2024-02-16 23:57
Krill

Registered: Apr 2002
Posts: 2854
Quoting chatGPZ
Quote:
Can it get any more concise and elegant? :)


Noobtracker to the rescue :)

lda layer1 ; 10/11 (becomes 00/01)
and layer2 ; 01/10 (becomes 00/10)
ora layer3 ; 00/11
Brilliant! \=D/

Wonder if there are more suitable combinations than these bitpairs...
2024-02-17 04:57
Martin Piper

Registered: Nov 2007
Posts: 645
Applying a bit twiddle before the logical operations is rather similar to how hardware design makes some logical operations simpler, use fewer gates or use gates of a particular type, by introducing not gates before or after.

Interesting.

If the layer values were coming from three routines that calculate bytes at a time for each layer, this would be quite quick for a nice effect.
2024-02-17 19:58
chatGPZ

Registered: Dec 2001
Posts: 11145
Quote:
Wonder if there are more suitable combinations than these bitpairs...

The most obvious thing would be to invert all values, and swap AND with OR (that always works) :)
lda layer1 ; 01/00 (becomes 11/10)
ora layer2 ; 10/01 (becomes 11/01)
and layer3 ; 11/00 (becomes 11/00)

(this opens the door for storing with fixed layer3 in X and SAX)
2024-02-17 20:44
chatGPZ

Registered: Dec 2001
Posts: 11145
and noobtracker was busy, so for the records:
; bg   l1   l2   l3
; 00 < 01 < 10 < 11
           ;bg/fg
lda layer1 ;10/11
and layer2 ;01/10
ora layer3 ;00/11
 
; bg   l1   l2   l3
; 00 < 10 < 01 < 11
           ;bg/fg
lda layer1 ;01/11
and layer2 ;10/01
ora layer3 ;00/11
 
 
; bg   l1   l2   l3
; 01 < 00 < 10 < 11
           ;bg/fg
lda layer1 ;11/10
and layer2 ;01/10
ora layer3 ;00/11
 
; bg   l1   l2   l3
; 10 < 00 < 01 < 11
           ;bg/fg
lda layer1 ;11/01
and layer2 ;10/01
ora layer3 ;00/11
 
 
; bg   l1   l2   l3
; 01 < 10 < 00 < 11
           ;bg/fg
lda layer1 ;01/10
and layer2 ;11/00
ora layer3 ;00/11
 
; bg   l1   l2   l3
; 10 < 01 < 00 < 11
           ;bg/fg
lda layer1 ;10/01
and layer2 ;11/00
ora layer3 ;00/11


(plus all the inverted versions, as said above)
2024-02-18 12:25
Krill

Registered: Apr 2002
Posts: 2854
Would be cool if the bottom layers 1 and 2 could be EORed.. =)
2024-02-18 16:23
chatGPZ

Registered: Dec 2001
Posts: 11145
Are you crowdsourcing your coding now? :D
2024-02-18 23:42
Krill

Registered: Apr 2002
Posts: 2854
Quoting chatGPZ
Are you crowdsourcing your coding now? :D
Still beats AI when it comes to coding! :)

But seriously, was more like "bummer that it won't work with EOR".
(No formal proof but strong guts feeling.)
2024-02-19 02:38
chatGPZ

Registered: Dec 2001
Posts: 11145
Please define what exactly you want it to do (making the second op an OR and then just use EOR is trivial...)
2024-02-19 08:29
Mixer

Registered: Apr 2008
Posts: 422
This reminds me of the eor fill.
2024-02-19 21:42
JackAsser

Registered: Jun 2002
Posts:
Just had to check how I did my 4-layer chesszoomer in Super Larsson Bros back in 2008. Totally forgot how I did it and only remembered that I used 3 layers in the chars (Stacking MC layers) and one in sprites.

In that code I can only scale down to char-sized checkers and I only move in MC resolution, hence a checker-char for one layer can be one of 8 different chars. I use 8 512-byte tables to figure out the final pixel-values indexed by A+(B<<3)+(C<<6) which combines into ((A&~B)&0x55) | (B&0xaa) | C. I have two sets of these 8 tables, one for opaque rendering and one for additive blending. It's 8 tables because of 8 combinations of odd/even checkers for each of the three layers.
2024-02-21 18:08
Krill

Registered: Apr 2002
Posts: 2854
Quoting JackAsser
... tables to figure out ...
You lost me there. :)
2024-02-21 18:22
Oswald

Registered: Apr 2002
Posts: 5026
"a checker-char for one layer can be one of 8 different chars."

I am lost already here. chars?
2024-02-21 21:08
JackAsser

Registered: Jun 2002
Posts:
Quote: "a checker-char for one layer can be one of 8 different chars."

I am lost already here. chars?


<Off-topic since this is a table based approach>

Checker-char, as the possible values for a byte in a scaled checker board line.
0: 11111111
1: 11111100
2: 11110000
3: 11000000
4: 00000000
5: 00000011
6: 00001111
7: 00111111

If you only scale down do char-size checkers with a motion in x of 2 pixels you have only these 8 combinations.

Three of those layers yields 8*8*8 = 512 combinations to stack them, or blend, or whatever operation you wanna do in those tables.

However, using the lda/and/or trick here is probably faster but requires different scalers to produce the correct bitpairs for each of the layers.
2024-02-22 10:50
WVL

Registered: Mar 2002
Posts: 886
But dear Jackasser, with those same chars you can zoom down to 6 pixel wide chars.
2024-02-22 11:33
Oswald

Registered: Apr 2002
Posts: 5026
can we stop calling it chars ? maybe a stride ?
2024-02-22 15:01
JackAsser

Registered: Jun 2002
Posts:
Quote: But dear Jackasser, with those same chars you can zoom down to 6 pixel wide chars.

Yes yes I know but didn't bother to update the text. :)
2024-02-22 15:01
JackAsser

Registered: Jun 2002
Posts:
Quote: can we stop calling it chars ? maybe a stride ?

Technically they ARE chars, but only line 7 visible. :P
2024-02-23 09:09
Oswald

Registered: Apr 2002
Posts: 5026
Quote: Technically they ARE chars, but only line 7 visible. :P

technically they are bytes in a table, which you read to update the "chars".
2024-02-23 15:38
chatGPZ

Registered: Dec 2001
Posts: 11145
Jackasser might be referring to FPP though :)
2024-03-08 22:19
HCL

Registered: Feb 2003
Posts: 717
Again.. Dawnfall surely has 3 independent layers.. I sneaked into the code but haven't quite figured out how it is done.. except that he is *not* drawing lines and eor-filling it :)..
2024-03-09 09:17
Oswald

Registered: Apr 2002
Posts: 5026
I find it hard to believe you dont know how it works, esp since Jackasser (your teammate) has released various effects showcasing the same tech and in my thinking this is common knowledge amongst the top coders :)
2024-03-09 10:29
Krill

Registered: Apr 2002
Posts: 2854
Quoting HCL
Again.. Dawnfall surely has 3 independent layers..
If it has, why is there not a single effect that makes it very clear? :)
2024-03-09 11:59
chatGPZ

Registered: Dec 2001
Posts: 11145
One variant of the rotzoomer has the bars rotating "over" each other (requiring layers) and not just "temporal blur".
2024-03-09 12:56
Krill

Registered: Apr 2002
Posts: 2854
Quoting chatGPZ
One variant of the rotzoomer has the bars rotating "over" each other (requiring layers) and not just "temporal blur".
Have you checked the code? Layers, yes, but only two. Two of the three bars are rather strongly tied together, thus not independent.
2024-03-09 12:58
chatGPZ

Registered: Dec 2001
Posts: 11145
Na, can't be bothered :)

I expect you to implement it for X though :=)
2024-03-09 16:01
Krill

Registered: Apr 2002
Posts: 2854
Quoting chatGPZ
I expect you to implement it for X though :=)
It's on my TODO list, but not placed very prominently. :)
2024-03-09 23:48
HCL

Registered: Feb 2003
Posts: 717
Oh @Oswald, you're doing that trick on me again.. Ok, i will check the code again, and understand it, and then i will tell you exactly how it is done :D
2024-03-09 23:52
Krill

Registered: Apr 2002
Posts: 2854
Quoting HCL
Oh @Oswald, you're doing that trick on me again.. Ok, i will check the code again, and understand it, and then i will tell you exactly how it is done :D
I may still have the "source" to E2E4K somewhere. =)
2024-03-10 16:56
HCL

Registered: Feb 2003
Posts: 717
So, i've checked the code in Dawnfall again, and it turns out Krill was right!! First there is precalced "graphics" for different slopes, seems to be two pages ($200 byte) for each.. Perhaps 16 different versions of it for different zoom-factors i would guess..

Then the actual copying of gfx is different for all five versions of the effect, but *none* of them calculate more than two layers of "gfx" per iteration, and many of them reuse gfx from the last iteration.. like this:

First version (in hires):
lda gfx,x
sta ->
..
lda gfx,x
eor # <-
sta VisualBuffer,y

Second version (multicolor):
lda VisualBuffer,y
asr #$aa ;            <- Effectively clears color 1 and then turns color 2 into color 1
ora gfx,x
sta VisualBuffer,y

Third version:
ldx VisualBuffer,y
lda TransferTable,x ; <- turns colors [0,1,2,3] into [0,0,1,2]
tsx
ora gfx,x
sta VisualBuffer,y

Fourth version:
ldx VisualBuffer+$780,y
lda TransferAndMirrorTable,x ; <- turns colors [0,1,2,3] into [0,0,1,2] and mirrors the byte
tsx
ora gfx,x
sta VisualBuffer,y

Fifth version:
..just like First version but multicolor.. and two versions of the gfx for color 1 and 2.

So.. sorry for interrupting this thread with something that was unrelated. Funny that i didn't figure this out earlier since that demo is almost 30 years old :P. But still, with the knowledge from this thread, we can now do it better with three independent layers!!
2024-03-11 08:06
Oswald

Registered: Apr 2002
Posts: 5026
lda gfx,x
sta ->
..
lda gfx,x
eor # <-
sta VisualBuffer,y

this is done because no 3 index registers the two lda's need different offset, and also graham is trading speed for memory.

seeing the amount cycles wasted on this it should be no problem doing 3 or even 4 layers.
2024-03-11 08:36
Krill

Registered: Apr 2002
Posts: 2854
Quoting Oswald
seeing the amount cycles wasted on this it should be no problem doing 3 or even 4 layers.
Every additional layer adds considerable cost, eating into the framerate.

It's basically just another 4.change cycles per layer and output byte, but that's the per-output-byte hot path. =)

Edit: And how would you render a 4th layer? ANDing out brick pixels again? Use some kind of dithering?
2024-03-11 11:26
Oswald

Registered: Apr 2002
Posts: 5026
sure I'm not saying extra layer has no cost, I think doing it without much slowdown is possible.

lda layer1,x
ora layer2,y
sta temp

lda temp
ora layer3,x
sta visual,y

+4 cycles, less than 1/3 of a frame, most c64 sceners wouldnt notice such slowdown on a ~ 25 fps effekt, which is where dawnfall chessrot is in its fastest form.
2024-03-11 13:24
Krill

Registered: Apr 2002
Posts: 2854
Quoting Oswald
sure I'm not saying extra layer has no cost, I think doing it without much slowdown is possible.
The original single hires checkerboard effect uses 50% CPU on each of the two stripe layers approx., so a third one would make it go from., e.g., 25 FPS to 16 FPS. Quite noticeable. :)

And for 3 stacked checkerboards you can expect a third of the original speed, around 8 FPS.

The question about the 4th layer was how you'd render it, given that the 3 colours plus background are already taken.
2024-03-11 13:40
Oswald

Registered: Apr 2002
Posts: 5026
Quote: Quoting Oswald
sure I'm not saying extra layer has no cost, I think doing it without much slowdown is possible.
The original single hires checkerboard effect uses 50% CPU on each of the two stripe layers approx., so a third one would make it go from., e.g., 25 FPS to 16 FPS. Quite noticeable. :)

And for 3 stacked checkerboards you can expect a third of the original speed, around 8 FPS.

The question about the 4th layer was how you'd render it, given that the 3 colours plus background are already taken.


ok you win, so it would be horribly slow, so for god's sake please nobody code it.

already with 3 layers it doesnt really add much visually, 4th can still eor ora or whatever despitve the screen having just 2 bit depth.
2024-03-11 13:49
Krill

Registered: Apr 2002
Posts: 2854
Quoting Oswald
so it would be horribly slow, so for god's sake please nobody code it.
Doesn't need to be in the same size and resolution as Dawnfall did, does it? :)

And 3 stacked rotating zooming checkerboards do look quite good on other platforms (where speed isn't much of an issue).
2024-03-11 14:20
Oswald

Registered: Apr 2002
Posts: 5026
Quote: Quoting Oswald
so it would be horribly slow, so for god's sake please nobody code it.
Doesn't need to be in the same size and resolution as Dawnfall did, does it? :)

And 3 stacked rotating zooming checkerboards do look quite good on other platforms (where speed isn't much of an issue).


you mean flying through 3 level deep rotating chessboards? 3 chessboards means 6 layers, doesnt feel its gonna fly not even in 4x4.
2024-03-11 14:42
chatGPZ

Registered: Dec 2001
Posts: 11145
Quote:
3 chessboards means 6 layers

?
2024-03-11 15:26
Oswald

Registered: Apr 2002
Posts: 5026
Quote: Quote:
3 chessboards means 6 layers

?


dawnfall chessboard is made of 2 rotating stripe layers, which are perpendicular to eachother. you need 6 load operations to make 3 chessboards this way. Krill called 1 stripe layer a layer :)
2024-03-11 15:35
Krill

Registered: Apr 2002
Posts: 2854
Quoting Oswald
dawnfall chessboard is made of 2 rotating stripe layers, which are perpendicular to eachother. you need 6 load operations to make 3 chessboards this way. Krill called 1 stripe layer a layer :)
I differentiate between "stripe layers" (half checkerboards) and "pixel layers" (full checkerboards), admittedly somewhat confusingly.

Anyways, in 4x4 there'd be no speed problem, and with a Dawnfall-like 16x16 multicolour tiles square, there are also techniques to make it somewhat smoother despite a low overall framerate.
2024-03-11 16:39
Oswald

Registered: Apr 2002
Posts: 5026
"3 stacked rotating zooming checkerboards "

so what does this mean? 3 chessboard or 3 stripes ?

looks good on other system? you mean smth like 2nd reality? could emulate amiga bitplane motion "blur", yeah that would look ace, but needs high fps.

4x4 is also $0800 bytes fullscreen like a 16x16 char matrix.
2024-03-11 17:08
Krill

Registered: Apr 2002
Posts: 2854
Quoting Oswald
"3 stacked rotating zooming checkerboards "

so what does this mean? 3 chessboard or 3 stripes ?
Which part of "checker""board" do you not understand? Forget the silly stripes for once, okay? :)

Quoting Oswald
looks good on other system? you mean smth like 2nd reality? could emulate amiga bitplane motion "blur", yeah that would look ace, but needs high fps.
Like the classic 2.5-D flight through the holes of checkerboards, but with added rotation about the depth axis.

Quoting Oswald
4x4 is also $0800 bytes fullscreen like a 16x16 char matrix.
$03e8 = 1000 bytes for 3+1 colours.
2024-03-11 18:26
Oswald

Registered: Apr 2002
Posts: 5026
and which part of this question you did not understand?

"you mean flying through 3 level deep rotating chessboards?"

because answering "you are using confusingly stripes and chessboards" is not an answer to this yes/no question.

frankly its totally pointless and tiresome to conversate with you, you are just looking for argumentative victory points, instead of exchanging information.

I must admit Its a thing I am guilty of myself aswell, maybe I am just looking at a mirror here.
2024-03-11 21:01
chatGPZ

Registered: Dec 2001
Posts: 11145
lol
2024-03-11 23:54
Krill

Registered: Apr 2002
Posts: 2854
Quoting Oswald
frankly its totally pointless and tiresome to conversate with you, you are just looking for argumentative victory points, instead of exchanging information.
You might confuse me with somebody else, and i wasn't aware it's a competition, even after you brought up "you win" a couple of posts above. :)

Anyways, as for some information (or hints thereof): Some back-of-the-envelope calculations i made a while ago seem to indicate (if i interpret them right) that it's quite possible to get a decent frame-rate, by rolling out quite a bit more code and data than Graham could afford in a one-filer, and some eye-fooling partial update techniques.
2024-03-12 01:47
chatGPZ

Registered: Dec 2001
Posts: 11145
That only increases my expectations to see this implemented in your 4k for X :o)
2024-03-12 08:33
Oswald

Registered: Apr 2002
Posts: 5026
Quote: Quoting Oswald
frankly its totally pointless and tiresome to conversate with you, you are just looking for argumentative victory points, instead of exchanging information.
You might confuse me with somebody else, and i wasn't aware it's a competition, even after you brought up "you win" a couple of posts above. :)

Anyways, as for some information (or hints thereof): Some back-of-the-envelope calculations i made a while ago seem to indicate (if i interpret them right) that it's quite possible to get a decent frame-rate, by rolling out quite a bit more code and data than Graham could afford in a one-filer, and some eye-fooling partial update techniques.


I also noticced in other threads you switched into this mode, you are only looking for your argument victory points and not for a meaningful conversation.

And for those points you are right no matter what.

Just look at our last dozen posts, I proposed the doability of 3 layers you say that would be too slow, but your 6 layers, now thats perfectly doable.

LOL
2024-03-12 10:34
Krill

Registered: Apr 2002
Posts: 2854
Quoting Oswald
I also noticced in other threads you switched into this mode, you are only looking for your argument victory points and not for a meaningful conversation.
Please message me next time you notice this (and keep the fuss out of public threads), thanks.
RefreshSubscribe to this thread:

You need to be logged in to post in the forum.

Search the forum:
Search   for   in  
All times are CET.
Search CSDb
Advanced
Users Online
Rock/Finnish Gold
Guests online: 70
Top Demos
1 Next Level  (9.8)
2 13:37  (9.7)
3 Mojo  (9.7)
4 Coma Light 13  (9.7)
5 Edge of Disgrace  (9.6)
6 Comaland 100%  (9.6)
7 Uncensored  (9.6)
8 No Bounds  (9.6)
9 Wonderland XIV  (9.6)
10 Bromance  (9.5)
Top onefile Demos
1 Layers  (9.7)
2 It's More Fun to Com..  (9.6)
3 Cubic Dream  (9.6)
4 Party Elk 2  (9.6)
5 Copper Booze  (9.6)
6 TRSAC, Gabber & Pebe..  (9.5)
7 Rainbow Connection  (9.5)
8 Dawnfall V1.1  (9.5)
9 Quadrants  (9.5)
10 Daah, Those Acid Pil..  (9.5)
Top Groups
1 Oxyron  (9.3)
2 Booze Design  (9.3)
3 Censor Design  (9.3)
4 Crest  (9.3)
5 Performers  (9.3)
Top Crackers
1 Mr. Z  (9.9)
2 Antitrack  (9.8)
3 OTD  (9.8)
4 S!R  (9.7)
5 Faayd  (9.7)

Home - Disclaimer
Copyright © No Name 2001-2024
Page generated in: 0.171 sec.