Re: A89: TIGCCLIB 2.0 released!
[Prev][Next][Index][Thread]
Re: A89: TIGCCLIB 2.0 released!
Hi!
> Considering that all the HW2 grayscale routines are branched off of the
one
> I wrote, I felt inclined to look at this one.
Mainly all grayscale routines I saw before are exactly the same as mine.
> Since I have a special version of VTI meant to emulate HW2 to some degree
> AND a HW2 calc (for those of you who don't know, my calc was stolen and
> my new one is HW2), I have to point out that it's gonna produce some
> pretty shitty grayscale.
My previous grayscale produces quite fine picture on HW2-VTI, but the sheet
on real HW2 (this is a routine from TIGCCLIB 1.1). This grayscale routine
produced shitty (flickering) grayscale on HW2-VTI, but my beta testers said
to me that the picture is absolutely stable on real HW2 (I also check the
grayscale demo on one HW2 calc, and it was stable).
> I don't mean to insult you or your code; technically, it's somewhat
> functional. But, as you know, we're copying a buffer to LCD_MEM every
> fourth time int1 is called. That's incredible processor intensive, even
> with our extra 2 MHz on HW2.
Yes.
> [on a side note: I see __gmax is being used as the value to reset the
> counter, __gcnt, to. Why?
Maybe you see that __gmax is 2 on HW2 and 3 on HW1. Look at the
instruction move.l %d1,__gcnt. It also sets __gmax...
> a higher number could be used to intentionally flip between two
> images), then it's understandable - but you don't do that.
I am sorry, but I can't understand what do you talking about in this
sentence...
> So why not make it a constant 3
Because it is not a constant. Btw, when it WAS a simple constant 3, it
flickered enormously on real HW2 calc (but not on HW2-VTI). I concluded
that it MUST be 2 on HW2 and 3 on HW1. Julien Muchembled tells this to
me the same thing. I don't know why.
> and save the few precious clock cycles necessary to read
> from a relative address?
10 clock cucles loss 87.5 times per second is degradation of
only 0.0087% which is really not too much...
> While that routine is optimal size-wise, it's just too slow to be
anything
> but extremely ugly. First of all, you can't afford to have the interrupt
> determine which routine to use. Instead, have two separate routines, and
> have the grayscale manager determine which one to install.
Again, degradation of speed is about 0.013%, as I can calculate.
> Second, the loop that I quoted above MUST be unrolled.
Universal OS and DoorsOS do not perform loop unrolling too!
> I recommend fully unrolling it, or, if you insist on maintaining some
> kind of balance between speed and size, limit it to only 8 loops at
> most.
I had an idea to retain small size with unrolled loop: allocate a buffer
and construct an unrolled loop in it in the initialization routine :-)
But, authors of shells (Julien and Xavier) told to me that their
experience gives the conclusion that unrolling is not necessary.
> That loop is currently using up 1,442
> clock cycles and being called 87.5 times per second. That's 126,179
> clock cycles per second - MASSIVE speed loss.
Seems massive, but when you turned it in percent, it is 1.26% :-)
Not noticable in practice.
> If I'm reading your code correctly, there is one MAJOR BUG in your
grayscale
> code. On HW2, LCD_MEM is constantly being written to and therefore
anything
> drawn directly there fails. However, on HW1, LCD_MEM is plane 0.
> Therefore, many programmers assume that anything written to LCD_MEM will
> draw to plane 0. TIOS functions, by default, draw to LCD_MEM. Thus many
> programmers assume they can call drawstrxy or any other TIOS function
right
> after turning on grayscale and it will draw to the screen, which is NOT
true
> on HW2!
This is not bug but a feature: this is noted in the documentation. Cite:
"In a grayscale mode, don't assume that any plane is on LCD_MEM due to
HW2 support..."
> I suggest that you call PortSet and set the drawing area to plane 0
> when grayscale is turned on, and PortRestore() when it is turned off.
Good idea. Btw, you can see that PortRestore is called on turning off.
> You may want to leave out those port changes and put them in the
> documentation, as a few programs won't need them, but that number is so
few
> and the number of bad programmers who will ignore documentation is so
many
> that I highly recommend that you have the routines do it.
I accept your idea. Btw, the common rule is: "if really nothing else helps,
then read the documentation" :-)
> And now, a question or two for the master: Is there any way I can use
> constants/defines from the C code in inline ASM? Since the ASM code
> is in a string, it's not going to be modified by the compiler in any
> way, so anything in there will be interpreted literally, correct?
There IS a method, but very awkward (look for example my definition of
_ram_call macro in compat.h). Define a following macro:
#define _str_(x) #x
Suppose that you have the following define:
#define LCD_MEM 0x4C00
Then, the following statement:
asm("lea "##_str_(LCD_MEM)##",%a0")
or even shorter
asm("lea "_str_(LCD_MEM)",%a0")
will expand to
asm("lea 0x4C00,%a0")
I hope that this is what you want to ask.
> Also, I see you don't use volatile anywhere - when exactly will
> the compiler attempt to optimize the ASM code itself (and possibly
> screw it up)?
The assembler will not screw up any ASM code which is defined out of
any function. Only code which is embedded into functions will be
eventually optimized.
> And that's my lengthy discourse for the day - Zeljko, I mean no offence
> to your code; I'm just providing some constructive criticism in an
> attempt to prove it
No problems, I am open for any criticism, especially for constructive
ones :-) And, I think that you want to say "improve" instead of "prove"...
> (and get it compatible with my calc! =)
Have you really tried it on your real calc or only on HW2-VTI? As I
said, HW2-VTI behaves wrongly!!!
Cheers,
Zeljko Juric