Re: A89: Invalid opcodes (was Re: Addressing)
[Prev][Next][Index][Thread]
Re: A89: Invalid opcodes (was Re: Addressing)
> Correction:
> lsl.w #3,d2
> move.w d2,d3
> add.w d2,d2
> add.w d2,d3
> lea Pic0(pc,d3.w),a0
> is the fastest(?) possible sequence (if allowed). (I haven't looked at
using
> a LUT though.)
Fastest, but I believe a byte larger - I suppose it just depends on what you
want.
> The pictures are 12 bytes each but come in pairs (plane 0/plane 1).
>
> Both 'lsl.w #1,d2' and 'add.w d2,d2' are single word instructions, but
> 'add.w d2,d2' is twice as fast. I see no reason for using 'lsl.w #1,d2'
> here, or anywhere else! In fact, 'add.w d2,d2 / add.w d2,d2' is faster
than
> 'lsl.w #2,d2'! :)
Yup, it's supposed to be faster because the 89 lacks a barrell roller - but
again, I was thinking sizewise (I was writing some PJ VAT/menu routines at
the time, which obviously don't require speed).
> All the "a few clocks here and a few clocks there" optimizations are
pretty
> useless, unless KeysDezez inlines and optimizes put_sprite_mask for his
> special sprite size.
And since he's using a mask of $FF, there's no point in using
put_sprite_mask when he can just remove masking from the routine altogether.
If you want any speed at all, avoid graphlib putsprites.
> AFAIK, the Motorola syntax for absolute memory addressing is 'abs', not
> '(abs)'. The parentheses are not required, not even recommended.
Really? I thought that absolute addressing was always used as '(abs)' -
then I stand corrected again. I still used '(abs)' to avoid screwing up my
z80, though, and to make the source less confusing.
> > > jsr (a2)
But I supposed that you need parenthesis for register indirection?
> Compared to the original two 'jsr graphlib::put_sprite_mask', it saves a
few
> words, but I noticed later that it's slower.
The usual speed/size combo. I imagine that he does want speed here.
> Note that the optimization 'jsr/rts' => 'jmp' (and 'bsr/rts' => 'bra')
saves
> 16 clocks!
Yup, I love that one - save clocks and bytes. Use it!
> I believe you mean 24*(d2+1). But +12 is correct. The routine gets the
> second plane of the "same" sprite.
Thanks for the comments - but then why doesn't it work? =)
-Scott
References: