Re: A89: matrix
[Prev][Next][Index][Thread]
Re: A89: matrix
That's awesome! But I think I found one error that (hehe) everyone can
assume is a typo, right?
> dbra d1,loop ; Do the loop if needed
Isn't d1 being used as one of the transfer registers? I think you meant to
put d0 in here. Correct me if I'm wrong, but I think d1 would change it's
value every loop, and you'll end up moving too much, or too little of the
memory you wanted to move.
On a side note: why not use the register a7?
-Miles Raymond EML: m_rayman@bigfoot.com
ICQ: 13217756 IRC: Killer2 AIM: KilIer2 (kilier2)
http://www.bigfoot.com/~m_rayman/
----- Original Message -----
From: Zoltan Kocsi <zoltan@bendor.com.au>
To: <assembly-89@lists.ticalc.org>
Sent: Thursday, July 08, 1999 5:24 AM
Subject: Re: A89: matrix
> Olle Hedman writes:
> > you dont have much of a choice more than move.l (Ax)+,(Ay)+
>
> Actually, you do. If you have lots of stuff to move then a little more
> cost on the preparation/cleanup side doesn't matter if you can save
> heaps on the actual transfer. In that case this might help:
>
> movem.l d1-d7/a2-a6,-(sp) ; Save all registers
> loop:
> movem.l (a0)+,d1-d7/a2-a6 ; Suck in 48 bytes at once
> movem.l d1-d7/a2-a6,(a1)+ ; Store them at destination
> movem.l (a0)+,d1-d7/a2-a6 ; Suck next 48 bytes in
> movem.l d1-d7/a2-a6,(a1)+ ; Store at destination
> ...
> dbra d1,loop ; Do the loop if needed
> movem.l (sp)+,d1-d7/a2-a6 ; Restore registers
>
> With the move.l method you waste 1 bus cycle for the insn fetch for
> every 4 bytes moved. With the movem.l method you waste 4 cycles for
> every 48 bytes moved, that is, your bus bandwith loss goes from 20%
> to only 7.7%. Of course if your block size is known a priori and it
> is small enough to warrant a loop unroll, then your d0 becomes free
> so you can move up to 52 bytes per 2 insns, which further decreases
> the bandwidth waste to 7.1%. If you need absolutely everything that
> is possible, you can disable the interrupts, save a7 to some known
> location and include a7 in the transfer too - your wasted bandwith
> will reach the ever low 6.7%.
>
> It's an old trick which was worth to do on a 68000. With the advent of
> the 68010 it went out of fashion for the 68010 and all further CPUs
> have a loop mode (or equivalent) where data blocks can be moved
> without insn fetches slowing down the transfer (i.e. your copy speed
> is only limited by the actual transfer speed of the bus). On the old
> 68000, however, the above method was quite popular when you needed
> that few extra bus cycles.
>
> Regards,
>
> Zoltan
Follow-Ups:
References: