[futurebasic] Re: [FB] [FB^3] Intel Words and Longs to mac Words and Longs

Message: < previous - next > : Reply : Subscribe : Cleanse
Home   : September 1999 : Group Archive : Group : All Groups

From: Robert Purves <robert.purves@...>
Date: Mon, 27 Sep 1999 17:29:29 +1200
>>>I need to parse some image files from the dark side where words (2 bytes)
>>>and longs (4 bytes) are the reverse of their Mac counterparts. What I've
>>>come up with is below. Is there a better or faster method?

<reply and examples snipped>

>>Do you think that in FB^3 there would be any speed gains using PPC
>>assembler?

>Very little. A for-next loop with a bunch of POKEs & PEEKs compiles to
>pretty efficient code.
>At a guess, Andy's code to do that is
>going to be about as good as you'll be able to do in "raw" assembler.

As an old PPC assembly programmer,let me be the first to disagree with
Bill! FB^3  provides a nice tool for experimentation, by allowing a mix of
assembler and high level statements. Furthermore, and tactfully expressed,
the PPC code produced by the compiler gives considerable scope for
optimization.

To motivate others to explore PPC assembler, the code in the ready-to-run
FB^3 program below turns out to be 20-40 times faster than Michael's
originally posted method. On an iMac, it does 25-30 million byte-order
reversals per second.

Some tricks of the trade are not easily available to a compiler writer. The
loads are done with "indexed update" instructions lwzux or lhzux. For the
LONG case, a special PPC instruction is used for the reversal:

` lwzux r3,r5,r6; load long int
` stwbrx r3,0,r4; store byte-reversed

The corresponding SHORT case has to be done less elegantly, because of an
oversight in the assembler (reported to Staz).

Robert Purves

'---------A complete FB^3 program-----------------
_nMax = 3000000' for BIG arrays. Reduce if out-of-memory
DIM gMacLongs&(_nMax), gMacWords%(_nMax)
END GLOBALS

#IF cpuPPC' 68K not allowed
REGISTER OFF
LOCAL FN LONGByteRev(srcPtr&, destPtr&, numLongs&)
'Reads numLongs& 4-byte chunks starting at address srcPtr&.
'Stores them, byte-reversed, starting at address destPtr&.
'If srcPtr&=destPtr& the reversal occurs in place,
' overwriting the original.
LONG IF numLongs&>0
` lwz r6,^numLongs&
` mtspr ctr,r6; loop count
` addi r6,0,4; li r6,4
` lwz r5,^srcPtr&
` addi r5,r5,$FFFC; subi r5,r5,4
` lwz r4,^destPtr&
`lLoop
` lwzux r3,r5,r6; load long int
` stwbrx r3,0,r4; store byte-reversed
` addi r4,r4,4
` bc 16,0,lLoop; bdnz lLoop
END IF
END FN

REGISTER OFF
LOCAL FN SHORTByteRev(srcPtr&, destPtr&, numWords&)
'Reads numWords& 2-byte chunks starting at address srcPtr&.
'Stores them, byte-reversed, starting at address destPtr&.
'If srcPtr&=destPtr& the reversal occurs in place,
' overwriting the original.
LONG IF numWords&>0
` lwz r6,^numWords&
` mtspr ctr,r6; loop count
` addi r6,0,2; li r6,2
` lwz r5,^srcPtr&
` addi r5,r5,$FFFE; subi r5,r5,2
` lwz r4,^destPtr&
`wLoop
` lhzux r3,r5,r6; load half (i.e. SHORT)
//` sthbrx r3,0,r4  ;<--BUG   assembles wrongly as sthbrx r0,r3,r4
` stb r3,(r4); therefore store..
` srawi r3,r3,8; ..as two..
` stb r3,1(r4); ..reversed bytes

` addi r4,r4,2
` bc 16,0,wLoop; bdnz wLoop
end if
END FN
#ENDIF

DIM j&,now&
WINDOW 1
FOR j& = 0 to _nMax - 1' make _nMax example values
gMacLongs&(j&) = &H01020304
gMacWords%(j&) = &H0102
NEXT

now& = FN TICKCOUNT
FN LONGByteRev(@gMacLongs&(0), @gMacLongs&(0), _nMax)
now& = FN TICKCOUNT - now&
PRINT int(1e9*now&/60.0/_nMax) " ns each 4-byte variable"

now& = FN TICKCOUNT
FN SHORTByteRev(@gMacWords%(0), @gMacWords%(0), _nMax)
now& = FN TICKCOUNT - now&
PRINT int(1e9*now&/60.0/_nMax) " ns each 2-byte variable"

DO
HANDLEEVENTS
UNTIL FN BUTTON
'----------------------------------------------------