[futurebasic] [FB^3] Superfast counting

Message: < previous - next > : Reply : Subscribe : Cleanse
Home   : June 2000 : Group Archive : Group : All Groups

From: Robert Purves <robert.purves@...>
Date: Fri, 30 Jun 2000 19:23:24 +1200
Aficionados of fast counting, and other crazed speed addicts like myself,
may enjoy this comparison between counting with FB^3's FOR/NEXT and
counting with optimised PPC assembler.

Optimisations in the assembly code include:
€ register operations only, in the loop (elementary, of course)
€ loop unrolling by a factor of 20 minimises slowing by the loop counter,
comparison and branch instructions.
€ use of two counters. Processors with two integer arithmetic units (G3,
G4, don't know about 604) can execute two integer instructions
simultaneously.

This counts at more than 600 MHz on my 350 MHz iMac. It seems a long time
since I was amazed to find that an Apple II could count at a couple of
hundred kHz in 6502 assembler. The iMac is a mere 3000 times faster.

Remember to set _myClockMHz to the clock frequency of your Mac, and to run
this in Console (fast) mode.

Robert P.



'--- FB^3 program for Console (Fast) mode------
_myClockMHz = 350 // change to correspond to your machine
_nLoops     = 800000000
register on

local fn ASMLoops(nLoop as UInt32)
dim t as UInt32
t = fn tickcount
` mr         r31,^nLoop
` addi       r30,0,20 ; loop unrolled 20x
` divwu      r31,r31,r30 ; nLoop / 20
` addi       r3,0,0 ; init count1
` addi       r4,0,0; init count2
`loop
` addic.     r31,r31,-0x0001 ; decr loop counter
` addi       r3,r3,1
` addi       r4,r4,1
` addi       r3,r3,1
` addi       r4,r4,1
` addi       r3,r3,1
` addi       r4,r4,1
` addi       r3,r3,1
` addi       r4,r4,1
` addi       r3,r3,1
` addi       r4,r4,1
` addi       r3,r3,1
` addi       r4,r4,1
` addi       r3,r3,1
` addi       r4,r4,1
` addi       r3,r3,1
` addi       r4,r4,1
` addi       r3,r3,1
` addi       r4,r4,1
` addi       r3,r3,1
` addi       r4,r4,1
` bc         12,1,loop ; bgt
` add        r3,r3,r4 ; total count in r3
end fn = fn tickcount - t // elapsed time

local fn FBLoops(nLoop as UInt32)
dim j as UInt32, t as UInt32
t = fn tickcount
for j = 1 to nLoop
next j
end fn = fn tickcount - t  // elapsed time


local fn Report(theMethod as str15, nLoop as UInt32, ticks as UInt32)
dim nanosec as double
nanosec = ticks/60e-9/nLoop
print theMethod
print using "###.##"; nanosec " ns   ";
print using "###"; 1e3/nanosec " MHz   ";
print using "##.##"; 1e3/nanosec/_myClockMHz " * clock speed"
print
end fn

fn Report( "FBLoops", _nLoops/2, fn FBLoops(_nLoops/2) )
fn Report( "ASMLoops", _nLoops, fn ASMLoops(_nLoops) )
'---------------------------------------------------------