[futurebasic] 0-based and 1-based arrays

Message: < previous - next > : Reply : Subscribe : Cleanse
Home   : April 2005 : Group Archive : Group : All Groups

From: Robert Purves <robert.purves@...>
Date: Wed, 27 Apr 2005 18:45:49 +1200
An occasional essay, meant to inflame interesting discussion.

0-based and 1-based arrays
--------------------------
In a typical 0-based array, the allowable index range is from 0 to n - 
1.
In a typical 1-based array, the allowable index range is from 1 to n.

0-based arrays are compulsory in C, and hence in Mac Toolbox calls.
1-based arrays are compulsory in Fortran and its derivatives such as 
MatLab, and in some scripting languages.
Some languages, such as VisualBasic, allow a choice of 0- or 1-based.
Some languages, such as Pascal, allow upper and lower array bounds to 
be any integer or (almost) any scalar.

Arrays in FB and C
------------------
FB's arrays are 0-based, but with an extra element. Storage is reserved 
for 1 more element than the stated size. The declaration
    dim as long  myArray(10)
reserves memory for the 11 elements myArray(0) through myArray(10).
The allowable index range is from 0 to n. This unusual definition may 
have been chosen -- many years ago -- so that simple programs could be 
written with either 0-based code or 1-based code.


Arrays in C are typical 0-based. The declaration
    int  myArray[10];
reserves memory for the 10 elements myArray[0] through myArray[9].
The allowable index range is from 0 to n - 1.


Failure to understand this difference between FB and C is responsible 
for many FB bugs. Commonly they arise in the course of
  - passing an array to a Toolbox function, or
  - translating a C struct or Toolbox function declaration into FB.
The outcome is often a nasty kind of "off-by-one" error, in this case 
reading or writing an element that lies just outside the array's 
declared bounds.

The effects of accessing an out-of-bounds array element are undefined. 
This kind of bug may cause an immediate crash (if you're lucky) or 
random bad behaviour (if you're not). It can be quite a challenge to 
debug random crashes of a program that otherwise seems to work 
properly.

1-based and 0-based code in FB
------------------------------
Some programmers write 1-based code in FB. In so doing they ignore the 
0th element and take advantage of FB's "extra" array element.

_maxSize = 100
dim as long  myArray(_maxSize)
dim as long  j, n

n = something  // 0 ≤ something ≤ _maxSize
long if ( n != 0 )
  for j = 1 to n
    // do something with myArray(j)
   next
end if

Every sensible programmer turns on Array bounds checking (at least 
during development), because it can catch so many easily-fixed 
blunders. With the 1-based convention above, it would in general be a 
bug if the program accessed myArray(0), but the bounds-checker cannot 
flag that error.


I have found that overall I make fewer programming mistakes if I 
consistently treat FB's arrays as 0-based.

  _maxSize = 100
  dim as long  myArray(_maxSize)
  dim as long  j, n

  n = something  // 0 ≤ something ≤ _maxSize
  j = 0
  while ( j < n )
    // do something with myArray(j)
    j++
  wend

The code can access myArray(0) through myArray(99), but not the "extra" 
hundred-and-first element myArray(100). Although the presence of the 
extra element is often harmless, I prefer (when I remember) to 
eliminate it by declaring:

  dim as long  myArray(_maxSize - 1)

With that declaration, the compiler-inserted array bounds-checking code 
is reliable. Both the compiler and I agree that the array contains 
exactly 100 accessible elements, and any transgression beyond the 
bounds will be flagged at runtime.


For an entertaining discussion of the 0-based 1-based issue, see
<http://c2.com/cgi/wiki?ZeroAndOneBasedIndexes>

Robert P.