An occasional essay, meant to inflame interesting discussion.
0-based and 1-based arrays
--------------------------
In a typical 0-based array, the allowable index range is from 0 to n -
1.
In a typical 1-based array, the allowable index range is from 1 to n.
0-based arrays are compulsory in C, and hence in Mac Toolbox calls.
1-based arrays are compulsory in Fortran and its derivatives such as
MatLab, and in some scripting languages.
Some languages, such as VisualBasic, allow a choice of 0- or 1-based.
Some languages, such as Pascal, allow upper and lower array bounds to
be any integer or (almost) any scalar.
Arrays in FB and C
------------------
FB's arrays are 0-based, but with an extra element. Storage is reserved
for 1 more element than the stated size. The declaration
dim as long myArray(10)
reserves memory for the 11 elements myArray(0) through myArray(10).
The allowable index range is from 0 to n. This unusual definition may
have been chosen -- many years ago -- so that simple programs could be
written with either 0-based code or 1-based code.
Arrays in C are typical 0-based. The declaration
int myArray[10];
reserves memory for the 10 elements myArray[0] through myArray[9].
The allowable index range is from 0 to n - 1.
Failure to understand this difference between FB and C is responsible
for many FB bugs. Commonly they arise in the course of
- passing an array to a Toolbox function, or
- translating a C struct or Toolbox function declaration into FB.
The outcome is often a nasty kind of "off-by-one" error, in this case
reading or writing an element that lies just outside the array's
declared bounds.
The effects of accessing an out-of-bounds array element are undefined.
This kind of bug may cause an immediate crash (if you're lucky) or
random bad behaviour (if you're not). It can be quite a challenge to
debug random crashes of a program that otherwise seems to work
properly.
1-based and 0-based code in FB
------------------------------
Some programmers write 1-based code in FB. In so doing they ignore the
0th element and take advantage of FB's "extra" array element.
_maxSize = 100
dim as long myArray(_maxSize)
dim as long j, n
n = something // 0 ≤ something ≤ _maxSize
long if ( n != 0 )
for j = 1 to n
// do something with myArray(j)
next
end if
Every sensible programmer turns on Array bounds checking (at least
during development), because it can catch so many easily-fixed
blunders. With the 1-based convention above, it would in general be a
bug if the program accessed myArray(0), but the bounds-checker cannot
flag that error.
I have found that overall I make fewer programming mistakes if I
consistently treat FB's arrays as 0-based.
_maxSize = 100
dim as long myArray(_maxSize)
dim as long j, n
n = something // 0 ≤ something ≤ _maxSize
j = 0
while ( j < n )
// do something with myArray(j)
j++
wend
The code can access myArray(0) through myArray(99), but not the "extra"
hundred-and-first element myArray(100). Although the presence of the
extra element is often harmless, I prefer (when I remember) to
eliminate it by declaring:
dim as long myArray(_maxSize - 1)
With that declaration, the compiler-inserted array bounds-checking code
is reliable. Both the compiler and I agree that the array contains
exactly 100 accessible elements, and any transgression beyond the
bounds will be flagged at runtime.
For an entertaining discussion of the 0-based 1-based issue, see
<http://c2.com/cgi/wiki?ZeroAndOneBasedIndexes>
Robert P.