(Apologies to Tom T. Hall; I couldn’t resist, and it leaves room for a followup.)
Recently a gentleman asked me whether I have a CoCo 3 (more formally, a TRS-80 Color Computer 3). He was curious about whether I could run a couple of benchmarks both in Color BASIC and in BASIC09.
I do have a CoCo 3, but I need more space to excavate the relevant hardware and to set it up. I’m doing well to have space for my desktop computer, a Raspberry Pi 3 B+, and a FPGA-based CoCo 3-like system roughly as big as the RPi, the Matchbox CoCo.
I also have MAME installed on my desktop computer, and know from past experience that its CoCo emulations are very accurate when it comes to execution speed. (We’ll see a confirmation of that shortly.) So we fired up MAME and went looking for the benchmarks: one from Creative Computing and one from BYTE.
David Ahl’s Small Benchmark
David Ahl, founder of the great personal computer magazine Creative Computing, wrote a short and simple benchmark in BASIC. (The Wikipedia article for “Creative Computing Benchmark” has a good set of references.)
It has a crude test of the random number generator, but the major part of it repeatedly adds the integers from 1 to 100, with a twist: each such number has its 1024th root taken via repeated square root evaluation, and the result is raised to the 1024th power, via repeated squaring with the BASIC exponentiation operator. This definitely dominates execution time, and one can get some idea of how accurate the square root and exponentiation operators are by how far off the sum is from 5050–or rather how far a fifth of the sum is from 1010. (I don’t know why it’s done that way, unless it’s to add one last potential source of error.)
For Color BASIC, I used the code straight from Ahl’s article, save for leaving out the comments. I timed it with a stop watch, and such is the talent of the folks behind MAME and its CoCo emulator that the result, including run time (2:23), exactly matched the CoCo entry in the article.
For BASIC09, I initially left out declarations, but had to initialize R and S, and I added lines to print DATE$ (whose value is a string holding the current date and time in the format yy/mm/dd hh:mm:ss) at the start and end. The source:
PROCEDURE Ahl PRINT DATE$ s:=0 r:=0 PRINT " Accuracy Random" FOR n:=1 TO 100 a:=n FOR i:=1 TO 10 a:=SQRT(a) r:=r+RND(0) NEXT i FOR i:=1 TO 10 a:=a^2 r:=r+RND(0) NEXT i s:=s+a NEXT n PRINT ABS(1010-s/5);" ";ABS(1000-r) PRINT DATE$
19/05/17 18:05:56 Accuracy Random 1.06906891E-03 2.6322155 19/05/17 18:06:23
So it took somewhere between twenty-six and twenty-eight seconds. that’s between about five and five and a half times as fast as the Color BASIC version, though to be fair we should note that Color BASIC comes up running at 0.89 MHz, but on the CoCo 3 OS-9 (and NitrOS-9, which is what I use now) switch to the 1.78 MHz clock rate the CoCo 3 is capable of. The “accuracy” result is about 1.8 times that of the Color BASIC version. (On the other hand, the BASIC09 version is almost as fast as the IBM PC and does far better for accuracy.)
Exponentiation is by far the slowest operation in this benchmark, and the most error -inducing. After all, it’s typically done in three steps: take the logarithm of the base, multiply by the exponent, and raise to the power (each step increasing the error). Given that we’re raising it to a constant integer power, in particular squaring it, the only reason anyone would use exponentiation rather than simple multiplication is to be consistent with the benchmark as it was originally written. Just changing “a:=a^2” to “a:=SQ(a)” (SQ is a BASIC09 function that squares its argument) gives as result
19/05/17 18:35:07 Accuracy Random 9.45091248e-04 .203981891 19/05/17 18:35:13
decreasing the error as expected while cutting the run time to between five and seven seconds.
Let’s give Color BASIC a chance to show how it will run with a similar change. With “A=A*A” instead of “A=A^2”, it has an “accuracy” figure of 1.93357468E-04, roughly a third of the value using exponentiation, a “random” value of 7.3876276, and runs in 1:29, almost a minute faster… but note that in the versions without exponentiation, BASIC09 is now between 12.7 and 17.8 times as fast as Color BASIC. Removing the major bottleneck, the slowest arithmetic operation in BASIC, shows off BASIC09’s advantages over Color BASIC that much more–so that declaring the loop control variables to be INTEGER instead of letting them default to REAL might be worthwhile.
BYTE‘s Prime Sieve
In BYTE v.6 #9, Jim Gilbreath wrote about looking for a benchmark and settling on a sieve of Eratosthenes program. Unlike Ahl’s benchmark, the intent was to compare languages, and the article is fun to read for the listings of various versions, e.g. PL/M, Ratfor, Forth, and even COBOL.
There is a BASIC version, but running it in Color BASIC gives an immediate “OM” error thanks to the array of 8192 five-byte floating point values used to hold what are effectively Boolean values. It would have to be rewritten to use an 8192-character string to fit on the CoCo…but Color BASIC strings have a maximum length of 255 characters. All right, how about an array of enough floating point values to be at least 8192 bytes? It’s a thought, and Color BASIC initializes values to zero–but I’ve found no way to get the address of a variable or array in any version of Color BASIC, so I won’t bother.
BASIC09, on the other hand, has a Boolean type, and a Boolean only takes up one byte of memory, so it’s easy to write:
PROCEDURE sieve BASE 0 DIM i,k,prime,count:INTEGER; flags(8191):BOOLEAN PRINT DATE$ PRINT "Only one iteration" FOR i:=0 TO SIZE(flags)-1 flags(i):=TRUE NEXT i count:=0 FOR i:=0 TO SIZE(flags)-1 IF flags(i) THEN prime:=i+i+3 FOR k:=i+prime TO SIZE(flags)-1 STEP prime flags(k):=FALSE NEXT k count:=count+1 ENDIF NEXT i PRINT count;" primes" PRINT DATE$
This deserves a little explanation: since the BASIC original uses zero as a subscript, we use the BASIC09 “BASE 0” statement to permit zero subscripts. I’ve recently found out that in Color BASIC, given an array dimensioned with DIM x(3), you actually have a four element array, with subscripts running from 0 to 3 inclusive. If that’s documented anywhere, I have yet to find it.
And the output:
19/05/18 07:37:48 Only one iteration 1899 primes 19/05/18 07:38:03
So the run time is somewhere between fourteen and sixteen seconds. The BASIC09 “dir” command shows that the code size is 323 bytes, and the data size is 8229 bytes; support for a reasonable set of primitive types makes the difference between easily fitting in a 64K address space and an OM error.
(By the way, the BYTE article has tables of results for various processors and languages, with Z80 systems split out into their own table. The best Z80 time, using a Digital Research PL/I compiler, is fourteen seconds… ERRATUM: I didn’t notice that the PL/I code did ten iterations, and it’s not fair to compare ten iterations in PL/I with one in BASIC09.)
Thanks to Maury Markowitz for asking about benchmarks and to archive.org for the archives of computing magazines.