The following program allows you to build up a catalogue of books, records, or telphone numbers. It illustrates how a computer can be used to store information, sort it, and retrieve it on command.
A collection of records of information on a computer is called a "database"; in the following examples we will assume that the records consist of names and telephone numbers. The program allows you to associate a telephone number with each name. You can then find out someones telephone number by typing their name, or as many letters of their name as are needed to identify them uniquely. As an example, assume we are entering the telephone numbers of five people. The symbol '>' is used to prefix a name to be entered:
>RUN COMMAND?>DEWAR J ?123 2234 COMMAND?>SMITH P S ?0223 314341 COMMAND?>NORTH Q ?119 2389 x191 COMMAND?>BOND J ?456 7789 COMMAND?>WEST A ?145 3456
We can then ask for an alphabetical list of the whole database, using the "*" command:
COMMAND?* BOND J 456 7789 DEWAR J 123 2234 NORTH Q 119 2389 x191 SMITH P S 0223 314341 WEST A 145 3456
Alternatively, we can ask for the telephone number of any person in the database:
COMMAND? NORT NORTH Q 119 2389 x191 COMMAND?S SMITH P S 0223 314341 COMMAND?DEWER NOT FOUND
An attempt to enter a name already entered will give a warning message:
COMMAND?>BOND J BOND J ALREADY EXISTS
Program Operation
The simplest way to store the records in the computer would be as a straightforward list. However, it would then be necessary to search through the whole list every time a new record was entered, or a record was being searched for. It would also be difficult to produce an alphabetical list of records, without first sorting all the records into order, which would be very time-consuming.
The Catalogue program therefore holds the records in a more sophisticated structure, called a 'tree'. Associated with every record are two 'pointers', which can be set to point to other records, or can be marked as pointing to nothing. As new records are entered, a tree is built up. Names lying earlier in the alphabet are inserted on the left-hand side of the tree, and names later in the alphabet on the right-hand side of the tree.
To see how this works in practice, a tree is built up with the five names given above. The first record goes at the top of the tree, and its two pointers are set to zero, indicated here by 'x', to indicate that there are no further records below it:
DEWAR / \ X X
Suppose we then add SMITH. This is later in the alphabet than DEWAR, and so it is attached to the right-hand branch of the tree:
DEWAR / \ X \ SMITH / \ X X
Next we add NORTH. First the name is compared with DEWAR, and since it is later in the alphabet we follow the right-hand branch. Then it is compared with SMITH. Since it is before SMITH in the alphabet we follow the left-hand branch. We have reached the end of the tree, and so NORTH is added there:
DEWAR / \ X \ SMITH / \ / X NORTH / \ X X
Finally, we add BOND and WEST, and the tree looks like this:
DEWAR / \ X \ SMITH / \ NORTH WEST / \ / \ X X X X
When searching the tree for a name we follow the same procedure, until either the name is found, or the end of a branch is reached in which case the message NOT FOUND' is given.
The advantage gained by using a tree structure depends somewhat on the shape of the tree; with a perfectly balanced tree of 31 records, a maximum of five comparisons need to be made to find any record, as opposed to 31 with a simple list of records. As the number of records increases, the saving becomes even greater.
A tree can be printed in alphabetical order by using the following simple recursive procedure:
To print the tree starting at a certain name
a. print the tree pointed to by the left-hand pointer,
b. print the name, and print the tree pointed to by the
right-hand pointer.
In the case of the tree above we obtain:
BOND, DEWAR, NORTH, SMITH, WEST
BBC Computer Version
5 REM ... Catalogue ... 10 DIM M%64,T%3:ZS=&3C00:F%=&1500:HIMEM=F% 20 !T%=0 30 REPEAT PRINT CHR$(12)"command"
Input command line. Commands: "
* - Print tree
> - Add name to tree.
40 INPUT $M% 50 IF?M%=ASC("*")PROCPRINT(!T%):GOTO 130 60 IF?M%=ASC(">")GOTO 100
Name on its own - search for it. If a close match is found print it (PROCPRINT), else say 'NOT FOUND'.
70 N%=M%:X%=T%:S%=1:E%=0:PROCSEARCH 80 IF E% PROCPUT:GOTO 130 90 PRINT "NOT FOUND":GOTO 130
Add name to tree. If no match (not E) then all is well; otherwise the name already exists.
100 N%=M%+1:X%=T%:K%=0:S%=0:PROCSEARCH 110 IF NOT E% UNTIL FALSE 120 PRINT $N%;" ALREADY EXISTS" 130 K%=GET:UNTIL FALSE
PROCSEARCH - Search tree. Operation depends on value of S:
S%=0 - Add name $N% to tree and input telephone number when the
end of a branch is reached.
S%=1 - Search tree for name, and return on a match of first few
characters.
1000 REM SEARCH TREE 1005 DEF PROCSEARCH 1010 IF !X%=0 GOTO 1100
Follow down pointer at X%, and compare name $J on tree with $N. If they match, return. If $N>$J, move X% to right-hand pointer; then keep searching down tree.
1020 X%=!X%:J%=X%+8:PROCCOMP 1030 IF E% ENDPROC 1040 IF C% X%=X%+4 1050 GOTO 1010
End of branch. If S=l give up since the name is not found; otherwise add the name $N to the tree here input the telephone number.
1100 IF S% ENDPROC 1110 Y%=X%:!X%=F%:X%=!X%:!X%=0:X%!4=0 1120 X%=X%+8:$X%=$N%:XS=X%+LEN($X%)+1 1130 INPUT $X%:XS=X%+LEN($X%1+1 1140 IF(X% AND &FFFF)>Z% PRINT "NO ROOM":! Y%=0: K%=GET: ENDPROC 1150 F%=X%:ENDPROC
PROCCOMP - Compare strings J% and N% character by character, and set C% to 1 if $N%>$JS. If searching the tree, settle for $N% matching the first few characters of $JS; if adding a name to the tree, insist on an exact match.
2000 REM COMPARE $ J% AND SNAB 2010 DEF PROCCOMP I%=-1:REPEAT I%=I%+1 2020 UNTIL J%?I%<>N%?I% OR N%?I%=13 2030 C%=(J%?I%<N%?I%) 2040 IF S% El=(MS?I%=13):ENDPROC 2050 E%= ( $J%=$N% ): ENDPROC
PROCPRINT - Print tree from Q% downwards. Print tree on left-hand branch, print record at Q%, then print tree on right-hand branch.
4005 DEF PROCPRINT(Q%) 4010 IF Q%=0 ENDPROC 4020 PROCPRINT(!Q%) 4040 J%=Q%+8:PROCPUT:Q%=Q%+4 4050 PROCPRINT(!Q%):ENDPROC
PROCPUT - Print record at J%. Tabulate telephone number to column 20.
5000 REM PRINT RECORD 5010 DEF PROCPUT PRINT $J%; 5020 J%=J%+LEN($J%)+1:PRINT TAB(20);$J%:ENDPROC
Variables
E% - Equal flag; E%=1 if a match is found F% - Next free memory location J% - Name string on tree M% - Command line string N% - Name string Q% - Local parameter for PROCPRINT S% - Search flag: S%=1 means search tree; S%=0 means T% - Pointer to top of tree X% - Pointer to current position on tree Y% - Pointer before adding name to tree Z% - Top of available memory
Atom Version
This version of the catalogue program is substantially identical to the version for the BBC Computer. The only major difference is that the routine to print the tree has to save the value of X before re-entering itself recursively, since on the Atom procedures do not have local parameters.
5 REM ... CATALOGUE ... 10 DIM M(64),T(3),F(-1) 20 !T=0;Z=#3BFF 30 DO PRINT $12"command"
Input command line. Commands: "
* - Print tree
> - Add name to tree.
40 INPUT $M 50 IF?M=CH"*"X=!T;GOSUB p;GOTO n 60 IF?M=CH">"GOTO a
Name on its own - search for it. If a close match is found print it (GOSUB q), else say 'NOT FOUND'.
70 N=M;X-Z;S=l;E=0;GOSUB s 80 IF E GOSUB q;GOTO n 90 PRINT "NOT FOUND"';GOTO n
a - Add name to tree. If no match (not E) then all is well; otherwise the name already exists.
100aN=M+1;X=T;E=0;S=0;GOSUB s 110 IF E:1 UNTIL 0 120 PRINT $N" ALREADY EXISTS" 130nLINK#FFE3;UNTIL 0
s - Search tree. Operation depends on value of S:
S=0 - Add name $N to tree and input telephone number when the end
of a branch is reached.
S=1 - Search tree for name, and return on a match of first few
characters.
1000sREM SEARCH TREE 1010 IF !X=0 GOTO b
Follow down pointer at X, and compare name J on tree with $N. If they match, return. If $N>$J, move X to right-hand pointer; then keep searching down tree.
3020 X=!X;J=X+8;GOSUB c 1030 IF E RETURN 1040 IF C X=X+4 1050 GOTO s
b - End of branch.
If S=1 give up since the name is not found; otherwise add the
name $N to the tree here, input the telephone number.
1100bIF S RETURN 1110 Y=X; !X=F; X=!X; !X=0; X!4 =0 1120 X=X+8; $X=$N;X=X+LENX+1 1130 INPUT$X; X=X+LENX+1 1140 IF X&#FFFF>Z PRINT "NO ROOM"; Y=0; LINK #FFE3; RETURN 1150 F=X; RETURN
c - Compare strings J and N character by character, and set C to 1 if $N>$J. If searching the tree, settle for $N matching the first few characters of $J; if adding a name to the tree, insist on an exact match.
2000cREM COMPARE $ J AND $N 2010 I=-1;DO I=I+1 2020 UNTIL J?I<>N?I OR N?I=13 2030 C=(J?I<N?I) 2040 IF S E=(N?I=13);RETURN 2050 E=($J=$N)gRETURN
p - Print tree from X downwards. First save X on stack. Print tree on left-hand branch, recover X and print record at X, then print tree on right-hand branch.
4000pREM PRINT TREE 4010 IF X=0 RETURN 4020 !F=X; F=F+2; X=!X; GOSUB p 4030 F=F-2;X=!F 4040 J=X+8;GOSUB q; X=X+4 4050 X=!X; GOTO p
q - Print record at J. Tabulate telephone number to column 20.
5000qREM PRINT RECORD5010 PRINT $J;DOPRINT " ";UNTIL COUNT=20 5020 J=J+LENJ+1;PRINT $J'gRETURN
Variables:
E - Equal flag; E=l if a match is found F - Next free memory location J - Name string on tree M - Command line string N - Name string S - Search flag: S=l means search tree; S=0 means add to tree T - Pointer to top of tree X - Pointer to current position on tree Y - Pointer before adding name to tree Z - Top of available memory
Further Suggestions
The program is not restricted to names and telephone numbers; in fact either field may be any sequence of up to 64 characters, and the whole of the first field is used for the alphabetical ordering.
A useful extension to the program would allow the database to be saved and loaded to and from tape. The values of !T% and F% (or !T and F in the Atom version) should be saved with the tree, since these give the address of the top of the tree and the next free location in memory respectively.