Words are sequences of letters in exactly the same way that numbers are sequences of digits, but one striking difference between words and numbers is that whereas every sequence of digits is a valid number, not every sequence of letters is a valid word. Thus, although it is simple to write a program to generate random numbers, it is impossible to write a program to generate random words; without, of course, providing the computer with a dictionary in the first place.
The following program illustrates this by taking a word and finding every possible permutation of the letters in that word. Among these permutations will be all the anagrams of that word. For example, for the letters OPST the program will give:
WORD?OPST THERE ARE 24 PERMUTATIONS: OPST OPTS OSPT OSTP OTPS OTSP POST POTS PSOT PSTO PTOS PTSO SOPT SOTP SPOT SPTO STOP STPO TOPS TOSP TPOS TPSO TSOP TSPO
If the original string is in alphabetical order, the program will produce the permutations in alphabetical order.
Program Operation
The permutation algorithm used by this program is one of the most efficient in producing a sequence of permutations in alphabetical order. It works as follows:
The first permutation is obviously the characters in alphabetical ordar; so, taking as an example the characters "ABCDEF", this is the first permutation. The last permutation is also obvious: it is sequence of characters in reverse order, FEDCBA".
To obtain the next permutation from any given sequence of letters, such as BCFEDA, we scan right-to-left looking a character that is smaller that the previous character. This will be called character 'I'. This character is then exchanged with the next higher character to its right, which will be called character 'J':
I J | | B C F E D A \ / \ / X / \ / \ B D F E C A
We then reverse the order of all the characters to the right of character I:
I J | | B D F E C A \ / \ / X / \ / \ B D A C E F
The result, "BDACEF", is the next permutation in alphabetical order.
BBC Computer Version
10 REM ... Anagrams 40 INPUT"STRING"A$
Set up array of letter positions such that CC%(1)=1, CC%(2)=2 ... CC%(N)=N . Calculate number of ermutations, factorial N.
50 N=LEN(A$):DIM CC%(N) 60 F=1 100 FOR J=1TON:CC%(J)=J:F=F*J:NEXT 102 PRINT "THERE ARE" F "PERMUTATIONS" '
Display first permutation. Then permute word, and display.
105 PROCDISPLAY 110 FOR H=2TOF:PROCPERMUTE:PROCDISPLAY:NEXT 120 END
PROCPERMUTE: permute word. Given any permutation in CC%(1) to CC%(N), a call to PROCPERMUTE will find the next permutation.
200 DEF PROCPERMUTE I=N-1: J=N 205 IF CC%(I)>=C%(I+1) I=I-1: GOTO 205 210 IF CC%(J)<=C%(I) J=J-1: GOTO 210 220 PROCSWAP 230 I=I+1:J=N:IF I=J ENDPROC 240 DO PROCSWAP: I=I+l: J=J-l:UNTIL I>=J 250 ENDPROC
PROCSWAP: Swap elements I and J.
300 DEF PROCSWAP T=CC%(I): CCS(I)=CCS(J) 310 CC%(J)=T: S=l-S: ENDPROC
PROCDISPLAY: Print permutation.
400 DEF PROCDISPLAY PRINT':FOR K=1TON 410 PRINT MID$(A$,CC%(K),CC%(K+1)); 420 NEXT:ENDPROC
Variables:
A$ - Word CC%(N) - Array being permuted; initially CC(N)=N F - Factorial N, number of permutations H - Current permutation number I,J - Items to be interchanged to get new permutation N - Number of letters in word S - Sign of current permutation
Variables:
Atom Version
The Atom version is virtually identical to the BBC version given above, except that the '?' operator is used instead of MID$ to extract characters from the word in $A (or A$).
30 DIMA(64) 40 INPUT"WORD"$A
Set up array of letter positions such that CC%(1)=l, CC%(2)=2 ... CC%(N)=N. Calculate number of permutations, factorial N.
50 N=LENA;DIM CC(N) 60 A=A-1;F=1;S=1 100 FOR J=1TON;CCJ=J;F=F*J;NEXT 102 PRINT "THERE ARE" F " PERMUTATIONS"
Display first permutation. Then permute word, and display.
105 GOSUBd 110 FOR H=2TOF;GOSUBp;GOSUBd;NEXT 120 END
p - permute word. Given any permutation an CC%(1) to CC$(N), a call to subroutine p will find the next permutation.
200pI=N-1; J=N 205 IF CC(I)>=CC(I+1) I=I-1; GOTO 205 210 IF CC(J)<=CC(I) J=J-1; GOTO 210 220 GOSUBs 230 I=I+1;J=N;IF I=J RETURN 240 DO GOSUBs; I=I+1; J=J-1;UNTIL I>=J 250 RETURN
s - Swap elements I and J.
300sT=CCI;CCI=CCJ;CCJ=T;S=l-S;RETURN
d - Print permutation.
400dPRINT ';FORK=1TON;PRINT $A?CCK;NEXT;RETURN
Variables:
$A - Word CC(N) - Array being permuted; initially CC(N)=N F - Factorial N, number of permutations H - Current permutation number I,J - Items to be interchanged to get new permutation N - Number of letters in word S - Sign of current permutation
Further Suggestions
The present program will give LEE twice in a list of the anagrams of the word EEL. An improvement would be to include a test for repeated latters in the original word.
Although this program will find all the permutations of a sequence of letters very rapidly, it is a very inefficient way of discovering that, for example, ORCHESTRA is an anagram of CARTHORSE because there are no less than 362880 permutations to test, and a much simpler method could be devised to verify that two words share the same letters.