BF is Turing-complete

Alan Turing came up with the idea for the so called Turing Machine, a very simplistic computing model, which yet is powerfull enough to calculate all possible function which can be calculated. A Turing Machine consists of an endless tape of cells, which each can contain any of a given set of symbols, a pointer pointing to one of the cells, and a certain finite state control. Depending on the function to be calculated the finite state control determines how the tape is manipulated. It does this by reading the symbol under the pointer, and decides what to do: write a new symbol on the current cell, move the tape to the left, or to the right, and what will be the next state.

Each turing machine can be specified by the five elements:

Many interesting properties of Turing Machines have been proven so far. For example, that each Turing Machine can be transformed into a Turing Machine with a tape that is cut-off on one side, and endless on the other side. Or that every Turing machine can be transformed into a Turing machine with set of symbols contains just two symbols.

A language is said to be 'Turing-complete', if all functions which can be calculated with a Turing Machine, can be calculated with this program. There are basically three approaches to proof that a language is Turing-complete. These are:

  1. Show there is some mapping from each possible Turing machine to a program in the language.
  2. Show that there is a program in the language that emulates a Universal Turing Machine.
  3. Show that the language is equivalent with (or a superset of) a language that is known to be Turing-complete.
On the rest of the page, proofs of each of the above approaches will be given.


Translating Turing Machines to BF programs

We can proof that BF is Turing-complete, if we can show for every possible Turing machine, there is an equivalent BF program. Of course, we only need to show that a Turing Machine with a tape limited on one side can be simulated. What we need is an algorithm, which given a Turing machine, specified by the six elements given above, generates a BF program, which will simulate the Turing machine.

To achieve this, the set of symbols A is mapped onto the numbers 0 to n-1, where n equals the number of symbols, such that ainit maps onto zero. Likewise, the set of states S is mapped onto the numbers 0 to m-1, where m equals the number of states, such that sinit maps onto zero, and sfinal maps onto 1.

To simulate a Turing machine, we need an array with unlimited length to represent the tape, a index into that array representing the position of the head, and the current state. To program this in BF we will need to assign these to some memory locations. The position of the head will be stored in memory location head, and the current state in the memory location state. Of course, some additional temporary memory locations will be needed. Using the terminology introduced on the page An introduction to programming in BF, we need a program of the following form:

Now we only need to fill in the middle part of the loop. For each tupple in f we need to generate some code. This means that for each symbol i in the range 0 to n-1, and each state j in the range 0 to m-1, we have to insert a line:
 map(state,head,cursymbol,
i,j,k,l,m,
newstate,newhead,newsymbol)
Where, given the tupple f(si,aj), k is the number representing the new state, l is the number representing the new symbol, and m is equal to 1 if the movement is right and equal to 0 if the movement is left. The procedure map is defined as follows:
 map(state,head,cursymbol,
astate,asymbol,anewstate,anewsymbol,movement,
newstate,newhead,newsymbol)
= 
copy(state,t1,t2)
const(t2,astate)
Equal(t1,t2,t3,t4,t5)
if(t3)
copy(cursymbol,t1,t2)
const(t2,asymbol)
Equal(t1,t2,t3,t4,t5)
if(t3)
const(newstate,anewstate)
const(newsymbol,anewsymbol)
copy(head,newhead,t1)
ifelse(movement,t1)
inc(newhead)
else(movement,t1)
dec(newhead)
endelse(t1)
endif(t3)
endif(t3)
In this code the expression const(v,c) stands for the piece of code that fills the memory location v with the value c. Now, we only need to assign real values to the variable representing the memory locations to arrive at the final BF program executing the given Turing machine. Note that memory location assigned to base should be larger than all other memory locations used.


Universal Turing Machine

The text of this section are based on
this BF program with explaination<
written by Daniel B. Cristofani.

A Universal Turing Machine (UTM) is a Turing machine that can simulate some Turing-complete computational model. By giving a BF program which simulates a particular UTM, we proof that BF is Turing-complete. The UTM that we implement here is taken from from Yurii Rogozhin's article Small universal Turing machines, in Theoretical Computer Science, November 1996 (volume 168 pgs 215-240). This UTM simulate a Turing-complete class of tag-systems. A tag-system transforms strings over an alphabet A = {a[1],a[2],...,a[n], a[n+1]} as follows: a positive integer m is chosen, and so is a function P that maps each a[i] for 1<=i<=n to a string P(a[i]) over the alphabet A. Now:

  1. if the string being transformed has fewer than m elements, the whole process stops now.
  2. m elements are removed from the beginning of the string
  3. call the first element removed a[k]; if k=n+1 the whole process stops now.
  4. P(a[k]) is appended to the string.
  5. steps 1-5 are repeated.
The particular class of tag-systems this Turing machine simulates is the class where m = 2, the initial string has length at least 2, and all P(a[i]) where 1<=i<=n are of the form a[n]a[n]B[i] where B[i] is some string over the alphabet A (B[i] is the empty string if and only if i=n).

The input for this Turing machine is mildly complex, and there is no error checking.

An example input string

We take as an example the tag-system over the alphabet A = {a[1], a[2], a[3], a[4]}, where m = 2 and:
P(a[1]) = a[3]a[3]a[2]a[1]a[4]
P(a[2]) = a[3]a[3]a[1]
P(a[3]) = a[3]a[3]
P(a[4]) = STOP

It meets the criteria above; and when applied to the initial string a[2]a[1]a[1] it gives:

a[2]a[1]a[1]
a[1]a[3]a[3]a[1]
a[3]a[1]a[3]a[3]a[2]a[1]a[4]
a[3]a[3]a[2]a[1]a[4]a[3]a[3]
a[2]a[1]a[4]a[3]a[3]a[3]a[3]
a[4]a[3]a[3]a[3]a[3]a[3]a[3]a[1]
a[3]a[3]a[3]a[3]a[3]a[1]
and then it's done.

Now, the encoding:

a[1] is 1
a[2] is 11111111111
a[3] is 11111111111111111
a[4] is 111111111111111111111
P(a[1]) is b1 b111111111111111111111b b1b b11111111111b b11111111111111111b b1111111111111111
P(a[2]) is b1 b1b b11111111111111111b b111111
P(a[3]) is b1 b11111111111111111b b
the initial string is 11111111111c1c1

and so the whole input is

b1 b11111111111111111b b
b1 b1b b11111111111111111b b111111
b1 b111111111111111111111b b1b b11111111111b b11111111111111111b b1111111111111111
b
11111111111c1c1
which when run together for input to the Turing machine becomes
b1b11111111111111111bbb1b1bb11111111111111111bb111111b1b111111111111111111111bb1bb11111111111bb11111111111111111bb1111111111111111b11111111111c1c1

The Universal Turing Machine

The Universal Turing Machine implemented by the BF program is a four state, six symbol (namely: '1','b','<','>','0', and 'c') Turing machine. For those interested, the state table of the machine is given in the following table. Each cell contains a 5-tupple representing the current state, the symbol read, the symbol written, the action (L stands for move left, R stands for move right and H stands for halt), and the new state.

11<L1 210R2 311R3 410R4
1b>R1 2b>L3 3b<R4 4bcL2
1>bL1 2><R2 3>bR3 4><R4
1<0R1 2<>L2 3< H 4<
10<L1 201L2 30cR1 40cL2
1c0R4 2cbR2 3c1R1 4cbR4

The initial state is 1, tape cells are set as per the input but with the termination code P(a[n+1])=STOP represented as a <b at the left end and all other cells on both sides (actually, only the right is needed) holding symbol 0, and the head is initially at the first 1 in the code for the initial string.

The minimal test case b1b1bbb1c1c11111 represents the tag-system where P(a[1]) = a[1]a[1] and P(a[2]) = STOP, applied to the string a[1]a[1]a[2]. This runs for 518 steps of the Turing machine, exercising all 23 Turing machine instructions, before halting with the output string a[1].

BF program

Below follows the BF program which implements the above Universal Turing Machine.
+++>+
[>>,[>+++++<[[->]<<]<[>]>]>-[<<<->++++++>>-[<<[-]>++>[<<+>->-]]]
<[-<[<]++[>]]<]<-[[>>>+<<<-]<]++>>>+[-[>++++++<-]>[<+>-]
+<<<+++>+>
[-
[<<+>->-
[<<[-]>>-
[<<++>+>-
[<<-->->>+++<-
[<<+>+>>--<-
[<<->->-
[<<++++>+>>+<-
[>-<-
[<<->->-
[<<->>-
[<<+++>>>-<-
[<<---->>>++<-
[<<++>>>+<-
[>[-]<-
[<<->>>+++<-
[<<->>>--<-
[<<++++>+>>+<-
[<<[-]>->>++<-
[<<+++++>+>>--<-
[<->>++<
[<<->>-
]]]]]]]]]]]]]]]]]]]]]]<[->>[<<+>>-]<<<[>>>+<<<-]<[>>>+<<<-]]>>]
>[>+++++>[<->-]<]>>[+++[<+++++>--]<[<+>>+++++<-]>-.[-]>]<<<.


URM is BF-complete

URM stands for Universal Register Machine. The statement "URM is BF-complete", simply means that all possible URM programs have an equivalent BF program. Actually, the set of possible BF programs is much larger than the set of possible URM programs. You would hardly believe it, but URM is also Turing-complete, due to the fact that the registers can store infinite numbers, although the mapping from general Turing machines to URM is very, very complicated. These fact combined, proof that BF is Turing-complete.

The Universal Register Machine is a machine with a fixed number of characters, and it supports the following commands:

Examples URM programs are:

There is almost a one-to-one mapping from URM to BF. The an expression maps to +, the sn expression maps to -, and the (x)n expression maps to [x], of course preceded with the right amouth of > and < to move to the indicated memory location. The translation of the above two URM programs to BF are:

It is possible to make context free grammars for both BF and URM, such that matching parse trees indicate equivalent programs. Such a grammar is given below:

URM BF
root = S1. root = S1
Sn = an Sn = +
Sn = sn Sn = -
Sn = Sn ; Sn Sn = Sn Sn
Sn = ( Sn )n Sn = [ Sn ]
Sn = Sn+1 Sn = < Sn+1 >
Sn = Sn-1 Sn = > Sn-1 <

Still, I feel that BF is far more elegant than URM, because it gives you more with less symbols. BF only needs the symbols "+", "-", "<", ">", "[", and "]", whereas URM needs "a", "s", ";", "(", ")", ".", and "0" to "9". Although, I have to admit, that BF programs will in general be longer than the equivalent URM program, and in most cases also more cryptic.