Here’s a follow-up article from C++ template metaprogramming, which talked about the basics of compile-time computation with C++ templates. This article is (vaguely) more practical, as we implement a brainfuck compiler that generates code via C++ templates!
The Language
Brainfuck is a relatively simple language. The brainfuck machine has at least 30000 one-byte cells initialized to 0, and a pointer that indicates the current cell being manipulated. The language consists of 8 commands: <>+-.,[]. From Wikipedia:
> increment the data pointer (to point to the next cell to the right).
< decrement the data pointer (to point to the next cell to the left).
+ increment (increase by one) the byte at the data pointer.
- decrement (decrease by one) the byte at the data pointer.
. output the byte at the data pointer as an ASCII encoded character.
, accept one byte of input, storing its value in the byte at the data pointer.
[ if the byte at the data pointer is zero, then instead of moving the instruction pointer
forward to the next command, jump it forward to the command after the matching ] command.
] if the byte at the data pointer is nonzero, then instead of moving the instruction pointer
forward to the next command, jump it back to the command after the matching [ command.
With this in mind, we can start implementing our compiler.
Representing the Machine
We first want a data structure to represent our brainfuck machine. This is passed around at runtime, because we still want to run brainfuck with IO side effects (getchar and putchar). We can encapsulate this in a simple struct that contains a pointer and the cell storage.
struct Machine {
int pointer;
char cells[30000];
inline Machine() {
pointer = 0;
std::memset(cells, 0, sizeof(cells));
}
};
Representing Programs
We can represent the program as a singly-linked list (yes, singly-linked lists are magic!) in a relatively straightforward way.
struct Stop {
inline static void run(Machine &m) { }
};
template<typename head, typename tail>
struct Program {
inline static void run(Machine &m) {
head::execute(m);
tail::run(m);
}
};
The head of the Program is the current instruction, and the tail of the Program contains the remainder of the instructions. Each instruction will have an execute(Machine &) function attached to it, which mutates the state of the machine. We only ever expect to move forward through program flow (there’s a trick which we can use for handling loops, which I’ll cover later), so this represents a brainfuck program pretty well.
Encoding Instructions
As we specified earlier, we have instructions that each have an execute(Machine &) function. Here’s an implementation of all the instructions, with the exception of the looping ones.
struct Incr {
inline static void execute(Machine &m) {
m.cells[m.pointer]++;
}
};
struct Decr {
inline static void execute(Machine &m) {
m.cells[m.pointer]--;
}
};
struct MoveLeft {
inline static void execute(Machine &m) {
m.pointer--;
}
};
struct MoveRight {
inline static void execute(Machine &m) {
m.pointer++;
}
};
struct Putchar {
inline static void execute(Machine &m) {
putchar(m.cells[m.pointer]);
}
};
struct Getchar {
inline static void execute(Machine &m) {
m.cells[m.pointer] = std::getchar();
}
};
Encoding Loops
Loops are an interesting case, but we can think of them like so: every set of instructions in a given loop block is a subprogram of our program. As such, we can wrap them in a Loop instruction, which will handle running the subprogram for us:
template<typename prog>
struct Loop {
inline static void execute(Machine &m) {
while (m.cells[m.pointer]) {
prog::run(m);
}
}
};
Writing the Dumper
We can write a simple dumping function which can dump what the program looks like, so we can reconstruct our program from the parsed input:
template<typename prog>
struct dump { };
template<typename tail>
struct dump<Program<Incr, tail>> {
static void print() {
putchar('+');
dump<tail>::print();
}
};
template<typename tail>
struct dump<Program<Decr, tail>> {
static void print() {
putchar('-');
dump<tail>::print();
}
};
template<typename tail>
struct dump<Program<MoveLeft, tail>> {
static void print() {
putchar('<');
dump<tail>::print();
}
};
template<typename tail>
struct dump<Program<MoveRight, tail>> {
static void print() {
putchar('>');
dump<tail>::print();
}
};
template<typename tail>
struct dump<Program<Putchar, tail>> {
static void print() {
putchar('.');
dump<tail>::print();
}
};
template<typename tail>
struct dump<Program<Getchar, tail>> {
static void print() {
putchar(',');
dump<tail>::print();
}
};
template<typename body, typename tail>
struct dump<Program<Loop<body>, tail>> {
static void print() {
putchar('[');
dump<body>::print();
putchar(']');
dump<tail>::print();
}
};
template<>
struct dump<Stop> {
static void print() { }
};
Writing the Parser
Here’s the “fun” bit — writing the parser in C++ templates. Our way of representing a program is a compile-time linked list, so we have to do the parsing at compile time. To make it less painful, we can use a C++11 variadic list (it still burns).
template<char... xs>
struct parse { };
We define the trivial cases: skip characters we don’t recognize, and emitting the end of input correctly.
template<char _, char... xs>
struct parse<_, xs...> {
typedef typename parse<xs...>::result result;
};
template<>
struct parse<> {
typedef Stop result;
};
Now, all the various instructions that aren’t loop-related ones.
template<char... xs>
struct parse<'+', xs...> {
typedef Program<Incr, typename parse<xs...>::result> result;
};
template<char... xs>
struct parse<'-', xs...> {
typedef Program<Decr, typename parse<xs...>::result> result;
};
template<char... xs>
struct parse<'<', xs...> {
typedef Program<MoveLeft, typename parse<xs...>::result> result;
};
template<char... xs>
struct parse<'>', xs...> {
typedef Program<MoveRight, typename parse<xs...>::result> result;
};
template<char... xs>
struct parse<'.', xs...> {
typedef Program<Putchar, typename parse<xs...>::result> result;
};
template<char... xs>
struct parse<',', xs...> {
typedef Program<Getchar, typename parse<xs...>::result> result;
};
Loops Strike Again!
Our parser uses immutable data structures, which makes it difficult to know how far we’ve parsed. To keep things simple, we can define a function which will skip the position of the parser’s input to the matching ] when we encounter a [. We keep track of how many loop depths we’re in with a counter.
template<int depth, char... xs>
struct skip_loop_block { };
template<int depth, char x, char... xs>
struct skip_loop_block<depth, x, xs...> {
typedef typename skip_loop_block<depth, xs...>::result result;
};
template<int depth, char... xs>
struct skip_loop_block<depth, ']', xs...> {
typedef typename skip_loop_block<depth - 1, xs...>::result result;
};
template<int depth, char... xs>
struct skip_loop_block<depth, '[', xs...> {
typedef typename skip_loop_block<depth + 1, xs...>::result result;
};
template<char... xs>
struct skip_loop_block<0, ']', xs...> {
typedef typename parse<xs...>::result result;
};
And now we can parse loops:
template<char... xs>
struct parse<'[', xs...> {
typedef Program<Loop<typename parse<xs...>::result>,
typename skip_loop_block<0, xs...>::result> result;
};
template<char... xs>
struct parse<']', xs...> {
typedef Stop result;
};
Taking it for a Test Drive
Here’s the venerable “Hello World!” program:
++++++++++[>+++++++>++++++++++>+++>+<<<<-]>++.>+.+++++++..+++.>++.<<+++++++++++++++.>.+++.------.--
------.>+.>.
We can parse it:
typedef parse<'+', '+', '+', '+', '+', '+', '+', '+', '+', '+', '[', '>', '+', '+', '+', '+', '+',
'+', '+', '>', '+', '+', '+', '+', '+', '+', '+', '+', '+', '+', '>', '+', '+', '+',
'>', '+', '<', '<', '<', '<', '-', ']', '>', '+', '+', '.', '>', '+', '.', '+', '+',
'+', '+', '+', '+', '+', '.', '.', '+', '+', '+', '.', '>', '+', '+', '.', '<', '<',
'+', '+', '+', '+', '+', '+', '+', '+', '+', '+', '+', '+', '+', '+', '+', '.', '>',
'.', '+', '+', '+', '.', '-', '-', '-', '-', '-', '-', '.', '-', '-', '-', '-', '-',
'-', '-', '-', '.', '>', '+', '.', '>', '.'>::result program;
Dump it out:
dump<program>::print();
And run it:
Machine m;
program::run(m);
And, as expected, we get:
Hello World!
For an example with nested loops:
+++[>+++++<-]>[>+>+++>+>++>+++++>++<[++<]>---]>.-.[>++>+<<--]>--.--.+.>>>++.<<.<------.+.+++++.>>-
.<++++.<--.>>>.<<---.<.-.-.>+.[+++++.---<]>>[.--.]<<.<+.++.++>+++[.<][.]<++.
typedef parse<'+', '+', '+', '[', '>', '+', '+', '+', '+', '+', '<', '-', ']', '>', '[', '>', '+',
'>', '+', '+', '+', '>', '+', '>', '+', '+', '>', '+', '+', '+', '+', '+', '>', '+',
'+', '<', '[', '+', '+', '<', ']', '>', '-', '-', '-', ']', '>', '-', '>', '-', '.',
'[', '>', '+', '+', '>', '+', '<', '<', '-', '-', ']', '>', '-', '-', '.', '-', '-',
'.', '+', '.', '>', '>', '>', '+', '+', '.', '<', '<', '.', '<', '-', '-', '-', '-',
'-', '-', '.', '+', '.', '+', '+', '+', '+', '+', '.', '>', '>', '-', '.', '<', '+',
'+', '+', '+', '.', '<', '-', '-', '.', '>', '>', '>', '.', '<', '<', '-', '-', '-',
'.', '<', '.', '-', '-', '>', '-', '.', '>', '+', '.', '[', '+', '+', '+', '+', '+',
'.', '-', '-', '-', '<', ']', '>', '>', '[', '.', '-', '-', '-', '>', ']', '<', '<',
'.', '<', '+', '.', '+', '+', '.', '+', '+', '>', '+', '+', '+', '[', '.', '<', ']',
'[', '.', ']', '<', '+', '+', '.'>::result program;
And we get:
Just another brainfuck hacker.
Extra Credit
We can compile our “Hello World!” program with GCC’s -O3 flag and dump the assembly from it with -S:
main:
# stuff we don't care about
movq stdout(%rip), %rsi
movl $72, %edi
call _IO_putc
movq stdout(%rip), %rsi
movl $101, %edi
call _IO_putc
movq stdout(%rip), %rsi
movl $108, %edi
call _IO_putc
movq stdout(%rip), %rsi
movl $108, %edi
call _IO_putc
movq stdout(%rip), %rsi
movl $111, %edi
call _IO_putc
movq stdout(%rip), %rsi
movl $32, %edi
call _IO_putc
movq stdout(%rip), %rsi
movl $87, %edi
call _IO_putc
movq stdout(%rip), %rsi
movl $111, %edi
call _IO_putc
movq stdout(%rip), %rsi
movl $114, %edi
call _IO_putc
movq stdout(%rip), %rsi
movl $108, %edi
call _IO_putc
movq stdout(%rip), %rsi
movl $100, %edi
call _IO_putc
movq stdout(%rip), %rsi
movl $33, %edi
call _IO_putc
movq stdout(%rip), %rsi
movl $10, %edi
call _IO_putc
# more stuff we don't care about
GCC optimized the program for us into just a bunch of _IO_putcs, removing all the unnecessary instructions. Neat!