New TLA+ Workshop
3 days, April 12-14. Register here!
Vim is Turing-Complete
No, not vimscript. Vim keystrokes are Turing-complete. Here’s how:
I’ll explain how this works, but first, a very brief overview of what I mean by “Turing-complete”.
A Brief Overview of Turing-Completeness
Very roughly: A decision problem is a standard math problem. We have some predicate P(x) (“x is divisible by 2”, “x contains an A”, etc) and we want to come up with an algorithm that correctly determine P for all possible finite inputs. If we can come up with an algorithm, we can then “implement” it in some mathematical system. One class of “implementation systems” are automata, and a Turing machine is one kind of automata. The Church-Turing Thesis hypothesizes that Turing machines are the upper bound of power: if you cannot implement the algorithm in a Turing machine, you cannot implement the algorithm in any system. This isn’t formally proven, but it’s empirically held up for a long-ass time. Nothing can be more powerful than a Turing machine.
Plenty of things are as powerful. We say a model of computation S is Turing-complete if it can solve any problem a Turing machine can, aka S is maximally powerful. The most direct way to prove S is Turing complete by showing that any Turing machine can be converted into an S-construct. Unfortunately Turing machines are a huge pain to work with, so we try to avoid that. Instead, the most common way to prove Turing-completeness is to write an interpreter for a system that we already know is Turing-complete, preferably a system that’s more agreeable. If we could write a C interpreter entirely in Vim keystrokes, then we’ve proven that Vim is Turing complete.
We’re not going to do that. C is even more of a pain than Turing machines are. It’s got too much stuff in it designed for “actually writing real programs” that makes it hard to encode. One of the more popular targets is Brainfuck, an esolang (esoteric language) from the 90s. BF is TC with only one datatype and eight commands. Esolangs commonly prove their own TCness by implementing a Brainfuck interpreter.
I’ve already written a Brainfuck interpreter in Vim keystrokes, so I could stop right there and call victory. But I don’t think it’s a good demonstration! I used way too many Vim Tricks (tm), like macro/register equivalence, that make understanding how it works really annoying. I wouldn’t be able to explain it well and this wouldn’t be a very informative newsletter.
So let’s talk tag systems.
Tag systems are one of my favorite automata, which is easily the third or fourth nerdiest thing I’ve said all month. We have a string of characters and a dictionary that maps single characters to strings:
We cut the first two characters in the string, then look up the mapping for the second character we removed and append it to the end of our string. In the above example, we’d remove
ab, then look up
b to get
acc, then append it to the end to get
cacc. The next five steps are:
We halt if we have fewer than two characters remaining. We can encode the truth-value of the decision problem in any number of ways: whether we end with zero or one characters, what the last character is, whether we halt at all. Hao Wang proved it’s Turing-complete.
This is a good candidate for Vim because there’s no “bookkeeping” internal state. We never change how we interpret the production rules. The input string is also our data tape, and we don’t need to emulate explicit conditionals. It’s all just text manipulation.
Here’s the Vim keystrokes again:
We can assume it’s loaded into a register, say
@q, for use as a macro. The “source file” should be in the same format as the example: input string on the first line, then a couple newlines, then the mappings. Every invocation of the macro corresponds to one step of the computation. Let’s break it down:
Move right, and then move left. We’ll come back to this later.
Delete the first two characters of the string. Vim automatically loads deleted characters into the default register. Since we do this twice, the first deleted character is clobbered, and the default register only has the second deleted character.
Paste the lasted deleted character on the line below, and then search for the next instance of that “word”.
* only looks for whole words, which in this case is a whitespace-separated single character. We now have our corresponding production rule.
Move from the key to the value and copy it into the default register.
Go to the first line of the file, jump to the end, and paste. The production is now appended to the end of the data string.
Cleanup. Delete the scratch character on the line below and go back to the beginning of the file. We have now completed one step of evaluation.
If this is in the
q register, we can append
@q to the macro to make it recursive. Then it will run until the macro errors out, which we want to associate with the halting state. For our purposes, defining the halting state as “less than two characters in the string” is fine. That’s what these two characters are for:
Vim macros define errors as “an error message or beep”. Trying to move right at the end of the line causes a beep, which cancels the macro. Since we start at the beginning of the line, the way for
lh to error is if we have one or zero characters, giving us the halting condition. Otherwise,
lh is a noop.
With this we’ve fully implemented a 2-tag system in Vim keystrokes, proving that Vim are Turing-complete.
Warning: If you use a recursive macro and your cyclic tag input doesn’t halt, the editor will freeze. Use at your own risk.
Why do this?
To my shock and horror, someone found an error in the proof:
Ruslan suggested fixing it with
/^<C-r>. That would work at the cost of introducing two new concepts (generalized searching, regular expressions). A simpler patch would be to use a sentinel value, like
We add the rule that
p cannot be part of the alphabet, so that
pa cannot appear in a production rule. Then instead of running
a, we run it on
pa, guaranteeing we find the prefix instead. The new vim string should be
I also patched a small bug when you have multiple empty production rules. Prefixes for empty production rules need exactly one space after them.