Sorry this newsletter is late! Monday I ran a 35 person online TLA+ workshop. Overall I think it went pretty well! There’s a bunch of bugs to iron out and a bunch of improvements I want to make, then I’ll run it again in maybe February. I’ve got one more talk to do this Friday and then I’m done with my contracting work for the year!
Now I don’t know about you, but everybody I know in tech is talking about the new “GPT-3.5” (https://chat.openai.com/). Give it a prompt and it generates text that matches that prompt. And if that prompt is a request for code, the code it generates can be surprisingly accurate. So far it’s solved several days of advent of code, passed the 2022 AP CS A test, and mimicked a virtual machine. It can even take a code snippet and inject a bug, and then explain what the bug is!
(Obviously I’m simplifying a lot and it’s not “doing these things”, these are handpicked examples, it stumbles in other places, etc etc etc).
So first some initial thoughts:
That last part’s what I want to focus on. Using an AI well means finding cases where it’s okay to generate incorrect output, and using an AI is more efficient than just doing the thing yourself, including fixing the mistakes.
In other words, we want tasks with solutions that are hard to find, easy to verify, and easy to fix.
Here’s two examples of that:
Me: What’s the property-based testing library for Rust?
GPT: The most popular property-based testing library for Rust is quickcheck, which is based on the Haskell library of the same name.
This is something I can easily look up, but it’s easier to check if
quickcheck is a PBT library (it is) than to look for PBT libraries and find out about
I’d have to confirm these are all right (some are not—they’re engines, not high-level languages), but it’s still a place to start.
When we write a program, we’re doing two distinct activities. First, we’re figuring out an algorithm that solves our problem, and then we’re encoding that problem in our programming language’s syntax. It often happens, especially when working in an unfamiliar language, that you know exactly what you want to say but have no idea what you have to actually type.
(Put another way: how do you write a for-loop in bash?)
GPT outputs “look right” because they look syntactically correct, even if they’re behaviorally wrong. If our problem is “knowing the right syntax”, though, then we can use GPT to bootstrap the syntax.
Here’s an example. There’s a LaTeX package called TikZ for textually describing diagrams. It’s powerful but the syntax is hair-pullingly arcane with hundreds of special keywords. The index is fifty pages. About once a year I think “oh this is the perfect tool for my problem” and then have to spend an hour just refreshing myself on how to do anything.
First I told ChatGPT to generate a simple TikZ diagram:
Here’s what it looks like rendered:
Now that’s not a great diagram. But it’s valid, and in a form where I can easily modify it to get what I want. That makes writing TikZ a lot more tolerable!
Similarly, I can bootstrap library code:
That’s giving me integers which are at least one, not integers greater than one, but still, it’s got all of the syntax right, so now I can change it to get what I actually want.
Neither of the above uses are gamechangers— we’re talking saving 10-15 minutes on a large task. AI maximalists are hoping it can eventually do the whole task for us. But that’s been five years out for the past two decades. I’m finding ways to make ChatGPT useful right now. Even if AI hits a brick wall tomorrow and never ever gets better, it’s still meaningfully improved my work.
In short, it’s gonna be just another tool in my toolkit. I imagine that I’ll find more ways to use it soon.