A week or so ago, GitHub announced GitHub Copilot, their AI-powered code completion assistant, powered by a version of OpenAI’s GPT-3 model. I’ve spent a lot of time working on developer productivity tools and am also now working on language-generation models at Anthropic, so I’m very interested in Copilot and its implications. I haven’t been invited to the beta yet (probably because I don’t use VS Code) so I haven’t had a chance to play with it, but I wanted to jot down some initial thoughts and reactions.
I want to caveat up front that these are all my personal views, not those of Anthropic, and I’m pretty sure that most of these reactions would have been the same even had I not joined Anthropic earlier this year.
It’s hard to deny that Copilot is very flashy and impressive in some ways; certainly, if you’d told me two years ago that we would have this level of automated code-writing based on generative models, I’d have been extremely skeptical.
That said, I struggle to see how Copilot would fit into my workflow in a way that would make a huge difference in my productivity. I feel that the hard part of my job is rarely actually writing the function-level code, but rather all the rest of my job — designing the system and the interactions between functions and components; figuring out what to build in the first place; debugging systems; and so on. And Copilot doesn’t help that much with those tasks.
One limitation that feels huge to me is that Copilot doesn’t really understand your codebase, in particular. Copilot only gets to “see” the text preceding the cursor in your current file, so it can’t even see your type definitions or functions defined in other modules. It seems to do an okay job of inferring or guessing what your types are or what methods might exist based on the code it does see, but even I, as an experienced engineer, would struggle mightily to write code if I was unable to ever go look at documentation or definitions for the particular project I’m working on.
In this sense and some others, I think it’s maybe productive to think of Copilot less as a pair programmer, and more as something like an in-editor interface to Stack Overflow. Like Stack Overflow, Github Copilot has extensive knowledge of your language and of open-source libraries and common idioms and patterns to solve common problems. Like Stack Overflow, it has very little knowledge of the particular types or idioms or utilities available in your codebase or application.
It’s obviously a bit more sophisticated than the existing corpus of Stack Overflow answers; it has some ability to improvise and stitch together pieces of functionality, and some ability to pattern-match style and idioms and some use of existing types and libraries to your current code base. But, for instance, if you prompt it to download the contents of a URL, I expect it to be pretty unlikely to realize that your codebase has an internal helper that goes through smokescreen and otherwise configures appropriate local defaults; Copilot is almost certainly gonna say the same thing that asking Stack Overflow would, and point you directly at a standard library like
All of this said, that’s my gut personal reaction; I actually suspect — both based on reflection, and on the reports from early users — this this tool is already more significant than I am inclined to believe.
Stack Overflow is a hugely valuable tool for developers, and even if GitHub Copilot “only” behaves like “Stack Overflow, but in your editor,” that might be a huge deal. I use Stack Overflow less than a lot of developers I know, because I am blessed with an excellent memory for software trivia, and because I like to fully understand the libraries I use and so I prefer puzzling through reference documentation. But even I find myself on Stack Overflow responses fairly frequently, and it’s clearly even more valuable to junior engineers with less experience to help point them towards the right documentation or to make sense of dense documentation.
Also, there’s a long-running (and not incorrect) joke that, as an experienced developer, the more you shrug and say “I don’t get it” to a new tool, the more likely it is that it’s actually the new hotness and is going to be absolutely everywhere in a few years. I suspect that heuristic might apply here.
This is another reason I suspect Copilot — or its successors — will be a big deal.
There’s a saying in cryptography that “attacks only get better”. It’s an attempt to summarize the fact that the best known attacks on a cryptosystem can only improve with time, and that where there’s one weakness, there are likely to be others. Therefore, the right time to migrate off of a system is at the first sign of weakness, even if the attack isn’t yet practical, because it will only ever get more practical.
Similarly, the field of machine learning is moving incredibly quickly right now: models are only getting better, compute is only getting faster and cheaper, and the field is still accruing larger and larger amounts of money and investment. The right way to look at Copilot — and Codex, the language model behind it — is not as the final version of this product, but as the worst version of this product GitHub and OpenAI will ever release. Future ones are likely to only get more capable and more sophisticated, potentially in surprising and unexpected ways. So, even if my gut instinct were right and Copilot, as it was released, doesn’t matter that much, that would be weak evidence at best that it will stay that way.
Copilot has come with a lot of jokes and half-jokes about how computers are going to put programmers out of jobs.
It’s hard to say what will happen with future, better versions of this tool, and a lot of these jokes are predictions about the future trend of such technologies, more so than Copilot itself. However, it seems clear to me that in the immediately foreseeable future, the following chain of implications is likely to be much more relevant:
I think you can view this prediction as an instance of the Jevons Paradox; anything that makes software production more efficient will just increase the demand for software, rather than decreasing the demand for software engineering. AI tools will have to get much better at the entirety of the software development lifecycle, and take on a much larger fraction of the entire supply chain of creating software, before the second effect will become dominant.
Finally, and on a personal note: I joined Anthropic in part because it seems clear to me that machine learning is gearing up to have an incredible impact in a diverse set of fields, including my own areas of software systems and software development, and I want to gain more familiarity and expertise with the space, as well as some measure of impact on the directions it takes.
GitHub Copilot feels to me like a strong affirmation of that belief, and evidence that now is, in fact, a good time to be getting into the field. So this launch makes me continue to feel good about my choice of where to invest my time and intellect.
If Copilot makes you feel like you want to know more about language models, and the state of modern ML, Anthropic is hiring! ML experience is a plus but not at all required for strong software engineers and especially strong systems engineers; I came to the org with essentially no ML experience and have found it to be a very welcoming place and an excellent environment for coming up to speed very quickly, while also working on valuable systems problems from day one. If you’re curious, please feel free to reach out to me directly if you have any questions I could help with!