On Reasoning about Code

software systems

                        February 12, 2021

                On Reasoning about Code

                        Ranger update
Ranger has now gotten enough vaccinations to go for walks! This is very exciting. He was 19.6lbs at last weighing, now crossing twice his weight as-of coming home!

Reasoning about code
Last week, I read ““Reasoning about code” is a scam,” which I saw via Hillel Wayne’s newsletter. Reasoning about code and software systems is a topic I care deeply about. Earlier last year I wrote a lengthy post explicating some of my personal philosophy of software engineering: a deep conviction that software systems can be understood; I would expand on that claim by asserting that “we can reason about software” is either synonymous with or a direct consequence of that understanding. I was therefore very interested in the claim presented. Today I want to share a few reactions or responses.
Reasoning about software systems
First off, I want to muse a bit about the nature of “reasoning about code.” I care a lot about reasoning about software systems, which includes both the concrete source code that makes them up, but also, more broadly, their properties and behaviors. I tend to believe that we spend too much time talking about the actual code that constitutes systems — about its concrete syntax and appearance and local details — and less about the larger-scale architecture, invariants, and design of the systems. These system properties include a lot of behavior not explicitly spelled out in source code — emergent properties like performance behaviors and resource consumption — but which derive from the source code.
I think Graham may be trying to gesture at some version of this distinction, but doesn’t spell it out; because of the close connection between source code and system behaviors, I’m going to use “reasoning about source code” as loosely synonymous with the broader notion of “reasoning about systems made out of software.”
Reasoning about code as a value
Graham says, of the argument that some pattern, language, or framework makes it easier to reason about code:

it’s a thought-terminating cliche. […] The idea is that there’s nowhere to go from here.

I think he’s right that “This approach makes it easier to reason about code” isn’t an argument-ender. It’s not immediately self-evident that some approach is better on that axis, and reasoning about a system is not the only or most important goal; we might accept a system that is harder to reason about but more performance or desirable in some other way.
However, I disagree that there’s nowhere to go from here. I think that the ability to reason about code and software systems is very important — I’ll say more in the next section — and we should absolutely take it into account.
The merits of consistency
Furthermore, if we do accept that premise, I think Graham’s next objection actually contains within it the seeds of an important point. Graham argues that because different people and different communities find different approaches or paradigms easier to reason about, all notions of “ease of reasoning” are relative and uninformative. He mentions at least one concrete example:

the Smalltalk folks noticed it was easier to teach object-oriented programming to children than to professional programmers, because the professional programmers already had mental toolkits for comprehending programming that didn’t integrate with the object model. It’s easier for them to “reason about” imperative code than objects.

I disagree that all notions of “easy to understand” are relative. However, if we accept that premise for the sake of argument, I think it suggests a very important principle. In systems with large numbers of developers with diverse backgrounds, most developers approaching the system will have to learn new models or tools or paradigms. However, in a large system, if we limit the total number of such new models, we can greatly improve their ability to reason about and communicate about the system as a whole!
Suppose we have a codebase written using one consistent paradigm (perhaps within one language and using one organizing framework). We need to add some new feature, and we have the choice of doing it in our existing paradigm, or switching models. If we switch to a new programming model, the new feature will be somewhat “simpler” to implement in some way — perhaps both in lines of code, and in terms of some more abstract conceptual space. Should we do so?
Well, every developer on the project, by construction, is already familiar with our existing paradigm, and finds it passably easy to reason about, since they have had time to learn it and work within it. So, for our existing developers, the cost of understanding the new feature will be something like [complexity in the old paradigm] if we we stick with our existing tools, and [complexity in the new paradigm] + [cost to learn the new paradigm] if we switch. Even if the intrinsic complexity is lower, the cost of learning the new model may not be worth it. And, even if it is worth it, the calculus for a single feature and paradigm is different than if we were to implement each subsystem or major feature in a different paradigm. If we stipulate that developers’ ease of understanding a paradigm is largely a function of “have they worked with it before,” we get a strong argument for consistency in our choices of tools and paradigms.
As ever, there are no absolutes; if the “intrinsic complexity” in the new tool is sufficiently low, or if we can staff a team to work entirely on the new feature, and we don’t anticipate a lot of crossover between the teams, the argument for switching tools get stronger. Engineering is always about tradeoffs, and saying that some factor has weight is not the same as saying that factor is always the most important factor. But I think even this reductive relativistic approach to “reasoning about code” has something important to teach about the values of consistency and minimizing the tools we use, within a team and project.
Is reasoning a scam?
The article closes with the allegation that reasoning about code is entirely a scam:

Code is a particular representation of, at best, yesterday’s understanding of the problem you’re trying to solve. “Reasoning about code” is by necessity accidental complexity […]
This points to a need for code to be deletable way faster than it needs to be thought about.

I am thoroughly in favor of writing code that is easy to delete. However, this strikes me as a hopelessly simplistic view of the way that software engineering works, or should work.
Code is not written once, run, and then thrown away. Even in an idealized view of the world where we are drastically better at software engineering, much of the code we write exists to support systems that provide value in an ongoing fashion. Our desktop applications, our communication tool, the tools we use to connect with each other, to get work done; the code that runs in our cars and on our phones and in our houses; these all provide value in their ongoing day-to-day existence and functioning. By nature of their ongoing existence, they need to be continually maintained and evolved and debugged to better suit the needs of their users and the needs of tomorrow.
Whether or not view this softwarification of the world as a good thing, it is our reality, and as software engineers we have to grapple with. The purposes for which we build software require software that can evolve and be understood into the future, and that, in turn, requires that we be able to understand reason about these systems and the code that creates them.
We should aspire to build these systems in decoupled ways that supports rewriting components, and deleting individual components, as part of that evolution. But we cannot in today’s world aspire to delete these systems in their entirety, and even the question of deciding whether it is safe to delete a subsystem or a component – that is a question of reasoning about the larger system, or, as we might put it, “reasoning about code.”
Software engineering is about the time-extended development of evolving software systems. Code we wrote yesterday is part of today’s problem, whether we like it or not. We may choose to label that “accidental complexity,” but that label does not free us of the problem of dealing with it.
Writing systems that are amenable to understanding is one of the most important problems we can attack, for the sake of our future selves, future developers, and, ultimately, our future users.

                        Don't miss what's next. Subscribe to Musing in Computer Systems: