“Keep it simple!” “Just write simple code!” “Great devs come up with simple solutions!”
I get it. Simple code is better than complex code. It’s an important lesson, and I’ll admit that it’s one we easily lose sight of, and a lot of code is more complicated than it needs to be. But weirdly enough, no matter how many times people say “write simple code”, no matter how many times we repeat our mantras, a whole lot of software is really complicated! Why is that? Why can’t we write simple code? If you look online you’ll run into shit like this:
Most Serious Developers™ like to write Real Software™ with Real Code™. … We need to admit that not every application out there needs the same level of interface sophistication and operational scalability as Gmail. There is a whole world of apps out there that need well thought-out interfaces, complicated logic, solid architectures, smooth workflows, etc… but don’t need microservices or AI or chatbots or NoSQL or Redux or Kafka or Containers or whatever the tool dujour is.
This isn’t an explanation, it’s scapegoating! It’s your fault that the code is too complicated! And, like so many other scapegoating attempts, it’s completely useless for pursuing simplicity. If the only advice you have is “Do Better”, that leaves you completely powerless when Doing Better doesn’t give you simplicity.
To understand why programs are complex, we need to think about them structurally. Complexity is a budget. Different factors “cost” you complexity; you make your program more complicated to address these factors. Most factors can be addressed in a variety of ways, some simpler than others, but there will always be a bare minimum of complexity added. Then achieving simplicity becomes about managing the complexity instead of avoiding it. There is a limit to how simple you can make things.
We’d expect that different factors affect complexity in different ways. They should add different amounts and kinds of complexity. The complexity added by “we need to account for different browser implementations” should look different from that added by “we need to make a backwards-incompatible change to our API without disrupting current customers.”
What are some of these factors? Here’s a few I came up with a little bit of thinking. This isn’t anywhere close to a comprehensive list or even an organized list, but it at least showcases that complexity can come from a lot of sources.
If you have complex requirements, you’d need to write complex code to handle them. How complicated the features get is domain-specific. But there’s also “universal” features that appear in many types of programs and many domains.
- User configuration and options
- Tagging systems
- Third party integrations
- Data export
- Data auditing
It makes sense to study how these features add complexity to our projects, so we can catalog and share expected problems.
You have an event app that lets users create an event. Adding an “edit existing event” feature is easy. Adding a “create recurring event” feature is easy. Adding both features is messy. How do you handle users who want to edit recurring events?
This makes things more complex in two ways. First, feature interaction is a notorious source of bugs. Second, the two features are “simple” in incompatible ways. Your recurring event code isn’t gonna be nearly as simple once it also needs to account for people editing events.
All products have at least two stakeholders: the client and the developers. Most also have other stakeholders in the company, like sales and marketing. And often the “client” is itself many different groups, like users and advertisers. These groups are not unified.
Stakeholders have conflicting requirements. Users want privacy while the advertisers want to know everything. Buyers want accurate reviews, sellers want 5-star ones. Balancing all of these means writing features that touch on multiple domains. And it leads to more feature interaction, which adds even more complexity.
Every edge case leads to more complexity. Either you special-case the edge case, in which case your logic is messier, or you incorporate the edge case into the core model, in which case the core model is messier. Some domains have more kinds of edge cases than others.
Normally performance, but any quantity can become a constraint: latency, scalability, memory, battery, waste heat, number of API calls, AWS bill. There’s also multiple ways to measure the constraint: worst-case performance, average performance, standard deviation of performance (very important in realtime systems), etc.
Optimized code is more complex than unoptimized code. This follows from complexity as a budget. Regular code just needs to minimize complexity. Optimized code needs to minimize complexity and the constraint.
(Yes, yes, “premature optimization is the root of all evil.” Mature optimization adds complexity, too. If you never need to optimize any constraints over the course of the project then you are an extremely lucky person.)
- You’re working with a third party library that doesn’t have quite the right tools you need.
- You get information from a vendor who organizes data in a very different way.
- You need to represent tree data in a SQL database.
- The project’s programming language is poorly suited for a specific requirement.
This is what most people would call “accidental complexity”, since you wouldn’t have them if you had different tools. But a degree of accidental complexity is inevitable here. Most projects do multiple different things, and tools are rarely suitable for everything you need to do. Sure, document databases have an easier time with tree data, but if SQL is the right choice for every other problem you’ve got… well, guess you’re stuck with accidental complexity.
The happy path’s easy, but a lot of complexity comes from handling the sad paths. Or putting in telemetry for crash forensics. Or adding retry logic that doesn’t DDOS your API. etc etc etc.
Special shoutout to maintaining robustness in concurrent and distributed systems, which are so complicated I make my livelihood off teaching people how to formally specify those systems. Not even implementing those systems, just how to describe systems that would be robust if you implement them correctly. And that still saves companies a lot of money! Concurrency makes code way more complex.
Not just security, which is a complexity topic unto itself, but anybody using your software for malicious purposes. Bots, cheaters, cybershills, harassers, stalkers… In addition to all of the regular problems, this adds a couple more layers of auxiliary complexity:
- Lots of people don’t think about malicious actors, meaning they have to retrofit security onto legacy code
- You’re working against adversaries who are always evolving their approaches, meaning you must always evolve yours, meaning steadily adding more complexity.
If you already have an existing codebase, you can’t solve new software problems “from scratch”. You have to adapt the existing code to handle it, even if starting over will get you a simpler solution.
“Legacy” here is broader than most people think: code you wrote a month ago is legacy compared to code you’re writing now. If you receive requirement A, write a solution, and then receive requirement B, you’re likely to end up with a more complex solution than if you received A and B simultaneously, because your solution for A becomes legacy when addressing B.
This sounds like an myopic source of complexity, or at least an “accidental” one, but it’s just Conway’s Law in action. If you have two separate teams working on the codebase, you’ll see the code structure reflect that, even if it’s more complex than it needs to be.
Or if an expert leaves the company, the team is now working without that expert’s knowledge, so is less likely to see simple solutions to problems.
GDPR. Nuff said.
The hell is worse than the sum of its parts
Lots of stuff I left out in that list, like versioning and portability. But this is a good stopping point for now, because focusing on just the individual sources of complexity is misleading. The majority of your complexity budget is burned on these things interacting.
Two features could interact in a way that raises a security vulnerability, where the “simple” solution would unacceptably reduce performance. Or people start using your software in unexpected ways, leading to new stakeholders, who want things your legacy codebase makes difficult. Or a nasty edge case is most simply fixed by changing a different team’s module, but that team has ten other stakeholders and vetting that the change wouldn’t break anybody else would take too long. Or all three of those things happen at once and are all part of the same problem.
Repeat this many times, over many weeks and months and years, with different problems and features. Some will be independent, some will not. Eventually, bit by bit, your system will grow increasingly complex. You’ll simplify as best you can and find ways to make the system manageable. It’s complex, but it’s complexity born of out necessity, from many different systemic forces acting on many different constraints.
And then somebody rolls up and says “Why is this so complicated?! Stop chasing shiny objects and start being responsible!”
“Just write simple code!“
Simplicity is good. We should write simple code. But complexity is unavoidable. We do a disservice to ourselves by pretending that any software can be simple if we just try hard enough. Instead, we should study the factors that lead to complex software. That way we can learn how to recognize, predict, and manage complexity in our systems. And then we can seek simplicity within that context. It won’t give us simple software, but it will help us write simpler software. Nuance is better than mantras.
PBT Rebuttal Rebuttal
Brian Marick wrote a rebuttal to my essay about PBT. This is prolly the last post on it for the newsletter, if I write a response to his response it will be a full blog post with examples and combinatorics and shit