I've recently been real fascinated by the topic of complexity and what keeps us from keeping software simple. The wider net likes to blame "lazy programmers" and "evil managers" for this, as if any software could, with sufficient time, be made as simple as "hello world". I've instead been looking at how various factors create complexity "pressure". Code that needs to satisfy a physical constraint is more likely to be complex than code that doesn't, etc.
One complexity pressure is "impedance": when the problem you are solving isn't well suited for the means you have to solve it. For example, if you need to write really fast software, then Python will be too slow. You can get around this by using foreign function interface, as scientific libraries do, or running multiple processes, as webdevs do, but these are solutions you might not need if you were using a faster language in the first place. In a sense impedance is complexity that comes from using "the wrong tool for the job."
Saying that python is the "wrong tool" for data science is a little inflammatory. It might have impedance flaws, but it also has a lot going for it— rapid prototyping, a huge community, a large ecosystem, etc. Surely those matter more than the added complexity of slowness!
More broadly, "use the right tool for the job" directly contradicts the best practice of choose boring technology:
Adding technology to your company comes with a cost. As an abstract statement this is obvious: if we’re already using Ruby, adding Python to the mix doesn’t feel sensible because the resulting complexity would outweigh Python’s marginal utility. But somehow when we’re talking about Python and Scala or MySQL and Redis people lose their minds, discard all constraints, and start raving about using the best tool for the job.
In the majority of cases, the benefits of using the same language for ten different problems outweighs the benefits of using the perfect language for every problem.
One advantage of boring technology: you can manage a lot more complexity in tech you've mastered than in tech you haven't. Complexity (sometimes) happens when the amount of the system we have to think about is larger than the amount we can fit in our heads. But you can chunk information in familiar tech and so raise the amount you can "fit in your head".
The wrong tool might add complexity through impedance, but you can also manage more complexity due to familiarity, so you're still coming out ahead by using it.
(Another way that familiar tools help with complexity: the better you are at a tool, the easier it is to find simple and idiomatic solutions. I die a little inside whenever one of my clients solves a TLA+ problem with double-recursive functions. Nine times out of ten there's a simpler solution they're unfamiliar with.)
I'm inclined to think that the benefits of familiar tooling is enormous, such that you're often better using a wrong-but-familiar tool than a right-but-exotic tool. And there's a problem with that. What you're familiar with is highly circumstantial. If a team wants to make a slick web app, but all they know is Fortran, then they have a good argument for writing the app backend in Fortran.
…And then, if they need a yaml parser or a pdf generator or a job queue, then those should be in Fortran too. Over time, all language ecosystems will develop all possible language tooling, even if the tooling is for a purpose the language is completely unsuited to.1 Because no matter how wrong a tool is for the job, proper familiarity can turn it into the right tool.
(Obviously there's a limit to this, which is why some webtech companies have migrated their Ruby and Python codebases to things like Go or TypeScript.)
This also changes what the right tool for the job is. If you ask me, Python isn't that good as a language for scientific computing, both due to performance impedance and because of how hard it is to pipeline data. But because it has the large data ecosystem, it's more appropriate now for that kind of work. Other languages now have more impedance because you'd need to reproduce those libraries.
This is going in a lot of different places. Let's bring it back together:
I think impedance is an interesting source of complexity because it's avoidable, but not one you always should avoid. You're often better off writing more complex software if you can stick with a familiar tool.
I wrote about modeling message queues in TLA+.
Most often fortunately this involves wrappers around binaries. But not always. ↩