It’s been a while! Let’s just say the past couple of weeks have not been kind to me and leave it at that.
So, newsletter. I got 40% through writing about book recommendations, then read Why Is Naming Things Hard? and threw it all out to talk about naming things instead. So let’s talk about naming things!
Why is naming things hard? Neil sums it up like this:
And naming things is hard because of this compression. English isn’t precise like C++, so compressing from a precise to an ambiguous language increases losses. Another big reason is that choosing the most important things to include is hard. You’ll notice that most disagreements about naming fall on these two axes: either the name isn’t just right for what you want to do … or the name doesn’t capture important parts of the functionality.
This doesn’t tell the whole story. Neil is only talking about why naming as identifiers is hard. The problem with “naming things” is that it covers a bunch of related concepts, each that are hard in their own way. We’re not only faced with the individual challenges of naming things but also the challenge of realizing there are different things in the first place! Most people who talk about “naming things” don’t do this. They talk about a particular kind of “naming things”, often implicitly, as if that one kind covers the whole domain.
Naming things is hard because it itself is ambiguous. Here are some of the things it could mean:
Naming as Tokenizing
The translation between computer and human identifiers, such that the human can construct a mental model of the program with the help of the names. This is the kind of naming things that Neil talks about in his piece. It’s going from
http_get(request). The name “chunks” the program semantics into an easily identifiable form.
The problems here are as Neil said, though not 100% of the time. In some cases, English can be more precise than code, when multiple possible names map to the same code. This happens when the same code can serve different purposes. Is
x = x + 1 incrementing a counter, or iterating through a list, or giving you unique identifiers? It could be any of them. The only difference is what name you choose.
But the rest of the critiques still hold. You’re compressing a lot of program semantics into a small name with bad grammar.
Naming as Domain Modeling
This is naming as going the other way, translating the human concepts into programming semantics. There are going to be more fundamental concepts in the machine-view than the human-view, because the machine-view needs to work as both a model of the human world and also as a correct program. Or worse, a correct maintainable program.
Consider, I don’t know, an event calendar system. You want to have an event repeat on the second Thursday of every month except November. Think of all the domain concepts you’d need to introduce to accurately capture that one sentence:
LastDayOfMonthRule, I don’t know what else, you probably thought of it in a totally different way. None of this matters to the human. To the human, “second Thursday of every month except November” is barely more complex than “every day”. To the machine, this is a huge expansion of complexity.
And since we’re programming the machine, all of those names become human concepts again. We need to give them names so we can refer to them, tokenize our code, and discuss them with outs. We need to give them clear names that make it clear why this extra machine-complexity is necessary to capture the human domain. That’s hard.
Naming as Bounding
This is more a purpose of naming than a kind of naming, but interesting enough to discuss here. Beyond just describing what a thing is, a name should describe what a thing isn’t. It needs to exclude. We do this all the time with naming, as a
User entity trivially excludes “chairs”. Things get harder when there are concepts that may or may not fit, depending on where you draw the lines. If we’re tracking books, does that include ebooks? Audiobooks? Books on cassette? Torah scrolls?
(If you’re a library, maybe this doesn’t matter, they’re all Reservables. But some reference books cannot leave the library. Should they be excluded? If not, then Reservables the wrong name. So what’s the right name?)
Naming as Codifying
x must be a whole number, we say it has an integer type. If
x can be a whole number or nothing at all, what type does it have? Is it a nullable type? An optional? A maybe? A voidable?
Several different communities saw the same thing, “types that may not have a value”, and codified it all with a name. Once there’s a name, it becomes possible to talk about it. When should you use optional types? What are the alternatives? Does type system X support them? Codifying creates a topic of discussion.
This makes naming-as-codifying a very sensitive process. Once you’ve given it a name, that name becomes how people introduce the concept. Unlike codebases, you can’t change a codification. A misleading name will be confusing people for the rest of your life. The stakes for finding good names is much higher.
One reason codifying annoys me: we codify because we see patterns. If a community needs “types that may not have a value” then they will codify it. You get a jargon-fracture where some concept has ten different names, each with their own literature and community. This makes researching some topics hell.
Naming as Creation
This is one of the weirdest and most important kinds of naming. This is taking a collection of experiences, ideas, and touchpoints and condensing it into a thing. Carving out a section of the noosphere and saying “this is now X.” This is subtly different from naming-as-codifying. That’s when there’s this… whatever everybody already recognizes as a discrete whatever, and they’re trying to find a name for it. Here, people don’t think about the thing because it doesn’t exist as a concept yet. The act of naming is what turns it into something we can think about. But the two can overlap a lot.
I don’t know if a term already exists for this, but I’ve run into it often enough that I internally codify it as “articulating”. Kent Beck and Ward Cunningham first articulated patterns. Individual patterns existed before them, and people used them, but those two first made “patterns” a general software concept. Michael Jackson articulated the machine-world dichotomy and problem frames. Bertrand Meyer articulated the open-closed principle. Some of these are more debatable than others, whether they were entirely new concepts or codifications of existing ones, but the fundamental idea here is the naming itself generating a new space of knowledge.
I don’t think I’m describing this very well. But it’s something I think is really important, and I spend a lot of time trying to articulate new things. On Emulation is one example, as is Constructive vs Predicative Data. The hard part is connecting the dots and realizing there’s something there that needs a name. I find that once I’ve done that, explaining the ideas is pretty easy. Some of them are in “obvious once you hear it” territory.
Also naming them is hard, because names are boundaries and it’s really hard to tell what needs to be included in the articulation.
There’s probably a lot more ways of “naming things” I couldn’t think of. They’re probably all hard, too. Naming things is hard because it could mean a bunch of different hard things. But you don’t realize that from just “naming things”. I can’t think of a better name for “naming things”, though. Naming things is hard.