Alloy is a formal specification language I use a lot. In Alloy, ≈all data is either an atom or a relation between atoms.
sig DataSource {}
sig Step {
requires: set Step
}
Unlike most languages, Alloy has two notions of subtyping: a type (or "signature") can be extend
ed, which is exclusive, or they can be in
, which are stackable. In this example, the source can be generic, a database, or a file, but not all three. The Step
can be generic, an extraction step, a load step, or both:1
sig Database, File extends DataSource {}
sig Extract in Step {
from: DataSource
}
sig Load in Step {
to: DataSource
}
Some thoughts on this:
X, Y in Step
, we can't have something be both X
and Y
. However, the Alloy analyzer can tell us if two inclusive subtypes are incompatible, and what rules make them so. This is still useful for modeling!Alloy doesn't have a primitive boolean type. Instead, if you want a boolean "field", you use in
:
sig DataSource {}
sig PrivateSource in DataSource {}
sig PremiumSource in DataSource {}
sig UntrustworthySource in DataSource {}
// each of these can have separate fields
Private, Premium, and Untrustworthy are adjectives applicable to the "Data Source" noun. The set of all private data sources is a subset of the set of all data sources. Every adjective acts as a subtype.
Or at least, that's what I thought, but the paper Lexical Semantics and compositionality provides counterexamples: a "former senator" is not a senator and a "fake gun" is not a gun (pg 12). This reminds me of Liskov Substitution. There are valid sentences that make sense with "senator" but not "former senator", just as there are functions which can accept Senator
but not FormerSenator
.
(The author starts relating this to a "possible worlds" interpretation of adjectives which I want to spend time studying but is way outside the work scope of this newsletter)
Diving down this rabbit hole lead me to the concept of hypo- and hypernym, as well as the Wordnet. The more I dig, the more I'm seeing ideas that look a lot like CS ideas, except more fully developed. I know there's been a lot of ties between linguistics and CS— computational linguistics is a thing, and I think early lisps were used heavily in language studies haven't been able to find a source of that? — but that's all between academics doing research, not practitioners building software.
At the same time though, I recognize that I know barely anything about linguistics aside from a bunch of Wikipedia articles. I've been critical in the past of "programmers must learn unicycling"-style thoughtpieces and don't want to fall into the same trap. I think there could be useful ideas in linguistics that can be extracted and turned into useful programming ideas, but I don't think everybody needs to learn linguistics.
What I really want to do is learn more about it myself, and then interview a bunch of linguistics/SE crossovers, and then see what comes of that. Maybe a project for later this summer.
I figure I should mention at least one attempt to software look a little more like linguistics: The J language. In J, values are called "nouns", functions are "verbs", map
is an "adverb". As the inventor put it:
I now prefer the terms drawn from natural language, as illustrated by the terms shown on the right. Not only are they familiar to a broader audience, but they clarify the purposes of the parts of speech and of certain relations among them:
- A verb specifies an “action” upon a noun or nouns.
- An adverb applies to a verb to produce a related verb; thus + is the verb “partial sums.”
- A conjunction applies to two verbs, in the manner of the copulative conjunction and in the phrase “run and hide.”
- A name such as a or b behaves like a pronoun, serving as a surrogate for any referent linked to it by a copula.
Later iterations on J introduced "gerunds" and "idioms", but I'm not aware of any other inspiration from linguistics besides that.
Still working on it! Rewrote the "theory" part last week, now planning how to restructure the "applications" part. No completion estimates.
If you're reading this on the web, you can subscribe here. Updates are 6x a month. My main website is here.