Two tweets from last week.
1) A question I asked friday:¹
If you know what property-based testing is and don’t use it in your code, why don’t you use it? (not saying you should, just curious about the reasons why people don’t)
I wanted to get a sense of what barriers people had. By understanding the barriers better, we understand how to make it more usable, right? I got a lot of answers, which I’ll probably write up into a blog post at some point, but one answer in particular I want to talk about now:²
i feel like often when i see examples of property based testing (eg [link]) they use a lot of toy examples of functions to test (like Math.abs) that I find it hard to relate my actual code to
As the author of the Python PBT library says, “Every time someone uses reversing a list twice to demonstrate property-based testing, I take a drink. No, this isn’t a drinking game, I’m just being driven to drink by bad examples”.³ And I agree! I see the “reverse a list twice” example used to show how good PBT is, but it’s a bad example!
2) This tweet from Matt Fournier:⁴
Re-working the intro to FP first talk. […] I’m really struggling to find a good type safety example that is longer than Identity (worst example, the sepal width of FP), but 2-3 long slides max
“Sepal width” is a very common example in machine learning: classify flowers in a tagged data set. I was vaguely familiar with it, but hadn’t heard of it used as an insult before! I guess it’s so common in ML that everybody’s sick of seeing it. Just like everybody’s sick of seeing “reverse a list twice” used as an example for PBT.
In the best traditions of over-abstracting, let’s define a new term that covers these two things: a canonical example is one that is so widely used as a representative of a topic that people who know about that topic associate it with the example.
Other examples (hah!) of canonical examples: “hello world” for programming, factorial for recursion, counting words for MapReduce, bank transfers for concurrency and transactions,
i++ // increment i by one for why comments are bad, Therac-25 for software accidents… I’m sure you can think of a bunch about whatever you domain of expertise is.
Canonical examples seem to have some common properties.
- They’re easy for everyone to remember.
- They’re easy for the viewer to understand.
- They’re easy to explain to other people, regardless of how well you understand the core topic.
- They’re easy for teachers to share with other teachers.
So canonical examples are extremely convenient to use, which is probably why they become canonical in the first place.
The problem is that canonical examples are very rarely insightful or persuasive. They have to be super shallow, otherwise they’d be too inconvenient! Take the following examples of property testing:
- “If you randomly generate a list and reverse it twice, you get the same list!”
- “If we generate a sequence of expected scene classifications and reverse-construct a pixel video from that, then the scene classifier will output classifications with scene breaks corresponding to the intended scene breaks in the generated expected classifications.” (from here)
The first gives you a much better sense of what property testing actually is. That’s why it’s canonical and the other isn’t! But all the reasons it’s easier to understand are also reasons why it’s less likely to persuade you PBT might be useful for actual real-life problems.
I spend a lot of time thinking about examples. Not just finding examples, but why and how we use them. Two years ago I wrote a blog post called Instructive and Persuasive Examples, which also tries to categorize examples, but since then my thinking has grown considerably more convoluted on them.
One other consequence of categorizing examples is it makes it possible to talk about it on a meta-level. “Reversing a list twice” might be a canonical example of “canonical examples”, but likely not “examples that are both canonical and persuasive.”
(Can you have a canonical example of “canonical examples of canonical examples of things?” Maybe not. The idea seems pointless to me, but I also know I can’t mentally go more than “one level of meta” before my brain stops. That frustrates me.)