Computer Things

Archive

GPT is revolutionary

I don’t feel comfortable making predictions about the future. There’s just too much that goes into it and it’s way too easy to be wrong. But if I don’t occasionally write my riskiest thoughts, do I really deserve a newsletter?! If this ends up being wrong, I promise I’ll do a postmortem. Here goes:

I think GPT-4 will be revolutionary. I think ChatGPT already is revolutionary, and in fact the revolution started in February, when OpenAI released ChatGPT Plus. We haven’t seen the full consequences of it yet.

Half of you are thinking that I’m saying the obvious and the other half think I drank the AI koolaid. I think I’m approaching in a very slightly different way than most of the stuff I’ve read, so please give me a chance to justify why I think this is a big deal.

GPT is different from other AI hype cycles

#234
March 24, 2023
Read more

Ergonomic APIs, channel invariants, and data views

Hi everyone! It’s two days after the March 20 TLA+ workshop, which means it’s time to start getting feedback and revising things for the May 15th workshop. At some point I want to talk about my “workshop technology” I use, but that’ll take some time to write up, so maybe next week? We’ll see.

(If you’re on the fence about the workshop: attendees can share specs they write after the class and I give them feedback. So not only are you learning TLA+, you get help applying it to your job!)

Anyway, here’s a software engineering idea I’ve been playing with but haven’t tried in practice yet, so I don’t know how well it works. Constraint solving1 has this technique called “channeling constraints”, where you represent data in multiple connected ways. Like if you’re constraint solving a schedule, you have two possible representations of “which talk is in slot T and room R” (in MiniZinc):

#233
March 22, 2023
Read more

Making Memes

Making memes

I’m ahead of schedule on workshop prep, which means I have time to think about things besides pedagogy and formal methods. And I’ve been thinking of a kind of article I commonly write:

  1. Here’s some examples of something.
  2. I am going to give that thing a name.
  3. Now that we have a name, let’s discuss it as its own topic.

Examples: edge case poisoning, mimicry, cleverness vs insight, constructive and predicative data.

#232
March 16, 2023
Read more

What can you code up in an hour?

We’re just one week from the TLA+ workshop! Thanks to everyone who signed up, and if you’re interested, there’s still two slots left!

I also published a new blog post, A Neovim Task Runner in 30 lines of Lua. It’s about a little Neovim script I wrote that can handle tasks like “execute this script on whatever file is in the left window, with whatever flags are in the active buffer’s b:flags variable. Yes, this is something that I actually needed. I have weird problems.

For the newsletter, the important bit is the “30 lines” part. There’s a lot of reasons why we program, but a major one is automation of manual tasks. For most people, the tradeoff is not between language A and language B but between language A and doing the task by hand.

Now for the obligatory XKCD:

#229
March 13, 2023
Read more

Formalizing Stability and Resilience Properties

Sent to me via mailbag:

Have you used TLA+ to model resilience as in resilience engineering (systems in general)? — Feodrippe

Resilience Engineering, to my understanding, refers to building systems that can function in the presence of disruptions. For example, if a backend service gets overloaded, its dependencies can automatically switch to fallback logic to keep serving customers and give the service time to recover.

I’ve not done much work personally with “resilience engineering”. But the idea comes up a lot in formal methods, in the form of resilience properties. These are goals of the systems that we want (and use resilience engineering) to achieve. And I think the details of resilience properties are interesting enough for a deep dive. I’ll use syntax, but it should be applicable to other specification languages, too.

#228
March 8, 2023
Read more

Predicate Logic for Programmers: Status Report

Two years ago I started a new book: Predicate Logic for Programmers. In it I said

People often ask me what’s the best math to learn for formal methods, and my answer is always “predicate logic”. 1 It’s super useful to specifying properties, understanding requirements, and just modeling things in general. Then they ask me how to learn it and I falter.

I estimated it would be in early-access by June 2021. But then real life intervened, and then ADHD happened, and then I didn’t touch it for two years. In January of this year I picked it up again, with the goal of having the first draft done by the end of winter. Well, that’s not happening either, but I am 10k words in, so that’s progress! Let’s do a rundown of what the book is, what I have planned, and what I’m struggling with.

Goals and Themes

#227
March 6, 2023
Read more

Hype Cycles Aren't "Gaslighting" You

I promised an update on the logic book today but I haven’t done a spite write in a while and oh boy a new one is just raring to go. Update is queued up to be sent out Monday.

Anyway, the candidate for today is The Great Gaslighting of the JS Age. The author Jared White looks at the current spike in React Discourse and claims that the pro-React arguments are identical to last decade’s pro-Angular arguments. This means that pro-React people are gaslighting everyone else.

I’m angry because for the past decade of web development, I and so many others like me feel like we’ve been repeatedly gaslit, and that so many of the “merchants of complexity” refuse to acknowledge the harm that’s been done.

There has been a small but mighty ecosystem of “influencers” peddling a sort of “pop culture developer abstractions” ethos on the web whether it’s about React, or CSS-in-JS, or Tailwind CSS, or “serverless”, or “microservices”, or (fill in the blank really)—and they’re continuing to gaslight and obfuscate the actual debates that matter.

I don’t have a dog in this fight. I learned formal methods to avoid writing CSS. I don’t give a single shit about React versus Vue versus Vanilla JS. But I do care about honest writing, and I especially care about mental health, and I want to say that calling this gaslighting is complete bullshit.

#226
March 2, 2023
Read more

Teaching Implication Better

Hello everyone! It’s finally March, or at least close enough to March for my purposes. First thing, we’re a month off from April Cools! April cools is a less-cringe version of April fools, where content creators like me publish content that is both genuine and totally out of genre. Last year I took a break from software engineering to write about microscopy. Other people wrote about singing church music, marathon food, and how to read rot13. If you’ve got a blog, I’d heartily recommend joining! It’s a lot of fun.

Anyway, between that and a bunch of work obligations, this is going to be a real busy month for me. I’m still committed to updating the newsletter six times a month, but it might be a bit erratic: instead of the alternating 1-2-1 schedule I’ve been on, it might be 2-2-0-2 instead. Just a heads up.

(If you want to make the newsletter a little easier on me, why not send in a question? If I get a critical mass of questions, or one really really deep question, I’ll do a newsletter about ‘em.)

Now, two of the things this month are related: the TLA+ workshop in March and the predicate logic book I’m working on.1 Predicate logic is a necessary requirement for using TLA+ well, but also has a lot of uses outside that. So I’m always trying to figure out how to teach it better. And two topics in predicate logic cause me the most trouble:

#225
February 27, 2023
Read more

Code review vs code proofreading

Administrative Stuff

  • Just one month until the March TLA+ Workshop! Thanks to everyone who already signed up, I’m in the process of revising everything and am real excited to share the new content. There’s still nine slots left if you want to join!
  • I have a new blog post up: NP-Complete isn’t (always) Hard, about NP-complete problems and modern SAT solvers.

Code review vs code proofreading

So in my last newsletter I had an aside on code review:

#224
February 20, 2023
Read more

Programming AIs worry me

For some inane reason, Github classifies me as a “major open source maintainer”, which means I get a free copilot subscription.1 I’ve been using it for a couple months now and I got to say, it’s a goddamn delight. It can write boilerplate like nobody’s business. I find the tool works best when I’m using it as a keystroke saving device, where it writes 1-2 lines at a time. I write x = and it completes with really[long][dict][lookup]. It’s all very easy.

And this easiness worries me. I got a lot more worried when I read What Do ChatGPT and AI-based Automatic Program Generation Mean for the Future of Software, by Bertrand Meyer. He starts by testing ChatGPT with a tricky spec:

I like to use the example of a function that starts with explicit values: 0 for 0, 1 for 1, 4 for 2, 9 for 3, 16 for 4, 25 for 5. […] This time I fed the above values to ChatGPT and for good measure added that the result for 6 is 35. Yes, 35, not a typo. Now, lo and behold, ChatGPT still infers the square root [sic] function!

About what I expect. After he told ChatGPT that it was for 6, ChatGPT gave back a giant if-else chain. This technically is compliant with his spec but doesn’t capture the spirit of what he wants. A good example of the limits of the tool.

#223
February 16, 2023
Read more

Maybe people do care about performance and reliability

It’s well-established consensus that software is slower and more bloated than it was 20, 40 years ago. One explanation is that software engineers don’t care about their work. Another is that it’s the interplay of a lot of different factors and blaming it on apathetic devs is a convenient way to avoid understanding the actual problems.

This post isn’t about that.

This post is about the other side of the coin, blaming it on apathetic consumers:

If you poll your existing customers about what you should work on next, they’re going to ask for features, not speed—unless the software is so slow it borders on unusable. And god forbid any red-blooded board of directors would allow the company to take a six-month detour from its product roadmap to work on technical debt. The pressure is always on us to build features, features, features.

Programmers want to write fast apps. But the market doesn’t care.

Programmers want to write bug-free apps. But the market doesn’t care.

#222
February 13, 2023
Read more

Creatively Misusing TLA+

I spent the past few weeks thinking about complexity and poking dead birds and stuff, but now that the March TLA+ workshop is available (use C0MPUT3RTHINGS for 15% off!), I’m back in teacher mode and making workshop improvements.1 TLA+ is intended for finding flaws in software designs. But what else can we do with it?

Creative Misuse is the use of a tool in an unintended way. For example, if you use a screwdriver to pry something open, or a book as a monitor stand. Creative misuse in software includes making games in Excel spreadsheets and using yes to test for broken hardware. Creative misuse is 1) very fun, and 2) expands the space of how useful the tool is. I love finding creative misuses for tools. Here’s a few of them for TLA+.

Data Generation

In most programming languages, we construct values step-by-step. In most formal specification languages, we instead take a big set of all possible values and then winnow them down with predicates. This is less efficient, which is why programming languages don’t do it, but it’s also a lot more expressive.

#221
February 7, 2023
Read more

Improve your debugging by asking broad questions

I recently had to help a friend debug a Word issue where fonts would randomly change to Greek symbols. It got me thinking about theories of debugging in general. At my last job, I was the Debugging Guy. I’d semiregularly have a sprint task like “this other team is seeing a weird behavior in an old system, help them figure out what’s going on.” I was pretty good at it, but I couldn’t explain what good debugging looked like to other people.

Since then, I found a couple of good resources. First is Julia Evan’s posts on debugging. Second is David Agans’ book Debugging, which I really like as an introduction, so much that I bought a bunch of copies to give as gifts to early-career friends.

On top of what they say, here’s one technique I like: you can speed up debugging by asking broader questions.

Debugging as hypothesis building

#220
February 2, 2023
Read more

Tag Systems

I’ve tried to write a blog post on tag systems for years now. Literally years, I think I first started drafting it out in 2018 or so? The problem is that there’s just so much to them, so many different approaches and models and concerns that trying to be comprehensive and rigorous is an exercise in madness.

So screw it. These are my noncomprehensive, poorly-researched thoughts on tag systems, thrown on the newsletter. This is not about implementation of tag systems, just their design.

What is a Tag

A tag is a metadata label associated with content. The tag name is also the id: two tags with the same name are the same tag. Tags appear in almost all large systems:

#219
January 30, 2023
Read more

New Workshop, Some Data-ish Pipeline Tricks

Lots of admin stuff today! First, we have a new blogpost, the full version of the complexity preview I shared last week. I’m also announcing a new TLA+ workshop! Or more precisely, three workshops. To make it easier on people’s schedules, there are three dates you can sign up for: March 20, May 15, and June 12. And there’s no fee to move between classes if something comes up and you can’t make your session. Use the code C0MPUT3RTHINGS for 15% off..1

Anyway, on to the main thing. A couple of years ago I started work on a Logic for Programmers pamphlet, then ADDed into some other project. I started work on it again last week with the hope (the hope) of having an early version available by the end of winter. I’m writing the book in Sphinx but compiling it to LaTeX and then a pdf. I like using Sphinx because it’s (relatively) easy to create “directives”, or new types of content with special processing rules.

The main technical challenge so far has been making an “exercises” directive. The pamphlet will have a lot of exercises. I want solutions to automatically be placed in the back of the book, and I want exercises and solutions to hyperlink to each other.2 So I need to turn this:

.. exercise:: Implication
  :name: impl-1

  My Exercise!

  .. sol::

    My Solution!
#218
January 23, 2023
Read more

Funny Programming Languages

One of the weirdest and most wonderful things about people is that they can make a joke out of anything. For any human discipline there’s people making jokes about that discipline. In programming, that starts with memes like “how do I exit vim” (as typified in places like r/programmerhumor), or funny examples of awful code (such as from TheDailyWTF).1

Those are both highly accessible kinds of jokes. More interesting to me is the humor that heavily rely on experience or background knowledge, where the expectation is that many people won’t have the context that makes it funny. This is comedy that is produced by programmers for programmers. By relying on the shared context of software engineering, the gags aren’t recognizable as mainstream jokes. You could easily explain why this is a joke to a normie, but could you explain Enterprise Fizzbuzz? Aphyr’s Technical Interview series?

(This is something that happens in all fields, not just programming. 441 is a hilarious juggling pattern.)

The most interesting is when the medium of programming is itself used as a comedy substrate. This would be something like vanillaJS. There’s also some cases where the joke is an entire programming language. Not many, since making a PL is hard, but a few!

#217
January 19, 2023
Read more

Use the Wrong Tool for the Job

I’ve recently been real fascinated by the topic of complexity and what keeps us from keeping software simple. The wider net likes to blame “lazy programmers” and “evil managers” for this, as if any software could, with sufficient time, be made as simple as “hello world”. I’ve instead been looking at how various factors create complexity “pressure”. Code that needs to satisfy a physical constraint is more likely to be complex than code that doesn’t, simply because it has to balance more concerns.

One complexity pressure is “impedance”: when the problem you are solving isn’t well suited for the means you have to solve it. For example, if you need to write really fast software, then Python will be too slow. You can get around this by shelling out to other libraries, as scientific programmers do, or going for wide scale, as webdevs do, but these are solutions you might not need if you were using a faster language in the first place. In a sense impedance is complexity that comes from using “the wrong tool for the job.”

Saying that python is the “wrong tool” for data science is a little inflammatory. It might have impedance flaws, but it also has a lot going for it— rapid prototyping, a huge community, a large ecosystem, etc. Surely those matter more than the added complexity of slowness!

More broadly, “use the right tool for the job” directly contradicts another best practice, which is choose boring technology:

#215
January 17, 2023
Read more

In Defense of Testing Mocks

Computer Things: 2021 Edition

It’s over a year late, I know, but the 2021 Newsletter collection is now available to purchase as a PDF. 70,000 words, 250 pages, 20 bucks. Unlike last year, there’s no private subscriber-only emails, so this is purely for people who want to read it on the go / give me money. I might add a postmortem review or an introduction piece if there’s enough interest.

I also wrote a bunch of automation so later editions get out a lot faster, I’ll try to get around to the 2022 edition by the end of the month.

In Defense of Testing Mocks

#214
January 9, 2023
Read more

Microfeatures I'd like to see in more languages

There are roughly three classes of language features:

  1. Features that the language is effectively designed around, such that you can’t add it after the fact. Laziness in Haskell, the borrow checker in Rust, etc.
  2. Features that heavily define how to use the language. Adding these are possible later, but would take a lot of design, engineering, and planning. I’d say pattern matching, algebraic data types, and async fall under here.
  3. Quality-of-life features that aren’t too hard to add, and don’t meaningfully change a language in its absence. Often syntactic sugar, like Python’s chained evaluators (if 2 <= x < 10).

Most PLT and language design work is focused around (1) and (2), because those are the most important, but I have a deep fondness for (3)-type features. Because they’re so minor, they’re the most likely to spread between languages, since the overhead of adding them is so small. Since I spend a lot of time in niche obscure languages, I also encounter a lot of cool QoL features that most people might not have seen before. Here’s a few of them!

Number Representations

#213
January 5, 2023
Read more

Why Modeling Finds Bugs (Without Model-Checking)

Happy new year everyone! There’s gonna be some changes to the newsletter this year, see after the article for details. But first, let’s talk formal methods.

I normally sell TLA+ based on the model checker: you give it a design and it tells you if the design satisfies the invariants. When other FM experts try to sell TLA+, they focus on the mental benefits of modeling: just the act of writing the model itself helps you find bugs. I avoid this framing because it sounds too much like snake-oil, and because it’s the same pitch that all the other fringe language evangelists use too.

But it’s also true! I find a lot of bugs when writing a spec and sometimes don’t even need to run the model checker. It’s a skill you pick up over time. I don’t want to say that without also providing some mechanisms for why it’s true. So here’s three reasons why the act of modeling itself finds bugs:

#212
January 2, 2023
Read more

Bugs that literally cost money

How much does a software bug cost a company?

Well that’s a messy question. It depends on the type of bug, the broken behavior, the type of company, etc. And it also depends on how we define “cost”. Catastrophic cases like Knight Capital and Ariane notwithstanding, most of the price of the bug is indirect: lost consumer confidence, opportunity cost of debugging, schedule impacts, etc. These are more nebulous costs and people can disagree on how much they actually are.

That got me interested in the class of bugs where the cost is direct and obvious. And the most obvious case of “obvious cost” is when the bug prevents me from spending money. In October I started recording instances of this I’ve run into. Here’s what I got so far:

  • My apartment complex has a dry cleaning service that’s controlled through an app. Every time I try to make an account, the app crashes. I do my dry cleaning somewhere else.
  • When trying to get a plane ticket, the airline rejected every credit card I tried. On inspecting the JavaScript I saw that the cc validation requests returned successfully but the console crashed on an [[Object object]]. I bought tickets from a different airline.
  • The Amazon counterfeiting situation has gotten so bad I first try to buy from manufacturer sites first. The manufacturer site asked me to make an account, but rejected my email. And my backup email. There was nothing for me to inspect, it just said the POST request returned an error. I bought the product from Amazon.
  • A friend of mine was trying to reserve a hotel, but the “submit” button wasn’t anywhere on the page. She stayed at a different hotel.
  • I tried to buy a new phone case from the Google store but the checkout screen never finished loading.
  • I can’t link my Citibank business card to my business bank account. Every time I try I just get “error”. Whenever I call in customer service gets “error”. I haven’t migrated off Citi yet but I am sorely tempted.
#211
December 21, 2022
Read more

2022 In Review

Happy end of 2022! It’s been a busy year personally and professionally, one of the most productive (on both fronts!) in a long time. Not that you’d be able to tell from the blog! I didn’t publish much this year, in fact it was the least active blog year since 2017. That’s because most of my time was taken up by a couple of really big projects and I never quite got into the work groove. But let’s cover everything I did this year.

Learn TLA+ (announcement)

This defined the first half of my year. I first wrote learntla at the end of 2016 and it was sorely in need of a rewrite. It was so important to get done that I put my blog on hiatus in Feb1 (with one exception) and worked nonstop ‘til July. I still have a lot of work I want to do but it’s at least past the point where I can get hit by a bus and learntla can stand as a good resource.

I had two motivations behind a complete rewrite. First, I’ve gotten a lot better at teaching in the past six years and knew I could make much better content if I started over. Second, I wanted to fix up the whole tech stack, including using a different markup format, which meant that I’d have to extract all the content from its markup anyway— more motivation to just not bother. I think I’m successful in both counts. The material is much better now, and it’s a lot easier to modify stuff without introducing inconsistencies.

#210
December 19, 2022
Read more

Universal SE Topics

Last Monday I ran a TLA+ workshop and last Friday I gave a talk on the Crossover Project. And that was the last thing I had scheduled for the year. I’m done. There will be one more newsletter with the annual end-of-year project wrapup (1 2) next week and then I’ll be off until 2023. I’m feeling equal parts relief and trepidation. When you’re a consultant an open schedule means you’re not getting paid. Still, I’m looking forward to two straight weeks of chocolatiering and hasslin’ friends.

But you’re not here for challenging self-introspection, you’re here for computer things! And the computer thing on my mind right now was inspired by last week’s crossover talk. When discussing what software and trad engineering can learn from each other, I often bring up the Snap-Fit Handbook. You know that little panel that holds the batteries in a remote or mouse? That’s a “snap fit”. The handbook is an entire manual dedicated to snap fit considerations. And there’s a lot of considerations! How to mold them, how strong to make them, all sorts of stuff.

A page from the handbook

My point is: what’s the software engineering equivalent of the snap-fit handbook? That’d be resources focused on a very widespread, but very specific, topic in SE applications. I suspect that there’s an insufficient focus on producing these resources. By that, I mean

#209
December 13, 2022
Read more

Making ChatGPT Useful

Sorry this newsletter is late! Monday I ran a 35 person online TLA+ workshop. Overall I think it went pretty well! There’s a bunch of bugs to iron out and a bunch of improvements I want to make, then I’ll run it again in maybe February. I’ve got one more talk to do this Friday and then I’m done with my contracting work for the year!

Now I don’t know about you, but everybody I know in tech is talking about the new “GPT-3.5” (https://chat.openai.com/). Give it a prompt and it generates text that matches that prompt. And if that prompt is a request for code, the code it generates can be surprisingly accurate. So far it’s solved several days of advent of code, passed the 2022 AP CS A test, and mimicked a virtual machine. It can even take a code snippet and inject a bug, and then explain what the bug is!

(Obviously I’m simplifying a lot and it’s not “doing these things”, these are handpicked examples, it stumbles in other places, etc etc etc).

So first some initial thoughts:

#208
December 7, 2022
Read more

I am disappointed by dynamic typing

Here’s weird thing about me: I’m pro-dynamic types. This is weird because I’m also pro-formal methods, in fact teach formal methods as a career, which seems completely antithetical. So on one hand I teach people how to do static analysis, on the other I use languages which make static analysis impossible.

My fondness for dynamic types is ultimately more social than technical:

  1. Python helped me escape academia and my first two jobs were in Ruby.
  2. I encounter a lot more smug static weenies than smug dynamic weenies, so I defend dynamic typing out of spite. There have been a few cases where I was surrounded by SDWs and I immediately flipped around to being pro-static.
  3. The industry is moving towards static typing and I like being weird and contrarian.

But there’s also a technical reason: I think dynamic typing is itself really neat and potentially opens up of really powerful tooling. It’s just … I’m not actually seeing that tooling actually exist, which makes me question if it’s possible in the first place.

#207
November 29, 2022
Read more

Let's Prove Leftpad; Content Aggregation

Hi everyone!

First of all, I published a new blog post. It’s an explanation of my Let’s Prove Leftpad project. Give it a read!

Now because I can’t in good conscience send a one-line newsletter, here’s a bit more motivation for the post and the project. The post is entirely inspired by this tweet:

#206
November 16, 2022
Read more

Why do we call it "boilerplate code?"

Now that Twitter is on a downward spiral I’m rewriting my favorite tweetstorms in a more permanent medium, so here’s the first: why do we have the term “boilerplate code”? It comes from the peculiar interplay of two industrial revolution technologies: steam engines and hot metal typesetting.

So let’s start with the steam engine. Steam engines run on steam, produced by a water boiler. The hotter you can get the steam, the more efficiently you can extract energy from it. Now according to the ideal gas law, if you increase the temperature of a gas while holding the volume constant, you’ll also increase the pressure. If the boiler can’t handle the pressure, it explodes. This is a bad thing.

An exploding water boiler

A lot goes into boiler engineering, but the relevant thing here is that boilers need to be uniformly cylindrical. If the boiler has sharp angles, or if the boiler wall is thinner in one place, then Bad Thing is more likely. Fortunately, there was a well-established method of producing large, flat sheets of metal with uniform thickness: rolling. Boilermakers started buying vast quantities of cast iron from rolling mills, so large rolled sheets came to be called “boiler plate”.

#205
November 14, 2022
Read more

Notes on Every Strangeloop 2022 Talk I Attended

I originally made this newsletter to tell people when I updated my blog. Then it grew into the abomination it is today.

Anyway, I just updated my blog. Here’s a 2000 word writeup on all the strangeloop talks I watched in September.


Notes on Every Strangeloop 2022 Talk I Attended • Hillel Wayne

This is my writeup of all the talks I saw at Strangeloop, written on the train ride back, while the talks were still fresh in my mind. Now that all the talks are online I can share it! Next year’s the last Strange Loop. You can buy tickets here. Opening Keynote: How expert programmers think about errors Topic: What we empirically know about how experts write code. Ultimately found a lot of it pretty conventional.

#204
November 2, 2022
Read more

Software Moves

No newsletter next week

I’m speaking at JAX.

Software Moves

I’m completely braindoggled from the conference prep and forgot how to write newsletters. After three or four attempts of writing about something, getting 300 words in, and hitting a wall, I’m giving up and phoning this one in. Have a brain dump.

#203
November 1, 2022
Read more

Software Isomorphisms

Last week Gabriella Gonzalez wrote What does “isomorphic” mean (in Haskell), which covers isomorphism from a first-principles perspective. At the very end she says

In my experience, the more you train your ability to reason formally about isomorphisms the more you broaden your ability to recognize disparate things as equivalent and draw interesting connections between them. [\n] For example, fluency with many common isomorphisms is a useful skill for API design because often there might be a way to take an API which is not very ergonomic and refactor it into an equivalent (isomorphic) API which is more ergonomic to use.

I want to expand a bit more on how they’re useful. I’ll be gleefully butchering rigor here, so if you want a more rigorous treatment I’ll rerecommend Gabriella’s post.

Isomorphisms

#202
October 25, 2022
Read more

Teaching Accidental and Essential Complexity

So you may know that I’m teaching a 1-day TLA+ workshop in December (just 10 slots left!) This is unlike my normal workshops because it’s only 1 day and for 35 people instead of 3.1 The two workshops share almost no content between them. To understand why, I need to go into a bit of teaching theory.

For the 700 or so new readers, TLA+ is a form of formal specification language. You can make a design for a system and then directly test the design itself for bugs. It’s an extraordinarily powerful technique but also notoriously difficult to learn. There are two reasons why. First, TLA+ involves a lot of skills and concepts that people aren’t familiar with, like formal logic and model checking. These concepts are all tightly interrelated and you need to understand them all to make TLA+ useful. This is essential difficulty: there’s no easy way to simplify the language while maintaining its power. Second, the tooling is very unforgiving, and there’s lots of unclear error messages and footguns. This is accidental difficulty: there’s only a couple people working on the main tooling and they haven’t had the time or resources to make things better.

I can tailor my workshops to focus on one of these two. To reduce the conceptual difficulty, I can “scaffold” the teaching, or introduce each topic in isolation and gradually build up the whole language. In TLA+, this is done by using a special language, called PlusCal, that compiles to TLA+. PlusCal is much easier for people to grok:

\* TLA+

\E s \in Servers:
  /\ status' = [status EXCEPT ![s] = TRUE]

\* PlusCal
with s \in Servers do
  status[s] := TRUE;
end with;
#201
October 18, 2022
Read more

What a Wedding Taught Me About Software Engineering

Sorry this newsletter is late! I was at a wedding on Monday and forgot to say “no newsletter this week”. As penance for my crimes, I’m going to turn my personal life into content.

One of the nice things about being in software is I don’t have to dress up much. I still try to look good at conferences and events, but it’s more “shirt jacket over white tee” fashion, not “suit” look-good. This was the second time this year I had to wear a tie and the experience made me wonder: why are fancy clothes so damn uncomfortable?

Lots of reasons, probably, but there’s one abstract reason that applies to a wider range of topics than just fashion. “Fanciness” and “Comfort” are both optimization targets. If you don’t prioritize comfort, your clothes will be more uncomfortable. Now imagine we take some piece of clothing and measure its comfort and fanciness, then look at all variants of that piece. Some of those variants will be fancier and less comfortable. If your only optimization target is fanciness, you should pick a variant to maximize the quality you care about. If you iteratively repeat the process, you’ll eventually end with very fancy and very uncomfortable clothing.1

In other words, fancy clothing is uncomfortable because if it wasn’t uncomfortable, we could make it more uncomfortable to eke out more fanciness.

#200
October 12, 2022
Read more

I am a SQL Injection Attack

Recently I’ve been very interested in “meatspace models”, where you explain a CS concept in terms of the real world. It’s the monad is a burrito gag except you actually pick something where the analogy is helpful. This seems especially effective with concurrency topics. I first saw this in terms of race conditions, my own first attempt was two-phase commit, and the Strangeloop TLA+ workshop was framed in terms of musical chairs.

In “Commonsense Computing”, Lewandowski et al argue that students learn concurrency faster when its presented as “real-life” problems, not abstract algorithms. I think about this a lot. Here’s my attempt at translating the two phase commit protocol:https://t.co/mwdbwT5NSj

— Hillel (@hillelogram) August 25, 2022

Are there meatspace models for things besides concurrency? Turns out I’ve already used a meatspace model to explain SQL injection attacks to layfriends (say if it’s on the news). At a very high level, an injection attack is the conflation of syntax and data, which isn’t a difference most people have encountered before. So here’s how I explain it.1

#198
October 4, 2022
Read more

Snippet Praxis

Hi everyone, I’m back from Strangeloop! As always it was a fantastic conference. I’ve already done a 2000-word writeup on it but I’m waiting for all the videos to be uploaded first.

(As part of SL I taught 7-hour, 30 person TLA+ workshop. Normally my workshops are 3 days and 4 people, so this was a big experiment for me, and it worked pretty well! Students said they really liked it; I’m aiming to run a web version of it in early December. Stay tuned!)

On the train ride back, I wrote a couple of tools I’ve wanted a while. The first is a markdown link adder. I often need to put links in my documents and manually going over to every [empty link]() is really annoying. So this opens a GUI with all the links in place, and then I can add them all in once place. Then it puts all the links in the appropriate places.

image_18.png

#197
September 27, 2022
Read more

New Article, "The Five Minute Feedback Fix"

Hi everyone!

I know I said there was no newsletter this week, but that doesn’t mean no new content! Last year I was approached by The ReadMe Project, GitHub’s new online zine, to write about formal methods. I decided to focus on decision tables, since they have a great strength-weight ratio and you can learn everything about them in only five minutes.

Hence, The five-minute feedback fix. You can now read it here:



The five-minute feedback fix · GitHub

Got 5 minutes? @hillelogram has a formal feedback method you can learn to deliver high-quality software that’s cheaper and faster:

#196
September 19, 2022
Read more

On the benefits of humanities in software engineering

No Newsletter Next Week

I’m frantically working on next Wednesday’s TLA+ Workshop. I started making it last month and it’s occupied all my time since.

Originally this newsletter was going to be about the essential vs accidental complexity in teaching a topic, but a couple of things intervened to change that. I’ll probably save that one for after Strange Loop.1

On the benefits of humanities in software engineering

#195
September 14, 2022
Read more

Data Invariants

New Post: Safety and Liveness Properties

Just a few useful ideas from the formal methods world. Read it here!

Data Invariants

Let’s stay on theme this week with a quick overview of data invariants. These are statements about the data that determine correctness: an invariant can only be broken if there’s a bug in your program.

#194
September 7, 2022
Read more

Giving Names to Things

First of all, new post: Software Mimicry! It’s about a thing I’ve seen in a lot of software tools and products that I was trying to capture in a single term, so I could share it and explore what it actually means for things. It’s also a rewrite and expansion of one of my very first newsletters: on emulation.

I find myself fascinating with terms. Mimicry is something I see in a lot of different places, and I’ve seen people identify particular instances of it, but not connect the dots and discuss the abstract concept of mimicry. Is “articulating” mimicry into a distinct term useful? I think so! By having a term for something you can discuss it as a thing instead of being forced to gesture vaguely at it through examples.

Take, for example, “side effect”. A side effect is when a function changes the outside world in addition to computing a value. Changing a global value is a side effect. Allocating or freeing memory is a side effect. Updating the cache, advancing the RNG, and making a POST request are all side effects. Side effects are generally something that increases software complexity, so are often implicitly avoided, or explicitly embedded in software constructs like monads and algebraic effects.

That all makes sense, right? But that only makes sense because I have the term “side effect”. Without it, can I really convincingly explain that “generating random numbers” shares a fundamental essence with deleting a reference or logging to a file? We can vaguely wave at it and maybe point out things like idempotetncy. But it’d be a major uphill battle because people don’t conceptually relate HTTP requests to playing a sound. Articulating the concept of “side effects”, then, creates a new category which all these belong to, making us see them as similar. Once we see them as similar, we can elaborate those similarities / find other examples / find synonyms / all the things we do with new knowledge.

#193
August 30, 2022
Read more

Why do arrays start at 0?

I was at my wits end for this newsletter after my first two ideas hit research barriers. Then someone linked me this story about why arrays start at 0 and bam I had my topic. Specifically, arguing that said link is wrong and does not, in fact, fully explain why arrays start at 0.

You should read it for full context, but the gist of Hoye’s argument is this:

  1. Back in the old days, all languages were either 1-indexed or “arbitrarily-indexed”, meaning you could pick an “index range” like 0:10 or -1:25.
  2. Dr. Martin Richards was tasked to write a language for the IBM 7094. That computer had a special job deck to calculate yacht handicaps for the president of IBM, which took priority over all other jobs.
  3. To keep people from having their jobs cut short by yacht-racing, Richards designed the language to compile as fast as possible. One optimization was setting arrays to start at 0.
  4. The language, BCPL, went on to inspire B, which went on to inspire C. C was so popular it made everybody switch from 1-indexing to 0-indexing.

I have a lot of problems with this article, like the author quoting Richards and then immediately turning around and claiming he said the opposite, as well as mocking anybody who thinks that 0-indexing could possibly be good:

#192
August 23, 2022
Read more

I have complicated feelings about TDD

There were a couple of threads this week on why Test Driven Development, or TDD, isn’t more widely used by programmers.

I spoze the historic and ongoing inability/unwillingness of the software trade to grasp and adopt test-driven development (TDD) is one of the most frustrating & demoralizing events of my forty-two years as a professional geek.

— GeePaw Hill (@GeePawHill) August 9, 2022

That thread (it’s a good one) argues that the problem was an organization failure by TDD proponents, pushing too hard and letting memetic decay transmute “TDD” into “tests r gud”. I have a different explanation: TDD isn’t as valuable as its strongest proponents believe. Most of them are basing TDD’s value on their experience, so I’ll base it on mine.

#191
August 16, 2022
Read more

Excel is Pretty Dang Cool

Surprise newsletter! I said there was none this week due to vacation, but that ended Monday and I’m forever cursed by a witch, so

People might remember this quote from last month’s newsletter:

Interactive computation is a common enough activity for me that I’d put a lot of time into learning something like this. …Maybe I should just get real good with Excel.

Did I? Did I! Here’s some of the fun stuff I discovered:

#190
August 12, 2022
Read more

Search less, browse more

New Blog Post: Crimes with Python Pattern Matching

First technical blog post since 2021 and it’s all about how to break the Python type system over your knee. I have a lot of fun figuring this one out. Read it here!

No newsletter next week

Vacation weekend!

#189
August 2, 2022
Read more

Information camouflage

side note do you know how hard it is to google the tree evolutionary tree? the internet pipes get all gummed up

— 🎵 Tim Blais 👨🏻‍🔬 (@acapellascience) July 23, 2022

I do a lot of research for my job and over time built a vocabulary for researching concepts. This one comes up a lot so I figured I’d get it written down.

Information camouflage is when piece of information A has a name similar to another, very different, much more popular piece B. This makes searching for A difficult because you always get results for B instead.

#188
July 26, 2022
Read more

On Metafiles

A metafile is a file that represents multiple possible files. The most common type of metafile in use is the template file:

<p>I passed in {number}</p>

Templates are usually filled out at runtime, so it “looks like” one file, but you can take a script that takes the template and produces a bunch of output files. That’s what makes it a metafile.

I first got interested in metafiles while working on learntla. I wanted to present specifications as a sequence of iterations but keep them all in sync, so that changing a name in the first one automatically propagated through the rest. My solution was put them all in a single XML file:

#187
July 19, 2022
Read more

Six Programming Languages I'd Like to See

I got 1,000 words into “what, exactly, is software complexity” before remembering that this is supposed to be less effort than the blog. So instead I’m going to list some ideas I had for programming languages. I think all of these are technically doable, so it’s more a “nobody else wants this badly enough” thing. Also caveat these might all already exist; I just know the languages I know!

A serious take on a contract-based language

Contracts are predicates on the code, usually attached to functions and treated like assertion statements. The canonical example is withdrawing from a bank account:

method withdraw(amnt: int)
requires amnt <= self.balance
ensures self.balance >= 0
{
  self.balance -= amnt;
}
#186
July 13, 2022
Read more

Beyond the Four-Document Model

Been a while! Now that learntla is out, I can get back to life. Exercising, eating, sleeping, and most importantly, updating this newsletter. Word of warning, though: all the stuff I want to talk about is inspired by things I did with learntla. Like, for example, limitations of the four-document model.

The four document model, aka The Grand Unified Theory of Documentation, states that there are four types of documentation:

  • Tutorials: A step-by-step lesson that gets a beginner doing something with your project. In TLA+, this would be setting up and running a simple spec.
  • How-to guides: A guide on how to solve a specific problem. In TLA+, this could be how to model a state machine.
  • Explanations: deep dives into a particular topic. For example, auxiliary variables.
  • Reference: clear, dry technical descriptions. Toolbox model configuration options.

If you want to learn more about the 4doc model, divio has a good page. I want to be clear that overall, I think the 4doc model is a great starting point that helped a lot of people write better documentation. Also, as learntla is teaching a language and not documenting a library, the 4doc model doesn’t map well.

#185
July 6, 2022
Read more

The new learntla is now online!

tl;dr: online TLA+ manual/advanced techniques/examples here.

TLA+ is a tool for testing abstract software designs. I first stumbled on it in 2016 and found it so useful I wrote a free online guide to help others learn it. Then I decided the guide wasn’t good enough and wrote Practical TLA+ in 2018. Then I decided the book wasn’t good enough and needed a second edition.

Just kidding! Book’s never getting a second edition, because it costs $26 and no matter how good I make it, it’s never going to have the reach of a free resource. So instead I took all the teaching lessons I learned over the past four years and turned them on rewriting the online guide from scratch.

Six months and 40,000 words later, the Learn TLA+ rewrite is now online! While I’m nowhere near done writing, the guide’s now complete enough to be a useful TLA+ resource, for beginners and experts alike.

#184
July 1, 2022
Read more

Updates on Learntla

Fifteen days left

Hi everyone! Been a couple of weeks since the last email. As mentioned, I took June off from weekly newsletter to work on the new version of the learntla web site. I have a requirement to have a version I’m happy uploading by the end of June, or else I owe a random one of you a thousand dollars. That’s just two weeks away, and I figure I should update everyone on how that’s coming so far.

Currently I’ve written about 28,000 words. That’s about twice the current version of learntla and about half the length of Practical TLA+. With the exception of the sections on Action Properties and pure TLA+, all of the core learning material is in “polish mode”. Action Props are about 60% done, pure TLA+ is maybe 30% done. I also need to take a lot more screenshots and make a lot more diagrams— dreary work, but easy enough to do.

In addition to core material, I’m writing a bunch of special topic essays. Here’s the stuff I want in the MVP:

#183
June 16, 2022
Read more

Reader Mailbag: June Crunch Edition

Hey everyone!

We gotta start with some housekeeping. First, you might have noticed that the newsletter email’s changed. Several people informed me that Gmail was sending the newsletter to spam and Buttondown suggested I send from a custom domain to fix it. There shouldn’t be any changes needed on your end, though if you have message filters you might need to update them. And you can still email support@buttondown.email if you see any issues.

Second, registration for the TLA+ conference is now available! Register here. I’m teaching this year’s TLA+ preconf workshop with an attendee cap of thirty. For the record, my usual cap is four. God help me.

Finally, this is the last weekly newsletter until July. I’m teaching a TLA+ workshop next week, and after that I’ve got to go all in on the learntla June deadline. I might do an off-the-cuff newsletter if something’s really stuck in my head but don’t expect anything. By July we’ll be back to at-least weekly as per usual.

#182
May 26, 2022
Read more

Codebases as communication

Conventionally we communicate programming ideas with talks, papers, and blog posts. But we can also communicate ideas with entire codebases. If someone finds a security exploit, she’ll sometimes publish a proof of concept to prove the exploit isn’t just theoretical.

Now let’s say the exploit PoC comes with a ton of command-line flags: verbose mode, configuration options, output formats, the whole works. Now the writer is communicating something subtly different: not just that the exploit exists, but she wants you to experiment with it. She’s making it as easy as possible for you to play with the exploit yourself and come up with variations and consequences.

This makes codebases like any other kind of communication medium. There are different styles you can use to say subtly different things. There are also different “genres”, or overt things you use the codebase to say. Some examples:

Of Genres

#181
May 16, 2022
Read more
 
Older archives
Brought to you by Buttondown, the easiest way to start and grow your newsletter.