There’s still one slot available for the TLA+ workshop! July 27-29, remote, going to be super intense. Sign up here.
I wanted to talk today about software augments but the first draft was way too raw for a Monday newsletter. Instead I’m phoning it in by sharing some of the random ideas I had last week that I will probably never flesh out. I’m distracted enough as it is. Feel free to steal any of these!
“Amazon Plays Dirty” Exercise
This is presented more as an exercise as opposed to a potentially real issue. The purpose is to help us be creative and think through the different ways a system can be subverted.
Premise: Amazon sells a lot of third-party widgets. Amazon also makes its own widgets that are often competing for the same customers as the third parties. Amazon also controls the site they’re both featured on. These things do not mix.
I’m reminded of an essay on cryptocurrency miners:
I feel that a lot of people under-estimate Bitmain or assume that because they play dirty they wouldn’t be able to keep up without playing dirty. But that’s not true. They play dirty because it’s yet another place they can optimize their business, and because they know they can get away with it. Everything else they do is highly optimized as well.
So Amazon wants to 1) draw as much sales away from the third parties as possible and 2) get away with it. They can do all sorts of things, because they control the platform, but we will assume they want plausible deniability. What dirty tricks can they pull and how could outsiders detect them?
Here’s one: folk wisdom tells us that Amazon was a ton of money for every hundred ms of latency they have on page load. I’m not sure I fully believe this as Amazon dumps like a gig of trackers on me every time I visit, but still. If this relation holds up, what Amazon could do is inject a small amount of random latency into every competitor’s page load. At an individual level it would probably be unnoticeable, but in aggregate it would probably make people less likely to buy third-party widgets. And the only way to get evidence this is happening is to make a ton of calls and graph the page load times, which should be obvious enough that Amazon could find out you’re doing this and lay low.
Of course will be much simpler to just prioritize Amazon widgets in search results, but where’s the fun in that?
All features are malicious
A while back I read a quote like “all systems of communication will eventually be used to harass someone”. I can’t remember who said it, but they should have an eponym for saying that. I think the example they used was harassing people through Google docs “shared with” feature: the harasser would share documents with their target, who had no way of stopping it.
This seems like a good exercise whenever you are making a new feature: can this be used to harass someone? If so, what kinds of controls also need to be added so that way the victim can stop it? Obviously we can’t cut harassment out entirely, since the attacker has an overwhelming advantage here, but we can at least mitigate it.
(For the purposes of these exercises, will restricted to only harassment where the attacker is forcing the victim to process information. This excludes things like doxxing, which is horrifying but outside the scope we are thinking about right now.)
Concurrent algorithms can often be faster than sequential ones. But not always: you have to spend time on coordination and information transfer. And sometimes this as enough overhead to make concurrency not worth it. To take a canonical example of a distributed system, MapReduce has to take the dataset, break it down and map it across workers, retrieve the results, and then reduce the results. All those extra steps could make it slower than doing everything in a single machine for small datasets.
I’m not talking about costs related to mechanical sympathy (like being able to fit things in RAM) or costs from having to use parallelizable algorithms. I’m just fascinated by the action cost here of the extra coordination code. I’ve never seen it visualized. I want to see an algorithm’s runtime broken down into time spent on coordinating across agents versus time spent on actually computing the answer. It seems like one of those things that seeing a visual breakdown would give you a lot more insight into how it works intuitively.
PBT Event sourcing
Event sourcing is where instead of storing the state of an app, you store the events that change the state and dynamically recompute the state from those events. It’s a common pattern for domain driven design and large-scale development. It also seems like something that would be really amenable to property based testing. TLA+ too, but then you have to introduce either recursive operators or functions on power sets, neither of which is beginner friendly.
Link Aggregation vs Curation
I’m not a fan of the “link aggregator” style of newsletter because it feels like I’m phoning in content. But I also want to share cool stuff I found around the Internet. This means curating and analyzing instead of just linking and describing. Which, uh, is actually pretty high effort. Every piece of insightful content increase the potential for more insightful content analyzing the original content. I should lean into that more. Here’s an example of what that looks like:
Wireless is a trap: An argument for why wireless peripherals are slower and less reliable than wired ones. What was particularly interesting to me is his discussion of polling and how it interferes with wifi networks. I joke that the two hardest problems in CS are videocalls and room reservations. “All videoconf software is terrible” is just the base state, and it sounds like a big chunk of that is just… using wifi? Sounds like there’s some evidence for it. I’m moving soon and am thinking of getting a desktop computer, maybe it’s time to go back to direct ethernet cables.
Something like that.