Hi everyone. To begin this newsletter I just wanted to say that I’m horrified by the ongoing invasion of Ukraine, and am wishing for strength to the brave Ukrainian people. For me, this war serves as a stark reminder of the privilege that is peace and physical safety. If you’d like to donate to humanitarian causes, I’ve seen Razom recommended as a reputable organization.
And now, on to less important things.
For the past six months or so, I’ve been having a lot of fun collaborating with Nicholas Schiefer, Johannes Schickling, and my PhD advisor Daniel Jackson on a new project called Riffle. We’re trying to simplify app development for both experts and novices, by applying ideas from databases to managing UI state in local-first apps. We published an essay with some findings from our initial explorations:
If you have a chance to read it, I’d appreciate hearing your feedback as a reply to this email, even if it’s just a quick one sentence reaction or question.
We’re still early in this line of work, and much remains to be done. So what comes next, then? To address that, I want to zoom out a bit and reflect on the challenges of scoping research work.
I spent the first five years of my career as an early engineer at an ed tech startup. We were VC-backed, and followed the typically intense trajectory of that world. I remember often laughing at how absurd our growth targets looked, but sometimes realizing later that, to my surprise, we had actually hit them.
In the startup context, resources were scarce, and scoping was everything. Our goal was always to find some clever way to do 10x less work to deliver the same value to the customer. If the scope didn’t feel painfully small, it usually meant you were doing the wrong thing.
From a distance, “research” might seem like the exact opposite situation. No more pesky urgent deadlines and demanding customers! Over the past few years of being a grad student, I’ve found that there’s indeed some truth to this. It’s a luxury to have space to dig deeply into things.
However, there’s a flip side to this story. Scoping in startups may be brutally hard, but at least it’s clear what game you’re playing. Your goal is to ship value fast, grow fast, and eventually make money. It’s not always easy to measure these goals or to achieve them, but you know the basic parameters of your incentive loop, and you can get pretty good at the game if you play it a lot. I also found it grounding to know that this incentive loop was connected to reality—if schools kept paying us year after year, it meant we were really helping them out.
Meanwhile, in research, there are fewer constraints. You can say things like “take a decade or two to reinvent the entire tech stack” with a straight face—in fact, many people would say that’s a better use of research time than boring incremental improvements!
In my experience, this flexibility at the meta-level makes scoping in research harder in many ways than scoping at a startup. There’s danger on both sides of the tightrope. On the one hand, a mundane incremental improvement isn’t an interesting outcome. On the other hand, being too loose with scope might result in some ideas that seem cool, but never really come to fruition, or worse, miss the real essence of the problems people are actually facing. It’s also hard to design tight feedback loops, so it might take years to realize you’ve already fallen off the tightrope.
(By the way, a cynic might say that researchers are actually tightly constrained by the realities of grants and publications. This is probably true for some people, but most of the researchers I know are quite self-aware about this dynamic and still spend a lot of time grappling with questions of scope, separately from funding and publication incentives.)
How to scope research is obviously a deep question with no general answer. Some relevant ideas to consider might be how Xerox PARC was set up to “deliver nothing of business value in five years”, or Richard Feynman’s story about approaching research as an act of joyful play rather than being too goal-directed, or how my collaborations with Ink & Switch so far have been focused three-month summer sprints with a tightly scoped goal.
One lens we’re trying out with Riffle is to simultaneously take a “bottom-up” and “top-down” approach.
By bottom-up, I mean grounding the work in the existing realities of today: solving problems people have, re-using tools they already use. In our case, this means using SQLite + React to help an experienced developer build a full-featured music app.
The obvious benefit of bottom-up is convenience: it’s faster to just use SQLite than to try to build a new kind of database from scratch.
But I actually think the more important benefit of the bottom-up approach is getting a feel for the full complexity of the problem domain in a serious context of use. Building a music app is a different beast than building a toy app like TodoMVC, and it’s useful to encounter gnarly problems early. To take one simple example, since very early in the project we’ve been working with music collections containing hundreds of thousands of tracks, which ensures we won’t accidentally build a system that only works with small amounts of data. There are other more subtle feedback loops around developer experience—for example, we found that SQL seems reasonable in a toy app, but it gets pretty unwieldy for larger apps.
People often cite Alan Kay’s advice: “Simple things should be simple, complex things should be possible.”, but they leave out the rest of the quote. Alan’s full point was that you should start by designing for the hardest tasks first, and only then reach downwards for simplicity (quote from “The Media Lab” by Stewart Brand):
It’s too tempting to build a simple tool with a low ceiling. We’re trying to take that to heart in this project by using our tool to make a complex thing early on.
On its own, the bottom-up approach carries lots of risks. Basing the project in the current culture of computing can constrain the imagination, making it harder to see how people might think differently in the future. And using existing tools can backfire by forcing you to grapple with their full complexity—for example, integrating with React has been tricky because it was designed for a different set of assumptions around asynchronously accessing data over a network.
Meanwhile, the top-down approach is about hitting the reset button. As we discuss in the last section of the essay, we’re designing an integrated system which combines CRDT logic, data reshaping, and UI tree rendering all into a single end-to-end relational query, incrementally maintained. Forget about existing CRDTs and databases and React, replace it all!
This isn’t a completely new idea (e.g., the Eve project had a lot in common with this approach) but I find it to be beautifully elegant. I’m most excited about the possibilites for provenance tracking: what happens if the system is always aware of the connections tying together past user actions all the way through to individual UI elements? How can that help provide better user experiences?
The benefit of the top-down approach comes from being untethered. If a direction feels intuitively right you can just pursue it, without worrying too much about connecting it to concrete problems yet. There’s more variance here, and looser feedback loops: the best case outcomes seem way cooler, but the risk is greater too. I think the most pernicious risk is that it’s easy to produce results that seem cool or novel with a top-down approach because of the weaker constraints, but that’s not necessarily a good proxy for long-term value to the world.
It can be hard to balance bottom-up and top-down, so one thing we’re trying is to split them as somewhat separate efforts within the project. The bottom-up track is okay with being pragmatic; a React library and deeper problem understanding would be great outcomes. The top-down track is okay with being untethered; it’s fine if it doesn’t produce anything that real people can use anytime soon. Orienting in one direction for any given week seems to make it easier to make individual scoping decisions; I’ve found similar success in the past with wearing either my designer hat or my engineer hat at separate times, to avoid getting muddled.
Of course, this might just raise new challenges, like context-switching between tracks and the difficulty of eventually trying to converge the various threads. We’ll see how it goes… let me know if you have any strategies or tricks you find useful for scoping this kind of work.
One quick update: I’ll be attending the Principles and Practice of Consistency for Distributed Data workshop at EuroSys in Rennes, France on April 5. Nicholas Schiefer and I will be presenting a paper that proposes a new synchronization model for local-first apps, which provides stronger data consistency guarantees than typical CRDTs by allowing for a git-like branching model.
Let me know if you’ll be attending that workshop and want to meet up! We’ll also be sharing the paper soon and I’ll send an update to this newsletter once we share it.