Further below are 3 job roles including Senior roles in DS and DEng at organisations - if you’ve got a job to advertise, just reply to this newsletter
I’m on holiday, surfing down in Cornwall. For this issue I’m taking a lighthearted reflection on a couple of years of acquiring a new skill - sourdough baking - with some observations on how it applies to data science (‘natch):
I’ll be back to the usual focus on data science for the next issue.
In other news - we’re rebooting PyDataLondon and should be back at the old venue for Tuesday October 4th.
I noted a couple of issues back that my friend Douglas Squirrel was running his first in-person advisory event in London for tech leads (it was free to attend). There were 15 or so of us along to discuss an improv technique applied to business on “yes, and” rather than “no”. It was an interesting experience focused on helping get ourselves to understand the viewpoint from “the other side of the table”. One of the goals was to identify low-trust situations in a business meeting and to help move them forwards. The silly starter exercise focused on each person adding another line to a story, using “yes, and” as the join phrase. It worked pretty well (and despite involving aliens, actually made sense).
I’ve certainly been guilty of not being sympathetic in some meetings, notably when someone is asking for magic when they have no data but they still want a scientific result in a short time frame. Trying to disappoint them helpfully (another of Squirrel’s phrases) feels like a constructive step, if there aren’t many places to go with the “yes, and” technique. Being mindful about the need for empathy when people dig in on positions does feel like a useful thing to be aware of.
Squirrel will be running more events in a variety of cities, they’re focused more on tech leads in general (not just data science) and will probably turn into a useful social mixer, if you wanted to step outside of your usual events. He’s got plenty that run on Zoom too.
So, my early sourdough loaves started like this:
where, although I got “bread”, it was under cooked or it might be burnt and poorly risen like this:
So began my bread making journey. This was early in lockdown (around 2020-05), we were pregnant and our early maternity travel plans were of course off, so I took to baking to use up old flour. Prior to this I’d never have considered baking and now - a couple of years later - it turns out I have a new hobby.
Sourdough is different from “instant yeast” recipes as it uses a naturally occurring yeast from the air which you make into a starter (and into a bubbling levain prior to making your dough). It takes longer to rise (overnight is great) and takes a bit more work, but also tastes a heck of a lot more interesting because of the time for the chemical reactions to develop.
When it started I read a lot online, but not from books. Hackernews and other geek sites had lots of links to sourdough bread making and lively discussions and I just got started because “how hard can it be?”. It turns out - pretty hard, due to the number of variables at play. Without a good clear guide, you can get lost.
Many of the cookiecutter recipe blogs copy each other, along with a pile of warnings (danger! chlorine with kill sourdough! weevils will be in your old flour! your old flour will be rancid - buy more here!). Learning to ignore the rubbish was a first step. It turns out brewing up a good sourdough starter using water and flour and “yeast from the air” wasn’t actually hard, it just took a week and then I had a lovely bubbling simple dough. If you’ve never tried this, I’d recommend a recipe like this (just start with plain flour if that’s all you have) and be patient. It is pretty amazing to see.
Top tip - a sourdough starter is resilient and easy to form. If you kill it, just make a new one. If you do something like drop your dish on the kitchen floor (I did this a couple of times!), just scoop up a bit of the starter, feed it, and it’ll quickly grow back. If you have too much - fry it up with garlic salt on top for a nice unrisen bread.
For my early recipes I tended to feed up the starter the day before with fresh flour and water, combine the next morning, leave for a few hours for the bulk-rise phase and then bake. This generally produced an “ok loaf”, not too flavourful, sometimes with a poor rise (so it was dense, and didn’t rise very high). It was pretty easy to arrive at this stage.
The obvious parallel is - at the start you don’t need the best solution on the planet. If you want to eat, the initial loaves are fine and you get a benchmark. Later you can iterate, but only if you know what “ok” looks like. I’ve come across lots of DS teams who push quickly for high-end (e.g. DNN) solutions without first checking if the simple stuff (linear models! simple text representations!) works resulting in a complex and harder to diagnose solution.
Following early recipes from online forums I baked using a casserole dish, this works for a time until the dry-baking (there’s no liquid in the dish, just your dough) causes the casserole dish to start to crack. After that, you’ve broken the casserole dish. The dish also doesn’t retain so much heat, so when you add your cold lump of dough it causes the temperature to drop which affects your baking quality. Your dough really wants a consistent temperature for the 30 minutes or so of cooking. Without a lid you’d lose the moisture from the loaf to the home oven (bakeries have different ovens) which reduces the chance of a crispy crust.
Next up I switched to a Dutch Oven which is a cast iron pot which retains its heat - and the moisture - when the dough goes in. Having pre-heated it, you try not to drop in your dough whilst not burning your fingers. If you drop it, it’ll de-gas a bit causing a deflation. My solution now is the inverted dutch oven - you put your dough into the upturned lid, then place the base on top as the “deep lid” (i.e. you have the whole unit upside down). The opening photo above is the result from using the Dutch Oven, a banetton basket without liner (lined with oat flakes and rice flour) and a couple of hundred loaves.
Getting the sourdough starter to work well is yet another critical step. It turns out you can geek out quite a lot on this. I’ve settled on feeding it a couple of times a week and keeping it in the fridge. I know from when our son was born that if you leave it in the fridge (uncovered, in a bowl) for 3 months and forget about it you can quickly recover it once you’re over the shock of having a newborn in the house. Starters are pretty resilient cultures.
The rice flour tip came via twitter and one of my early bread photos, I was using plain flour and I was recommended that rice flour would be just as good to stop the dough sticking to the banetton basket without leaving loads of flour on the bread. And indeed, it was another good step forwards. Not having your bread stuck to the basket, or stuck to your hands, is another critical step to good loaf consistency (and self happiness).
What can go wrong? Well - you can be ambitious and decide to start using not 1 flour (e.g. strong white) but several, such as a strong white (for basic gluten strength) + brown + rye. Adding more flours changes the taste, that’s brilliant. It turns out that using non-white flours, notably rye, decreases the strength of the gluten in the dough and it is the gluten mesh that holds the bubbles from the yeast in. Fewer bubbles, less rise, more frustration with flatter breads. I spent half a year getting really frustrated with my failure to bake a tall sourdough using three flours.
One day I decided to return to just using white flour and suddenly - I had a great and tall loaf again. Up until this point I’d had no idea that my failure to bake a tall loaf was due to my flour combinations (nothing I’d read, and I read a lot, said as such!). I’d ended up experimenting with diastatic malt or ginger powder as raising agents, along with getting obsessive about the initial dough temperature (it can’t be a degree over 27C! oh yes, actually, it can, stop obsessing…). Have you had any joy using a raising agent like diastatic malt? There’s no DS parallel here, I’m just curious :-)
Sometimes going off on an experimental journey can lead you into a frustrated path and stepping back is the only sane step. By making things complex, I’d lost track of the source of the failure (the lack of a good rise) and I was assuming other parts of my process were at fault. If I’d have stepped back to a simple but comparable solution earlier, I’d have diagnosed my real issue (notably using too much rye flour for the rise I was hoping for) much sooner.
I’ve seen just this issue in client projects where overly complex solutions evolve which are then hard to debug. Folk get attached to their positions (particularly if deep learning or complex cool processes are at play) and it is hard to argue about why things might fail. Building a simpler, but reasonably equivalent solution can often lead to some new clear thinking.
Having come this far I’m going to recommend Bread Science: The Chemistry and Craft of Making Bread. This is an amazing book by a passionate author, full of hand drawn diagrams explaining what’s going on - to understand things like the temperature gradient at a point in the loaf whilst it is in the oven, this is the book to go to. If you want to get started then Flour, Water, Salt, Yeast (Forkish) is hard to beat.
There’s a couple of lessons here. You can get a long way (with sourdough or with DS) with a bit of trial and error, be cautious about random blog posts and when you upgrade to a good book, you’ve probably taken a sensible step up. Keeping notes is critical - I use a Google Doc where I note the critical stages like quantities and times. Keeping photos (or well annotated graphs for DS!) helps give context to the old diary entries.
Sharing results almost certainly results in useful feedback (I’m sure some of you will have further tips for me). Doing experiments is brilliant, but if you end up making really intricate schedules with complex recipes and things aren’t working - don’t be afraid to return to the basics to see if your fundamentals are good or bad. In bread baking it is easy to get sidetracked by random variation (e.g. changes in initial dough hydration caused by different natural humidity levels between summer and winter) which covers whatever intricate (and probably overkill) experiment I’d run.
The final bit of advice I’ll share came from something I read (I can’t find the reference, I think it is a baker’s quote) - “bake every loaf”. Even when my bread was doing poorly, appearing not to rise, or felt too wet to hold its structure, I’d still forge on and bake it and make notes. Completing every recipe with a tasting was the easiest way to learn. I’m pretty sure this holds true to most of our DS work too - bake every experiment to conclusion, reflect on the results, learn something, then iterate.
If you’ve got tips to share on the sourdough journey, I’d happily hear them!
See recent issues of this newsletter for a dive back in time. Subscribe via the NotANumber site.
About Ian Ozsvald - author of High Performance Python (2nd edition), trainer for Higher Performance Python, Successful Data Science Projects and Software Engineering for Data Scientists, team coach and strategic advisor. I’m also on twitter, LinkedIn and GitHub.
Jobs are provided by readers, if you’re growing your team then reply to this and we can add a relevant job here. This list has 1,500+ subscribers. Your first job listing is free and it’ll go to all 1,500 subscribers 3 times over 6 weeks, subsequent posts are charged.
This is an exciting opportunity to join a diverse team of strategists, campaigners and creatives to tackle some of the world’s most pressing challenges at an impressive scale.
This role is for a software start-up, although is a part of a much larger established group, so they have solid finance behind them. You would be working on iGaming/online Gambling products. As well as working on the product itself you would also work on improving the backend application architecture for performance, scalability and robustness, reducing complexity and making development easier.
Trust Power is an energy data startup. Our app, “Loop”, connects to a home’s smart meters, collects half-hourly usage data and combines with contextual data to provide personalised advice on how to reduce costs and carbon emissions. We have a rapidly growing customer base and lots of interesting data challenges to overcome. You’ll be working in a highly skilled team, fully empowered to use your skills to help our customers through the current energy crisis and beyond; transforming UK homes into the low carbon homes of the future. We’re looking for a mid to senior level data scientist with a bias for action and great communication skills.