#423: quantum of sollazzo – 25 May 2021
The data newsletter by @puntofisso.
Hi folks, I hope things are going well wherever you are, especially if you are in countries still badly affected by the pandemic. We’re starting to see the end of the tunnel here in the UK. Personally, I was vaccinated a couple of weeks ago, and – as I start writing this issue – I’ve just come back from my first visit to the barber in over a year.
I cannot help but segue into this good little article by Financial Times’s journalist Tim Harford, Why bad times call for good data, which dispenses very good wisdom: “Hindsight is a wonderful thing, of course: the next crisis will no doubt demand timely information about something new. But statistical infrastructure can be built to adapt.“
Yours truly will run a session at the next CogX Festival, happening in London and remotely on 14-16 June, centred on the work of the NHSX AI Skunkworks programme and involving two digital leaders I’ve been working with closely: NHS Resolution’s CIO Niamh McKenna and Kettering General Hospital’s Digital Director Ian Roddis. It should be fun.
Speaking of festivals, don’t miss the Loud Numbers Sonification Festival on 5 June! From 4pm (London time), data sonification duo Miriam Quick and Duncan Geere will be broadcasting a free, live, event on YouTube with a stellar lineup that includes The Economist’s master of data Alex Selby-Boothroyd, The Conditional Orchestra’s Richard Bultitude, Sara Lenzi of the Data Sonification Archive, and others. See below.
This week’s data hero interview is with Gavin Freeguard, data pun supremo and animator of the Institute for Government’s Data Bites series. You probably also know Gavin’s famous National Data Strategy Sea Shanty.
Speaking of interviews: which people working in or with data would you like me to interview? Just hit reply and let me know. Thanks!
‘till next week,
Six questions to...
Gavin Freeguard, freelance consultant, Gavin Freeguard Ltd. Also associate at the Institute for Government, consultant at the Ada Lovelace Institute, and special adviser at the Open Data Institute.
What is your daily data work like and what tools do you use?
Historically, I’ve worked with all sorts of government data – usually stitching together spreadsheets from the Office for National Statistics and various government departments. I’ve worked almost exclusively in Excel – better for dataviz than many think and particularly useful for helping an organisation of mainly qualitative researchers to start thinking about and using data. I then use Daniel’s XL Toolbox – a plugin – to produce an emf version of our charts, and export to Inkscape for some editing, and adding titles and source. That file gives us PNGs for web and social media, and designers what they need to produce printed reports.
Tell me about a data project that you're proud of...
Whitehall Monitor, in general - helping people inside and outside government understand what government looks like, how it's changed and how it's performing. As part of that, our analysis of reshuffles and ministerial resignations giving a more data-informed view - the former sometimes helped change the narrative (in 2018, from 'nothing has changed' to 'actually, a lot has changed in important areas of government policy'), the latter showing why making the data more widely available is worth doing, as lots of others (including news outlets) could use it. Also our work looking at how government measures its own performance (I can't believe we put this together in about three weeks!).
...and a data project that someone else did and you're jealous of.
So many things by the data teams at the FT, The Economist, The Pudding, the New York Times or the Washington Post! (With lots of honourable mentions for The Guardian, Reuters, Bloomberg, FiveThirtyEight and others.) A topical one - Transparency International UK's work in making data about who ministers are meeting much more useable, a great example of how we could make government data more accessible and useful.
If I say "dataset", you think of...
Pain. But also the pleasure that comes solving problems and finding interesting things.
Give someone new to data a tip or lesson you wish you'd learned earlier.
Here's some I made earlier.
Data is or data are...
Data is. I'm told it was one of my tweets that prompted this poll. And even though I could wax lyrical for ages on why treating data as singular is a real problem!
Working towards change
“A comprehensive, national assessment of attitudes and stereotypes towards Asian Americans. The index is one of the first such studies in the last 20 years.“
Police Score Card
“We analyzed data on 13,147 US police departments. Read the Findings. See the Data for Each Department.“
A monumental piece of work.
COVID-19 test and deaths visualization
Colin Angus of the Sheffield Alcohol Research Group (ScHARR) at the University of Sheffield has created this series of charts representing COVID-19 data in this twitter thread and released the R code he used to generate them.
How America’s ‘places to be’ have shifted over the past 100 years
The Washington Post’s graphic team can turn every topic into an amazing data-driven scrolling-based story – here, it investigates how population figures have changed in different US states over time.
A report by the Data for Black Lives organization. Needless to say, this is not an impartial report, and its controversial angle that will generate a lot of debate. But it rightly points to a number of intriguing data issues that do a good job of defining this era of intense data exploitation and scrutiny.
Tools & Tutorials
New Data Tools and Tips for Investigating Climate Change
GIJN’s Rowan Philp has put together a good write-up of different tools and ideas on working with often difficult to use climate datasets.
The Epidemiologist R Handbook
A handbook, written by epidemiologists for epidemiologists, that
“*serves as a quick R code reference manual, provides task-centered examples addressing common epidemiological problems, assists epidemiologists transitioning to R, is accessible in settings with low internet-connectivity via an offline version”.
Created by the Indiana University Observatory on Social Media to help combat the spread of misinformation.
Highres spectrograms with the DFT Shift Theorem
Talking about data sonification, I can’t remember how I got to this article… I wish I was able to understand at least half of it (but the Fourier Transform is fascinating anyway).
Who owns Australia?
“Complex web of data reveals large swathes of country controlled by small number of billionaires and large companies”.
A fascinating piece of interactive journalism on land ownership, by The Guardian and supported by The Pew Charitable Trusts.
(via Lucilla Piccari)
Be Decision-Driven Not Data-Driven
“Maybe being data-driven is the wrong goal.“
Good Data Scientist, Bad Data Scientist
Among other things, “Good DS is obsessed with solving business problems. They relentlessly search for them, and then bring out the right tool once found. Bad DS is obsessed with applying a specific technology or tool. They’ll orient their search for problems around the tool they are looking to use.“
Support this newsletter & spread the word
Become a GitHub Sponsor. It costs about the price of a coffee per month, and you’ll get an Open Data Rottweiler sticker (and other stuff). Or you can Buy Me A Coffee.
quantum of sollazzo is also supported by ProofRed’s excellent proofreading service. If you need high-quality copy editing or proofreading, head to http://proofred.co.uk. Oh, they also make really good explainer videos.