The data newsletter by @puntofisso.
Hello, regular readers and welcome new ones :) This is Quantum of Sollazzo, the newsletter about all things data. I am Giuseppe Sollazzo, or @puntofisso. I’ve been sending this newsletter since 2012 to be a summary of all the articles with or about data that captured my attention over the previous week. The newsletter is and will always (well, for as long as I can keep going!) be free, but you’re welcome to become a friend via the links below.
Every week I include a six-question interview with an inspiring data person. This week, I speak with Donata Columbro author of Italian-language book “Ti spiego il dato”. Some of you might know her as the headteacher of data school DataNinja”.
Riccardo Di Sipio sent me a correction that made me feel a little stupid. In issue 447 I linked to the Economist’s brilliant graphic showing the travel journey of probe Lucy, and I wrote “the approach they took to visualising the 12-year journey of Lucy, a space probe that visited eight different asteroids”. In fact, Lucy has just departed and will visit eight asteroids. Thank you, Riccardo. I’ll go and brew coffee.
‘till next week,
You're receiving this email because you subscribed to Quantum of Sollazzo, a weekly newsletter covering all things data, written by Giuseppe Sollazzo (@puntofisso). If you have a product or service to promote and want to support this newsletter, you can sponsor an issue.
Depends how you ask, says Reuters Graphics in this highly visual article.
“The mRNA jabs seem best—but all offer protection”. From The Economist, a chart with a title that can only be the work of Alex Selby-Boothroyd…
A good Twitter thread. The Simpson’s Paradox is a statistical phenomenon that shows a trend that only shows when you split data by groups, and disappears on the combined data – or, even more counterintuitively, reverses when you do. And of course there is so much grouped data about vaccines…
(h/t Paola Masuzzo)
A very interesting Twitter thread on telling stories through company accounts, by journalism practitioner and lecturer Paul Bradshaw.
“we have millions of prospective likely-to-sell homes that are not already associated with an agent’s CRM contacts. To which agent or agents should we recommend each of these properties?“
Well, this is a peculiar application for a sector, real estate, which is not quite my greatest strength… but it’s a clear display of use of Multivariate Kernel Density Estimation in the wild.
Last week, OpenAI announced that access to their GPT-3 API has become public (well, you still have to apply!), so this tutorial might be handy if you’re planning any natural language application.
“This repository includes datasets pre-processed to facilitate common operations and full correspondence with commonly used sources. Full correspondence is possible only through some compromise solutions: ensure that the data provided are fit for purpose for your specific use case. Please open an issue if you find problems with the data or have suggestions for improvement.“
Why is this useful? As the EDJnet folks explain, the data “provide concordance between local administrative units (LAUs) and NUTS regions, which is so often an obstacle for data journalism projects looking at local and/or geographical data.” Useful data with an open approach. Giorgio Comai had previously explained how the dataset was created.
Have you ever needed to work on 2 billion rows?
This is a pretty cool piece on the Off The Charts newsletter, in which Helen Atkinson, a visual data journalist, chronicles a day in the office.
Prukalpa Sankar, a featured “Six Questions” interviewee, has written a new interesting article about the approach to creating data strategies. Having been involved in data strategies for now a few years, I find this really helpful to think about the different angles of a business-driven technical data strategy.
“The story of where data governance started and how everything went wrong” in another interesting article by Prukalpa. “Control, not collaboration” is something I’ve seen time and time again.
Academic paper klaxon.
“Social media platforms attempting to curb abuse and misinformation have been accused of political bias. We deploy neutral social bots who start following different news sources on Twitter, and track them to probe distinct biases emerging from platform mechanisms versus user interactions. We find no strong or consistent evidence of political bias in the news feed. Despite this, the news and information to which U.S. Twitter users are exposed depend strongly on the political leaning of their early connections. “
Created as part of the 30 Day Map Challenge (launched by “Six Questions” graduate Topi Tjukanov), this Observable notebook looks at segments of US roads that are listed as heading towards various cities.
(via Soph’s Fair Warning)
Edurne Morillo, a “Six Questions” graduate and a support engineer at DataWrapper, shows cycling lanes around the world and explains how she did it.
The POLIS think-tank at the London School of Economics have released this useful “guide designed to help news organisations learn about the opportunities offered by AI to support their journalism.“
An interesting discussion on legal StackExchange.
quantum of sollazzo is also supported by ProofRed’s excellent proofreading service. If you need high-quality copy editing or proofreading, head to http://proofred.co.uk. Oh, they also make really good explainer videos.