#473: quantum of sollazzo – 21 June 2022
The data newsletter by @puntofisso.
Hello, regular readers and welcome new ones :) This is Quantum of Sollazzo, the newsletter about all things data. I am Giuseppe Sollazzo, or @puntofisso. I’ve been sending this newsletter since 2012 to be a summary of all the articles with or about data that captured my attention over the previous week. The newsletter is and will always (well, for as long as I can keep going!) be free, but you’re welcome to become a friend via the links below.
Every week I include a six-question interview with an inspiring data person. This week, I speak with Mahima Singh, Data Editor at the Globe and Mail.
The data politicos out there might be interested in Who’s Watching Parliament?, by Ben Worthy, Cat Morgan and Stefani Langehennig. They have just completed a Leverhulme Trust funded project which looked at how new data tools like TheyWorkForYou are impacting upon Parliament. You can read the project report and summary here, and if you want to find out more drop Ben an email on email@example.com.
‘till next week,
Six questions to...
Mahima is Data Editor at The Globe and Mail.
What is your daily data work like and what tools do you use?
As a Data Editor at the Globe and Mail, I deal with many different kinds of projects; from daily assignments, long-term projects to tools and training for the newsroom. I also focus on assisting the newsroom with their data needs.
My weapon of choice is spreadsheets with a sprinkle of Python, but I am constantly working with different tools for projects.
Tell me about a data project that you're proud of...
My work mainly involves:
Getting data: open-source, scraping, record requests etc.
Analyzing the data: Code in Python or analysis in a spreadsheet
Visualizing the data: Creating top-level charts and graphics in our inhouse tool
Finally, the fantastic folk on our visuals team create the visuals, treatment and interactives for our stories.
...and a data project that someone else did and you're jealous of.
One of my first projects at the Globe and Mail was this analysis of the Supreme Court of Canada
By regularly reading Supreme court cases, our justice writer noticed a shift in how the country's top judges interpreted constitutional rights. They wanted to prove this hypothesis quantitively.
I had to create an algorithm to track the bench's ruling patterns and evaluate how conservative or liberal the judges were.
Initially, I wrote multiple scrapers to pull text from the cases and analyze them automatically. But we all know how finicky scrapers can be. I soon realized what we were tracking wasn't objective enough for our code. After some trial and error, we concluded it was best to manually read through all the cases and create our own database. Finally, after months of reading and rereading cases, during which it felt like I was in law school, we finally had our data.
Cue the analysis.
I did most of the analysis in google sheets itself. Since this was an exploratory project, we tried to analyze the data through multiple angles, many of which didn't make it into the final story ( A lesson for anyone working with big data sets: not all findings end up in the story)
Towards the end of the project, we set up various meetings with a Data Scientist from Ottawa university to validate our analysis and act as a third pair of eyes.
This was a very fun project for me because it was also a crash course in Canadian law for someone who had just immigrated to the country.
If I say "dataset", you think of...
I love to seek inspiration from projects outside the Wapo, NYT, Reuters multiverse. While these organizations have time and again set the standards for data-driven journalism, it's the projects from around the world (particularly Asia in my case) that really stick out for me.
Here are just a few of the many DataViz teams out there.
I'm a big fan of Kontinentalist
, which does data-driven stories from Asia.
And I never cease to be amazed by the investigative pieces by Malaysiakini
from Malaysia, this interactive
from Thai news organization workpointtoday
, and everything that the South China Morning Post
One of the building blocks of a data story/project.
Give someone new to data a tip or lesson you wish you'd learned earlier.
It can take many shapes and sizes, but ultimately it is a source, and just like any source, you interview it to get insights.
Data, set, go!
Be kind to yourself. The industry is forever changing. The constant barrage of the new tools, coding languages, and skill sets can get overwhelming, but it's okay not to know it all.
Data is or data are...
I have learnt that the best way to learn new skills is by incorporating them into your projects. There is something very empowering about going into a project, not knowing something and coming out the other side with skills you didn't have before.
Another major takeaway from my time in newsrooms is "Make connections." Like all journalism, data journalism, too, is a collaborative effort. If it takes a village, it makes sense to get to know the villagers. Building relationships with reporters keep you sane in the workplace and presents opportunities for projects across different beats and fields. Plus, it has the bonus of sharing skills and learning something new.
For me, it's Data is!
It is what it is ¯\_(ツ)_/¯
Car Free Megacities
The talented folks at climate organization Possible have released this incredibly detailed, interactive noise map of London, Paris, and New York.
How popular was Boris Johnson really?
“Use our gadget to guess where public opinion on the Prime Minister was at key moments of his time in office.“
A “draw your own chart” adventure by the New Statesman.
(via Warning: Graphic Content)
How space debris threatens modern life
“Fragments of a defunct satellite were hurtling towards the space station and the crew was ordered into their escape shuttles.“
This is one of those boiling frog problems for which a solution isgetting less and less viable. The sheer number of space vehicles in orbit is massive.
Just How Far Apart Are The Two Parties On Gun Control?
“While a plurality of Republicans (49 percent) said both are equally important and a majority of Democrats (57 percent) said protecting people from violence is more important, the difference is striking.“
Some of you, this side of the pond, might say… 57 percent is the highest of the two?!
Tools & Tutorials
10 Tips for Using Geolocation and Open Source Data to Fuel Investigations
Quite a few useful ideas in this brief listicle by GIJN, including how to identify time of the year through sunlight and shadows
bnomial – One machine learning question every day
This series is short, packed with info, playful. You can also track your own progress.
A Very, Very Tiny Grammar of Graphics
My suggestion is: fork this Observable notebook and create your own grammar.
How the Ancient Egyptians Built the Original Skyscrapers with Data
An entertaining yet eye opening blog from Atlan about the data management practices used in pyramid-building.
Dataviz, Data Analysis, & Interactive
Money flows: who’s investing in Laos, and what problems do they present?
A deep dive into foreign investment into Laos by Kontinentalist.
In their newsletter – which you should all subscribe to – they give some interesting detail on how they delivered this work: “what we did not expect was how tedious the data consolidation and gathering process would be. We spent months and months consolidating a variety of data sources, and digging for information about specific companies involved in these projects.“
The article has some pretty good dataviz.
One year in vis
Dataviz on dataviz klaxon for Datawrapper – and happy first birthday to their Datavis Dispatch :)
How smarter AI will change creativity
“The promise and perils of a breakthrough in machine intelligence”, by The Econonomist, with a good set of reflections for us practitioners in this field.
Interview with a squirrel
Last week, news about the Google developer who started saying that their text-based AI model has become sentient were all the rage. As Janelle Shane writes in this absolutely spot-on proof that that’s a preposterous claim, “slmost everyone else who has used these large text-generating AIs, myself included, is entirely unconvinced. Why? Because these large language models can also describe the experience of being a squirrel.“
Yes, she proceeds on rewriting the dialogue to be about a squirrel – it only takes a handful of word substitutions.
Design Patterns in Machine Learning Code and Systems
“Design patterns are not just a way to structure code. They also communicate the problem addressed and how the code or component is intended to be used.
Here are some patterns I’ve observed in machine learning code and systems, mostly from the Gang of Four design patterns book.“
By Amazon ML scientist Eugene Yan.
The essence of the web, every morning in your inbox
Tens of thousands of busy people start their day with their personalized digest by Refind. Sign up for free and pick your favorite topics and thought leaders. Subscribe here.
quantum of sollazzo is supported by ProofRed’s excellent proofreading. If you need high-quality copy editing or proofreading, head to http://proofred.co.uk. Oh, they also make really good explainer videos.
casperdcl and iterative.ai
[*] this is for all $5+/months Github sponsors. If you are one of those and don’t appear here, please e-mail me