#463: quantum of sollazzo – 12 April 2022
The data newsletter by @puntofisso.

Hello, regular readers and welcome new ones :) This is Quantum of Sollazzo, the newsletter about all things data. I am Giuseppe Sollazzo, or @puntofisso. I’ve been sending this newsletter since 2012 to be a summary of all the articles with or about data that captured my attention over the previous week. The newsletter is and will always (well, for as long as I can keep going!) be free, but you’re welcome to become a friend via the links below.
·
Every week I include a six-question interview with an inspiring data person. This week, I speak with Valérie Ouellet of CBC, where she is a Senior Data Journalist working in their Investigative Unit.
‘till next week,
Giuseppe @puntofisso
Six questions to...
Valérie Ouellet
Valérie is a Senior Data Journalist at CBC News (Canada).
What is your daily data work like and what tools do you use?
My job is to report on national investigations with a data-driven focus for Canada’s public broadcaster, CBC News. I work with data from all sources: FOI, scraping, open data portals and exclusive datasets I build from scratch using news clippings, court filings, interviews, etc. My main tools are spreadsheets, SQL/Python code and QGIS, plus OpenRefine/Comet Docs for messy data. Once I have vetted key findings, I share them with 2-3 academics/experts and chase main characters, all for on-camera interviews. I write all my stories and present them on our online, radio and television platforms, with the support of a producer. Most data visualizations are created by a talented team of graphic designers and/or developers.
Tell me about a data project that you're proud of...
I just used data analysis to
document a manufacturer data dump of thousands of suspected injuries tied to breast implants sold in Canada. The last time I analyzed this data was in 2018 as part of
the ICIJ’s “Implant Files” investigation". At the time, we suspected problems had been underreported for years, but couldn’t quantify the gap. When I dug back into the data in 2022, I found more than 5,900 new reports had been submitted to our federal health authority, Health Canada, by manufacturers on three dates - a ‘data dump’. Some reports dated back to the early 2000s, which goes against federal requirements and raises many questions on the accountability of our reporting system. Thanks to meticulous record-keeping, I was able to reuse my 2018 scripts to run my analysis quickly and put the story out as an exclusive. Proud that I trusted my instinct that there was more to this story and followed up.
...and a data project that someone else did and you're jealous of.
My deep admiration goes to investigative reporter Robyn Doolittle and data reporter Chen Wang
for their investigative series into Canada’s “Power Gap”. Their two-and-a-half-year project compared the overall representation, salaries and leadership positions of women and men in 244 entities in the country. It took careful research, hundreds of FOI requests and complex data analysis to put it all together. The way the series rolled out is also to be commended, with a steady flow of sector-focused stories over months and several “Behind-the-scenes” pieces where the team broke down their data collection, data analysis and methodology in full transparency.
If I say "dataset", you think of...
A creative new way to approach, tell and visualize news stories, particularly to shed light on systemic discrimination, inequalities and social justice. A tool to better research and quantify many problems we hear about regularly as investigative journalists. A way to show sources that their lived experience has meaning for many more people, to connect the dots for our audience and to move the dial on big debates with exclusive information. An ironclad email requesting accountability from the “power that be”, with not just testimonies or allegations they can easily dismiss, but hard numbers they must address.
Give someone new to data a tip or lesson you wish you'd learned earlier.
It took me a long time - thank you Imposter Syndrome - to understand that a great data-driven story doesn’t need to involve dozens of bots, fancy interactive visualizations or three coding languages to be newsworthy and serve the public interest. In Canada, access to data at all levels of government is still limited, unfortunately. Many times, I filed great stories by being the first person to answer a simple question with reliable data - like how many prisoners had been infected with COVID or what are vaccination rates like in provincial correctional facilities - and all it took was one spreadsheet and meticulous research with a rigorous methodology. What matters is the story and what it reveals, not the tool. Robots are really cool, but first ask yourself: are they your best, fastest way of getting to that story?
Data is or data are...
The eternal debate! I do a bit of both? Data is in the newsroom and on-air, data are with academics, experts, statisticians and data scientists.
Topical
A table set for fasting
“Because it follows the lunar calendar, Ramadan falls on a different solar date from year to year — which means that the length of the daytime fast also changes from year to year.”
Rose Mintzer-Sweeney at Datawrapper takes inspiration from The Economist and creates a visualization of fasting times.
Satellite images show bodies lay in Bucha for weeks, despite Russian claims.
Excellent investigative work from the New York Times.

Ukraine war threatens to deepen Russia’s demographic crisis
As Federica Cocco, one of the author of this article, summarises in this Twitter thread, “in the mid-1990s Russia suffered a demographic crisis of historic proportions (caused by an economic crisis, rise in alcoholism & chronic disease, etc). Its population is still suffering the consequences.“
Is the EU’s asylum system ready to welcome Ukrainian refugees?
“Ukrainian refugees now enter the EU under the aegis of the ultra-fast special protection system, but regular reception centres across the Union are piling up hundreds of thousands of applications and rejecting many. EU members states’ asylum systems average more than 15 months of delay.“
A data-driven investigation by CIVIO/European Data Journalism Network, using Observable and with all data available for download.

We Study Virus Evolution. Here’s Where We Think the Coronavirus Is Going.
An opinion piece by a group of researchers for the New York Times.
Tools & Tutorials
OSINT Investigation Tools
“The BBC Africa Eye /Forensics Dashboard is a collation of some essential tools tailored to journalists using open source intelligence (OSINT) to carry out investigations on the African continent.“
I have no comment other than “Wow”. There are quite a few useful tools on this board.

Web Scraping Intro
An introduction to web scraping using Python’s BeautifulSoup.
Dataviz, Data Analysis, & Interactive
Data Visualization Events
A handy Google calendar by the Data Visualization Society.
(via Massimo Conte)
Teacher Pay and Inequality
“Teachers are 3x more likely to have multiple jobs compared to other workers - hometown is a big factor.“
It comes with a handy interactive dataviz made with Datawrapper. I also wonder what the UK data and EU data would be like.

Justice Map
A map of the US to “visualize race and income for your community and country.“

Sponsored content
The essence of the web, every morning in your inbox
Tens of thousands of busy people start their day with their personalized digest by Refind. Sign up for free and pick your favorite topics and thought leaders. Subscribe here.
quantum of sollazzo is supported by ProofRed’s excellent proofreading. If you need high-quality copy editing or proofreading, head to http://proofred.co.uk. Oh, they also make really good explainer videos.

Sponsors*
casperdcl and iterative.ai
Jeff Wilson
Fay Simcock
Naomi Penfold
Steve Parks
[*] this is for all $5+/months Github sponsors. If you are one of those and don’t appear here, please e-mail me