#458: quantum of sollazzo – 22 February 2022
The data newsletter by @puntofisso.
Hello, regular readers and welcome new ones :) This is Quantum of Sollazzo, the newsletter about all things data. I am Giuseppe Sollazzo, or @puntofisso. I’ve been sending this newsletter since 2012 to be a summary of all the articles with or about data that captured my attention over the previous week. The newsletter is and will always (well, for as long as I can keep going!) be free, but you’re welcome to become a friend via the links below.
Every week I include a six-question interview with an inspiring data person. This week, I speak with Keila Guimarães a Data Lead working with Google Trends and Vaco..
Oh, one more thing. I’ve launched Wardle: a UK Local Authority based clone of Worldle, the map-guessing quiz inspired by Wordle. For you, administrative boundary loving nerds. You can play at wardle.puntofisso.net and, for added laugh, the UK constituency version wardle-parli.puntofisso.net.
‘till next week,
This week’s edition is sponsored by OpenCage
OpenCage operates a highly available, simple to use, worldwide, geocoding API based on open data like OpenStreetMap. With libraries for python, R, MATLAB, Stata, and over 30 other programming languages it's easy to dive in. Whether you just need to geocode one dataset, or you have an on-going need, we offer cost-effective, flat-fee packages, and all the benfits of Open Data.
Try the API now on the OpenCage demo page.
Six questions to...
Keila is Data Lead, Google, on assignment from Vaco.
What is your daily data work like and what tools do you use?
Through my employment with Vaco, I work as a Data Lead for the global Trends team at Google, where I surface insights for newsrooms about what people search for on the internet. Prior to my transfer to London, I was a data curator with the same Google team, working in São Paulo, Brazil.
Tell me about a data project that you're proud of...
A big part of my everyday job is to help reporters find hidden stories in this huge database called Google Trends. These stories can be anything from people’s quick response to breaking news to mapping deeper shifts in behaviour.
I use a variety of data tools in my day to day work, from R scripts for cleaning and wrangling large portions of data to query language for interrogating our datasets.
...and a data project that someone else did and you're jealous of.
Back in March 2021, when Brazil was the epicentre of the Covid-19 outbreak, I assisted one of the country’s main publications, Revista Piauí, to map
how searches for Covid-19 symptoms spiked ahead of a surge in cases in various Brazilian states. We plotted
the curve of cases against the curve of searches for symptoms such as loss of taste, loss of smell and difficulty breathing. The pattern
was striking throughout the pandemic.
We also assisted the newsroom mapping
the deeper impact of the pandemic in the wellbeing of Brazilians, looking into searches for mental health, economic hardship and overall physical pain. While at the epicentre of Covid-19 at the time, Brazil was also the epicentre for various search terms related to mental health and physical discomfort. Discovering these correlations was very interesting.
If I say "dataset", you think of...
At the moment I am following the Suisse secret
project, a large investigation into the Swiss banking system. The series of stories unmasks how clients involved in torture, drug trafficking, money laundering and other serious crimes were allowed to hold their money in Credit Suisse. Coordinated by German newspaper Süddeutsche Zeitung, it has taken 160 journalists from 48 news organisations – including The Guardian
, The New York Times
) and Le Monde
– months to dig into the leaked data of more than 30,000 clients, which, collectively, hold more than $100 billion in the bank. Similar to recent large-scale investigations, such as the Panama Papers and the Pandora Papers, the Suisse secrets is a fresh example of how blurred the lines between data and traditional journalism have become. Even though the data isn’t explicitly visible at first glance, it would be impossible to report on this without the desire and ability to analyse vast quantities of data. This story shows the power of digging into large, secret datasets, helping to reveal wrongdoing and keeping the powerful accountable.
A source like any other, which needs to be researched, interviewed, challenged and seen in the perspective of a wider context. Similar to other sources, it also requires a good dose of scepticism.
Give someone new to data a tip or lesson you wish you'd learned earlier.
If I could talk to my younger self, I would remind her that the tools we use to explore and visualise data are enablers to good stories, not the main journey. Our data skills should be seen in the context of facilitators of storytelling, without dogmas. There is something sexy about saying you work with data, something only performed by the few, and for a long time I wasn’t sure if I was using the right tools to be considered part of this tight-knit club. But I wish I had learned early on that, in data journalism, data skills without a headline is fruitless and the story is the ultimate goal.
Data is or data are...
Data is. I just wouldn’t be able to say it differently!
Where are industries clustered in the UK?
“While the bulk of investment into the UK goes to London and the South East, the country is dotted with industrial clusters that are mapped here.“
By Nicu Calcea and Josh Rayman at Investment Monitor.
The Lasting Legacy Of Redlining
“We looked at 138 formerly redlined cities and found most were still segregated — just like they were designed to be.“
De jure and de facto are sadly worlds apart on racial equality in the U.S.
The perfect storm
“Data shows why the volcanic lightning storm from the Tonga eruption was unlike anything on record.“
A very interesting angle about the Tonga eruption – I had no idea that volcanic lightning and seawater were a thing to evaluate in eruptions.
Tools & Tutorials
This web tool allows you to simulate traffic on a ring road – be careful, it’s addictive.
Dtale is a “visualizer for pandas data structures”. It integrates well with jupyter.
360 Giving Data Quality Dashboard
If you’re interested in third sector grants data, 360 Giving has released this handy tool to explore the completeness and accuracy of the data.
How we make maps at The Economist
“To create the dozen or so maps published each week, we source GIS maps and data, which we then load into QGIS—our preferred mapping software.“
“A Real-Time Website Privacy Inspector“.
Dataviz, Data Analysis, & Interactive
Maryanne Wachter, who’s a convert to data visualization from structural and bridge engineering, has created a web application (https://bridge.watch) with interactive data visualizations of the 2021 National Bridge Inventory (a US open data set).
The real Montalbano!
This incredible visualization of Andrea Camilleri’s Montalbano novels is outstanding (and you bet it is – the clever folks of Accurat are behind it).
(via Massimo Conte)
Black Owned Cincinnati
Including data from the Cincinnati Open Data Portal, Dinushki De Livera explores Black ownership.
Radar Interference Tracker: A New Open Source Tool to Locate Active Military Radar Systems
“The Radar Interference Tracker (RIT) is a new tool created by Ollie Ballinger that allows anyone to search for and potentially locate active military radar systems anywhere on earth”.
The tool can be found here, while this article explains how it works.
Fake faces created by AI look more trustworthy than real people
“Synthetic human faces are so convincing they can fool even trained observers, and they may be highly effective for use in scams“
Andrew Ng: Unbiggen AI
“The AI pioneer says it’s time for smart-sized, “data-centric” solutions to big issues”.
The essence of the web, every morning in your inbox
Tens of thousands of busy people start their day with their personalized digest by Refind. Sign up for free and pick your favorite topics and thought leaders. Subscribe here.
quantum of sollazzo is supported by ProofRed’s excellent proofreading. If you need high-quality copy editing or proofreading, head to http://proofred.co.uk. Oh, they also make really good explainer videos.
casperdcl and iterative.ai
[*] this is for all $5+/months Github sponsors. If you are one of those and don’t appear here, please e-mail me