Feb. 15, 2022, 11:45 a.m.

457: quantum of sollazzo

quantum of sollazzo

#457: quantum of sollazzo – 15 February 2022

The data newsletter by @puntofisso.


Hello, regular readers and welcome new ones :) This is Quantum of Sollazzo, the newsletter about all things data. I am Giuseppe Sollazzo, or @puntofisso. I’ve been sending this newsletter since 2012 to be a summary of all the articles with or about data that captured my attention over the previous week. The newsletter is and will always (well, for as long as I can keep going!) be free, but you’re welcome to become a friend via the links below.

·

Every week I include a six-question interview with an inspiring data person. This week, I speak with David Kane, a data-driven researcher who I had the pleasure to work with, and who’s the absolute guru of third sector data.

‘till next week,
Giuseppe @puntofisso


This week’s edition is sponsored by OpenCage

OpenCage operates a highly available, simple to use, worldwide, geocoding API based on open data like OpenStreetMap. With libraries for python, R, MATLAB, Stata, and over 30 other programming languages it's easy to dive in. Whether you just need to geocode one dataset, or you have an on-going need, we offer cost-effective, flat-fee packages, and all the benfits of Open Data.

Try the API now on the OpenCage demo page.


Six questions to...

David Kane

David is a freelance researcher.
What is your daily data work like and what tools do you use?
A lot of my work is looking at data about charities and other non-profits, so my typical day would start with firing up a Jupyter notebook and running some queries about a particular set of charities, or merging some datasets together to produce analysis.

More often than not I’ll be working with the Charity Commission’s register of charities. I run a site called findthatcharity.uk which has a big database full of lots of data about charities - that database is often where I start with a piece of work.

I have VSCode open most of the day, it’s where I run notebooks and write Python scripts to process data. I’m not sure what I’d do without the Pandas data analysis library in Python.

Tell me about a data project that you're proud of...
I work a lot with 360Giving, an amazing charity that supports organisations to openly publish grants data, and helps people to use it to improve charitable giving.

This became even more important during the pandemic. To help people understand more about how funders responded to the crisis, in December we released a piece of research looking at grants made since March 2020, and how funders made sure that the charities they supported could continue to provide vital services.

It was a real team effort - the team at 360Giving worked with a wide variety of funders to help them publish high quality data quickly - sometimes for the first time.

...and a data project that someone else did and you're jealous of.
My degree was Geography and I’ll always be a geographer at heart, so basically anything that Alasdair Rae does. I particularly love his work using 3D modelling software to make beautiful maps - eg these maps of Australia, or using lidar to make 3D models of Scotland.

If I say "dataset", you think of...
Let’s get stuck in! What do I already know that I can link this to, to add extra context or make it more useful?

Give someone new to data a tip or lesson you wish you'd learned earlier.
Always look at your data. It’s amazing how easy it is, especially when you use something like Python or R, to go straight to analysis or attempt to clean the data without seeing what’s actually in it first. Excel has lots of flaws, but the best thing about it is that the data is always there right in front of you - so you can see patterns & oddities. So even if I’m not using Excel I try to make sure I’m looking at the actual data as much as I can.

Data is or data are...
Data is. It's a mass noun - like water. There's not really such a thing as a "datum", a single point of data that exists in isolation - it's the context and integration with other parts that turns a number or a piece of text into "data". So you can only ever refer to the whole thing. Or something.


Become a Friend of Quantum of Sollazzo →

If you enjoy this newsletter, you can support it by becoming a GitHub Sponsor. Or you can Buy Me a Coffee. I'll send you an Open Data Rottweiler sticker.
Quantum of Sollazzo will always be free.

You're receiving this email because you subscribed to Quantum of Sollazzo, a weekly newsletter covering all things data, written by Giuseppe Sollazzo (@puntofisso). If you have a product or service to promote and want to support this newsletter, you can sponsor an issue.


Topical

Beijing Winter Olympics Medal Count

There are many trackers of Olympic Medals, but I particularly like this one from Bloomberg this week.

Beijing Winter Olympics Medal Count.png

You’re doing Italian food all wrong, say Italians

According to this YouGov poll, my spaghetti with cheese, ham, and mustard are probably not acceptable.

Youre doing Italian food all wrong say Italians.png

Interactive: How much of your country’s electricity is renewable?

“About 30 percent of the world’s electricity comes from renewables, including hydropower, solar and wind among others.“
Interactive quiz aside, the article gives a broad picture of renewable use.

Interactive How much of your country’s electricity is renewable.png

Tools & Tutorials

Mother Jones’s Storytelling Tools

A lot of them.

Simutrans

“Simutrans is a freeware and open-source transportation simulator. Your goal is to establish a successful transport company. Transport passengers, mail and goods by rail, road, ship, and even air. Interconnect districts, cities, public buildings, industries and tourist attractions by building a transport network you always dreamed of.“
Basically, SimCity for transport.

Simutrans.png

Population around a point

“Human population within a distance, from any point in the world.”, using the Global Human Settlement Layer data.
(via Chris Weston)

Population around a point.png

falsisign

“For bureaucratic reasons, a colleague of mine had to print, sign, scan and send by email a high number of pages. To save trees, ink, time, and to stick it to the bureaucrats, I wrote this script.“
The utter idiocy of requests like these trigger creative solutions, it would appear.

falsisign.png

Free KMZ to SHP to GeoJSON Online Converter

Although it can only do files up to 10MB, this tool might be useful.

Cheatsheet: expansion for continuous scales

For ggplot2. Yes, there’s a typo in the title.

Cheatsheet expansion for continuous scales.png

How to: make your first interactive map in R

Here’s the tutorial and here’s a map of London house prices made with it.
(via Steve Parks)

How to make your first interactive map in R.png

How I made the viral map

“Six Questions” graduate Erin Davies explains how she made the viral map (featured in last week’s quantum of sollazzo) of where Americans were born (see below) and how to adapt it to a different dataset.

How I made the viral map.png

Data thinking

How we communicate uncertainty in polls and election forecasts

The Economist data team explains well how they approach the topic of uncertainty in polls, using a technique that allows them “to translate the polls into probabilities, which […] should give readers a better understanding of the uncertainty in the election in general, and the polls in particular.“

How we communicate uncertainty in polls and election forecasts.png

Dataviz, Data Analysis, & Interactive

Can You Gerrymander Your Party to Power?

“To help you understand [gerrymandering] better, we created an imaginary state called Hexapolis, where your only mission is to gerrymander your party to power.“
(via Soph Warnes’ Fair Warning)

Can You Gerrymander Your Party to Power.png

Words Known Better by Males Than Females, and Vice Versa

“The chart below visualizes words disproportionately known by one sex and not the other.“
On Observable, so it’s editable, using data coming from a paper. Not a lot of it, it must be said, but enough to give you a hunch of cultural norms. Oddly, I knew the word taffeta because of Young Frankenstein.

Words Known Better by Males Than Females and Vice Versa.png

AI

A Taxonomy for AI / Data for Good

“This piece is the culmination of a series of explorations seeking to map and make sense of the landscape of efforts to apply data for good.” By DataKind co-founder Jake Porway. Nice chart, too.

A Taxonomy for AI Data for Good.png

UK to pilot world-leading approach to improve ethical adoption of AI in healthcare

Work thing klaxon. Led by my colleague Brhmie Balaram, in the NHS AI Lab we’re starting to work on piloting Algorithmic Impact Assessments (AIAs) in healthcare, as a way to become aware of, and ultimately eradicate, biases in algorithms.


Sponsored content

The essence of the web, every morning in your inbox

Tens of thousands of busy people start their day with their personalized digest by Refind. Sign up for free and pick your favorite topics and thought leaders. Subscribe here.

quantum of sollazzo is supported by ProofRed’s excellent proofreading. If you need high-quality copy editing or proofreading, head to http://proofred.co.uk. Oh, they also make really good explainer videos.

Sponsors* casperdcl and iterative.ai Jeff Wilson Fay Simcock Naomi Penfold Steve Parks

[*] this is for all $5+/months Github sponsors. If you are one of those and don’t appear here, please e-mail me

You just read issue #457 of quantum of sollazzo. You can also browse the full archives of this newsletter.

Share on Facebook Share on Twitter Share via email
Brought to you by Buttondown, the easiest way to start and grow your newsletter.