The data newsletter by @puntofisso.
Read it in your browser.
Last week I had my first day off since November, and it was pretty good. I had forgotten the joys of a day of sweet laziness. As many others, I had struggled to think I should take days off without the ability to travel anywhere. As I should have known, it was refreshing and I should probably do this again – just a day off in between working weeks.
I keep finding repeat instances of issues around definitions. This time, it’s in the discussion around COVID tests. I’ve heard reports that school pupils in the UK will be offered multiple tests in order to make their return to school safer, which is great news. What’s less great is the slightly hysterical debate about false positives, alleging that students might be (I quote) “forced to self-isolate needlessly”. Of course, the issue here is clear: every test comes with a certain share of false positives and false negatives. Generally, those who develop tests will try and minimise one at the expense of the other one. This is normal, expected, and the way tests generally work. The problem is the way public debate addresses the issue, though, suggesting that there’s something wrong in this as opposed to an ordinary feature of a test.
If this sounds familiar, we’ve heard similar discussions around the vaccines. A vaccine – any vaccine – protects the recipient to the vaccine up to a certain level because of the way the immune system produces antibodies as a response; but it does also provide an extra level of protection when everyone around you is vaccinated, because the virus will not find a fertile ground for transmission. This means that vaccines that produce a weak immune response but that are widely adopted will massively reduce the number of infections. Once again, the way most of public debate addresses the issue is by not showing the interdependent and independent variables around the process. I’ve seen some discussions implicitly suggesting that protection is a binary yes/no (when, in fact, the response could vary). I’ll never cease to appeal for better, more in-depth discussions around definitions and processes, and I’m really worried to see how superficially the public debate sometimes is.
On a positive note, I liked the NPR immunity simulator, linked below, because I think it captures this problem well.
Ofcom rejected, as I thought it would, part of my FOI request for broadband data. I’m still working on a response. I think that the decision to not release the data in bulk comes from an overly cautious (and, I believe, incorrect) reading of FOIA and of the nature of the data. I think that releasing the data in bulk would be extremely useful to a variety of parties: planners, digital advocates, analysts, researchers, and, ultimately, benefit both Ofcom and the providers. I’ll keep you posted if there are any updates.
Till next week,
The Negro League Stars That MLB Kept Out — And Is Finally Recognizing
This is a incredible story I didn’t know: “Major League Baseball’s long-overdue decision to recognize the statistics of players from the Negro Leagues means that MLB’s record book is expanding.”
The gender gap in European governments and parliaments
“While the proportion of women in the executive and legislative bodies of EU countries has grown over the years, access to key positions of political influence is still limited — in some member states more than in others.“
From the European Data Journalism Network, as are the following couple of links – all their articles can be freely redistributed, which is pretty awesome.
The gender pay gap could close in 257 years
The COVID-19 pandemic seems to have made things worse in terms of pay equality. Moreover, “85 countries have had no female leaders in the past 50 years. Globally, only 55% of women are in the labour market, compared to 78% of men. 72 countries prohibit women to open a bank account or request a loan. In not a single country do men spend as much time as women in unpaid work.“
On a similar note, the same applies to migrants.
What Are the Vaccine Roadblocks Where You Live?
The New York Times takes a look at what is preventing Americans from getting a COVID-19 vaccine.
We learn that “pinning blame on this misinformation campaign alone ignores dozens of other systemic issues that continue to compromise the American health care system.“
What will it be like when we go back to the office?
I have to admit this is personally interesting. I actually like the ability to choose to work from home if I have an office. Interestingly, a couple of jobs ago I had the option to work from home, but rarely used it (3-4 days per year). This set of game-like visualizations by Reuters are giving me a few reflections on what going back might be like.
“Some people are sick of working from home, but the office they remember is very different from the one they will return to.” The point about the end of meetings is interesting: people used to walk out together and probably have more meaningful interaction then than during the meeting; this is now lost.
How Herd Immunity Works — And What Stands In Its Way
“What will it take to finally halt the spread of the coronavirus in the U.S.? To answer that question, we’ve created a simulation of a mock disease we’re calling SIMVID-19.” Nicely done simulator by NPR.
Become a GitHub Sponsor. It costs about the price of a coffee per month, and you’ll get an Open Data Rottweiler sticker (and other stuff).
If you’re a supporter of this newsletter, thanks a lot for your support. Share this e-mail with a friend, or via social media.
Most Americans Don’t Have Enough Flood Insurance for Climate Change
Interestingly, I was watching a BBC programme about the growing risk of flooding and the history of the 1953 North Sea Flooding, when I came across this article by Bloomberg Graphics: “The real potential cost of floods exceeds federal insurance premiums four-and-a-half times over”.
The power switch: tracking Britain’s record coal-free run
Niko Kommenda at The Guardian created in 2019 this excellent tracker of UK coal-powered energy which I had totally missed at the time. It is still updating.
Hat tip to the J++ newsletter for highlighting it.
The charity sector is, like, really London-centric
“Mapping the locations of every charity in England using publicly available data.” Some data wrangling of OpenCharities data. All source code is available.
Open Cell ID
“OpenCelliD is the world’s largest open database of cell towers with a license CC BY-SA 4.0. Data has full world coverage and freely available for download. This tabular data has ~40 million rows and 6 columns in it but only 3 columns (latitude, longitude, and type) are used in this visualization. ” Well, this is pretty interesting.
Tourism hotspots hit hard by Covid-19 jobs crisis
Analysis by a dream team (including, among others, Paul Bradshaw and Alex Homer) based at the BBC Shared Data Unit and in partnership with the BBC Local News Partnership shows, using welfare claims data from the Department of Work and Pensions, that area reliant on tourism have been given quite a blow by the pandemic.
What’s brilliant about this articles is that its authors have published their methodology (screenshot below) in full.
Cube Composer is a puzzle game inspired by functional programming. The source code is openly available on Github.
“Big ideas in machine learning, simply explained. The rapidly increasing usage of machine learning raises complicated questions: How can we tell if models are fair? Why do models make the predictions that they do? What are the privacy implications of feeding enormous amounts of data into models? This ongoing series of interactive, formula-free essays will walk you through these important concepts.” Highly visual, interactive essays.
A database of sonification works
What it says on the tin, with links. (via Duncan Geere’s newsletter)
A Data Pipeline is a Materialized View
“The ideas presented in this post are not new. But materialized views never saw widespread adoption as a primary tool for building data pipelines, likely due to their limitations and ties to relational database technologies.“
The Building Blocks of a Modern Data Platform A beginner’s guide to the best of breed tools and capabilities for your Data Platform initiative
10 Years of Open-Source Visualization
Did I learn anything from D3.js?, asks D3’s creator Mike Bostock.
Map of my personal data infrastructure This is a map of my personal data liberation infrastructure, with links to the scripts and tools used; and my blog posts elaborating on different parts of it. Intriguing or mad, you judge.
Making waterlines in locator maps move: an experiment From Hans at Datawrapper, how to make an animated locator map.
Low Earth Orbit Visualization
quantum of sollazzo is supported by my GitHub Sponsors, and by ProofRed, who offer an excellent proofreading service. If you need high-quality copy editing or proofreading, head to http://proofred.co.uk. Oh, they also make really good explainer videos.