The data newsletter by @puntofisso.
Hello, regular readers and welcome new ones :) This is Quantum of Sollazzo, the newsletter about all things data. I am Giuseppe Sollazzo, or @puntofisso. I’ve been sending this newsletter since 2012 to be a summary of all the articles with or about data that captured my attention over the previous week. The newsletter is and will always (well, for as long as I can keep going!) be free, but you’re welcome to become a friend via the links below.
We have more sponsored content by Ed Freyfogle, organiser of location-based service meetup Geomob, co-host of the Geomob podcast, and co-founder of the OpenCage, who has offered to introduce a set of points around the topic of geodata. His first entry starts a few paragraphs below on building or buying a geocoder.
The most clicked link last week was the mind-boggling visualization of time perception.
‘till next week,
You're receiving this email because you subscribed to Quantum of Sollazzo, a weekly newsletter covering all things data, written by Giuseppe Sollazzo (@puntofisso). If you have a product or service to promote and want to support this newsletter, you can sponsor an issue.
“…from the Milky Way to the edge of what can be seen”.
“Satellite data hints at the scale of their deception”. A brilliant piece by The Economist, starting with that trope about Mussolini and trains.
“After almost a year of war with Russia, EU citizens still predominantly support Ukraine and the EU’s approach to the crisis.“
“A rapid string of punishing storm systems, known as atmospheric rivers, has brought extreme amounts of rain and snow to California during the past weeks, but the sudden deluge has not made up for years of ongoing drought”, the New York Times reports.
I absolutely love these hand-drawn-like charts by Axios.
MIT-licensed system to create presentation slides, specifically aimed at developers. It uses Markdown.
“Frictionless is an open-source toolkit that brings simplicity to the data experience - whether you’re wrangling a CSV or engineering complex pipelines.“
It’s supported by the Sloan Foundation and the Open Data Institute.
An “interactive SVG Reference” that will clarify a few SVG concepts.
“I thought it would be really fun to test out ChatGPT’s code generation capabilities - focusing on three primary questions:
Can ChatGPT write ggplot2 code which seems to capture the semantic meaning requested in the prompt?
Does the generated code actually run and build the charts requested?
Can ChatGPT translate code between R and ggplot2 and Python and Seaborn?“
This Observable notebook is a handy tool that allows you to explore different colouring schemes for your choropleths.
“anywidget is a new Python library that greatly simplifies creating and publishing custom Jupyter Widgets. Unlike the traditional (cookiecutter) approach, with anywidget you 1) avoid fiddling with build steps and bundlers, 2) can prototype widgets from within a notebook, and 3) get a modern front-end developer experience.“
A set of handwritten notes from a data scientist on general data science, machine learning, statistics, deep learning, image processing, and general data structures and algorithms.
“How to use the full capabilities of Matplotlib to tell a more compelling story.“
Data analyst Milos Popovic explains how to use R to create beautiful maps that show population density.
Build or Buy? Should you try to create your own geocoder?
Welcome to part four of our series on geocoding.
Given freely available opensource software, and open data like OpenStreetMap, should you run your own geocoder? Or should you pay a service like ours to host the geocoding software for you? The whole point of open data is that you can do it yourself, right?
The short answer is that yes, you can run, or even write, your own geocoder. Unique technical requirements may mean it makes sense to craft your own custom service, but most people prefer to leave it experts and get on with their real work.
Our geocoding API aggregates many different open data sources and provides enterprise level reliability. One factor we see many people overlook: setting up the software is one thing, keeping the underlying data current is another. Put another way: building is easy, maintaining is hard. OSM alone gets 4-5 million edits per day. Still, as a developer myself and long-time OSM contributor, I understand the inclination to get your hands dirty. Hopefully it helps put you at ease to know that we’re doing our part to give back to the open data and open source geo software community.
Finally, rest easy knowing that if you ever need to the data and code is all there for you to dive into. That’s the real power of open-source and open data.
Have a project that will need geocoding? See our geocoding buyer’s guide for an overview of all the factors to consider when choosing between geocoding services.
“Drive more value for your organization by adopting an action-oriented mindset“
ML engineer Eugene Yan: “How can we improve a machine learning project’s chance of success? Over the years, I’ve explored various mechanisms in both my own projects and those of my team members. Most people who tried these mechanisms ended up adopting them in future projects. I’m sharing a few here that I hope will help you in your projects too.
While these mechanisms were developed with machine learning projects in mind, with a few tweaks, they can be applied to other technical endeavors too.“
“A Visual Survey of Text Visualization Techniques (IEEE PacificVis 2015 short paper)“
What you can do with Observable in a browser these days is absolutely outstanding.
Yes, another Observable notebook (I received Observable’s newsletter and went on a spree trying their most featured notebooks). I really like the ease of creating stories that are both interactive and customisable in the way they show the results of data analyses.
From this original article in Ukrainian which sadly fails to get automatically translated into English.
From this tweet, a chart of the worst managed towns in France, based on levels of debt and taxation.
A long list of transformer models, including BERT, DALL-E, GPT, and more.
quantum of sollazzo is supported by ProofRed’s excellent proofreading. If you need high-quality copy editing or proofreading, head to http://proofred.co.uk. Oh, they also make really good explainer videos.
[*] this is for all $5+/months Github sponsors. If you are one of those and don’t appear here, please e-mail me