if you go back and analyse how many newsletters were late and which ones mention that Iām stressed, I think thereās some overlap. But regardless, there has been some amazing machine learning going one, so letās check those out for a break!
Also welcome to all those 100s new subscribers! We’re over 850 now!
Got this from a friend? Subscribe here!
Iām on the road again. This time Iāll be in Singapore and Greece for work. Iām giving a big talk here about the work ECMWF does in machine learning for numerical weather prediction. Unsurprisingly, I have some serious imposter syndrome going on.
This goes hand-in-hand with my problem that I get too excited by doing too many cool things, which I end up saying yes to. Within the coming month, I have multiple major deadlines for different projects and I have no idea how Iām supposed to meet them all. So the next weeks will be fairly intense.
After flying all day today, I have to give another shoutout to my Sony noise-cancelling headphones keeping me sane during such a noisy endeavour. Only draw-back, when I was watching movies and the PA came on, my head almost exploded from the max volume they force on you.
I had two posts on Linkedin go pretty viral. One about ar5iv the HTML5 arxiv alternative render. Another about the NN-SVG web tool for neural network architectures. I share both of these with you 8 months ago, but it looks like the ar5iv one just cracked 600,000 views, which is pretty neat.
In case you missed it, I wrote an article about my favourite VSCode Extensions and itās still quite popular.
Last week I asked, āWhere do you normally obtain data for your analysis?ā, and hereās the gist of it:
Itās not uncommon that we have to collect data ourselves for scientific analysis. The dirty secret about āgetting labelsā for data is that someone has to sit down to label the data. High-quality data is usually extremely important for better models, so we end up labelling data ourselves.
But we would be amiss to not check if datasets exist and if theyāre available publicly. In weather, for example we have WeatherBench as a fantastic benchmark dataset.
I will usually start my search on Kaggle. The simple reason being that datasets on Kaggle often come with notebooks, which is a double win for me. Then I often use the Google Dataset search to find other supplemental data.
There is also OpenML, the Amazon AWS Registry and this Awesome List of Public Datasets.
The Roman Empire fell and elephants basically disappeared from Western Europe.
So we had to do scientific sketches from stories and other sketches, which ended up in a hilarious case of the Telephone Game. I cropped a small part of it, so definitely check out the full chart below. Each elephant is also clickable, although many links are too old end donāt function anymore Iām afraid.
I think there are some really adorable specimens in here!
Source: Uli Westphal
Post them on Twitter and Tag me. I’d love to see what you come up with. Then I can include them in the next issue!