Hey folks 🎉
this is such a lovely weather this week. I hope you had a good one and enjoyed some time outside!
The Latest Fashion
- Python is getting a switch:case statement! PEP 622 introduces structural pattern matching. I always found it weird Python doesn’t have one, so there we go!
- I’ve quite enjoyed the Kaggle Learn section. Their newest addition is an intro to AI Ethics course. A lightweight way to get an understanding of the inner workings and important considerations when applying machine learning.
- This deep-learning enabled document parser has really impressed me lately. What was simply impossible four years ago is now available as open-source software.
My Current Obsession
NASA flew a helicopter on another planet!
Here you can see an image of the little guy jumping over its own shadow, and here’s a video of it taking off! The science behind building and testing an aircraft for another planet involves such a feat of engineering. Think about it, have you flown (and crashed because, of course, you did) a drone here on Earth? Now we’re doing this with minutes of latency under an atmosphere that is less than one per cent of the Earth. Basically, you’d have to move to a height of 35km in our atmosphere to get similar conditions, three times the height planes usually cruise at.
Thing I Like
Still obsessed with my new bike. Been riding around Edinburgh almost every day and finally getting away a little bit from my flat ever since a year. It’s magic for my mind. 10/10 do recommend.
Hot off the Press
Would you believe it? I made another Youtube video! This time I talk about my most productive VS Code extensions. I have also written an accompanying blog post that gives some context for those that prefer reading: https://dramsch.net/posts/my-10-favourite-vs-code-extensions/.
Machine Learning Insights
Regression problems often depend on minimising some metric like the mean squared error. While this is a viable loss function, it can be very difficult to gauge the absolute performance of a regression model. Here’s where the Coefficient of Determinism or R^2 score comes in. It is a rather simple concept to all of you that have a more statistical background, but worth investigating regardless. A mean squared error of 30,000 or 3,000 doesn’t matter unless we know the scale and noise level of the data. The R^2 score, however, always returns a value up to 1, where 1 would describe a perfectly deterministic model. Generally, it’s a “higher is better” situation and it is possible for the R^2 score to be negative, this is the case for non-linear models that do not describe the unseen data sufficiently.
The calculation takes two main concepts. The first concept is the total sum of squares (TSS), proportional to the variance of the data. The second concept is the good ol’ residual sum of squares, equivalent to the unbiased mean squared error (RSS). Then the R^2 score is R^2 = 1 - TSS / RSS, so we explain the variance (ish) of the data and the residual error of the model prediction. Measuring whether a model describes the mean and the variance of a model is a simple and very intuitive way to explain the capability of a model to explain the data.
So how do we get negative values? The simplest way would be to “forget” the y-intercept in a linear model. For your (actually well implemented) models, it’s usually a sign that the mean of the data actually better explains the outcomes than your model, so a clear indicator that the model does not capture the complexity of the data accurately.
Question of the Week
- Is the loss surface for Neural Networks convex and what implications does it have?
Send me your answers or post them on Twitter and Tag me. I’d love to see what you come up with. Then I can include them in the next issue!
Tidbits from the Web