I was a bit under the weather this week, but next week we’re finally releasing my MOOC segment on deep learning! Let’s dive into some other machine learning first!
***Thanks to David for sending in the brain scan paper!***
Got this from a friend? Subscribe here!
I have spent so much energy on the ECMWF MOOC for ML in weather and climate prediction. So on Monday, you can finally see my segment on deep learning! Very excited!
And this one feels huge: I was featured in Interesting Engineering! They wrote a very extensive glowing review of my Skillshare course on AI art!
There are two kind of people in the world. Those that have a favourite spoon and those that just realized there are people that have favourite spoons. I just bought some “long small spoon”, and I am so happy. Completely forgot they exist until I saw a Tiktok.
I have been doing a lot of behind-the-scenes work.
I added Calls for Proposals on PythonDeadlin.es for:
I guess, owing to the fact that I have been working so much on the ECMWF MOOC, I felt like adding a “teaching” section to my website. This section showcases all the different ways I taught different topics.
The VS Code Twitter account made a top 10 list of extensions. They were more web design focused, so I shared my VS Code Extension Top 10, which is more data science focused.
Last week I asked, How can you select the most important features in a data set?, and here’s the gist of it:
Selecting the most important features in a dataset is a critical step in building machine learning models. It helps reduce the data's dimensionality and improve model performance by focusing on the most relevant information.
Let's first look at some potential steps and then analyze how we select features on a volcano dataset.
Here are some ideas:
In a volcano data set, some potential features might include the volcano's height, the frequency of earthquakes in the area, the temperature of nearby hot springs, and the composition of volcanic gases. Using the ideas outlined above, we would start with a nice cross-correlation matrix to eliminate correlated features.
We might find that the frequency of earthquakes and the composition of volcanic gases are correlated time series, which is good to remember for further analysis. We can eliminate correlated features by hand or use Leave-One-Feature-Out with a cheaper model, like Xgboost, to rank the importance of features to weigh the correlated features against each other. Then we'll use these features to build a predictive model or further investigate the relationship between height, earthquakes, spring temperature, and volcanic activity, eliminating the gas composition. This also has the added benefit of eliminating a possibly expensive and complicated measurement from our data pipeline!
The space race was big, but how many space launches were there actually?
This Youtube video is a great visualization of the different launches, their name, and the country that was responsible for the rocket taking off.
[Source: Youtube]
Post them on Mastodon and Tag me. I'd love to see what you come up with. Then I can include them in the next issue!