6 episodi

Short, simple summaries of machine learning topics, to help you prepare for exams, interviews, reading the latest papers, or just a quick brush up. In less than two minutes, we'll cover the most obscure jargon and complex topics in machine learning.

For more details, including small animated presentations, please visit erikpartridge.com.

Please do join the conversation on Twitter for corrections, conversations, support and more at #mlbytes

Machine Learning Bytes Erik Partridge

- Tecnologia

- 31 LUG 2019
K-Fold Cross Validation

K-Fold Cross Validation

K-fold cross validation is the practice by which we separate a large data set into smaller pieces, independently process each data set, and then train our models on some number of the segments, and validate it on the rest. This is generally considered a best practice, or at least good practice, in machine learning, as it helps ensure the correct characterization of your model on the validation set.

Machine Learning Mastery has a great post on the topic.

---

Send in a voice message: https://podcasters.spotify.com/pod/show/mlbytes/message
- 55 sec
- 30 LUG 2019
Stratified Sampling

Stratified Sampling

Stratified sampling provides a mechanism by which to split a larger dataset into smaller pieces. While random approaches are commonly used, stratified sampling ensures a relatively consistent distribution. This can result in an unwanted loss of variance, but can also beneficially reduce variance.

---

Send in a voice message: https://podcasters.spotify.com/pod/show/mlbytes/message
- 59 sec
- 25 LUG 2019
Boosting

Boosting

Boosting is also an ensemble meta-algorithm, like boosting. However, in boosting we teach a large number of weak, but specialized learners, and combine them according to their strengths. For more information on boosting, consider watching the University of Washington's great lecture on the topic.

---

Send in a voice message: https://podcasters.spotify.com/pod/show/mlbytes/message
- 51 sec
- 24 LUG 2019
Bagging

Bagging

Bagging is an ensemble meta-algorithm. Basically, we take some number of estimators (usually dozens-ish), train them each on some random subset of the training data. Then, we average the predictions of each individual estimator in order to make the resulting prediction. While this reduces the variance of your predictions (indeed, that is the core purpose of bagging), it may come at the trade off of bias.

For a more academic basis, see slide #13 of this lecture by Joëlle Pineau at McGill University.

---

Send in a voice message: https://podcasters.spotify.com/pod/show/mlbytes/message
- 50 sec
- 23 LUG 2019
Bias, Variance, and the Bias-Variance Tradeoff

Bias, Variance, and the Bias-Variance Tradeoff

The bias-variance trade-off is a key problem in your model search. While bias represents how well your model can capture the salient details of a problem, and generally correlates with more complex algorithms, it comes at the trade off of variance. Variance is the degree to which on individual predictions your estimators stray from the mean output on those values. High variance means that a model has overfit, and incorrectly or incompletely learned the problem from the training set. Most commonly, high bias = underfitting, high variance = overfitting.

Please consider joining the conversation on Twitter. I also blog from time to time. You can find me at erikpartridge.com.

For more academic sources, consider reading the slides from this fantastic Carnegie Mellon lecture.

---

Send in a voice message: https://podcasters.spotify.com/pod/show/mlbytes/message
- 1m
- 20 LUG 2019
Empirical Risk Minimization

Empirical Risk Minimization

The concept of empirical risk minimization drives modern approaches to training many machine learning algorithms, including deep neural networks. Today's thirty second summary covers the basics of what you need to know, but the concept goes well beyond just the simple case we discuss today. If you are looking to discuss the topic further, please consider joining the conversation on Twitter.

Lecture notes from Carnegie Mellon University (no affiliation).

---

Send in a voice message: https://podcasters.spotify.com/pod/show/mlbytes/message
- 38 sec