Machine Learning in R using caret: GBM (Gradient Boosting Machine) vs. Random Forest

Описание к видео Machine Learning in R using caret: GBM (Gradient Boosting Machine) vs. Random Forest

We try to beat a Random Forest Model by using a Gradient Boosting Machine (GBM). To do so, we use Max Kuhn's great caret package, which, among other strengths,
1. simplifies cross validation and model comparison, and
2. provides a common interface to a lot of machine learning algorithms.

We start out with a simple decision tree and compare its prediction accuracy to a random forest model.

The first attempt to beat the random forest model fails: our GBM model yields a higher RMSE. So we improve the model by using a custom grid to tune model parameters: maximum tree depth, number of trees, minimal terminal node size and shrinkage. Caret again saves us a lot of time by taking the custom grid, training and comparing 400 models using cross validation, and displaying the best tuning parameters, from very simple code.

Using optimized model parameters, we finally beat the random forest model (even if by a narrow margin).

Blog post to this video (German, google translate option):
https://statistik-dresden.de/archives...

Contact me, e. g. to discuss (online) R workshops / trainings / webinars:

LinkedIn:   / wolfriepl  
Twitter:   / statistikindd  
Xing: https://www.xing.com/profile/Wolf_Riepl
Facebook:   / statistikdresden  

https://statistik-dresden.de/kontakt
R Workshops: https://statistik-dresden.de/r-schulu...
Blog (German, translate option): https://statistik-dresden.de/statisti...

Playlist: Music chart history
   • Music Chart History  

Комментарии

Информация по комментариям в разработке