Statistical Machine Learning plays a central role in the emerging field of Data Science, with supervised learning as one of its main pillars. When it comes to supervised learning methods, one of the most formidable paradigms from a point of view of optimal prediction is the ensemble learning paradigm according to which instead of seeking to select a single model out of many, one combines/aggregates many good candidate models in some ways through model averaging (regression) or majority voting (classification).
Ensemble learning methods have proven to be typically better than their single counterparts, typically because suitable ensembles aggregated well tend to reduce the variance of the resulting estimator, thereby improving the predictive performance. In This lecture I will present some of the most popular and most commonly used methods of ensemble learning, with examples in R.