Member-only story

A hands-on explanation of Gradient Boosting Regression

7 min readSep 4, 2020

Introduction

One of the most powerful ways of training models is to train multiple models and aggregate their predictions. This is the main concept of Ensemble Learning. While many flavours of Ensemble Learning exist, some of the most powerful algorithms and Boosting Algorithms. In my previous article, I broke down one of the most popular Boosting Algorithms; Adaptive Boosting. Today, I want to talk about its equally powerful twin; Gradient Boosting.

Boosting & Adaptive Boosting vs Gradient Boosting

Boosting refers to any Ensemble Method that can combine several weak learners(a predictor with poor accuracy) to make a strong learner(a predictor with high accuracy). The idea behind boosting is to train models sequentially, each trying to correct its predecessor.

An Overview Of Adaptive Boosting

In Adaptive Boosting, the main idea occurs with the model assigning a certain weight to each instance, and training a weak learner. Based on the predictor’s performance, it gets assigned its own separate weight based on a weighted error rate. The higher the accuracy of the predictor, the higher its weight, and the more “say” it will have on the final prediction.

Once the predictor has made predictions, AdaBoost looks at the misclassified instances, and boosts their instance weights. After normalising the instance weights so that they all equate to 1, a new predictor is trained and the process is repeated until a desirable output is reached, or a threshold is reached.

The final classification is done by taking a weighted vote. In other words, if we were predicting heart disease on a patient, and 60 stumps predicted 1 and 40 predicted 0, but the predictors in the 0 class had a higher cumulative weight(i.e the predictors had more “say”), then the final prediction would be 0.

Gradient Boosting

In contrast to Adaptive Boosting, instead of sequentially boosting misclassified instance weights, Gradient Boosting actually make predictions on the predecessors residuals. Woah, hold it. What?

A hands-on explanation of Gradient Boosting Regression

Introduction

Boosting & Adaptive Boosting vs Gradient Boosting

An Overview Of Adaptive Boosting

Gradient Boosting

Written by Vagif Aliyev

Responses (1)