Icon

Gradient Boosting Regression Example

Gradient boosting regression example with the Auto MPG data setThis workflow shows gradient boosting regression to model MPG (miles per gallon) of cars based on their HP (horsepower). This example uses the Auto MPGData Set from the UCI Machine Learning Repository (https://archive.ics.uci.edu/ml/datasets/Auto+MPG).To run this workflow, the data set needs to be downloaded, and its location needs to be specified in the File Reader node. Loading the data-Splitting them into training and testing data sets -Scatter plot of HP vs MPG Training the gradient boosting treelearnerEach tree is a weak learner of maximum depth 2.Learning rate is 0.1. Only 20 iterations were used toavoid overfitting. Testing the modelThe model performance is evaluated by various metricsgenerated by the Numeric Scorer node. A scatter plot of thetesting data, overlaid with the predicted outcome, isproduced by the Python View node (with a Python script forvisualization). MPG data setPlotting the raw dataMPG vs HPRenaming columnslearninggradient goosting regressionMPG vs HPFitting the gradient boostingregression modelPlotting data and predictedTrain- test70% - 30%Evaluating themodel performance File Reader Scatter Plot Column Rename Gradient Boosted TreesLearner (Regression) Gradient Boosted TreesPredictor (Regression) Python View Partitioning Numeric Scorer Gradient boosting regression example with the Auto MPG data setThis workflow shows gradient boosting regression to model MPG (miles per gallon) of cars based on their HP (horsepower). This example uses the Auto MPGData Set from the UCI Machine Learning Repository (https://archive.ics.uci.edu/ml/datasets/Auto+MPG).To run this workflow, the data set needs to be downloaded, and its location needs to be specified in the File Reader node. Loading the data-Splitting them into training and testing data sets -Scatter plot of HP vs MPG Training the gradient boosting treelearnerEach tree is a weak learner of maximum depth 2.Learning rate is 0.1. Only 20 iterations were used toavoid overfitting. Testing the modelThe model performance is evaluated by various metricsgenerated by the Numeric Scorer node. A scatter plot of thetesting data, overlaid with the predicted outcome, isproduced by the Python View node (with a Python script forvisualization). MPG data setPlotting the raw dataMPG vs HPRenaming columnslearninggradient goosting regressionMPG vs HPFitting the gradient boostingregression modelPlotting data and predictedTrain- test70% - 30%Evaluating themodel performance File Reader Scatter Plot Column Rename Gradient Boosted TreesLearner (Regression) Gradient Boosted TreesPredictor (Regression) Python View Partitioning Numeric Scorer

Nodes

Extensions

Links