Icon

kn_​example_​ml_​variable_​importance_​re_​engineering

Re-Engineering the variable importance by feeding the Score (of a binary classification) into a numeric model that provides variable importance

Re-Engineering the variable importance by feeding the Score (of a binary classification) into a numeric model that provides variable importance

Re-Engineering the variable importance by feeding the Score (of a binary classification) into a numeric model that provides variable importance This is where the re-engineering of the feature importance is happening. The score from the original model gets feed into a regression model as target (without the orginal target of course) and the feature importance isbeing calculated. train.parquetcensus-incomedatasetoriginalbinary modelexclude*Target*=>this is where the 'magic' happens.The Score [P (Target=1)] is feed as a numeric target intoa model that would give us variable importancegbm_model.zipgbm_model.zipscore the originaldata with your original modelPercentage_originalPercentage_reengineeredRank_originalRank_reengineeredoriginalbinary modeljust apply the originalmodel to the (yes) trainig dataexclude*Target*=>this is where the 'magic' happens.The Score [P (Target=1)] is feed as a numeric target intoa model that would give us variable importancePercentage_reengineeredcompareRank_reengineeredtry somethingwith XGBoost(not entirely convincing ...)Percentage_reengineeredRandom Forest not so impressiveRank_reengineeredNode 564 Parquet Reader H2O Gradient BoostingMachine Learner H2O Gradient Boosting MachineLearner (Regression) H2O Local Context Table to H2O H2O Model to MOJO H2O MOJO Writer H2O MOJO Reader H2O MOJO Predictor(Classification) Table to H2O Column Rename Column Rename Joiner Counter Generation Counter Generation Numeric Scorer H2O Numeric Scorer Table to H2O DecisionTree Learner Decision TreePredictor Numeric Scorer H2O Gradient Boosting MachineLearner (Regression) Table to H2O Column Rename Joiner Counter Generation XGBoost VariableImportance H2O Random ForestLearner (Regression) Column Rename Joiner Numeric Scorer Counter Generation Column Filter Re-Engineering the variable importance by feeding the Score (of a binary classification) into a numeric model that provides variable importance This is where the re-engineering of the feature importance is happening. The score from the original model gets feed into a regression model as target (without the orginal target of course) and the feature importance isbeing calculated. train.parquetcensus-incomedatasetoriginalbinary modelexclude*Target*=>this is where the 'magic' happens.The Score [P (Target=1)] is feed as a numeric target intoa model that would give us variable importancegbm_model.zipgbm_model.zipscore the originaldata with your original modelPercentage_originalPercentage_reengineeredRank_originalRank_reengineeredoriginalbinary modeljust apply the originalmodel to the (yes) trainig dataexclude*Target*=>this is where the 'magic' happens.The Score [P (Target=1)] is feed as a numeric target intoa model that would give us variable importancePercentage_reengineeredcompareRank_reengineeredtry somethingwith XGBoost(not entirely convincing ...)Percentage_reengineeredRandom Forest not so impressiveRank_reengineeredNode 564Parquet Reader H2O Gradient BoostingMachine Learner H2O Gradient Boosting MachineLearner (Regression) H2O Local Context Table to H2O H2O Model to MOJO H2O MOJO Writer H2O MOJO Reader H2O MOJO Predictor(Classification) Table to H2O Column Rename Column Rename Joiner Counter Generation Counter Generation Numeric Scorer H2O Numeric Scorer Table to H2O DecisionTree Learner Decision TreePredictor Numeric Scorer H2O Gradient Boosting MachineLearner (Regression) Table to H2O Column Rename Joiner Counter Generation XGBoost VariableImportance H2O Random ForestLearner (Regression) Column Rename Joiner Numeric Scorer Counter Generation Column Filter

Nodes

Extensions

Links