Icon

JKISeasor2-19_​tomljh_​ver5

There has been no title set for this workflow's metadata.

Challenge 19: Dealing with Diabetes

Level: Easy or Medium

Description: In this challenge you will take the role of a clinician and check if machine learning can help you predict diabetes. You should create a solution that beats a baseline accuracy of 65%, and also works very well for both classes (having diabetes vs not having diabetes). We got an accuracy of 77% with a minimal workflow. If you'd like to take this challenge from easy to medium, try implementing:

* sampling techniques
* feature importance calculation

Find SeedTip: Due to the small dataset and fast calculation using the method, I generated 50 seeds.best seed: 509909 Final Simplification Ref:https://www.kaggle.com/code/yasinncndr/diabetes-prediction-feature-engineering-and-eda#Feature-Engineering read data - diabetes.csvSet the target variable to string typeoutcomeint ->strEDANode 848Node 849Stratified sampling70/30Node 854Node 863Node 864EDANode 866Node 867Stratified sampling70/30seed =594675seed = 594675Node 881GlucoseBloodPressureSkinThicknessInsulinBMIAgeNode 889Node 890 CSV Reader Number To String Data Explorer Random ForestLearner Random ForestPredictor Scorer (JavaScript) Partitioning Variable Loop End Data Generator Math Formula Data Explorer Table Row ToVariable Loop Start Table RowTo Variable Partitioning Scorer (JavaScript) Random ForestLearner Random ForestPredictor Box Plot Rule Engine Rule Engine Rule Engine Rule Engine Rule Engine Rule Engine Missing Value Numeric Outliers Box Plot Find SeedTip: Due to the small dataset and fast calculation using the method, I generated 50 seeds.best seed: 509909 Final Simplification Ref:https://www.kaggle.com/code/yasinncndr/diabetes-prediction-feature-engineering-and-eda#Feature-Engineering read data - diabetes.csvSet the target variable to string typeoutcomeint ->strEDANode 848Node 849Stratified sampling70/30Node 854Node 863Node 864EDANode 866Node 867Stratified sampling70/30seed =594675seed = 594675Node 881GlucoseBloodPressureSkinThicknessInsulinBMIAgeNode 889Node 890CSV Reader Number To String Data Explorer Random ForestLearner Random ForestPredictor Scorer (JavaScript) Partitioning Variable Loop End Data Generator Math Formula Data Explorer Table Row ToVariable Loop Start Table RowTo Variable Partitioning Scorer (JavaScript) Random ForestLearner Random ForestPredictor Box Plot Rule Engine Rule Engine Rule Engine Rule Engine Rule Engine Rule Engine Missing Value Numeric Outliers Box Plot

Nodes

Extensions

Links