Icon

JKISeasor2-19_​tomljh_​ver3

There has been no title set for this workflow's metadata.

Challenge 19: Dealing with Diabetes

Level: Easy or Medium

Description: In this challenge you will take the role of a clinician and check if machine learning can help you predict diabetes. You should create a solution that beats a baseline accuracy of 65%, and also works very well for both classes (having diabetes vs not having diabetes). We got an accuracy of 77% with a minimal workflow. If you'd like to take this challenge from easy to medium, try implementing:

* sampling techniques
* feature importance calculation

1.sampling techniques : Stratified sampling SMOTE2.Forward Feature Selectionacc = 79.22% read data - diabetes.csvSet the target variable to string typeStratified sampling70/30select featurestree deep = 8apply to the test setoutcomeint ->strOversample " outcome" class at each training sampleEDA CSV Reader Partitioning Forward FeatureSelection Scorer (JavaScript) Gradient BoostedTrees Learner Gradient BoostedTrees Predictor ReferenceColumn Filter Number To String SMOTE ROC Curve(JavaScript) Data Explorer 1.sampling techniques : Stratified sampling SMOTE2.Forward Feature Selectionacc = 79.22% read data - diabetes.csvSet the target variable to string typeStratified sampling70/30select featurestree deep = 8apply to the test setoutcomeint ->strOversample " outcome" class at each training sampleEDACSV Reader Partitioning Forward FeatureSelection Scorer (JavaScript) Gradient BoostedTrees Learner Gradient BoostedTrees Predictor ReferenceColumn Filter Number To String SMOTE ROC Curve(JavaScript) Data Explorer

Nodes

Extensions

Links