Icon

JKISeason3-18_​tark

Explaining Cancer Predictions


Level: Hard

Description: You work as a researcher creating models to identify whether a breast tumor is benign or malign, based on anonymized patient data. Besides obtaining a classifier that works very well for both benign and malign cases, you must be able to explain how different feature values impact your results. Experiment with LIME and visualization techniques to explain your predictions and make your research more transparent. Hint: Learn more about this problem's data attributes here.

Author: Keerthan Shetty

Dataset: Breast Tumor Data in the KNIME Community Hub

URL: Datasets https://hub.knime.com/s/JMWDCxY4oCz_eK_o
URL: JKISeason3-18 https://www.knime.com/just-knime-it?pk_vid=f1a9625dd14a14c5171698895027e10b
URL: This challenge thread https://forum.knime.com/t/solutions-to-just-knime-it-challenge-18-season-3/83116?pk_vid=4e602e8568914d2d1726067200168798
URL: Explain Stroke Prediction Models with LIME in KNIME https://www.knime.com/blog/XAI-LIME-stroke-prediction

Building a modelPrediction LIME analysis of samples to be explained There is a high correlation between Uniformityof Cell Shape and Uniformity of Cell Size.-> Uniformity of Cell Shape was omitted. Node 1Train80Test 20Convert target (Class) and Sample codenumber to stringtop input : test set instance rows to be explainedbottom input : test set distributionPredicttraining a local GLMfor each input instance to generate a Local Inter. Model-agn. ExplanationNode 2050Reduce the sample sizedue to tne following steptaking a long time.Nega:BluePosi:RedAppend over allprediction confidenceNode 2081Node 2082Node 2083Convert Prediction and LIME chartto imageCombine LIME valueswith the predicted valuesNode 2100ExcludeUniformity of Cell Shape Node 2102Average LIME value for each classModel performanceCSV Reader Partitioning Number to String LIME Loop Start Workflow Executor Compute LIME Loop End Partitioning Color Manager Random ForestPredictor Random ForestLearner CaptureWorkflow Start CaptureWorkflow End Metanode Column Appender Linear Correlation Column Filter Column Rename(Regex) Metanode Component Metanode Building a modelPrediction LIME analysis of samples to be explained There is a high correlation between Uniformityof Cell Shape and Uniformity of Cell Size.-> Uniformity of Cell Shape was omitted. Node 1Train80Test 20Convert target (Class) and Sample codenumber to stringtop input : test set instance rows to be explainedbottom input : test set distributionPredicttraining a local GLMfor each input instance to generate a Local Inter. Model-agn. ExplanationNode 2050Reduce the sample sizedue to tne following steptaking a long time.Nega:BluePosi:RedAppend over allprediction confidenceNode 2081Node 2082Node 2083Convert Prediction and LIME chartto imageCombine LIME valueswith the predicted valuesNode 2100ExcludeUniformity of Cell Shape Node 2102Average LIME value for each classModel performanceCSV Reader Partitioning Number to String LIME Loop Start Workflow Executor Compute LIME Loop End Partitioning Color Manager Random ForestPredictor Random ForestLearner CaptureWorkflow Start CaptureWorkflow End Metanode Column Appender Linear Correlation Column Filter Column Rename(Regex) Metanode Component Metanode

Nodes

Extensions

Links