0 ×

02_​Techniques_​for_​Dimensionality_​Reduction

Workflow

Techniques for Dimensionality Reduction
ETLbig datadata preprocessingperformanceaccuracyclassificationdimensionality reductionLDAauto-encodert-SNEPCAbackward feature eliminationforward feature selectionfeature selection
This part takes very long time! Execute with discretion! Dimensionality Reduction This workflow shows methods for dimensionality reduction andcalculates a baseline where no dimensionality reduction technique isapplied:1. Baseline evaluation2. Linear Discriminant Analysis (LDA)3. Auto-encoder4. t-SNE5. High ratio of missing values6. Low variance7. High correlation with other data columns8. Tree ensemble based 9. Principal Component Analysis (PCA)10. Backward feature elimination11. Forward feature selectionROC Curve shows the final performances using different dimensionalityreduction techniques. The positive class probabilities are accessible viathe top output ports of the components.Accuracies obtained using different techniques are accessible via thebottom outputs of the Components Reading DataRead the KDD train small data set as text files eliminate colsw/ > 30% missingSelect prediction task- churn- appetency- upsellingread smalldata set:233 columns50K rowsSeparate target2500 stratified onTarget for betterperformance Auto-encoderbased Reduction Column Selectionby Missing Values Target Selection Reading fullsmall data set Column Splitter Joiner Baseline Evaluation Reduction basedon Missing Values Tree Ensemblebased Reduction Backward FeatureElimination Reduction basedon Low Variance Reduction basedon High Corr. ROC Curve Column Appender Forward FeatureSelection Reductionbased on PCA Reductionbased on LDA Reduction basedon t-SNE ROC Curve Row Sampling Positive classprobabilities Bar Chart Accuracies This part takes very long time! Execute with discretion! Dimensionality Reduction This workflow shows methods for dimensionality reduction andcalculates a baseline where no dimensionality reduction technique isapplied:1. Baseline evaluation2. Linear Discriminant Analysis (LDA)3. Auto-encoder4. t-SNE5. High ratio of missing values6. Low variance7. High correlation with other data columns8. Tree ensemble based 9. Principal Component Analysis (PCA)10. Backward feature elimination11. Forward feature selectionROC Curve shows the final performances using different dimensionalityreduction techniques. The positive class probabilities are accessible viathe top output ports of the components.Accuracies obtained using different techniques are accessible via thebottom outputs of the Components Reading DataRead the KDD train small data set as text files eliminate colsw/ > 30% missingSelect prediction task- churn- appetency- upsellingread smalldata set:233 columns50K rowsSeparate target2500 stratified onTarget for betterperformance Auto-encoderbased Reduction Column Selectionby Missing Values Target Selection Reading fullsmall data set Column Splitter Joiner Baseline Evaluation Reduction basedon Missing Values Tree Ensemblebased Reduction Backward FeatureElimination Reduction basedon Low Variance Reduction basedon High Corr. ROC Curve Column Appender Forward FeatureSelection Reductionbased on PCA Reductionbased on LDA Reduction basedon t-SNE ROC Curve Row Sampling Positive classprobabilities Bar Chart Accuracies

Download

Get this workflow from the following link: Download

Nodes

02_​Techniques_​for_​Dimensionality_​Reduction consists of the following 960 nodes(s):

Plugins

02_​Techniques_​for_​Dimensionality_​Reduction contains nodes provided by the following 13 plugin(s):