Icon

Tool Migration - From Excel to Value with KNIME

Tool Migration: From Excel to Value with KNIME
Tool Migration: From Excel to Value with KNIMEThis workflow shows how using a no-code/low-code tool like KNIME Analytics Platform can substitute, expand and improve considerably the spectrumof capabilities offered by traditional spreadsheet tools. The focus is placed on:1. Automation and Reusability2. (Big) Data Access3. Preprocessing and Machine Learning4. Error Handling5. Deployment 1. Automation andReusability 2. (Big) Data Access 3. Preprocessing and Machine LearningPredict which flight will be delayed 4. Error Handling 5. Deployment Preprocessing Modelling Evaluation and Visualization Start error handling" Diverted" to stringRemove cancelled flightsCreate target class:>=10 min delayedotherwise not delayedTrain: 80%Test: 20%Removemissing valuesmin:0max:1Keep 10%of 5mil rowsSave best trained modelFlight status 2018For prototypingEnd error handling:Use Parquet datasetor alternative CSV filein case of failureFlight status 2018Create log of errorsSave log fileKaggle APISave image as .pngSave data for report~ 96% accuracyKeep 10 columnsKeep 10 columns~ 94% accuracy Try (VariablePorts) Spark Column Filter Spark SQL Query Spark Row Filter Spark SQL Query Spark Partitioning Spark Missing Value Spark MissingValue (Apply) Spark Normalizer Spark Row Sampling Spark TransformationsApplier Model Writer Parquet to Spark Create Local BigData Environment Catch Errors(Generic Ports) CSV to Spark Variable toTable Row Create back-upfolder with CSV Active BranchInverter Excel Writer Download Kaggledataset(s) ROC Curve Spark to Table Image to Report Data to Report Spark RandomForest Learner Spark Scorer Spark Predictor(Classification) Spark PCA Spark PCA Spark DecisionTree Learner Spark Scorer Spark Predictor(Classification) Binary ClassificationInspector Spark to Table Column Appender Tool Migration: From Excel to Value with KNIMEThis workflow shows how using a no-code/low-code tool like KNIME Analytics Platform can substitute, expand and improve considerably the spectrumof capabilities offered by traditional spreadsheet tools. The focus is placed on:1. Automation and Reusability2. (Big) Data Access3. Preprocessing and Machine Learning4. Error Handling5. Deployment 1. Automation andReusability 2. (Big) Data Access 3. Preprocessing and Machine LearningPredict which flight will be delayed 4. Error Handling 5. Deployment Preprocessing Modelling Evaluation and Visualization Start error handling" Diverted" to stringRemove cancelled flightsCreate target class:>=10 min delayedotherwise not delayedTrain: 80%Test: 20%Removemissing valuesmin:0max:1Keep 10%of 5mil rowsSave best trained modelFlight status 2018For prototypingEnd error handling:Use Parquet datasetor alternative CSV filein case of failureFlight status 2018Create log of errorsSave log fileKaggle APISave image as .pngSave data for report~ 96% accuracyKeep 10 columnsKeep 10 columns~ 94% accuracy Try (VariablePorts) Spark Column Filter Spark SQL Query Spark Row Filter Spark SQL Query Spark Partitioning Spark Missing Value Spark MissingValue (Apply) Spark Normalizer Spark Row Sampling Spark TransformationsApplier Model Writer Parquet to Spark Create Local BigData Environment Catch Errors(Generic Ports) CSV to Spark Variable toTable Row Create back-upfolder with CSV Active BranchInverter Excel Writer Download Kaggledataset(s) ROC Curve Spark to Table Image to Report Data to Report Spark RandomForest Learner Spark Scorer Spark Predictor(Classification) Spark PCA Spark PCA Spark DecisionTree Learner Spark Scorer Spark Predictor(Classification) Binary ClassificationInspector Spark to Table Column Appender

Nodes

Extensions

Links