Machine Learning Meta Collection (with KNIME)
This meta collection is about machine learning. It contains links to some examples demonstrating several types of machine learning mosttly with KNIME and also some links how to learn machine learning (again mostly witth KNIME). It is not a complete collection of ML methods and algorithms and far from answering all questions or covering all topics - more like a quick practical overview of some aspects; and always with a focus on Mnimal Viable Examples you could try at home. Please note these examples do not substitute for a deeper understanding of your business problems and the various -statistical- implications to consider when using such models - in other words: terms and conditions *do* apply.
--------------- Learning Machine Learning (with KNIME) ---------
How to learn machine learning with KNIME
https://forum.knime.com/t/knime-based-machine-learning-course/21876/2?u=mlauber71
[L1-DS] - KNIME Analytics Platform for Data Scientists: Basics
Lesson 4. Machine Learning & Data Export
https://www.knime.com/self-paced-course/l1-ds-knime-analytics-platform-for-data-scientists-basics/lesson4?u=mlauber71
-----------------------------------------------------------------
Links to types of prediction models
https://forum.knime.com/t/how-to-find-the-optimal-process-parameter-based-on-quality-defects/20846/6?u=mlauber71
-----
1) Models for binary classsifications - 0/1 or Yes/No Targets
https://forum.knime.com/t/looking-for-options-to-evaluate-a-decision-tree/11384/2?u=mlauber71
Understand metrics like AUC and Gini (and use H2O.ai)
https://forum.knime.com/t/random-forest-model-not-working/12738/3?u=mlauber71
https://forum.knime.com/t/help-choosing-analytics-algorithm/11404/3?u=mlauber71
11 Important Model Evaluation Metrics for Machine Learning Everyone should know
https://www.analyticsvidhya.com/blog/2019/08/11-important-model-evaluation-error-metrics/
-----
2) Model for Multiclass Targets (and explanation of Log Loss statistics)
https://forum.knime.com/t/any-advice-to-improve-the-performance-of-a-classification-model/12801/10?u=mlauber71
https://forum.knime.com/t/metrics-in-multiclass-classification/11193/3?u=mlauber71
Score Documents with multiple Classes?
https://forum.knime.com/t/urgent-what-is-wrong-with-my-decision-tree-predictor-for-new-data/13292/10?u=mlauber71
-----
3) Regression models (numeric Target)
https://forum.knime.com/t/predictive-analytics-for-sales/12858/3?u=mlauber71
https://forum.knime.com/t/forecasting-sales-per-customer-for-the-next-360-days/13221/4?u=mlauber71
https://forum.knime.com/t/evaluate-a-linear-regression-model/13305/2?u=mlauber71
https://forum.knime.com/t/how-to-identify-the-top-100-features-selected-from-mlp-model/11371/2?u=mlauber71
Regression collection (Time Series)
https://forum.knime.com/t/prediction-based-on-multi-variables/20184/5?u=mlauber71
predict how many future visitors a restaurant will receive (with H2O.ai)
https://www.knime.com/blog/solving-a-kaggle-challenge-using-the-combined-power-of-knime-analytics-platform-h2o?u=mlauber71
------------------------------------------------------------
PMML Models with numeric scores
https://forum.knime.com/t/export-pmml-that-outputs-class-probabilities/13244/2?u=mlauber71
-----------------------------------------------------------------
Data preparation steps
[preparation] Techniques for Dimensionality Reduction
https://hub.knime.com/knime/spaces/Examples/latest/04_Analytics/01_Preprocessing/02_Techniques_for_Dimensionality_Reduction/02_Techniques_for_Dimensionality_Reduction~7PBv1kGifxCng2qo
[preparation] Three New Techniques for Data Dimensionality Reduction in Machine Learning
https://www.knime.com/blog/three-new-techniques-for-data-dimensionality-reduction-in-machine-learning
[preparation] use R's vtreat to automatically prepare data fo classification and regression tasks
https://forum.knime.com/t/is-artificial-intelligence-used-for-data-cleansing-techniques-used-by-knime/36209/6?u=mlauber71
[preparation] Spark Label Encoding, remove highly correlated variables - prepare the data in local Big Data environment
https://hub.knime.com/mlauber71/spaces/Public/latest/kn_example_bigdata_h2o_automl_spark/s_401_spark_label_encoder~mF4g6HTMX7J4m27Q
prepare the preparation of data in a big data environment
- label encode string variables
- transform numbers into Double format (Spark ML likes that)
- remove highly correlated data
- remove NaN variables
- remove continous variables
- optional: normalize the data
-----------------------------------------------------------------
How to handle missing values
Basic missing value handling
https://hub.knime.com/knime/spaces/Examples/latest/02_ETL_Data_Manipulation/04_Transformation/01_Handling_Missing_Values
some more advanced approaches to missing values
https://hub.knime.com/knime/spaces/Education/latest/Courses/L4-ML%20Introduction%20to%20Machine%20Learning%20Algorithms/Session_4/02_Solutions/02_Missing_Value_Handling_solution
Multipe Imputation for Missing Values
https://hub.knime.com/kathrin/spaces/Missing%20Value%20Imputation/latest/Mulitple%20Imputation%20for%20Missing%20Values
Comparing Missing Value Handling Methods
https://hub.knime.com/kathrin/spaces/Missing%20Value%20Imputation/latest/Comparing%20Missing%20Value%20Handling%20Methods
Employ R's Amelia package to replace missing values
https://hub.knime.com/mlauber71/spaces/Public/latest/kn_example_r_amelia/m_001_missing_values_amelia
-----------------------------------------------------------------
about unbalanced Targets
https://forum.knime.com/t/xgboost-predictor/23960/5?u=mlauber71
about unbalanced data and evaluation metrics (AUCPR)
https://forum.knime.com/t/problem-with-unbalanced-data-with-examples-attached/26227/4?u=mlauber71
another thread about how to handle imbalanced data
https://forum.knime.com/t/knime-fraud-detection-autoencoder/28859/17?u=mlauber71
--------------- KNIME and H2O.ai ----------
H2O.ai models and KNIME in general
https://www.knime.com/nodeguide/analytics/h2o-machine-learning?u=mlauber71
simple example how to use H2O.ai models in a Big Data environment
https://hub.knime.com/mlauber71/spaces/Public/latest/kn_example_h2o_sparkling_water?u=mlauber71
H2O.ai AutoML in KNIME for classification problems
https://forum.knime.com/t/h2o-ai-automl-in-knime-for-classification-problems/20923?u=mlauber71
H2O.ai AutoML in KNIME for regression problems
https://forum.knime.com/t/h2o-ai-automl-in-knime-for-regression-problems/20924?u=mlauber71
„Sparkling Predictions and Encoded Labels – Developing and Deploying Predictive Models on a Big Data Cluster with KNIME, Spark and H2O.ai“
(talk in German, slides in English)
https://www.youtube.com/watch?v=k8MsxzwEVrk&t=4335s
--------------- KNIME and Python ----------
use Python and KNIME to make a random forest (quick basic example)
https://hub.knime.com/mlauber71/spaces/Public/latest/kn_example_python_iris?u=mlauber71
Python Installation (the very short story)
https://forum.knime.com/t/problem-with-setting-a-python-deep-learning-environment/19477/2?u=mlauber71
https://forum.knime.com/t/installing-a-new-library-in-python/25365/4?u=mlauber71
Python KNIME official installation
https://docs.knime.com/2020-07/python_installation_guide/index.html?u=mlauber71
Python and Deep Learning
https://docs.knime.com/latest/deep_learning_installation_guide/index.html?u=mlauber71
Python and Anaconda versions / Python and Keras
https://forum.knime.com/t/python-extension-not-recognizing-anaconda-environment-in-knime-3-7/12978/3?u=mlauber71
https://forum.knime.com/t/python-extension-not-recognizing-anaconda-environment-in-knime-3-7/12978/9?u=mlauber71
--------------- Special ----------
Rule Induction with Weka Rule Nodes and Yacaree Associator
https://hub.knime.com/mlauber71/spaces/Public/latest/kn_example_rule_induction_weka_hotspot_and_yacaree_rules?u=mlauber71
Not strictly a KNIME thing but very helpful books and blogs about ML and Python
https://machinelearningmastery.com/
Clustering Algorithms (small collection in KNIME)
https://forum.knime.com/t/ml-techniques-which-one-can-i-use-to-predict-sales-in-a-particular-country/28783/5?u=mlauber71
To use this workflow in KNIME, download it from the below URL and open it in KNIME:
Download WorkflowDeploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud or on-premises – with our brand new NodePit Runner.
Try NodePit Runner!Do you have feedback, questions, comments about NodePit, want to support this platform, or want your own nodes or workflows listed here as well? Do you think, the search results could be improved or something is missing? Then please get in touch! Alternatively, you can send us an email to mail@nodepit.com.
Please note that this is only about NodePit. We do not provide general support for KNIME — please use the KNIME forums instead.