Icon

Regression - House Prices - Advanced Techniques including H2O.ai and Vtreat

Score Kaggle House Prices: Advanced Regression Techniques - prepare data with vtreat - use H2O.ai nodes and other models - measure results with RMSE<br /><br />https://www.kaggle.com/c/house-prices-advanced-regression-techniques/overview

URL: vtreat for KNIME! https://win-vector.com/2020/06/28/vtreat-for-knime/
URL: H2O.ai AutoML (wrapped with Python) with vtreat data preparation in KNIME for regression problems https://hub.knime.com/-/spaces/-/latest/~QCQZSo2ffXuw5CdZ/
URL: H2O.ai AutoML in KNIME for regression problems https://forum.knime.com/t/h2o-ai-automl-in-knime-for-regression-problems/20924?u=mlauber71
URL: Medium: Data preparation for Machine Learning with KNIME and the Python “vtreat” package https://medium.com/p/efcaf58fa783

# Run AutoML for 60 seconds or
# 300 = 5 min, 600 = 10 min, 900 = 15 min, 1800 = 30 min, 3600 = 1 hour,
# 7200 = 2 hours
# 14400 = 4 hours
# 16200 = 4.5 hours
# 18000 = 5 Stunden
# 21600 = 6 hours
# 25200 = 7 hours
# 28800 = 8 hours
# 36000 = 10 hours

Score Kaggle House Prices: Advanced Regression Techniques - prepare data with vtreat - use H2O.ai nodes and other models - measure results with RMSE
https://www.kaggle.com/c/house-prices-advanced-regression-techniques/overview

Python Conda environment propagation. Please read this article for more details:


KNIME and Python — Setting up and managing Conda environments
https://medium.com/p/2ac217792539

KNIME — Machine Learning and Artificial Intelligence— A Collection
https://medium.com/p/12e0f7d83b50

About Machine-Learning — How it Fails and Succeeds
https://medium.com/p/9f3ab7cb9b00

dataset_regression_20.parquet=> to be used in a Jupyter notebookto develop a H2O.ai model
Parquet Writer
collect all modelresults
Concatenate
locate and create /data/ folder with absolute paths
Collect Local Metadata
collect resultsof severalregressionmodelsH2O
Concatenate
dataset_regression_80.parquet
Parquet Writer
collect resultsof severalregressionmodelsKNIME
Concatenate
create initial Test andTraining dataKaggle House Prices: Advanced Regression Techniques
Test Training
dataset_regression_vtreat_20.parquet
Parquet Writer
dataset_regression_vtreat_80.parquet
Parquet Writer
Table Partitioner
dataset_regression.parquethttps://www.kaggle.com/c/house-prices-advanced-regression-techniques/overview
Parquet Reader
collect LightGBM modelresults
Concatenate
20
Table to H2O
80
Row Filter
Domain Calculator
Py_LightGBM
Py_LightGBM
Py_LightGBM_vtreat
Py_LightGBM_vtreat
20
Row Filter
RowID
RMSEASCENDING=> lowest value -> best model
Sorter
model_results.xlsx
Excel Writer
different H2O.ai modelsfor regression tasks
H2O
models withvtreat data preparation
VTREAT
WEKA
F10 - open view
Select Parameters for Models
Merge Variables
Activate Conda Environmentbased on Operating SystemWindows or macOSconfigure how to handle the environmentdefault = just check the names
conda_environment_lightgbm
80
Table to H2O
H2O Local Context
various KNIME modelsfor regression tasks=> "RProp MLP Learner" here
KNIME
collect resultsof severalregressionmodelsVTREAT
Concatenate
collect resultsof severalregressionmodelsWEKA
Concatenate

Nodes

Extensions

Links