Icon

s_​400_​spark_​h2o_​automl_​about_​this_​collection

Sparkling predictions and encoded labels

Use Big Data Technologies like Spark to get a robust and scalable data preparation. Use the latest Auo ML technology like H2O.ai AutoML to cretae a robust model and deploy it in a Big Data environment (like Cloudera)

https://hub.knime.com/mlauber71/spaces/Public/latest/kn_example_bigdata_h2o_automl_spark

„Sparkling Predictions and Encoded Labels – Developing and Deploying Predictive Models on a Big Data Cluster with KNIME, Spark and H2O.ai“ // Markus Lauber, Deutsche Telekom (presentation in German, slides in English)
https://www.youtube.com/watch?v=k8MsxzwEVrk&t=4335s

s_400 - Sparkling predictions and encoded labelsUse Big Data Technologies like Spark to get a robust and scalable data preparation. Use the latest Auo ML technology likeH2O.ai AutoML to cretae a robust model and deploy it in a Big Data environment (like Cloudera)https://hub.knime.com/mlauber71/spaces/Public/latest/kn_example_bigdata_h2o_automl_spark„Sparkling Predictions and Encoded Labels – Developing and Deploying Predictive Models on a Big Data Cluster withKNIME, Spark and H2O.ai“ // Markus Lauber, Deutsche Telekom (presentation in German, slides in English)https://www.youtube.com/watch?v=k8MsxzwEVrk&t=4335s These are the steps covered in this collection- s_401 - prepare the data in your Big Data cluster - use a robust handmade label enconding in Spark - apply missing value rules - filter variables which have a too high correlation - filter continous variables - filter variables with NaN and strange values (yes you are allowed to investigate and clean your data)- s_405 - with these rules prepare data and either store them in you big data cluster or download them as Parquet files- s_410 - demonstrates how to use H2O.ai's auto-machine-learning package in developing a model with a Python wrapper (yes you are free to use any H2O.ai or other ML environment that can be used with KNIME and Spark)- s_415 - use the same AutoML with an R wrapper (and normalized data)- s_420 - demonstrate how your production workflow migth look like with KNIME server you could easily bring such a workflow in production s_400 - Sparkling predictions and encoded labelsUse Big Data Technologies like Spark to get a robust and scalable data preparation. Use the latest Auo ML technology likeH2O.ai AutoML to cretae a robust model and deploy it in a Big Data environment (like Cloudera)https://hub.knime.com/mlauber71/spaces/Public/latest/kn_example_bigdata_h2o_automl_spark„Sparkling Predictions and Encoded Labels – Developing and Deploying Predictive Models on a Big Data Cluster withKNIME, Spark and H2O.ai“ // Markus Lauber, Deutsche Telekom (presentation in German, slides in English)https://www.youtube.com/watch?v=k8MsxzwEVrk&t=4335s These are the steps covered in this collection- s_401 - prepare the data in your Big Data cluster - use a robust handmade label enconding in Spark - apply missing value rules - filter variables which have a too high correlation - filter continous variables - filter variables with NaN and strange values (yes you are allowed to investigate and clean your data)- s_405 - with these rules prepare data and either store them in you big data cluster or download them as Parquet files- s_410 - demonstrates how to use H2O.ai's auto-machine-learning package in developing a model with a Python wrapper (yes you are free to use any H2O.ai or other ML environment that can be used with KNIME and Spark)- s_415 - use the same AutoML with an R wrapper (and normalized data)- s_420 - demonstrate how your production workflow migth look like with KNIME server you could easily bring such a workflow in production

Nodes

  • No nodes found

Extensions

  • No modules found

Links