Icon

s_​600_​spark_​h2o_​automl_​about_​this_​collection

s_600 - Sparkling predictions and encoded labels - "the poor man's ML Ops"

s_600 - Sparkling predictions and encoded labels - "the poor man's ML Ops"
Use Big Data Technologies like Spark to get a robust and scalable data preparation. Use the latest Auo ML technology like H2O.ai AutoML to cretae a robust model and deploy it in a Big Data environment (like Cloudera)

Please download the whole workflow group:
https://hub.knime.com/mlauber71/spaces/Public/latest/kn_example_bigdata_h2o_automl_spark_46

„Sparkling Predictions and Encoded Labels – Developing and Deploying Predictive Models on a Big Data Cluster with KNIME, Spark and H2O.ai“ // Markus Lauber, Deutsche Telekom (presentation in German, slides in English)
https://www.youtube.com/watch?v=k8MsxzwEVrk&t=4335s

s_600 - Sparkling predictions and encoded labels - "the poor man's ML Ops"Use Big Data Technologies like Spark to get a robust and scalable data preparation. Use the latest Auo ML technology like H2O.ai AutoML to cretae a robustmodel and deploy it in a Big Data environment (like Cloudera)Please download the whole workflow group:https://hub.knime.com/mlauber71/spaces/Public/latest/kn_example_bigdata_h2o_automl_spark_46„Sparkling Predictions and Encoded Labels – Developing and Deploying Predictive Models on a Big Data Cluster with KNIME, Spark and H2O.ai“ // MarkusLauber, Deutsche Telekom (presentation in German, slides in English)https://www.youtube.com/watch?v=k8MsxzwEVrk&t=4335s These are the steps covered in this collection- s_601 - prepare the data in your Big Data cluster - use a robust handmade label enconding in Spark - apply missing value rules - filter variables which have a too high correlation - filter continous variables - filter variables with NaN and strange values (yes you are allowed to investigate and clean your data)- s_605 - with these rules prepare data and either store them in you big data cluster or download them as Parquet files- s_618 - use the generic H2O.ai AutoML with an Spark- s_620 - demonstrate how your production workflow migth look like with KNIME server you could easily bring such a workflow in production "The poor man's ML Ops" relative pathsvalidatedatabig_datamodelmodel/validatescriptbig_data/uploadbig_data/upload Create File/FolderVariables Create File/FolderVariables Create Folder Create Folder Create Folder Create Folder Create Folder Create File/FolderVariables Create Folder s_600 - Sparkling predictions and encoded labels - "the poor man's ML Ops"Use Big Data Technologies like Spark to get a robust and scalable data preparation. Use the latest Auo ML technology like H2O.ai AutoML to cretae a robustmodel and deploy it in a Big Data environment (like Cloudera)Please download the whole workflow group:https://hub.knime.com/mlauber71/spaces/Public/latest/kn_example_bigdata_h2o_automl_spark_46„Sparkling Predictions and Encoded Labels – Developing and Deploying Predictive Models on a Big Data Cluster with KNIME, Spark and H2O.ai“ // MarkusLauber, Deutsche Telekom (presentation in German, slides in English)https://www.youtube.com/watch?v=k8MsxzwEVrk&t=4335s These are the steps covered in this collection- s_601 - prepare the data in your Big Data cluster - use a robust handmade label enconding in Spark - apply missing value rules - filter variables which have a too high correlation - filter continous variables - filter variables with NaN and strange values (yes you are allowed to investigate and clean your data)- s_605 - with these rules prepare data and either store them in you big data cluster or download them as Parquet files- s_618 - use the generic H2O.ai AutoML with an Spark- s_620 - demonstrate how your production workflow migth look like with KNIME server you could easily bring such a workflow in production "The poor man's ML Ops" relative pathsvalidatedatabig_datamodelmodel/validatescriptbig_data/uploadbig_data/upload Create File/FolderVariables Create File/FolderVariables Create Folder Create Folder Create Folder Create Folder Create Folder Create File/FolderVariables Create Folder

Nodes

Extensions

Links