0 ×

Training_​workflow

Workflow

In this use case, we use the NYC taxi dataset and a Random Forest to train a simple time series prediction model to predict taxi demand in the next hour based on data from past hours. For better scalability, we will train and test the model on a Spark cluster.
Demand prediction random forest time series prediction mode Spark cluster NYC taxi datset
Training Workflow Taxi Demand PredictionBased on the NYC taxi dataset this workflow uses a Random Forest to train a simple time series prediction model to predict taxi demand in the next hour based on data from pasthours.The model is trained and tested on a Spark cluster for better scalability.Given the large size of the dataset, train and deploy the machine learning model of choice on a Spark cluster. The KNIME Big Data Extension allows you to run a KNIME workflow onthe big data platform you prefer, via in-database processing or via Spark. partitioninto training and test settrain the modelload the Parquetdataset to Sparkpredictthe test set View line plot Split by dateand time Spark Lag Column Spark NumericScorer Model Writer Find lag Create Local BigData Environment Spark RandomForests Learner Parquet to Spark Spark Predictor Spark Lag Column Find lag Path totraining set Training Workflow Taxi Demand PredictionBased on the NYC taxi dataset this workflow uses a Random Forest to train a simple time series prediction model to predict taxi demand in the next hour based on data from pasthours.The model is trained and tested on a Spark cluster for better scalability.Given the large size of the dataset, train and deploy the machine learning model of choice on a Spark cluster. The KNIME Big Data Extension allows you to run a KNIME workflow onthe big data platform you prefer, via in-database processing or via Spark. partitioninto training and test settrain the modelload the Parquetdataset to Sparkpredictthe test set View line plot Split by dateand time Spark Lag Column Spark NumericScorer Model Writer Find lag Create Local BigData Environment Spark RandomForests Learner Parquet to Spark Spark Predictor Spark Lag Column Find lag Path totraining set

Download

Get this workflow from the following link: Download

Nodes

Training_​workflow consists of the following 112 nodes(s):

Plugins

Training_​workflow contains nodes provided by the following 7 plugin(s):