Icon

04_​GoogleCloudExample

Working with Google cloud services

This workflow demonstrates how to connect to various Google Cloud Services such as Google BigQuery, Google Dataproc, and Google Cloud Storage from within KNIME Analytics Platform. The Google Authentication (API Key) node allows you to authenticate with the various Google APIs using a p12 key file.

The output of the Google Authentication (API Key) node can be used as input for the Google BigQuery Connector node. The Google BigQuery Connector node provides a DB Connection which can be used with the existing DB nodes to visually assemble queries that are executed within your BigQuery cluster. To upload large amounts of data into the BigQuery cluster use the DB Loader node since the JDBC based interface has a lot of restrictions.

The Google Cloud Storage Connector node connects KNIME Analytics Platform with your Google Cloud Storage and allows you to work with your files using the file handling nodes. The Google Cloud Storage File Picker node creates a pre-signed URL that can be used in the reader nodes in KNIME to read directly from Google Cloud Storage or that can be shared with other users to access the dedicated files without the need for authentication.

Finally, the Create Spark Context (Livy) node can be used to set up a Spark context in your Google Cloud Dataproc. In order to use the node, you need to execute the Apache Livy Initialization Action during cluster creation. For more details see link to the documentation. Once a context is created you can use all the existing Spark nodes to visually assemble your Spark analysis flow.


Use the DB nodes to work with Google BigQuery. Use the DB Loader to upload data. Use the Spark nodes to work with the Google Cloud Dataproc Use p12 key file for authentication and connect toGoolge Cloud Storage Use the file handling nodes to work with Coogle Cloud Storage This workflow demonstrates how to connect tovarious Google Cloud Services.For more information see the workflow metadata.Find it here: View -> Description Upload data into new tableParquet does not support spacesSave Spark results in DBFS CreatetableBigQuery does not support spaces Table to Spark Spark Predictor(Classification) Spark Scorer DB Loader Data Generator DB Reader Data Generator Spark Column Rename Spark DecisionTree Learner Spark to Parquet Google Cloud StorageFile Picker Google BigQueryConnector Google Authentication(API Key) Create SparkContext (Livy) Google CloudStorage Connection CSV Reader DB Table Creator Column Rename Use the DB nodes to work with Google BigQuery. Use the DB Loader to upload data. Use the Spark nodes to work with the Google Cloud Dataproc Use p12 key file for authentication and connect toGoolge Cloud Storage Use the file handling nodes to work with Coogle Cloud Storage This workflow demonstrates how to connect tovarious Google Cloud Services.For more information see the workflow metadata.Find it here: View -> Description Upload data into new tableParquet does not support spacesSave Spark results in DBFS CreatetableBigQuery does not support spacesTable to Spark Spark Predictor(Classification) Spark Scorer DB Loader Data Generator DB Reader Data Generator Spark Column Rename Spark DecisionTree Learner Spark to Parquet Google Cloud StorageFile Picker Google BigQueryConnector Google Authentication(API Key) Create SparkContext (Livy) Google CloudStorage Connection CSV Reader DB Table Creator Column Rename

Nodes

Extensions

Links