KNIME Extension for Apache Spark core infrastructure version 4.3.0.v202011281457 by KNIME AG, Zurich, Switzerland
This node allows you to execute arbitrary java code to persist an existing Spark RDD e.g. by writing it to HDFS (See provided templates). Simply enter the java code in the text area.
Note, that this node also supports flow variables as input to your Spark job. To use a flow variable simply double click on the variable in the "Flow Variable List".
It is also possible to use external java libraries. In order to
include such external jar or zip files, add their location in the
"Additional Libraries" tab using the control buttons.
For details see the "Additional Libraries" tab description below.
The used libraries need to be present on your cluster and added to the class path of your Spark job server.
They are not automatically uploaded!
You can define reusable templates with the "Create templates..." button. Templates are stored in the users workspace by default and can be accessed via the "Templates" tab. For details see the "Templates" tab description below.
Enter your java code here.
The JavaSparkContext can be accessed via the method input parameter sc. The input JavaRDD<Row> can be accessed via the method input parameter rowRDD.
Flow variables:
You can access input flow variables by defining them in the Input table.
To define a flow variable simply double click on the variable in the "Flow Variable list".
You can hit ctrl+space to get an auto completion box with all available classes, methods and fields. When you select a class and hit enter a import statement will be generated if missing.
Note, that the snippet allows to define custom global variables and custom imports. To view the hidden editor parts simply click on the plus symbols in the editor.
Allows you to add additional jar files to the java snippet class path.
The used libraries need to be present on your cluster and added to the class path of your Spark job server.
They are not automatically uploaded!
Provides predefined templates and allows you to define new reusable templates by saving the current snippet state.
To use this node in KNIME, install KNIME Extension for Apache Spark from the following update site:
A zipped version of the software site can be downloaded here.
You don't know what to do with this link? Read our NodePit Product and Node Installation Guide that explains you in detail how to install nodes to your KNIME Analytics Platform.
You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.
Do you have feedback, questions, comments about NodePit, want to support this platform, or want your own nodes or workflows listed here as well? Do you think, the search results could be improved or something is missing? Then please get in touch! Alternatively, you can send us an email to mail@nodepit.com, follow @NodePit on Twitter, or chat on Gitter!
Please note that this is only about NodePit. We do not provide general support for KNIME — please use the KNIME forums instead.