DL4J Feedforward Learner (legacy)

This Node Is Deprecated — This node is kept for backwards-compatibility, but the usage in new workflows is no longer recommended. The documentation below might contain more information.

This node supplies means to learn the network configuration specified by the Deep Learning Model. Thereby, the model can be either trained supervised or unsupervised using several training methods like Stochastic Gradient Descent. The output layer of the network, which can be configured in the node dialog, will be automatically added by this node. Additionally, the node supplies further methods for regularization, gradient normalization and learning refinements. In order to learn the network, inputs will be automatically converted into a network understandable vector format. For the model input there are two options. If the supplied model is untrained it will be trained normally by the learner. If the model was trained by a previous learner the node will try to use the network parameters of the trained model to initialise the parameters of the new network for the new training run, because the network configuration can be changed between learner nodes. This way methods like Transfer Learning can be implemented. The output of the node is a learned Deep Learning Model containing the original configuration and tuned network weights and biases.

The KNIME Deeplearning4J Integration has been marked as legacy with KNIME Analytics Platform 5.0 and will be deprecated in a future version. If you are using this extension in a production workflow, consider switching to one of the other deep learning integrations available in KNIME Analytics Platform.

Options

Learning Parameters

Training Mode

Whether to do supervised or unsupervised training.

SUPERVISED - label column needs to be specified
UNSUPERVISED - label column can be omitted

Use Seed

Whether to use a seed value for training. Used to make different learning runs comparable. If the same seed was used and the configuration didn't change, the results will the same between learning runs.

Seed

The seed value which should be used. Any Integer number may be used.

Number of Training Iterations

The number of parameter updates that will be done on one batch of input data.

Optimization Algorithm

The type of optimization method to use. The following algorithms are available:

LINE_GRADIENT_DESCENT - normal gradient descent
CONJUGATE_GRADIENT
HESSIAN_FREE
LBFGS
STOCHASTIC_GRADIENT_DESCENTT - gradient descent using minibatches

Do Backpropagation

Whether to do backpropagation. If this option is chosen the learner will perform supervised training using the specified techniques and hyper parameters.

Do Pretraining

Whether to to pretreaining. If this option is chosen the learner will perform unsupervised pretraining (Contrastive Divergence) of the network parameters. This option is only applicable for Restricted Boltzmann Machines and Autoencoders.

Do Finetuning

Whether to to finetuning. If this option is chosen the learner will perform supervised finetuning of the network parameters.

Use Pretrained Updater

Whether to use a pretrained updater of a trained model. Some updaters contain a history of previous gradients, hence, it can be specified if a supplied updater should be taken or a new should be created. This option will only take effect if Deep Learning Model supplied at the input was previously trained and contains a saved updater.

Updater Type

The type of updater to use. These specify how the raw gradients will be modified. If a pretrained updater is used this option will be ignored. The The following methods are available:

SGD
ADAM
ADADELTA
NESTEROVS
ADAGRAD
ADAGRAD
RMSPROP

Use Regularization

Whether to use regularization techniques to prevent overfitting.

L1 Regularization Coefficient

Strength of L1 regularization.

L2 Regularization Coefficient

Strength of L2 regularization.

Use Gradient Normalization

Whether to use gradient normalization.

Gradient Normalization Strategy

Gradient normalization strategies. These are applied on raw gradients, before the gradients are passed to the updater. An explanation can be found at:
http://deeplearning4j.org/doc/org/deeplearning4j/nn/conf/GradientNormalization.html

RenormalizeL2PerLayer
RenormalizeL2PerParamType
ClipElementWiseAbsoluteValue
ClipL2PerLayer
ClipL2PerParamType

Gradient Normalization Threshold

Threshold value for gradient normalization.

Use Momentum

Whether to use momentum.

Momentum Rate

Rate of influence of the momentum term.

Momentum After

Schedule for momentum value change during training. This is specified in the following format:
'iteration':'momentum rate','iteration':'momentum rate' ...
This creates a map, which maps the iteration to the momentum rate that should be used. E.g. '2:0.8' means that the rate '0.8' should be used in iteration '2'. Leave empty if you do not want to use a schedule.

Use Drop Connect

Whether to use Drop Connect.

Global Parameters

Use Global Learning Rate: Whether to overwrite the learning rates specified in the layers of the network for all layers.
Global Learning Rate: The learning rate to use for all layers.
Use Global Drop Out Rate: Whether to overwrite the drop out rates specified in the layers of the network for all layers.
Global Drop Out Rate: The drop out rate to use for all layers.
Use Global Weight Initialization Strategy: Whether to overwrite the weight initialization strategy specified in the layers of the network for all layers.
Global Weight Initilialization Strategy: The weight initialization strategy to use for all layers.

Data Parameters

Batch Size: The number of examples used for one minibatch.
Epochs: The number of epochs to train the network, hence the number of training runs on the whole data set.
Size of Input Image: If the input table contains images the dimensionality of the images needs to be specified. This value needs to be three numbers separated by a comma specifying the dimension sizes of the images (size x,size y,number of channels). E.g. 64,64,3

Column Selection

Label Column: The column of the input table containing labels for supervised learning.
Input Column Selection: The columns of the input table containing the training data for the network.

Output Layer Parameter

Number of Output Units: The number of outputs for this layer. For supervised training this value is determined automatically, hence it is not possible to set it. For unsupervised training this value specifies the number of neurons in the output layer.
Learning Rate: The learning rate that should be used for this layer.
Weight Initialization Strategy: The strategy which will be used to set the initial weights for this layer.
Loss Function: The type of loss function that should be used for this layer.
Activation Function: The type of activation function that should be used for this layer.

Input Ports

: Finished configuration of a deep learning network.
: Data table containing training data.

Output Ports

: Trained Deep Learning Model

Popular Predecessors

Popular Successors

Views

Learning Status: Shows information about the current learning run. Has an option for early stopping of training. If training is stopped before the last epoch the model will be saved in the current status.

Workflows

09_Simple_Anomaly_Detection_Using_A_Convolutional_NetKNIME Hub

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.

Installation

To use this node in KNIME, install the extension KNIME Deeplearning4J Integration (64bit only) (legacy) from the below update site following our NodePit Product and Node Installation Guide:

v5.5

A zipped version of the software site can be downloaded here.

Plugin provider: KNIME AG, Zurich, Switzerland

Plugin version: 5.5.0.v202502211431

On NodePit since: 2025-07-02

Last update: 2025-08-08

Tags: Deprecated

KNIME versions: Since v3.6

Deploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud or on-premises – with our brand new NodePit Runner.

Try NodePit Runner!