Performs a grid search of parameter pairs for the a classifier (Y-axis, default is LinearRegression with the "Ridge" parameter) and the PLSFilter (X-axis, "# of Components") and chooses the best pair found for the actual predicting. The initial grid is worked on with 2-fold CV to determine the values of the parameter pairs for the selected type of evaluation (e.g., accuracy). The best point in the grid is then taken and a 10-fold CV is performed with the adjacent parameter pairs. If a better pair is found, then this will act as new center and another 10-fold CV will be performed (kind of hill-climbing). This process is repeated until no better pair is found or the best pair is on the border of the grid. In case the best pair is on the border, one can let GridSearch automatically extend the grid and continue the search. Check out the properties 'gridIsExtendable' (option '-extend-grid') and 'maxGridExtensions' (option '-max-grid-extensions <num>'). GridSearch can handle doubles, integers (values are just cast to int) and booleans (0 is false, otherwise true). float, char and long are supported as well. The best filter/classifier setup can be accessed after the buildClassifier call via the getBestFilter/getBestClassifier methods. Note on the implementation: after the data has been passed through the filter, a default NumericCleaner filter is applied to the data in order to avoid numbers that are getting too small and might produce NaNs in other schemes.
(based on WEKA 3.6)
For further options, click the 'More' - button in the dialog.
All weka dialogs have a panel where you can specify classifier-specific parameters.
The Preliminary Attribute Check tests the underlying classifier against the DataTable specification at the inport of the node. Columns that are compatible with the classifier are marked with a green 'ok'. Columns which are potentially not compatible are assigned a red error message.
Important: If a column is marked as 'incompatible', it does not necessarily mean that the classifier cannot be executed! Sometimes, the error message 'Cannot handle String class' simply means that no nominal values are available (yet). This may change during execution of the predecessor nodes.
Capabilities: [Numeric attributes, Date attributes, Missing values, Numeric class, Date class, Missing class values] Dependencies: [Nominal attributes, Binary attributes, Unary attributes, Empty nominal attributes, Numeric attributes, Date attributes, String attributes, Relational attributes, Missing values, No class, Nominal class, Binary class, Unary class, Empty nominal class, Numeric class, Date class, String class, Relational class, Missing class values, Only multi-Instance data] min # Instance: 1
E: Determines the parameter used for evaluation: CC = Correlation coefficient RMSE = Root mean squared error RRSE = Root relative squared error MAE = Mean absolute error RAE = Root absolute error COMB = Combined = (1-abs(CC)) + RRSE + RAE ACC = Accuracy KAP = Kappa (default: CC)
y-property: The Y option to test (without leading dash). (default: classifier.ridge)
y-min: The minimum for Y. (default: -10)
y-max: The maximum for Y. (default: +5)
y-step: The step size for Y. (default: 1)
y-base: The base for Y. (default: 10)
y-expression: The expression for Y. Available parameters: BASE FROM TO STEP I - the current iteration value (from 'FROM' to 'TO' with stepsize 'STEP') (default: 'pow(BASE,I)')
filter: The filter to use (on X axis). Full classname of filter to include, followed by scheme options. (default: weka.filters.supervised.attribute.PLSFilter)
x-property: The X option to test (without leading dash). (default: filter.numComponents)
x-min: The minimum for X. (default: +5)
x-max: The maximum for X. (default: +20)
x-step: The step size for X. (default: 1)
x-base: The base for X. (default: 10)
x-expression: The expression for the X value. Available parameters: BASE MIN MAX STEP I - the current iteration value (from 'FROM' to 'TO' with stepsize 'STEP') (default: 'pow(BASE,I)')
extend-grid: Whether the grid can be extended. (default: no)
max-grid-extensions: The maximum number of grid extensions (-1 is unlimited). (default: 3)
sample-size: The size (in percent) of the sample to search the inital grid with. (default: 100)
traversal: The type of traversal for the grid. (default: COLUMN-WISE)
log-file: The log file to log the messages to. (default: none)
S: Random number seed. (default 1)
D: If set, classifier is run in debug mode and may output additional info to the console
W: Full name of base classifier. (default: weka.classifiers.functions.LinearRegression)
D: Produce debugging output. (default no debugging output)
S: Set the attribute selection method to use. 1 = None, 2 = Greedy. (default 0 = M5' method)
C: Do not try to eliminate colinear attributes.
R: Set ridge parameter (default 1.0e-8).
D: Turns on output of debugging information.
C: The number of components to compute. (default: 20)
U: Updates the class attribute as well. (default: off)
M: Turns replacing of missing values on. (default: off)
A: The algorithm to use. (default: PLS1)
P: The type of preprocessing that is applied to the data. (default: center)
You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.
A zipped version of the software site can be downloaded here.
Do you have feedback, questions, comments about NodePit, want to support this platform, or want your own nodes or workflows listed here as well? Do you think, the search results could be improved or something is missing? Then please get in touch! Alternatively, you can send us an email to firstname.lastname@example.org, follow @NodePit on Twitter, or chat on Gitter!
Please note that this is only about NodePit. We do not provide general support for KNIME — please use the KNIME forums instead.