Shapley Values originated in game theory and in the context of machine learning they have recently became a popular tool for the explanation of model predictions. The Shapley Value of a feature for a certain row and prediction indicates how much the feature has contributed to the deviation of the prediction from the base prediction (i.e. the mean prediction over the full sampling data). In theory the Shapley Values of all features add up to the difference between the mean prediction and the actual prediction but this loop only produces approximations because it is typically infeasible to calculate the exact Shapley Values.
A typical Shapley Values loop will consist of only three nodes: The Shapley Values Loop Start node, the predictor node for the model you want to explain (e.g. a Random Forest Predictor node) and the Shapley Values Loop End node.
For each row in the ROI (Row of Interest) table, the Shapley Values Loop Start node creates a number of perturbed rows i.e. rows where some of the features are randomly exchanged with the features from rows in the sampling table (for the exact details of the algorithm we refer to algorithm one in the paper Explaining prediction models and individual predictions with feature contributions by Strumbelj and Kononenko). Your task is to obtain predictions for these permuted rows (usually via the Predictor node corresponding to your model). The Shapley Values Loop End node collects these predictions and calculates an approximation of the Shapley Values for each feature target combination.
These nodes support collection and vector columns such as List columns, Bit Vector and Byte Vector columns, in case of which each element of the position/vector can be treated as an individual feature. Note that this requires all collections/vectors in a single column to be of the same length i.e. contain the same number of elements. It is also possible to treat collections and vectors as single features, in which case the respective option has to be set in the dialog.
You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.
To use this node in KNIME, install the extension KNIME Machine Learning Interpretability Extension from the below update site following our NodePit Product and Node Installation Guide:
A zipped version of the software site can be downloaded here.
Deploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud or on-premises – with our brand new NodePit Runner.
Try NodePit Runner!Do you have feedback, questions, comments about NodePit, want to support this platform, or want your own nodes or workflows listed here as well? Do you think, the search results could be improved or something is missing? Then please get in touch! Alternatively, you can send us an email to mail@nodepit.com.
Please note that this is only about NodePit. We do not provide general support for KNIME — please use the KNIME forums instead.