Calculates for each pair of selected columns a correlation coefficient, i.e. a measure of the correlation of the two variables.
All measures are based on the rank of the cells. Where the rank of a
cell value refers to its position in a sorted list of all entries.
All correlation can be calculated on any kind of DataColumn. However
please note that we use the default ordering of the values. If there
is no ordering defined in the column, a string representation will
be used.
The node uses fractional ranks for equal values.
Spearman's rank correlation coefficient
is a statistical measure of the strength of a monotonic relationship
between paired data. Where the monotonic relationship is
characterised by a relationship between ordered sets that preserves
the given order, i.e., either never increases or never decreases as
its independent variable increases.
The value of this measure ranges
from -1 (strong negative correlation) to 1 (strong positive
correlation). A perfect Spearman correlation of +1 or −1 occurs when
each of the variables is a perfect monotone function of the other.
For Spearman's rank correlation coefficient the p-value and degrees
of freedom are computed. The p-value indicates the probability of an
uncorrelated system producing a correlation at least
as extreme, if the mean of the correlation is zero and it
follows a t-distribution with df degrees of freedom.
Goodman and Kruskal's gamma
as well as
Kendall's tau rank correlation coefficient
is used to measure the strength of association between two measured
quantities. Both are based on the number of concordant and
discordant pairs. Kendall's Tau A and Tau B coefficients can be
considered as standardized forms of Gamma. The difference between
Tau A and Tau B is that Tau A statistic does not consider tied
values, while Tau B makes adjustments for them. By tied observations
we consider two or more observations having the same value. Both
Kruskal's gamma and Kendall's Tau A are mostly suitable for square
tables, whereas Tau B is most appropriately used for rectangular
tables. The coefficients must be in the range from −1 (100% negative
association, or perfect inversion) to +1 (100% positive association,
or perfect agreement). A value of zero indicates the absence of
association.
Rows containing Missing Values will be ignored, not used for the calculations. For other behaviors please resolve them before.
You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.
To use this node in KNIME, install the extension KNIME Statistics Nodes from the below update site following our NodePit Product and Node Installation Guide:
A zipped version of the software site can be downloaded here.
Deploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud or on-premises – with our brand new NodePit Runner.
Try NodePit Runner!Do you have feedback, questions, comments about NodePit, want to support this platform, or want your own nodes or workflows listed here as well? Do you think, the search results could be improved or something is missing? Then please get in touch! Alternatively, you can send us an email to mail@nodepit.com, follow @NodePit on Twitter, or chat on Gitter!
Please note that this is only about NodePit. We do not provide general support for KNIME — please use the KNIME forums instead.