Rank Correlation

Calculates rank-based correlation coefficients between selected pairs of columns to measure the strength and direction of monotonic relationships.

This node supports several correlation methods based on ranked data:

  • Spearman’s rank correlation: Measures the strength of a monotonic relationship between two variables. Values range from -1 (strong negative correlation) to +1 (strong positive correlation), with associated p-values and degrees of freedom computed.
  • Goodman and Kruskal’s gamma: Assesses association based on concordant and discordant pairs; best suited for square contingency tables.
  • Kendall’s tau: Includes Tau A and Tau B variants. Tau A ignores tied ranks, while Tau B adjusts for them and is more appropriate for rectangular tables.

The node uses fractional ranking for tied values. If column values lack a natural order, string representations are used for sorting. Columns of any data type can be analyzed, but results depend on the default ordering.

Rows with missing values are excluded from calculations. To apply different handling, address missing values beforehand.

Options

Correlation type
Choose the type of correlation. The coefficient must be in the range from −1 (100% negative association, or perfect inversion) to +1 (100% positive association, or perfect agreement). A value of zero indicates the absence of association.
  • Spearman's Rho: Spearman's rank correlation coefficient is a statistical measure of the strength of a monotonic relationship between paired data. Where the monotonic relationship is characterised by a relationship between ordered sets that preserves the given order, i.e., either never increases or never decreases as its independent variable increases. The value of this measure ranges from -1 (strong negative correlation) to 1 (strong positive correlation). A perfect Spearman correlation of +1 or −1 occurs when each of the variables is a perfect monotone function of the other. For Spearman's rank correlation coefficient the p-value and degrees of freedom are computed. The p-value indicates the probability of an uncorrelated system producing a correlation at least as extreme, if the mean of the correlation is zero and it follows a t-distribution with df degrees of freedom.
  • Kendall's Tau A: Kendall's tau rank correlation coefficient is used to measure the strength of association between two measured quantities which is based on the number of concordant and discordant pairs and can be considered as standardized form of Gamma. The Tau A statistic does not consider tied values (two or more observations having the same value) and is mostly suitable for square tables.
  • Kendall's Tau B: Kendall's tau rank correlation coefficient is used to measure the strength of association between two measured quantities which is based on the number of concordant and discordant pairs and can be considered as standardized form of Gamma. The Tau B statistic makes adjustments for tied values (two or more observations having the same value) and is most appropriately used for rectangular tables.
  • Goodman and Kruskal's Gamma: Goodman and Kruskal's gamma is used to measure the strength of association between two measured quantities. It's based on the number of concordant and discordant pairs.
Correlation columns
Select the columns for which correlation values should be computed.
Include only column pairs with a valid correlation
Check this option if only the column pairs where the correlation could be computed should be included in the output table. Column pairs where the correlation could not be computed are then omitted from the output table.
p-value
Select which p-value should be computed for Spearman's rank correlation coefficient.
  • two-sided: Corresponds to the probability of obtaining a correlation value that is at least as extreme as the observed correlation.
  • one-sided (right): Corresponds to the probability of obtaining a correlation value that shows even greater positive association.
  • one-sided (left): Corresponds to the probability of obtaining a correlation value that shows even greater negative association.

Input Ports

Icon
Numeric input data to evaluate

Output Ports

Icon
Correlation variables, p-values and degrees of freedom.
Icon
Correlation variables in a matrix representation.
Icon
A model containing the correlation measures. This model is appropriate to be read by the Correlation Filter node.
Icon
A table containing the fractional ranks of the columns. Where the rank corresponds to the values position in a sorted table.

Views

Correlation Matrix
Squared table view showing the pair-wise correlation values of all columns. The color range varies from dark red (strong negative correlation), over white (no correlation) to dark blue (strong positive correlation). If a correlation value for a pair of column is not available, the corresponding cell contains a missing value (shown as cross in the color view).

Workflows

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.