Hierarchical Clustering based on molecular fingerprints
### Backend implementation

**utilities/canvasHCBuild**

canvasHCBuild is used to implement this node.

Available linkage types:

- single
- complete
- average
- centroid
- mcquitty
- ward
- weightedcentroid
- flexiblebeta
- schrodinger

The statsFile contains data relating to the cluster efficiency for each possible number of clusters (n).

__Definition of each statistics used in statsFile__

R-Squared(RSQ) represents 1.0-(W/T) where:

W is the sum of variance between all n clusters and

T is the total variance

Semipartial R-Squared(SPRSQ) represents the gradient of the above metric.

SPRSQRank is the rank of SPRSQ values over all possible choices of n (for clarity only the top sqrt(n) ranks are listed). Useful for choosing a locally optimal n within a desired range.

Kelley Penalty is Kelley's clustering efficiency metric. (Kelley et al. Protein Engineering (9) 11. pp. 1063-1065(1996))

IsKelleyMinimum represents whether the cluster is the global minimum of the above function. Useful for choosing globally optimal n.

- Clustering 1KNIME Hub
- Clustering 1Schrödinger
