Icon

20200428_​Pikairos Is it possible to extend a distance matrix

I have an iterative workflow where a distance matrix is calculated using the Distance Matrix Calculate node. At each new iteration there are only few new data points, because most of them came from older iterations. Is there a way to avoid re-calculating all the distances of the old points at each iteration?

I can save the distance matrix but then I don’t know how to extend it with the new data points, calculating only the distances between the new data points and the old ones (and of course between the new data points themselves). I think to a sort of “distance matrix extension/concatenation”. Is this possible?

One step classical Solution : Generate 1 columnwith 100Random NumbersHere only 1 columnbut it could be manyfor the example.It doesn't change anythingnor the metrics, etc.Work outEuclideanDistancesExtract pairsAs you can seethere are lessthan 100x100This is becauseit only returnsthe values ofthe half-matrixminus the diagonalTo do the sameas with theDistance MatrixPair Extractoryou need herea recursive loopDirect resultsusingthe Distance MatrixCalculate nodeIn this examplewe add one by onethe rows tothe distance matrixbut you could add hereas many as you want.This is just an example :)Start with first rowagainst itselfSend the lastdistance matrix resultsto the recursive startingloop node to addnew incoming distancesOld vs NewDo itssimilaritysearchKeep onlythe headerThis is only whatwe need at this stageto have the right table headerfor the recursive loopOld vs OldpreviouslycalculateddistancesFilter INNumbers &RowIDEliminateany possibleduplicated resultThis is just to geta cleaner result ifyou have duplicatedcoming rowsRegenatea clean RowIDto reinject to therecursive startingloop nodeNew vs OldFilter INNumbers &RowIDGeneratea unique Identifierfor all the rowsThis is important latersince Similarity Searchworks based ona Row IdentifierFilter OUTany duplicatedcomparisonSort by1st match-pair Identifierand then by2nd match-pair IdentifierFilter uselesscolumnsNew vs NewConcatenateall the calculateddistancesso farGroup by 1st IdentifierSort again by1st Identifierto get the equivalentof a full matrix distance(not exactly the same thoughsince this is the full matrix)That's all FolksWork out final #of iterations :here in thisexample 10because the tablehas 10 rows andwe add one rowto the matrixat each iterationThe Final # of Rowsis needed tostop the followingrecursive loopSet the RowIDwith the new identifier Random NumbersGenerator Distance MatrixCalculate Distance MatrixPair Extractor RecursiveLoop Start InteractiveTable (local) Rule-basedRow Filter Row Filter Recursive Loop End Similarity Search Similarity Search Row Filter Column Splitter Column Filter DuplicateRow Filter RowID Similarity Search Column Filter Math Formula DuplicateRow Filter Sorter Column Filter Similarity Search Concatenate GroupBy Sorter InteractiveTable (local) Math Formula Table Rowto Variable RowID One step classical Solution : Generate 1 columnwith 100Random NumbersHere only 1 columnbut it could be manyfor the example.It doesn't change anythingnor the metrics, etc.Work outEuclideanDistancesExtract pairsAs you can seethere are lessthan 100x100This is becauseit only returnsthe values ofthe half-matrixminus the diagonalTo do the sameas with theDistance MatrixPair Extractoryou need herea recursive loopDirect resultsusingthe Distance MatrixCalculate nodeIn this examplewe add one by onethe rows tothe distance matrixbut you could add hereas many as you want.This is just an example :)Start with first rowagainst itselfSend the lastdistance matrix resultsto the recursive startingloop node to addnew incoming distancesOld vs NewDo itssimilaritysearchKeep onlythe headerThis is only whatwe need at this stageto have the right table headerfor the recursive loopOld vs OldpreviouslycalculateddistancesFilter INNumbers &RowIDEliminateany possibleduplicated resultThis is just to geta cleaner result ifyou have duplicatedcoming rowsRegenatea clean RowIDto reinject to therecursive startingloop nodeNew vs OldFilter INNumbers &RowIDGeneratea unique Identifierfor all the rowsThis is important latersince Similarity Searchworks based ona Row IdentifierFilter OUTany duplicatedcomparisonSort by1st match-pair Identifierand then by2nd match-pair IdentifierFilter uselesscolumnsNew vs NewConcatenateall the calculateddistancesso farGroup by 1st IdentifierSort again by1st Identifierto get the equivalentof a full matrix distance(not exactly the same thoughsince this is the full matrix)That's all FolksWork out final #of iterations :here in thisexample 10because the tablehas 10 rows andwe add one rowto the matrixat each iterationThe Final # of Rowsis needed tostop the followingrecursive loopSet the RowIDwith the new identifier Random NumbersGenerator Distance MatrixCalculate Distance MatrixPair Extractor RecursiveLoop Start InteractiveTable (local) Rule-basedRow Filter Row Filter Recursive Loop End Similarity Search Similarity Search Row Filter Column Splitter Column Filter DuplicateRow Filter RowID Similarity Search Column Filter Math Formula DuplicateRow Filter Sorter Column Filter Similarity Search Concatenate GroupBy Sorter InteractiveTable (local) Math Formula Table Rowto Variable RowID

Nodes

Extensions

Links