Icon

Extraction

Nodes for extracting various kind of information mainly from unstructured text.

This category contains 17 nodes.

PDF Parser Streamable

Extract plain text from PDF files.

Phone Number Formatter Streamable

Format and parse phone numbers.

Regex Extractor Streamable

Extract fragments from text using regular expressions.

Regex Extractor Deprecated

Extract fragments from text using regular expressions.

String Similarity Streamable

Calculate similarities between strings.

TF-IDF Similarity Streamable

Similarity between strings based on the TF-IDF weighted cosine vectors.

URL Extractor 

This node allows to extract URLs from arbitrary text.