This category contains 8 nodes.
BERT Embedder node calculates embeddings of the texts.
The node uses BERT model and adds a predefined neural network on top.
The node converts all tokens to their root form (lemma), removing cases, plurals, conjugations, etc.
The node performs morphology analysis of the text and assigns the tags for singular/plural, gender, case, conjugation, animacy, etc. for the tokens.
The node assigns recognized named entities in the document. Generalized spaCy NE tag set is used.
The node assigns part of speech to each token of the document.
The node converts a string column with raw text to a Knime Document format, using its own tokenizer based on the selected spaCy model.
The node converts String or Document data to the vectors (list of doubles) according to the embedder provided by the selected Spacy model.