This is a workflow for topic classification.
After converting the Documents into word vectors, it becomes a traditional classification problem which can be solved using any Machine Learning supervised training algorithm. We chose a decision tree, but it could have been anything else.
Metanode "Limit # keywords" artificially limits the number of extracted keywords to limit the number of produced columns. Since the dataset used here is quite small, we do not want to run the risk of lack of generalization by having too many columns vs. too few rows in the training set.
Document Vector Applier node applies the word vector extracted in the training set and removes all words that might be present in the test set but not in the training set.
Category To Class extracts the content in the category field of the Document and places it in a column named "class".
To use this workflow in KNIME, download it from the below URL and open it in KNIME:
Download WorkflowDeploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud or on-premises – with our brand new NodePit Runner.
Try NodePit Runner!Do you have feedback, questions, comments about NodePit, want to support this platform, or want your own nodes or workflows listed here as well? Do you think, the search results could be improved or something is missing? Then please get in touch! Alternatively, you can send us an email to mail@nodepit.com.
Please note that this is only about NodePit. We do not provide general support for KNIME — please use the KNIME forums instead.