Icon

Exercise3

News relevance webinar

This workflow was presented at the webinar "KNIME & Redfield Present State-of-the-Art Language Models, Now Available Through Low-Code/No-Code" on September 29th, 2022.

Get the vectorized documents (last output port from Keyword Search) to predict the relevance of the texts.1) Split the texts into 2 groups: labeled and unlabeled2) Optional - color the labeled data by relevance3) Split the vectors into columns - use Split Collection Column node4) Partition labeled data set into training and test, user ration 70/30 or 80/205) Train the model. Suggestion: user Random Forest6) Apply trained model both for test set and unlabeled set7) Feed the prediction to the Prediction model evaluation component. Upper port - test set, bottom port - unlabelled set LabelLabelledarticlesNew articlesTopic analysisCo-occurenceTag cloudProvide the numberof textsConcatenate Number To String CSV Reader CSV Reader Prediction modelevaluation Keyword Search Text analysisdashboard Row Sampling Get the vectorized documents (last output port from Keyword Search) to predict the relevance of the texts.1) Split the texts into 2 groups: labeled and unlabeled2) Optional - color the labeled data by relevance3) Split the vectors into columns - use Split Collection Column node4) Partition labeled data set into training and test, user ration 70/30 or 80/205) Train the model. Suggestion: user Random Forest6) Apply trained model both for test set and unlabeled set7) Feed the prediction to the Prediction model evaluation component. Upper port - test set, bottom port - unlabelled set LabelLabelledarticlesNew articlesTopic analysisCo-occurenceTag cloudProvide the numberof textsConcatenate Number To String CSV Reader CSV Reader Prediction modelevaluation Keyword Search Text analysisdashboard Row Sampling

Nodes

Extensions

Links