Icon

Exercise3

News relevance webinar

This workflow was presented at the webinar "KNIME & Redfield Present State-of-the-Art Language Models, Now Available Through Low-Code/No-Code" on September 29th, 2022.

Get the vectorized documents (last output port from Keyword Search) to predict the relevance of the texts.1) Split the texts into 2 groups: labeled and unlabeled2) Optional - color the labeled data by relevance3) Split the vectors into columns - use Split Collection Column node4) Partition labeled data set into training and test, user ration 70/30 or 80/205) Train the model. Suggestion: user Random Forest6) Apply trained model both for test set and unlabeled set7) Feed the prediction to the Prediction model evaluation component. Upper port - test set, bottom port - unlabelled set en_core_web_mdLabelTopic analysisCo-occurenceTag cloudLabelledarticlesNew articlesProvide the numberof textsSpacy ModelSelector Concatenate Number To String Keyword Search Text analysisdashboard CSV Reader CSV Reader Prediction modelevaluation Row Sampling Get the vectorized documents (last output port from Keyword Search) to predict the relevance of the texts.1) Split the texts into 2 groups: labeled and unlabeled2) Optional - color the labeled data by relevance3) Split the vectors into columns - use Split Collection Column node4) Partition labeled data set into training and test, user ration 70/30 or 80/205) Train the model. Suggestion: user Random Forest6) Apply trained model both for test set and unlabeled set7) Feed the prediction to the Prediction model evaluation component. Upper port - test set, bottom port - unlabelled set en_core_web_mdLabelTopic analysisCo-occurenceTag cloudLabelledarticlesNew articlesProvide the numberof textsSpacy ModelSelector Concatenate Number To String Keyword Search Text analysisdashboard CSV Reader CSV Reader Prediction modelevaluation Row Sampling

Nodes

Extensions

Links