0 ×

02_​Active_​Learning_​for_​Document_​Classification

Workflow

Active Learning for Document Classification

This workflow defines a fully automated web based application that will label your data using active learning. The workflow was designed for business analysts to easily go through documents to be labeled in any number of classes. In each iteration the user labels more documents and the model is trained using the already labeled instances. With every new iteration, the model proposes documents based on a exploration vs exploitation approach. Once the user is happy with the overall potential falling below a certain value, they can exit the loop and export the model to label the remaining instances.

This workflow is made to be deployed on KNIME WebPortal via KNIME Server.






active learninghuman-in-the-loopguided analyticsdocument classificationsentiment analysistopic detectiontag cloudtopicactivelearninguncertaintyentropysamplinguncertainty samplingmovie reviewshighlitingexplorationexploitationlabel densitypotentialhuman-in-the-loop
Active Learning for Document ClassificationThis workflow defines a fully automated web based application that will label your data using active learning with an exploration / exploitation strategy. This workflow is made to be deployed on KNIME WebPortal via KNIME Server. To test the Guided Analytics application on KNIME Analytics Platform:- Right click the Label component and "Execute and Open Views"- Follow the in-view instructions- After saving your interactionsRight click Active Learning Loop End and "Step Loop Execution"- Open the Label component view again to see the second iteration of the human-in-the-loop The Process Step by Step1. Upload your documents and enter / upload the labels you want to use2. Start labeling your data3. Monitor overall potential as you provide more labels4. When the overall potentials falls below a desired amount, exit the loop5. Download the model and the labels, and visualize the results Show user currentpredictions and ask formore labels. Allow user to download themodel trained on all thelabeld data. ExplorationPotentialreduce density output: new labelsExploitationport 0 : labeledport 1: not labeledport 0 : termsport 1: docstop: new iteration labelsbottom : labeled + unlabeledtop 50 docuentsby PotentialDensity Scorer Exploration/ExploitationScore Combiner Active LearningLoop Start Active LearningLoop End Density Updater Label Entropy UncertaintyScorer Deploy Initialize / Train Classifierwith Available Labels Upload Text Preprocessing Pre-processing Post-processing Top k Selector Graph DensityInitializer Active Learning for Document ClassificationThis workflow defines a fully automated web based application that will label your data using active learning with an exploration / exploitation strategy. This workflow is made to be deployed on KNIME WebPortal via KNIME Server. To test the Guided Analytics application on KNIME Analytics Platform:- Right click the Label component and "Execute and Open Views"- Follow the in-view instructions- After saving your interactionsRight click Active Learning Loop End and "Step Loop Execution"- Open the Label component view again to see the second iteration of the human-in-the-loop The Process Step by Step1. Upload your documents and enter / upload the labels you want to use2. Start labeling your data3. Monitor overall potential as you provide more labels4. When the overall potentials falls below a desired amount, exit the loop5. Download the model and the labels, and visualize the results Show user currentpredictions and ask formore labels. Allow user to download themodel trained on all thelabeld data. ExplorationPotentialreduce densityoutput: new labelsExploitationport 0 : labeledport 1: not labeledport 0 : termsport 1: docstop: new iteration labelsbottom : labeled + unlabeledtop 50 docuentsby PotentialDensity Scorer Exploration/ExploitationScore Combiner Active LearningLoop Start Active LearningLoop End Density Updater Label Entropy UncertaintyScorer Deploy Initialize / Train Classifierwith Available Labels Upload Text Preprocessing Pre-processing Post-processing Top k Selector Graph DensityInitializer

Download

Get this workflow from the following link: Download

Resources

Nodes

02_​Active_​Learning_​for_​Document_​Classification consists of the following 552 nodes(s):

Plugins

02_​Active_​Learning_​for_​Document_​Classification contains nodes provided by the following 12 plugin(s):