Icon

03_​Parsing_​the_​KNIME_​Forum

Parsing the KNIME Forum

This workflow demonstrates how one can parse the KNIME Forum. We work in different stages. First we get read the list of topics from the fron page of the forum. Afterwards we go to each category separately. In each category we are searching for all topics which are newer than 9 days. This limitation is done mainly to speed up the workflow. If there is a next page avaiable, we also take those into consideration. After parsing the different thread pages, we are reading all information from the individual topics.

Parsing the KNIME ForumThis workflow demonstrates how one can parse the KNIME Forum. We work in different stages. First we get read the list of topicsfrom the fron page of the forum. Afterwards we go to each category separately. In each category we are searching for all topicswhich are newer than 9 days. This limitation is done mainly to speed up the workflow. If there is a next page avaiable, we also takethose into consideration. After parsing the different thread pages, we are reading all information from the individual topics. Download HTML pages from the web Parse generated XML documents to extract topic and content collect all forumsand all threadsrm missingsget all postsParse first postand thread titlefilter threadswithout commentsget all postsonly commentsenter url from KNIME forum page collect data Column Filter XPath Row Filter MISSING HtmlParser XPath Concatenate Row Filter MISSINGHttpRetriever XPath Table Creator Parsing the KNIME ForumThis workflow demonstrates how one can parse the KNIME Forum. We work in different stages. First we get read the list of topicsfrom the fron page of the forum. Afterwards we go to each category separately. In each category we are searching for all topicswhich are newer than 9 days. This limitation is done mainly to speed up the workflow. If there is a next page avaiable, we also takethose into consideration. After parsing the different thread pages, we are reading all information from the individual topics. Download HTML pages from the web Parse generated XML documents to extract topic and content collect all forumsand all threadsrm missingsget all postsParse first postand thread titlefilter threadswithout commentsget all postsonly commentsenter url from KNIME forum page collect data Column Filter XPath Row Filter MISSING HtmlParser XPath Concatenate Row Filter MISSINGHttpRetriever XPath Table Creator

Nodes

Extensions

Links