Icon

execute-failed-stackoverflowerror-null-on-the-html-parser-23567

Demonstrates how to use the 'Content-Type' header to filter downloaded URLs before passing them to the HTML parser.

Also, I set the download size limitation to stop the downloading once a file size of 0.5 MB has been reached.

Node 1Hint 1:Define a file size limit hereto avoid downloading hugePDF files and save time (I set this to 0.5 MB,adjust as necessary)Extract HTTP headersHint 2:Check if the 'content-type'header matches'text/html.*', removeother typesNode 5Node 6Table Creator HTTP Retriever HTTP ResultData Extractor Row Filter HTML Parser Column Filter Node 1Hint 1:Define a file size limit hereto avoid downloading hugePDF files and save time (I set this to 0.5 MB,adjust as necessary)Extract HTTP headersHint 2:Check if the 'content-type'header matches'text/html.*', removeother typesNode 5Node 6Table Creator HTTP Retriever HTTP ResultData Extractor Row Filter HTML Parser Column Filter

Nodes

Extensions

Links