Content Retriever (Labs)

Allows to retrieve the HTML from the browser session for further processing in KNIME.

Options

Retrieve tag type

Select the type of element that will be extracted from the page.

Available options:

Page: Extract the whole page as XML.
Link: Extract all anchor (<a>) elements and their href attribute from the page.
Paragraph: Extract all paragraph (<p>) elements and their inner text from the page.
Button: Extract all button (<button>) elements and their text from the page.
Image: Extract all image (<image>) elements and their alt-text from the page.
Heading: Extract all heading (<h1-6>) elements and their text from the page
Table: Extract all table (<table>) elements from the page
Unordered list: Extract all unordered list (<ul>) elements from the page
Ordered list: Extract all ordered list (<ul>) elements from the page
Page Title: Refreshes the current session.

Retrieval delay (seconds): Specifies the delay time until retrieving the HTML. This will prolong the execution for that exact amount of time.

This node has no views

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.

To use this node in KNIME, install the extension KNIME Python Extension Development (Labs) from the below update site following our NodePit Product and Node Installation Guide:

v5.11

A zipped version of the software site can be downloaded here.

Plugin provider: KNIME AG, Zurich, Switzerland

Plugin version: 5.11.0.v202602211520

On NodePit since: 2026-03-10

Last update: 2026-06-15

Tags: Modern UI

KNIME versions: Since v5.2

Deploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud or on-premises – with our brand new NodePit Runner.