mbox Reader

Go to Product

Reads an mbox file and creates a table with a row for each single message within the file.

The parser is based on Apache James Mime4J. Technical specification of the mbox format can be found in: RFC4155: The application/mbox Media Type and mbox manpage.

The result is provided as binary objects (BLOB), which can be further processed by the “mbox Message Extractor” and “mbox Header Extractor” nodes.

Options

mbox file
The mbox file to parse (attention: when you’re running on a Mac and processing .mbox data exported with Mail.app, note that the .mbox “file” is actually a directory (aka. “package”). So, make sure not to select the directory, but to navigate into the directory and there select the file called “mbox”). Pro tip: You can drag and drop .mbox files into your KNIME workflow, and a new node will be added for reading the dropped file.
Encoding
File encoding
From Line Pattern
The regular expression to use for parsing From headers. The first option ^From \\S+@\\S.*\\d{4}$ is more strict and will only match “From” lines which contain an @ character. The second option ^From \\S+.*\\d{4}$ will also match “From” lines without @ – use this e.g. if you need to parse Thunderbird mbox content.
Max. message size
Maximum (single) message size in megabytes for parsing; attention: in case this value is too small, messages will not be parsed correctly

Input Ports

This node has no input ports

Output Ports

Icon
Parsed mbox, one message per row

Popular Predecessors

  • No recommendations found

Views

This node has no views

Workflows

  • No workflows found

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.