Generic Workflow Nodes for KNIME: SeqAn version by Freie Universitaet Berlin, Universitaet Tuebingen, and the SeqAn Team

The Deferred Frequency Index (DFI) is a tool for string mining under frequency constraints, i.e., predicates that evaluate solely the frequency of a pattern occurrence in the data. The frequency of a pattern is defined as the number of distinct sequences in a database that contain the pattern at least once. Currently the implementation contains 3 different predicates and can easily be extended by user-defined frequency predicates. The frequencies are calculated during the construction of a suffix tree over all databases, which enables to limit the index construction to a problem-specific minimum referred to as the optimal monotonic hull.

(c) Copyright 2010 by David Weese and Marcel H. Schulz

Turn this option off to disable version update notifications of the application.
Set minimal and maximal frequency per database.
Minimal support in the first (with --growth) or all (with --entropy) databases.
Minimal support ratio between the first and second databases.
Maximal entropy of support values of all databases.
Specify database alphabet.
Output only left and right maximal substrings.

Input Ports

Database files in Fasta/Fastq or text format (lines are strings). [fq,fastq,fa,fasta,faa,ffn,fna,frn,embl,gbk,raw,sam]

Output Ports

Change output filename. Default: <stdout>. [txt]


Dfi Std Output
The text sent to standard out during the execution of Dfi.
Dfi Error Output
The text sent to standard error during the execution of Dfi. (If it appears in gray, it's the output of a previously failing run which is preserved for your trouble shooting.)

To use this node in KNIME, install SeqAn from the following update site:


A zipped version of the software site can be downloaded here.

You don't know what to do with this link? Read our NodePit Product and Node Installation Guide that explains you in detail how to install nodes to your KNIME Analytics Platform.

