FileFilter

Extracts or manipulates portions of data from peak, feature or consensus-feature files.

Web Documentation for FileFilter

Options

version: Version of the tool that generated this parameters file.
in_type: Input file type -- default: determined from file extension or content
rt: Retention time range to extract
mz: m/z range to extract (applies to ALL ms levels!)
int: Intensity range to extract
sort: Sorts the output according to RT and m/z.
log: Name of log file (created only when specified)
debug: Sets the debug level
threads: Sets the number of threads allowed to be used by the TOPP tool
no_progress: Disables progress logging to command line
force: Overrides tool-specific checks
test: Enables the test mode (needed for internal use only)
sn: Write peaks with S/N > 'sn' values only
rm_pc_charge: Remove MS(2) spectra with these precursor charges. All spectra without precursor are kept!
pc_mz_range: MSn (n>=2) precursor filtering according to their m/z value. Do not use this flag in conjunction with 'mz', unless you want to actually remove peaks in spectra (see 'mz'). RT filtering is covered by 'rt' and compatible with this flag.
pc_mz_list: List of m/z values. If a precursor window covers ANY of these values, the corresponding MS/MS spectrum will be kept.
level: MS levels to extract
sort_peaks: Sorts the peaks according to m/z
no_chromatograms: No conversion to space-saving real chromatograms, e.g. from SRM scans
remove_chromatograms: Removes chromatograms stored in a file
remove_empty: Removes spectra and chromatograms without peaks.
mz_precision: Store base64 encoded m/z data using 32 or 64 bit precision
int_precision: Store base64 encoded intensity data using 32 or 64 bit precision
indexed_file: Whether to add an index to the file when writing
zlib_compression: Whether to store data with zlib compression (lossless compression)
masstime: Apply MS Numpress compression algorithms in m/z or rt dimension (recommended: linear)
lossy_mass_accuracy: Desired (absolute) m/z accuracy for lossy compression (e.g. use 0.0001 for a mass accuracy of 0.2 ppm at 500 m/z, default uses -1.0 for maximal accuracy).
intensity: Apply MS Numpress compression algorithms in intensity dimension (recommended: slof or pic)
float_da: Apply MS Numpress compression algorithms for the float data arrays (recommended: slof or pic)
remove_zoom: Remove zoom (enhanced resolution) scans
remove_mode: Remove scans by scan mode
remove_activation: Remove MSn scans where any of its precursors features a certain activation method
remove_collision_energy: Remove MSn scans with a collision energy in the given interval
remove_isolation_window_width: Remove MSn scans whose isolation window width is in the given interval
select_zoom: Select zoom (enhanced resolution) scans
select_mode: Selects scans by scan mode
select_activation: Retain MSn scans where any of its precursors features a certain activation method
select_collision_energy: Select MSn scans with a collision energy in the given interval
select_isolation_window_width: Select MSn scans whose isolation window width is in the given interval
select_polarity: Retain MSn scans with a certain scan polarity
replace_pc_charge: Replaces in_charge with out_charge in all precursors.
similarity_threshold: Similarity threshold when matching MS2 spectra. (-1 = disabled).
rt: Retention tolerance [s] when matching precursor positions. (-1 = disabled)
mz: m/z tolerance [Th] when matching precursor positions. (-1 = disabled)
use_ppm_tolerance: If ppm tolerance should be used. Otherwise Da are used.
blacklist: True: remove matched MS2. False: retain matched MS2 spectra. Other levels are kept
q: Overall quality range to extract [0:1]
map: Non-empty list of maps to be extracted from a consensus (indices are 0-based).
map_and: Consensus features are kept only if they contain exactly one feature from each map (as given above in 'map')
blacklist: True: remove matched MS2. False: retain matched MS2 spectra. Other levels are kept
maps: Maps used for black/white list filtering
rt: Retention tolerance [s] for precursor to consensus feature position
mz: m/z tolerance [Th] for precursor to consensus feature position
use_ppm_tolerance: If ppm tolerance should be used. Otherwise Da are used.
charge: Charge range to extract
size: Size range to extract
remove_meta: Expects a 3-tuple (=3 entries in the list), i.e. <name> 'lt|eq|gt' <value>; the first is the name of meta value, followed by the comparison operator (equal, less or greater) and the value to compare to. All comparisons are done after converting the given value to the corresponding data value type of the meta value (for lists, this simply compares length, not content!)!
remove_hull: Remove hull from features.
remove_clashes: Remove features with id clashes (different sequences mapped to one feature)
keep_best_score_id: in case of multiple peptide identifications, keep only the id with best score
sequences_whitelist: Keep only features containing whitelisted substrings, e.g. features containing LYSNLVER or the modification (Oxidation). To control comparison method used for whitelisting, see 'id:sequence_comparison_method'.
sequence_comparison_method: Comparison method used to determine if a feature is whitelisted.
accessions_whitelist: keep only features with white listed accessions, e.g. sp|P02662|CASA1_BOVIN
remove_annotated_features: Remove features with annotations
remove_unannotated_features: Remove features without annotations
remove_unassigned_ids: Remove unassigned peptide identifications
rt: Retention tolerance [s] for precursor to id position
mz: m/z tolerance [Th] for precursor to id position
blacklist_imperfect: Allow for mismatching precursor positions (see 'id:blacklist')
max_intensity: maximal intensity considered for histogram construction. By default, it will be calculated automatically (see auto_mode). Only provide this parameter if you know what you are doing (and change 'auto_mode' to '-1')! All intensities EQUAL/ABOVE 'max_intensity' will be added to the LAST histogram bin. If you choose 'max_intensity' too small, the noise estimate might be too small as well. If chosen too big, the bins become quite large (which you could counter by increasing 'bin_count', which increases runtime). In general, the Median-S/N estimator is more robust to a manual max_intensity than the MeanIterative-S/N.
auto_max_stdev_factor: parameter for 'max_intensity' estimation (if 'auto_mode' == 0): mean + 'auto_max_stdev_factor' * stdev
auto_max_percentile: parameter for 'max_intensity' estimation (if 'auto_mode' == 1): auto_max_percentile th percentile
auto_mode: method to use to determine maximal intensity: -1 --> use 'max_intensity'; 0 --> 'auto_max_stdev_factor' method (default); 1 --> 'auto_max_percentile' method
win_len: window length in Thomson
bin_count: number of bins for intensity values
min_required_elements: minimum number of elements required in a window (otherwise it is considered sparse)
noise_for_empty_window: noise value used for sparse windows
write_log_messages: Write out log messages in case of sparse windows or median in rightmost histogram bin

Input Ports

: Input file [mzML,featureXML,consensusXML]
: Input file containing MS2 spectra that should be retained or removed from the mzML file!#br#Matching tolerances are taken from 'spectra:blackorwhitelist:similarity_threshold|rt|mz' options.#br# [mzML,opt.]
: Input file containing consensus features whose corresponding MS2 spectra should be removed from the mzML file!#br#Matching tolerances are taken from 'consensus:blackorwhitelist:rt' and 'consensus:blackorwhitelist:mz' options.#br#If consensus:blackorwhitelist:maps is specified, only these will be used.#br# [consensusXML,opt.]
: Input file containing MS2 identifications whose corresponding MS2 spectra should be removed from the mzML file!#br#Matching tolerances are taken from 'id:rt' and 'id:mz' options.#br#This tool will require all IDs to be matched to an MS2 spectrum, and quit with error otherwise. Use 'id:blacklist_imperfect' to allow for mismatches. [idXML,opt.]

Output Ports

: Output file [mzML,featureXML,consensusXML]

Popular Predecessors

Popular Successors

Views

FileFilter Std Output: The text sent to standard out during the execution of FileFilter.
FileFilter Error Output: The text sent to standard error during the execution of FileFilter. (If it appears in gray, it's the output of a previously failing run which is preserved for your trouble shooting.)

Workflows

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.

Installation

To use this node in KNIME, install the extension OpenMS from the below update site following our NodePit Product and Node Installation Guide:

v5.5

Plugin provider: Freie Universitaet Berlin, Universitaet Tuebingen, ZIB (GKN-Team) and the OpenMS Team

Plugin version: 3.4.0.202501170921

On NodePit since: 2025-07-02

Last update: 2025-08-02

KNIME versions: v5.5, v5.4, v5.3, v5.2, v5.1, v4.7, v4.6, v4.5, v4.4, v4.3, v4.2, v4.1, v4.0, v3.7, v3.6

Deploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud or on-premises – with our brand new NodePit Runner.

Try NodePit Runner!