FileFilter

Extracts or manipulates portions of data from peak, feature or consensus-feature files.

Web Documentation for FileFilter

Options

version
Version of the tool that generated this parameters file.
in_type
Input file type -- default: determined from file extension or content
rt
Retention time range to extract
mz
m/z range to extract (applies to ALL ms levels!)
int
Intensity range to extract
sort
Sorts the output according to RT and m/z.
log
Name of log file (created only when specified)
debug
Sets the debug level
threads
Sets the number of threads allowed to be used by the TOPP tool
no_progress
Disables progress logging to command line
force
Overrides tool-specific checks
test
Enables the test mode (needed for internal use only)
sn
Write peaks with S/N > 'sn' values only
rm_pc_charge
Remove MS(2) spectra with these precursor charges. All spectra without precursor are kept!
pc_mz_range
MSn (n>=2) precursor filtering according to their m/z value. Do not use this flag in conjunction with 'mz', unless you want to actually remove peaks in spectra (see 'mz'). RT filtering is covered by 'rt' and compatible with this flag.
pc_mz_list
List of m/z values. If a precursor window covers ANY of these values, the corresponding MS/MS spectrum will be kept.
level
MS levels to extract
sort_peaks
Sorts the peaks according to m/z
no_chromatograms
No conversion to space-saving real chromatograms, e.g. from SRM scans
remove_chromatograms
Removes chromatograms stored in a file
remove_empty
Removes spectra and chromatograms without peaks.
mz_precision
Store base64 encoded m/z data using 32 or 64 bit precision
int_precision
Store base64 encoded intensity data using 32 or 64 bit precision
indexed_file
Whether to add an index to the file when writing
zlib_compression
Whether to store data with zlib compression (lossless compression)
masstime
Apply MS Numpress compression algorithms in m/z or rt dimension (recommended: linear)
lossy_mass_accuracy
Desired (absolute) m/z accuracy for lossy compression (e.g. use 0.0001 for a mass accuracy of 0.2 ppm at 500 m/z, default uses -1.0 for maximal accuracy).
intensity
Apply MS Numpress compression algorithms in intensity dimension (recommended: slof or pic)
float_da
Apply MS Numpress compression algorithms for the float data arrays (recommended: slof or pic)
remove_zoom
Remove zoom (enhanced resolution) scans
remove_mode
Remove scans by scan mode
remove_activation
Remove MSn scans where any of its precursors features a certain activation method
remove_collision_energy
Remove MSn scans with a collision energy in the given interval
remove_isolation_window_width
Remove MSn scans whose isolation window width is in the given interval
select_zoom
Select zoom (enhanced resolution) scans
select_mode
Selects scans by scan mode
select_activation
Retain MSn scans where any of its precursors features a certain activation method
select_collision_energy
Select MSn scans with a collision energy in the given interval
select_isolation_window_width
Select MSn scans whose isolation window width is in the given interval
select_polarity
Retain MSn scans with a certain scan polarity
replace_pc_charge
Replaces in_charge with out_charge in all precursors.
similarity_threshold
Similarity threshold when matching MS2 spectra. (-1 = disabled).
rt
Retention tolerance [s] when matching precursor positions. (-1 = disabled)
mz
m/z tolerance [Th] when matching precursor positions. (-1 = disabled)
use_ppm_tolerance
If ppm tolerance should be used. Otherwise Da are used.
blacklist
True: remove matched MS2. False: retain matched MS2 spectra. Other levels are kept
q
Overall quality range to extract [0:1]
map
Non-empty list of maps to be extracted from a consensus (indices are 0-based).
map_and
Consensus features are kept only if they contain exactly one feature from each map (as given above in 'map')
blacklist
True: remove matched MS2. False: retain matched MS2 spectra. Other levels are kept
maps
Maps used for black/white list filtering
rt
Retention tolerance [s] for precursor to consensus feature position
mz
m/z tolerance [Th] for precursor to consensus feature position
use_ppm_tolerance
If ppm tolerance should be used. Otherwise Da are used.
charge
Charge range to extract
size
Size range to extract
remove_meta
Expects a 3-tuple (=3 entries in the list), i.e. <name> 'lt|eq|gt' <value>; the first is the name of meta value, followed by the comparison operator (equal, less or greater) and the value to compare to. All comparisons are done after converting the given value to the corresponding data value type of the meta value (for lists, this simply compares length, not content!)!
remove_hull
Remove hull from features.
remove_clashes
Remove features with id clashes (different sequences mapped to one feature)
keep_best_score_id
in case of multiple peptide identifications, keep only the id with best score
sequences_whitelist
Keep only features containing whitelisted substrings, e.g. features containing LYSNLVER or the modification (Oxidation). To control comparison method used for whitelisting, see 'id:sequence_comparison_method'.
sequence_comparison_method
Comparison method used to determine if a feature is whitelisted.
accessions_whitelist
keep only features with white listed accessions, e.g. sp|P02662|CASA1_BOVIN
remove_annotated_features
Remove features with annotations
remove_unannotated_features
Remove features without annotations
remove_unassigned_ids
Remove unassigned peptide identifications
rt
Retention tolerance [s] for precursor to id position
mz
m/z tolerance [Th] for precursor to id position
blacklist_imperfect
Allow for mismatching precursor positions (see 'id:blacklist')
max_intensity
maximal intensity considered for histogram construction. By default, it will be calculated automatically (see auto_mode). Only provide this parameter if you know what you are doing (and change 'auto_mode' to '-1')! All intensities EQUAL/ABOVE 'max_intensity' will be added to the LAST histogram bin. If you choose 'max_intensity' too small, the noise estimate might be too small as well. If chosen too big, the bins become quite large (which you could counter by increasing 'bin_count', which increases runtime). In general, the Median-S/N estimator is more robust to a manual max_intensity than the MeanIterative-S/N.
auto_max_stdev_factor
parameter for 'max_intensity' estimation (if 'auto_mode' == 0): mean + 'auto_max_stdev_factor' * stdev
auto_max_percentile
parameter for 'max_intensity' estimation (if 'auto_mode' == 1): auto_max_percentile th percentile
auto_mode
method to use to determine maximal intensity: -1 --> use 'max_intensity'; 0 --> 'auto_max_stdev_factor' method (default); 1 --> 'auto_max_percentile' method
win_len
window length in Thomson
bin_count
number of bins for intensity values
min_required_elements
minimum number of elements required in a window (otherwise it is considered sparse)
noise_for_empty_window
noise value used for sparse windows
write_log_messages
Write out log messages in case of sparse windows or median in rightmost histogram bin

Input Ports

Icon
Input file [mzML,featureXML,consensusXML]
Icon
Input file containing MS2 spectra that should be retained or removed from the mzML file!#br#Matching tolerances are taken from 'spectra:blackorwhitelist:similarity_threshold|rt|mz' options.#br# [mzML,opt.]
Icon
Input file containing consensus features whose corresponding MS2 spectra should be removed from the mzML file!#br#Matching tolerances are taken from 'consensus:blackorwhitelist:rt' and 'consensus:blackorwhitelist:mz' options.#br#If consensus:blackorwhitelist:maps is specified, only these will be used.#br# [consensusXML,opt.]
Icon
Input file containing MS2 identifications whose corresponding MS2 spectra should be removed from the mzML file!#br#Matching tolerances are taken from 'id:rt' and 'id:mz' options.#br#This tool will require all IDs to be matched to an MS2 spectrum, and quit with error otherwise. Use 'id:blacklist_imperfect' to allow for mismatches. [idXML,opt.]

Output Ports

Icon
Output file [mzML,featureXML,consensusXML]

Views

FileFilter Std Output
The text sent to standard out during the execution of FileFilter.
FileFilter Error Output
The text sent to standard error during the execution of FileFilter. (If it appears in gray, it's the output of a previously failing run which is preserved for your trouble shooting.)

Workflows

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.