IDFilter

Filters results from protein or peptide identification engines based on different criteria.

Web Documentation for IDFilter

Options

version: Version of the tool that generated this parameters file.
var_mods: Keep only peptide hits with variable modifications (as defined in the 'SearchParameters' section of the input file).
remove_duplicate_psm: Removes duplicated PSMs per spectrum and retains the one with the higher score.
remove_shared_peptides: Only peptides matching exactly one protein are kept. Remember that isoforms count as different proteins!
keep_unreferenced_protein_hits: Proteins not referenced by a peptide are retained in the IDs.
remove_decoys: Remove proteins according to the information in the user parameters. Usually used in combination with 'delete_unreferenced_peptide_hits'.
delete_unreferenced_peptide_hits: Peptides not referenced by any protein are deleted in the IDs. Usually used in combination with 'score:protein' or 'thresh:prot'.
remove_peptide_hits_by_metavalue: Expects a 3-tuple (=3 entries in the list), i.e. <name> 'lt|eq|gt|ne' <value>; the first is the name of meta value, followed by the comparison operator (equal, less, greater, not equal) and the value to compare to. All comparisons are done after converting the given value to the corresponding data value type of the meta value (for lists, this simply compares length, not content!)!
log: Name of log file (created only when specified)
debug: Sets the debug level
threads: Sets the number of threads allowed to be used by the TOPP tool
no_progress: Disables progress logging to command line
force: Overrides tool-specific checks
test: Enables the test mode (needed for internal use only)
rt: Retention time range to extract.
mz: Mass-to-charge range to extract.
length: Keep only peptide hits with a sequence length in this range.
charge: Keep only peptide hits with charge states in this range.
psm: The score which should be reached by a peptide hit to be kept. (use 'NAN' to disable this filter)
peptide: The score which should be reached by a peptide hit to be kept. (use 'NAN' to disable this filter)
type_peptide: Score used for filtering. If empty, the main score is used.
protein: The score which should be reached by a protein hit to be kept. All proteins are filtered based on their singleton scores irrespective of grouping. Use in combination with 'delete_unreferenced_peptide_hits' to remove affected peptides. (use 'NAN' to disable this filter)
type_protein: The type of the score which should be reached by a protein hit to be kept. If empty, the most recently set score is used.
proteingroup: The score which should be reached by a protein group to be kept. Performs group level score filtering (including groups of single proteins). Use in combination with 'delete_unreferenced_peptide_hits' to remove affected peptides. (use 'NAN' to disable this filter)
protein_accessions: All peptides that do not reference at least one of the provided protein accession are removed. Only proteins of the provided list are retained.
ignore_modifications: Compare whitelisted peptides by sequence only.
modifications: Keep only peptides with sequences that contain (any of) the selected modification(s)
protein_accessions: All peptides that reference at least one of the provided protein accession are removed. Only proteins not in the provided list are retained.
ignore_modifications: Compare blacklisted peptides by sequence only.
modifications: Remove all peptides with sequences that contain (any of) the selected modification(s)
RegEx: Remove all peptides with (unmodified) sequences matched by the RegEx e.g. [BJXZ] removes ambiguous peptides.
enzyme: enzyme used for the digestion of the sample
specificity: Specificity of the filter
missed_cleavages: range of allowed missed cleavages in the peptide sequences By default missed cleavages are ignored
methionine_cleavage: Allow methionine cleavage at the N-terminus of the protein.
number_of_missed_cleavages: range of allowed missed cleavages in the peptide sequences. For example: 0:1 -> peptides with two or more missed cleavages will be removed, 0:0 -> peptides with any missed cleavages will be removed
enzyme: enzyme used for the digestion of the sample
p_value: Retention time filtering by the p-value predicted by RTPredict.
p_value_1st_dim: Retention time filtering by the p-value predicted by RTPredict for first dimension.
error: Filtering by deviation to theoretical mass (disabled for negative values).
unit: Absolute or relative error.
n_spectra: Keep only the 'n' best spectra (i.e., PeptideIdentifications) (for n > 0). A spectrum is considered better if it has a higher scoring peptide hit than the other spectrum.
n_peptide_hits: Keep only the 'n' highest scoring peptide hits per spectrum (for n > 0).
spectrum_per_peptide: Keep one spectrum per peptide. Value determines if same sequence but different charges or modifications are treated as separate peptides or the same peptide. (default: false = filter disabled).
n_protein_hits: Keep only the 'n' highest scoring protein hits (for n > 0).
strict: Keep only the highest scoring peptide hit. Similar to n_peptide_hits=1, but if there are ties between two or more highest scoring hits, none are kept.
n_to_m_peptide_hits: Peptide hit rank range to extracts

Input Ports

: input file [idXML,consensusXML]
: Filename of a FASTA file containing protein sequences.#br#All peptides that are not referencing a protein in this file are removed.#br#All proteins whose accessions are not present in this file are removed. [fasta,opt.]
: Only peptides with the same sequence and modification assignment as any peptide in this file are kept. Use with 'whitelist:ignore_modifications' to only compare by sequence.#br# [idXML,opt.]
: Filename of a FASTA file containing protein sequences.#br#All peptides that are referencing a protein in this file are removed.#br#All proteins whose accessions are present in this file are removed. [fasta,opt.]
: Peptides with the same sequence and modification assignment as any peptide in this file are filtered out. Use with 'blacklist:ignore_modifications' to only compare by sequence.#br# [idXML,opt.]
: fasta protein sequence database. [fasta,opt.]

Output Ports

: output file [idXML,consensusXML]

Popular Predecessors

Popular Successors

Views

IDFilter Std Output: The text sent to standard out during the execution of IDFilter.
IDFilter Error Output: The text sent to standard error during the execution of IDFilter. (If it appears in gray, it's the output of a previously failing run which is preserved for your trouble shooting.)

Workflows

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.

Installation

To use this node in KNIME, install the extension OpenMS from the below update site following our NodePit Product and Node Installation Guide:

v5.4

A zipped version of the software site can be downloaded here.

Plugin provider: Freie Universitaet Berlin, Universitaet Tuebingen, ZIB (GKN-Team) and the OpenMS Team

Plugin version: 3.4.0.202501170921

On NodePit since: 2024-12-06

Last update: 2025-06-07

KNIME versions: v5.4, v5.3, v5.2, v5.1, v4.7, v4.6, v4.5, v4.4, v4.3, v4.2, v4.1, v4.0, v3.7, v3.6

Deploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud or on-premises – with our brand new NodePit Runner.

Try NodePit Runner!