0 ×

OpenSwathWorkflow

Generic Workflow Nodes for KNIME: OpenMS version 2.3.0.201712211252 by Freie Universitaet Berlin, Universitaet Tuebingen, and the OpenMS Team

Complete workflow to run OpenSWATH

Web Documentation for OpenSwathWorkflow

Options

version
Version of the tool that generated this parameters file.
tr_type
input file type -- default: determined from file extension or content
sort_swath_maps
Sort of input SWATH files when matching to SWATH windows from swath_windows_file
use_ms1_traces
Extract the precursor ion trace(s) and use for scoring
enable_uis_scoring
Enable additional scoring of identification assays
min_upper_edge_dist
Minimal distance to the edge to still consider a precursor, in Thomson
rt_extraction_window
Only extract RT around this value (-1 means extract over the whole range, a value of 600 means to extract around +/- 300 s of the expected elution).
extra_rt_extraction_window
Output an XIC with a RT-window that by this much larger (e.g. to visually inspect a larger area of the chromatogram)
mz_extraction_window
Extraction window used (in Thomson, to use ppm see -ppm flag)
ppm
m/z extraction_window is in ppm
sonar
data is scanning SWATH data
min_rsq
Minimum r-squared of RT peptides regression
min_coverage
Minimum relative amount of RT peptides to keep
split_file_input
The input files each contain one single SWATH (alternatively: all SWATH are in separate files)
use_elution_model_score
Turn on elution model score (EMG fit to peak)
readOptions
Whether to run OpenSWATH directly on the input data, cache data to disk first or to perform a datareduction step first. If you choose cache, make sure to also set tempDirectory
mz_correction_function
Use the retention time normalization peptide MS2 masses to perform a mass correction (linear, weighted by intensity linear or quadratic) of all spectra.
irt_mz_extraction_window
Extraction window used for iRT and m/z correction (in Thomson, use ppm use -ppm flag)
ppm_irtwindow
iRT m/z extraction_window is in ppm
tempDirectory
Temporary directory to store cached files for example
extraction_function
Function used to extract the signal
batchSize
The batch size of chromatograms to process (0 means to only have one batch, sensible values are around 500-1000)
log
Name of log file (created only when specified)
debug
Sets the debug level
threads
Sets the number of threads allowed to be used by the TOPP tool
no_progress
Disables progress logging to command line
force
Overwrite tool specific checks.
test
Enables the test mode (needed for internal use only)
retentionTimeInterpretation
How to interpret the provided retention time (the retention time column can either be interpreted to be in iRT, minutes or seconds)
override_group_label_check
Override an internal check that assures that all members of the same PeptideGroupLabel have the same PeptideSequence (this ensures that only different isotopic forms of the same peptide can be grouped together in the same label group). Only turn this off if you know what you are doing.
force_invalid_mods
Force reading even if invalid modifications are encountered (OpenMS may not recognize the modification)
alignmentMethod
How to perform the alignment to the normalized RT space using anchor points. 'linear': perform linear regression (for few anchor points). 'interpolated': Interpolate between anchor points (for few, noise-free anchor points). 'lowess' Use local regression (for many, noisy anchor points). 'b_spline' use b splines for smoothing.
outlierMethod
Which outlier detection method to use (valid: 'iter_residual', 'iter_jackknife', 'ransac', 'none'). Iterative methods remove one outlier at a time. Jackknife approach optimizes for maximum r-squared improvement while 'iter_residual' removes the datapoint with the largest residual error (removal by residual is computationally cheaper, use this with lots of peptides).
useIterativeChauvenet
Whether to use Chauvenet's criterion when using iterative methods. This should be used if the algorithm removes too many datapoints but it may lead to true outliers being retained.
RANSACMaxIterations
Maximum iterations for the RANSAC outlier detection algorithm.
RANSACMaxPercentRTThreshold
Maximum threshold in RT dimension for the RANSAC outlier detection algorithm (in percent of the total gradient). Default is set to 3% which is around +/- 4 minutes on a 120 gradient.
RANSACSamplingSize
Sampling size of data points per iteration for the RANSAC outlier detection algorithm.
estimateBestPeptides
Whether the algorithms should try to choose the best peptides based on their peak shape for normalization. Use this option you do not expect all your peptides to be detected in a sample and too many 'bad' peptides enter the outlier removal step (e.g. due to them being endogenous peptides or using a less curated list of peptides).
InitialQualityCutoff
The initial overall quality cutoff for a peak to be scored (range ca. -2 to 2)
OverallQualityCutoff
The overall quality cutoff for a peak to go into the retention time estimation (range ca. 0 to 10)
NrRTBins
Number of RT bins to use to compute coverage. This option should be used to ensure that there is a complete coverage of the RT space (this should detect cases where only a part of the RT gradient is actually covered by normalization peptides)
MinPeptidesPerBin
Minimal number of peptides that are required for a bin to counted as 'covered'
MinBinsFilled
Minimal number of bins required to be covered
span
Span parameter for lowess
num_nodes
Number of nodes for b spline
stop_report_after_feature
Stop reporting after feature (ordered by quality; -1 means do not stop).
rt_normalization_factor
The normalized RT is expected to be between 0 and 1. If your normalized RT has a different range, pass this here (e.g. it goes from 0 to 100, set this value to 100)
quantification_cutoff
Cutoff in m/z below which peaks should not be used for quantification any more
write_convex_hull
Whether to write out all points of all features into the featureXML
uis_threshold_sn
S/N threshold to consider identification transition (set to -1 to consider all)
uis_threshold_peak_area
Peak area threshold to consider identification transition (set to -1 to consider all)
stop_after_feature
Stop finding after feature (ordered by intensity; -1 means do not stop).
min_peak_width
Minimal peak width (s), discard all peaks below this value (-1 means no action).
background_subtraction
Try to apply a background subtraction to the peak (experimental). The background is estimated at the peak boundaries, either the smoothed or the raw chromatogram data can be used for that.
recalculate_peaks
Tries to get better peak picking by looking at peak consistency of all picked peaks. Tries to use the consensus (median) peak border if theof variation within the picked peaks is too large.
use_precursors
Use precursor chromatogram for peak picking
recalculate_peaks_max_z
Determines the maximal Z-Score (difference measured in standard deviations) that is considered too large for peak boundaries. If the Z-Score is above this value, the median is used for peak boundaries (default value 1.0).
minimal_quality
Only if compute_peak_quality is set, this parameter will not consider peaks below this quality threshold
resample_boundary
For computing peak quality, how many extra seconds should be sample left and right of the actual peak
compute_peak_quality
Tries to compute a quality value for each peakgroup and detect outlier transitions. The resulting score is centered around zero and values above 0 are generally good and below -1 or -2 are usually bad.
sgolay_frame_length
The number of subsequent data points used for smoothing. This number has to be uneven. If it is not, 1 will be added.
sgolay_polynomial_order
Order of the polynomial that is fitted.
gauss_width
Gaussian width in seconds, estimated peak size.
use_gauss
Use Gaussian filter for smoothing (alternative is Savitzky-Golay filter)
peak_width
Force a certain minimal peak_width on the data (e.g. extend the peak at least by this amount on both sides) in seconds. -1 turns this feature off.
signal_to_noise
Signal-to-noise threshold at which a peak will not be extended any more. Note that setting this too high (e.g. 1.0) can lead to peaks whose flanks are not fully captured.
write_sn_log_messages
Write out log messages of the signal-to-noise estimator in case of sparse windows or median in rightmost histogram bin
remove_overlapping_peaks
Try to remove overlapping peaks during peak picking
method
Which method to choose for chromatographic peak-picking (OpenSWATH legacy on raw data, corrected picking on smoothed chromatogram or Crawdad on smoothed chromatogram).
dia_extraction_window
DIA extraction window in Th.
dia_centroided
Use centroded DIA data.
dia_byseries_intensity_min
DIA b/y series minimum intensity to consider.
dia_byseries_ppm_diff
DIA b/y series minimal difference in ppm to consider.
dia_nr_isotopes
DIA nr of isotopes to consider.
dia_nr_charges
DIA nr of charges to consider.
peak_before_mono_max_ppm_diff
DIA maximal difference in ppm to count a peak at lower m/z when searching for evidence that a peak might not be monoisotopic.
max_iteration
Maximum number of iterations using by Levenberg-Marquardt algorithm.
use_shape_score
Use the shape score (this score measures the similarity in shape of the transitions using a cross-correlation)
use_coelution_score
Use the coelution score (this score measures the similarity in coelution of the transitions using a cross-correlation)
use_rt_score
Use the retention time score (this score measure the difference in retention time)
use_library_score
Use the library score
use_intensity_score
Use the intensity score
use_nr_peaks_score
Use the number of peaks score
use_total_xic_score
Use the total XIC score
use_sn_score
Use the SN (signal to noise) score
use_dia_scores
Use the DIA (SWATH) scores. If turned off, will not use fragment ion spectra for scoring.
use_ms1_correlation
Use the correlation scores with the MS1 elution profiles
use_sonar_scores
Use the scores for SONAR scans (scanning swath)
use_ms1_fullscan
Use the full MS1 scan at the peak apex for scoring (ppm accuracy of precursor and isotopic pattern)
use_uis_scores
Use UIS scores for peptidoform identification

Input Ports

Input files separated by blank [mzML,mzXML]
transition file ('TraML','tsv','pqp') [traML,tsv,pqp]
transition file ('TraML') [traML,opt.]
RT normalization file (how to map the RTs of this run to the ones stored in the library). If set, tr_irt may be omitted. [trafoXML,opt.]
Optional, tab separated file containing the SWATH windows for extraction: lower_offset upper_offset \newline 400 425 \newline ... Note that the first line is a header and will be skipped. [,opt.]

Output Ports

output file [featureXML,Inactive]
TSV output file (mProphet compatible TSV file) [tsv,Inactive]
OSW output file (PyProphet compatible SQLite file) [osw,Inactive]
Also output all computed chromatograms output in mzML (chrom.mzML) or sqMass (SQLite format) [mzML,sqMass,Inactive]

Views

OpenSwathWorkflow Std Output
The text sent to standard out during the execution of OpenSwathWorkflow.
OpenSwathWorkflow Error Output
The text sent to standard error during the execution of OpenSwathWorkflow. (If it appears in gray, it's the output of a previously failing run which is preserved for your trouble shooting.)

Best Friends (Incoming)

Best Friends (Outgoing)

Update Site

To use this node in KNIME, install Generic Workflow Nodes for KNIME: OpenMS from the following update site:

Wait a sec! You want to explore and install nodes even faster? We highly recommend our NodePit for KNIME extension for your KNIME Analytics Platform.