0 ×

MSSimulator

Generic Workflow Nodes for KNIME: OpenMS version 2.5.0.202002201806 by Freie Universitaet Berlin, Universitaet Tuebingen, and the OpenMS Team

A highly configurable simulator for mass spectrometry experiments.

Web Documentation for MSSimulator

Options

version
Version of the tool that generated this parameters file.
log
Name of log file (created only when specified)
debug
Sets the debug level
threads
Sets the number of threads allowed to be used by the TOPP tool
no_progress
Disables progress logging to command line
force
Overwrite tool specific checks.
test
Enables the test mode (needed for internal use only)
enzyme
Enzyme to use for digestion (select 'no cleavage' to skip digestion)
model
The cleavage model to use for digestion. 'Trained' is based on a log likelihood model (see DOI:10.1021/pr060507u).
min_peptide_length
Minimum peptide length after digestion (shorter ones will be discarded)
threshold
Model threshold for calling a cleavage. Higher values increase the number of cleavages. -2 will give no cleavages, +4 almost full cleavage.
missed_cleavages
Maximum number of missed cleavages considered. All possible resulting peptides will be created.
rt_column
Modelling of an RT or CE column
auto_scale
Scale predicted RT's/MT's to given 'total_gradient_time'? If 'true', for CE this means that 'CE:lenght_d', 'CE:length_total', 'CE:voltage' have no influence.
total_gradient_time
The duration [s] of the gradient.
sampling_rate
Time interval [s] between consecutive scans
min
Start of RT Scan Window [s]
max
End of RT Scan Window [s]
feature_stddev
Standard deviation of shift in retention time [s] from predicted model (applied to every single feature independently)
affine_offset
Global offset in retention time [s] from predicted model
affine_scale
Global scaling in retention time from predicted model
distortion
Distortion of the elution profiles. Good presets are 0 for a perfect elution profile, 1 for a slightly distorted elution profile etc... For trapping instruments (e.g. Orbitrap) distortion should be >4.
value
Width of the Exponential Gaussian Hybrid distribution shape of the elution profile. This does not correspond directly to the width in [s].
variance
Random component of the width (set to 0 to disable randomness), i.e. scale parameter for the lorentzian variation of the variance (Note: The scale parameter has to be >= 0).
value
Asymmetric component of the EGH. Higher absolute(!) values lead to more skewness (negative values cause fronting, positive values cause tailing). Tau parameter of the EGH, i.e. time constant of the exponential decay of the Exponential Gaussian Hybrid distribution shape of the elution profile.
variance
Random component of skewness (set to 0 to disable randomness), i.e. scale parameter for the lorentzian variation of the time constant (Note: The scale parameter has to be > 0).
model_file
SVM model for retention time prediction
pH
pH of buffer
alpha
Exponent Alpha used to calculate mobility
mu_eo
Electroosmotic flow
lenght_d
Length of capillary [cm] from injection site to MS
length_total
Total length of capillary [cm]
voltage
Voltage applied to capillary
dt_simulation_on
Modelling detectibility enabled? This can serve as a filter to remove peptides which ionize badly, thus reducing peptide count
min_detect
Minimum peptide detectability accepted. Peptides with a lower score will be removed
dt_model_file
SVM model for peptide detectability prediction
ionized_residues
List of residues (as three letter code) that will be considered during ES ionization. The N-term is always assumed to carry a charge. This parameter will be ignored during MALDI ionization
charge_impurity
List of charged ions that contribute to charge with weight of occurrence (their sum is scaled to 1 internally), e.g. ['H:1'] or ['H:0.7' 'Na:0.3'], ['H:4' 'Na:1'] (which internally translates to ['H:0.8' 'Na:0.2'])
max_impurity_set_size
Maximal #combinations of charge impurities allowed (each generating one feature) per charge state. E.g. assuming charge=3 and this parameter is 2, then we could choose to allow '3H+, 2H+Na+' features (given a certain 'charge_impurity' constraints), but no '3H+, 2H+Na+, 3Na+'
ionization_probability
Probability for the binomial distribution of the ESI charge states
ionization_probabilities
List of probabilities for different charge states (starting at charge=1, 2, ...) during MALDI ionization (the list must sum up to 1.0)
lower_measurement_limit
Lower m/z detector limit
upper_measurement_limit
Upper m/z detector limit
enabled
Enable RAW signal simulation? (select 'false' if you only need feature-maps)
peak_shape
Peak Shape used around each isotope peak (be aware that the area under the curve is constant for both types, but the maximal height will differ (~ 2:3 = Lorentz:Gaussian) due to the wider base of the Lorentzian
value
Instrument resolution at 400 Th
type
How does resolution change with increasing m/z?! QTOFs usually show 'constant' behavior, FTs have linear degradation, and on Orbitraps the resolution decreases with square root of mass
scaling
Scale of baseline. Set to 0 to disable simulation of baseline
shape
The baseline is modeled by an exponential probability density function (pdf) with f(x) = shape*e^(- shape*x)
sampling_points
Number of raw data points per FWHM of the peak
file
Contaminants file with sum formula and absolute RT interval. See 'OpenMS/examples/simulation/contaminants.txt' for details
error_mean
Average systematic m/z error (in Da)
error_stddev
Standard deviation for m/z errors. Set to 0 to disable simulation of m/z errors
scale
Constant scale factor of the feature intensity. Set to 1.0 to get the real intensity values provided in the FASTA file
scale_stddev
Standard deviation of peak intensity (relative to the scaled peak height). Set to 0 to get simple rescaled intensities
rate
Poisson rate of shot noise per unit m/z (random peaks in m/z, where the number of peaks per unit m/z follows a Poisson distribution). Set this to 0 to disable simulation of shot noise
intensity-mean
Shot noise intensity mean (exponentially distributed with given mean)
mean
Mean value of white noise (Gaussian) being added to each *measured* signal intensity
stddev
Standard deviation of white noise being added to each *measured* signal intensity
mean
Mean intensity value of the detector noise (Gaussian distribution)
stddev
Standard deviation of the detector noise (Gaussian distribution)
status
Create Tandem-MS scans?
tandem_mode
Algorithm to generate the tandem-MS spectra. 0 - fixed intensities, 1 - SVC prediction (abundant/missing), 2 - SVR prediction of peak intensity
svm_model_set_file
File containing the filenames of SVM Models for different charge variants
ms2_spectra_per_rt_bin
Number of allowed MS/MS spectra in a retention time bin.
min_mz_peak_distance
The minimal distance (in Th) between two peaks for concurrent selection for fragmentation. Also used to define the m/z width of an exclusion window (distance +/- from m/z of precursor). If you set this lower than the isotopic envelope of a peptide, you might get multiple fragment spectra pointing to the same precursor.
mz_isolation_window
All peaks within a mass window (in Th) of a selected peak are also selected for fragmentation.
exclude_overlapping_peaks
If true, overlapping or nearby peaks (within 'min_mz_peak_distance') are excluded for selection.
charge_filter
Charges considered for MS2 fragmentation.
use_dynamic_exclusion
If true dynamic exclusion is applied.
exclusion_time
The time (in seconds) a feature is excluded.
max_list_size
The maximal number of precursors in the inclusion list.
min_rt
Minimal rt in seconds.
max_rt
Maximal rt in seconds.
rt_step_size
rt step size in seconds.
rt_window_size
rt window size in seconds.
min_protein_id_probability
Minimal protein probability for a protein to be considered identified.
min_pt_weight
Minimal pt weight of a precursor
min_mz
Minimal mz to be considered in protein based LP formulation.
max_mz
Minimal mz to be considered in protein based LP formulation.
use_peptide_rule
Use peptide rule instead of minimal protein id probability
min_peptide_ids
If use_peptide_rule is true, this parameter sets the minimal number of peptide ids for a protein id
min_peptide_probability
If use_peptide_rule is true, this parameter sets the minimal probability for a peptide to be safely identified
add_single_spectra
If true, the MS2 spectra for each peptide signal are included in the output (might be a lot). They will have a meta value 'MSE_DebugSpectrum' attached, so they can be filtered out. Native MS_E spectra will have 'MSE_Spectrum' instead.
isotope_model
Model to use for isotopic peaks ('none' means no isotopic peaks are added, 'coarse' adds isotopic peaks in unit mass distance, 'fine' uses the hyperfine isotopic generator to add accurate isotopic peaks. Note that adding isotopic peaks is very slow.
max_isotope
Defines the maximal isotopic peak which is added if 'isotope_model' is 'coarse'
max_isotope_probability
Defines the maximal isotopic probability to cover if 'isotope_model' is 'fine'
add_metainfo
Adds the type of peaks as metainfo to the peaks, like y8+, [M-H2O+2H]++
add_losses
Adds common losses to those ion expect to have them, only water and ammonia loss is considered
sort_by_position
Sort output by position
add_precursor_peaks
Adds peaks of the unfragmented precursor ion to the spectrum
add_all_precursor_charges
Adds precursor peaks with all charges in the given range
add_abundant_immonium_ions
Add most abundant immonium ions
add_first_prefix_ion
If set to true e.g. b1 ions are added
add_y_ions
Add peaks of y-ions to the spectrum
add_b_ions
Add peaks of b-ions to the spectrum
add_a_ions
Add peaks of a-ions to the spectrum
add_c_ions
Add peaks of c-ions to the spectrum
add_x_ions
Add peaks of x-ions to the spectrum
add_z_ions
Add peaks of z-ions to the spectrum
y_intensity
Intensity of the y-ions
b_intensity
Intensity of the b-ions
a_intensity
Intensity of the a-ions
c_intensity
Intensity of the c-ions
x_intensity
Intensity of the x-ions
z_intensity
Intensity of the z-ions
relative_loss_intensity
Intensity of loss ions, in relation to the intact ion intensity
precursor_intensity
Intensity of the precursor peak
precursor_H2O_intensity
Intensity of the H2O loss peak of the precursor
precursor_NH3_intensity
Intensity of the NH3 loss peak of the precursor
add_isotopes
If set to 1 isotope peaks of the product ion peaks are added
max_isotope
Defines the maximal isotopic peak which is added, add_isotopes must be set to 1
add_metainfo
Adds the type of peaks as metainfo to the peaks, like y8+, [M-H2O+2H]++
add_first_prefix_ion
If set to true e.g. b1 ions are added
hide_y_ions
Add peaks of y-ions to the spectrum
hide_y2_ions
Add peaks of y-ions to the spectrum
hide_b_ions
Add peaks of b-ions to the spectrum
hide_b2_ions
Add peaks of b-ions to the spectrum
hide_a_ions
Add peaks of a-ions to the spectrum
hide_c_ions
Add peaks of c-ions to the spectrum
hide_x_ions
Add peaks of x-ions to the spectrum
hide_z_ions
Add peaks of z-ions to the spectrum
hide_losses
Adds common losses to those ion expect to have them, only water and ammonia loss is considered
y_intensity
Intensity of the y-ions
b_intensity
Intensity of the b-ions
a_intensity
Intensity of the a-ions
c_intensity
Intensity of the c-ions
x_intensity
Intensity of the x-ions
z_intensity
Intensity of the z-ions
relative_loss_intensity
Intensity of loss ions, in relation to the intact ion intensity
ionization_type
Type of Ionization (MALDI or ESI)
type
Select the labeling type you want for your experiment
ICPL_fixed_rtshift
Fixed retention time shift between labeled pairs. If set to 0.0 only the retention times, computed by the RT model step are used.
label_proteins
Enables protein-labeling. (select 'false' if you only need peptide-labeling)
ICPL_light_channel_label
UniMod Id of the light channel ICPL label.
ICPL_medium_channel_label
UniMod Id of the medium channel ICPL label.
ICPL_heavy_channel_label
UniMod Id of the heavy channel ICPL label.
fixed_rtshift
Fixed retention time shift between labeled peptides. If set to 0.0 only the retention times computed by the RT model step are used.
modification_lysine
Modification of Lysine in the medium SILAC channel
modification_arginine
Modification of Arginine in the medium SILAC channel
modification_lysine
Modification of Lysine in the heavy SILAC channel. If left empty, two channelSILAC is assumed.
modification_arginine
Modification of Arginine in the heavy SILAC channel. If left empty, two-channel SILAC is assumed.
iTRAQ
4plex or 8plex iTRAQ?
reporter_mass_shift
Allowed shift (uniformly distributed - left to right) in Da from the expected position (of e.g. 114.1, 115.1)
channel_active_4plex
Four-plex only: Each channel that was used in the experiment and its description (114-117) in format <channel>:<name>, e.g. "114:myref","115:liver".
channel_active_8plex
Eight-plex only: Each channel that was used in the experiment and its description (113-121) in format <channel>:<name>, e.g. "113:myref","115:liver","118:lung".
isotope_correction_values_4plex
override default values (see Documentation); use the following format: <channel>:<-2Da>/<-1Da>/<+1Da>/<+2Da> ; e.g. '114:0/0.3/4/0' , '116:0.1/0.3/3/0.2'
isotope_correction_values_8plex
override default values (see Documentation); use the following format: <channel>:<-2Da>/<-1Da>/<+1Da>/<+2Da> ; e.g. '113:0/0.3/4/0' , '116:0.1/0.3/3/0.2'
Y_contamination
Efficiency of labeling tyrosine ('Y') residues. 0=off, 1=full labeling
labeling_efficiency
Describes the distribution of the labeled peptide over the different states (unlabeled, mono- and di-labeled)
biological
Controls the 'biological' randomness of the generated data (e.g. systematic effects like deviations in RT). If set to 'random' each experiment will look different. If set to 'reproducible' each experiment will have the same outcome (given that the input data is the same)
technical
Controls the 'technical' randomness of the generated data (e.g. noise in the raw signal). If set to 'random' each experiment will look different. If set to 'reproducible' each experiment will have the same outcome (given that the input data is the same)

Input Ports

Icon
Input protein sequences [FASTA]

Output Ports

Icon
output: simulated MS raw (profile) data [mzML]
Icon
output: ground-truth picked (centroided) MS data [mzML]
Icon
output: ground-truth features [featureXML]
Icon
output: ground-truth features, grouping ESI charge variants of each parent peptide [consensusXML]
Icon
output: ground-truth features, grouping labeled variants [consensusXML]
Icon
output: ground-truth features caused by contaminants [featureXML]
Icon
output: ground-truth MS2 peptide identifications [idXML]

Views

MSSimulator Std Output
The text sent to standard out during the execution of MSSimulator.
MSSimulator Error Output
The text sent to standard error during the execution of MSSimulator. (If it appears in gray, it's the output of a previously failing run which is preserved for your trouble shooting.)

Best Friends (Incoming)

Best Friends (Outgoing)

Installation

To use this node in KNIME, install OpenMS from the following update site:

KNIME 4.2
Wait a sec! You want to explore and install nodes even faster? We highly recommend our NodePit for KNIME extension for your KNIME Analytics Platform.

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.