MasonFragSequencing

Given a FASTA file with fragments, simulate sequencing thereof.

This program is a more lightweight version of mason_sequencing without support for the application of VCF and fragment sampling. Output of SAM is also not available. However, it uses the same code for the simulation of the reads as the more powerful mason_simulator.

You can use mason_frag_sequencing if you want to implement you rown fragmentation behaviour, e.g. if you have implemented your own bias models.

Web Documentation for MasonFragSequencing

Options

version-check
Turn this option off to disable version update notifications of the application.
quiet
Low verbosity.
verbose
Higher verbosity.
very-verbose
Highest verbosity.
seed
Seed to use for random number generator.
force-single-end
Force single-end simulation although --out-right is given.
seq-technology
Set sequencing technology to simulate.
seq-mate-orientation
Orientation for paired reads. See section Read Orientation below.
seq-strands
Strands to simulate from, only applicable to paired sequencing simulation.
embed-read-info
Whether or not to embed read information.
read-name-prefix
Read names will have this prefix.
enable-bs-seq
Enable BS-seq simulation.
bs-seq-protocol
Protocol to use for BS-Seq simulation.
bs-seq-conversion-rate
Conversion rate for unmethylated Cs to become Ts.
illumina-read-length
Read length for Illumina simulation.
illumina-prob-insert
Insert per-base probability for insertion in Illumina sequencing.
illumina-prob-deletion
Insert per-base probability for deletion in Illumina sequencing.
illumina-prob-mismatch-scale
Scaling factor for Illumina mismatch probability.
illumina-prob-mismatch
Average per-base mismatch probability in Illumina sequencing.
illumina-prob-mismatch-begin
Per-base mismatch probability of first base in Illumina sequencing.
illumina-prob-mismatch-end
Per-base mismatch probability of last base in Illumina sequencing.
illumina-position-raise
Point where the error curve raises in relation to read length.
illumina-quality-mean-begin
Mean PHRED quality for non-mismatch bases of first base in Illumina sequencing.
illumina-quality-mean-end
Mean PHRED quality for non-mismatch bases of last base in Illumina sequencing.
illumina-quality-stddev-begin
Standard deviation of PHRED quality for non-mismatch bases of first base in Illumina sequencing.
illumina-quality-stddev-end
Standard deviation of PHRED quality for non-mismatch bases of last base in Illumina sequencing.
illumina-mismatch-quality-mean-begin
Mean PHRED quality for mismatch bases of first base in Illumina sequencing.
illumina-mismatch-quality-mean-end
Mean PHRED quality for mismatch bases of last base in Illumina sequencing.
illumina-mismatch-quality-stddev-begin
Standard deviation of PHRED quality for mismatch bases of first base in Illumina sequencing.
illumina-mismatch-quality-stddev-end
Standard deviation of PHRED quality for mismatch bases of last base in Illumina sequencing.
sanger-read-length-model
The model to use for sampling the Sanger read length.
sanger-read-length-min
The minimal read length when the read length is sampled uniformly.
sanger-read-length-max
The maximal read length when the read length is sampled uniformly.
sanger-read-length-mean
The mean read length when the read length is sampled with normal distribution.
sanger-read-length-error
The read length standard deviation when the read length is sampled uniformly.
sanger-prob-mismatch-scale
Scaling factor for Sanger mismatch probability.
sanger-prob-mismatch-begin
Per-base mismatch probability of first base in Sanger sequencing.
sanger-prob-mismatch-end
Per-base mismatch probability of last base in Sanger sequencing.
sanger-prob-insertion-begin
Per-base insertion probability of first base in Sanger sequencing.
sanger-prob-insertion-end
Per-base insertion probability of last base in Sanger sequencing.
sanger-prob-deletion-begin
Per-base deletion probability of first base in Sanger sequencing.
sanger-prob-deletion-end
Per-base deletion probability of last base in Sanger sequencing.
sanger-quality-match-start-mean
Mean PHRED quality for non-mismatch bases of first base in Sanger sequencing.
sanger-quality-match-end-mean
Mean PHRED quality for non-mismatch bases of last base in Sanger sequencing.
sanger-quality-match-start-stddev
Mean PHRED quality for non-mismatch bases of first base in Sanger sequencing.
sanger-quality-match-end-stddev
Mean PHRED quality for non-mismatch bases of last base in Sanger sequencing.
sanger-quality-error-start-mean
Mean PHRED quality for erroneous bases of first base in Sanger sequencing.
sanger-quality-error-end-mean
Mean PHRED quality for erroneous bases of last base in Sanger sequencing.
sanger-quality-error-start-stddev
Mean PHRED quality for erroneous bases of first base in Sanger sequencing.
sanger-quality-error-end-stddev
Mean PHRED quality for erroneous bases of last base in Sanger sequencing.
454-read-length-model
The model to use for sampling the 454 read length.
454-read-length-min
The minimal read length when the read length is sampled uniformly.
454-read-length-max
The maximal read length when the read length is sampled uniformly.
454-read-length-mean
The mean read length when the read length is sampled with normal distribution.
454-read-length-stddev
The read length standard deviation when the read length is sampled with normal distribution.
454-no-sqrt-in-std-dev
For error model, if set then (sigma = k * r)) is used, otherwise (sigma = k * sqrt(r)).
454-proportionality-factor
Proportionality factor for calculating the standard deviation proportional to the read length.
454-background-noise-mean
Mean of lognormal distribution to use for the noise.
454-background-noise-stddev
Standard deviation of lognormal distribution to use for the noise.

Input Ports

Icon
Path to input file. [fq,fq.bgzf,fq.gz,fastq,fastq.bgzf,fastq.gz,fa,fa.bgzf,fa.gz,fasta,fasta.bgzf,fasta.gz,faa,faa.bgzf,faa.gz,ffn,ffn.bgzf,ffn.gz,fna,fna.bgzf,fna.gz,frn,frn.bgzf,frn.gz,embl,embl.bgzf,embl.gz,gbk,gbk.bgzf,gbk.gz,raw,raw.bgzf,raw.gz,sam,sam.bgzf,sam.gz,bam]
Icon
Path to file with Illumina error profile. The file must be a text file with floating point numbers separated by space, each giving a positional error rate. [txt,opt.]
Icon
FASTQ file to use for a template for left-end reads. [fq,fq.bgzf,fq.gz,fastq,fastq.bgzf,fastq.gz,fa,fa.bgzf,fa.gz,fasta,fasta.bgzf,fasta.gz,faa,faa.bgzf,faa.gz,ffn,ffn.bgzf,ffn.gz,fna,fna.bgzf,fna.gz,frn,frn.bgzf,frn.gz,embl,embl.bgzf,embl.gz,gbk,gbk.bgzf,gbk.gz,raw,raw.bgzf,raw.gz,sam,sam.bgzf,sam.gz,bam,opt.]
Icon
FASTQ file to use for a template for right-end reads. [fq,fq.bgzf,fq.gz,fastq,fastq.bgzf,fastq.gz,fa,fa.bgzf,fa.gz,fasta,fasta.bgzf,fasta.gz,faa,faa.bgzf,faa.gz,ffn,ffn.bgzf,ffn.gz,fna,fna.bgzf,fna.gz,frn,frn.bgzf,frn.gz,embl,embl.bgzf,embl.gz,gbk,gbk.bgzf,gbk.gz,raw,raw.bgzf,raw.gz,sam,sam.bgzf,sam.gz,bam,opt.]

Output Ports

Icon
Output of single-end/left end reads. [fq,fq.bgzf,fq.gz,fastq,fastq.bgzf,fastq.gz,fa,fa.bgzf,fa.gz,fasta,fasta.bgzf,fasta.gz,faa,faa.bgzf,faa.gz,ffn,ffn.bgzf,ffn.gz,fna,fna.bgzf,fna.gz,frn,frn.bgzf,frn.gz,raw,raw.bgzf,raw.gz,sam,sam.bgzf,sam.gz,bam]
Icon
Output of right reads. Giving this options enables paired-end simulation. [fq,fq.bgzf,fq.gz,fastq,fastq.bgzf,fastq.gz,fa,fa.bgzf,fa.gz,fasta,fasta.bgzf,fasta.gz,faa,faa.bgzf,faa.gz,ffn,ffn.bgzf,ffn.gz,fna,fna.bgzf,fna.gz,frn,frn.bgzf,frn.gz,raw,raw.bgzf,raw.gz,sam,sam.bgzf,sam.gz,bam]

Popular Predecessors

Popular Successors

Views

MasonFragSequencing Std Output
The text sent to standard out during the execution of MasonFragSequencing.
MasonFragSequencing Error Output
The text sent to standard error during the execution of MasonFragSequencing. (If it appears in gray, it's the output of a previously failing run which is preserved for your trouble shooting.)

Workflows

  • No workflows found

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.