RabemaEvaluate

Compare the SAM/bam output MAPPING.sam/MAPPING.bam of any read mapper against the RABEMA gold standard previously built with rabema_build_gold_standard. The input is a reference FASTA file, a gold standard interval (GSI) file and the SAM/BAM input to evaluate.

The input SAM/BAM file must be sorted by queryname. The program will create a FASTA index file REF.fa.fai for fast random access to the reference.

Web Documentation for RabemaEvaluate

Options

version-check
Turn this option off to disable version update notifications of the application.
verbose
Enable verbose output.
very-verbose
Enable even more verbose output.
dont-check-sorting
Do not check sortedness (by name) of input SAM/BAM files. This is required if the reads are not sorted by name in the original FASTQ files. Files from the SRA and ENA generally are sorted.
oracle-mode
Enable oracle mode. This is used for simulated data when the input GSI file gives exactly one position that is considered as the true sample position. For simulated data.
only-unique-reads
Consider only reads that a single alignment in the mapping result file. Useful for precision computation.
match-N
When set, N matches all characters without penalty.
distance-metric
Set distance metric. Valid values: hamming, edit. Default: edit.
max-error
Maximal error rate to build gold standard for in percent. This parameter is an integer and relative to the read length. The error rate is ignored in oracle mode, here the distance of the read at the sample position is taken, individually for each read. Default: 0
benchmark-category
Set benchmark category. One of {all, all-best, any-best. Default: all
trust-NM
When set, we trust the alignment and distance from SAM/BAM file and no realignment is performed. Off by default.
extra-pos-tag
If the CIGAR string is absent, the missing alignment end position can be provided by this BAM tag.
ignore-paired-flags
When set, we ignore all SAM/BAM flags related to pairing. This is necessary when analyzing SAM from SOAP's soap2sam.pl script.
DONT-PANIC
Do not stop program execution if an additional hit was found that indicates that the gold standard is incorrect.
show-missed-intervals
Show details for each missed interval from the GSI.
show-invalid-hits
Show details for invalid hits (with too high error rate).
show-additional-hits
Show details for additional hits (low enough error rate but not in gold standard.
show-hits
Show details for hit intervals.
show-try-hit
Show details for each alignment in SAM/BAM input.

Input Ports

Icon
Path to load reference FASTA from. [fq,fq.bgzf,fq.gz,fastq,fastq.bgzf,fastq.gz,fa,fa.bgzf,fa.gz,fasta,fasta.bgzf,fasta.gz,faa,faa.bgzf,faa.gz,ffn,ffn.bgzf,ffn.gz,fna,fna.bgzf,fna.gz,frn,frn.bgzf,frn.gz,embl,embl.bgzf,embl.gz,gbk,gbk.bgzf,gbk.gz,raw,raw.bgzf,raw.gz,sam,sam.bgzf,sam.gz,bam]
Icon
Path to load gold standard intervals from. If compressed using gzip, the file will be decompressed on the fly. [gsi,gsi.gz]
Icon
Path to load the read mapper SAM or BAM output from. [bam,sam,sam.bgzf,sam.gz]

Output Ports

Icon
Path to write the statistics to as TSV. [rabema_report_tsv]

Popular Predecessors

Popular Successors

Views

RabemaEvaluate Std Output
The text sent to standard out during the execution of RabemaEvaluate.
RabemaEvaluate Error Output
The text sent to standard error during the execution of RabemaEvaluate. (If it appears in gray, it's the output of a previously failing run which is preserved for your trouble shooting.)

Workflows

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.