BWA

BWA is a fast light-weighted tool that aligns relatively short sequences (queries) to a sequence database (target), such as the human reference genome. It implements three different algorithms, all based on Burrows-Wheeler Transform (BWT). The first algorithm BWA-backtrack is designed for short queries up to ~200bp with low error rate (<3%). It does gapped global alignment w.r.t. queries, supports paired-end reads, and is one of the fastest short read alignment algorithms to date while also visiting suboptimal hits. The second algorithm BWA-SW is designed for long reads (ranging from 70bp to 1Mbp) with more errors. It performs heuristic Smith-Waterman-like alignment to find high-scoring local hits (and thus chimera). On low-error short queries, BWA-SW is slower and less accurate than the first algorithm, but on long queries, it is better. The latest algorithm BWA-MEM is similar to BWA-SW. It is generally recommended for high-quality queries as it is faster and more accurate than BWA-SW. BWA-MEM also has better performance than BWA-backtrack for 70-100bp Illumina reads.
Source: http://bio-bwa.sourceforge.net/bwa.shtml

Options

Index reference sequence
Uncheck this option if the reference sequence is already indexed (using bwa). In this case the reference sequence will not be indexed again.
Algorithm for constructing BWT index
  • BWT-SW: Algorithm implemented in BWT-SW. Algorithm implemented in BWT-SW. This method works with the whole human genome.
  • IS: IS linear-time algorithm for constructing suffix array. It requires 5.37N memory where N is the size of the database. IS is moderately fast, but does not work with database larger than 2GB. IS is the default algorithm due to its simplicity. The current codes for IS algorithm are reimplemented by Yuta Mori.
Algorithm for mapping
Define which algorithm should be used for mapping.
  • BWA_MEM: This algorithm is designed for 70bp - 100bp sequence reads. It is generally recommended for high-quality queries as it is faster and more accurate than the other two algorithms. BWA-MEM also has better performance than BWA-backtrack for 70-100bp Illumina reads.
  • BWA-backtrack: This algorithm is designed for longer sequences ranged from 70bp to 1 Mbp.
  • BWA-SW: Analogous to BWA-backtrack BWA-SW is also designed for longer sequences ranged from 70bp to 1 Mbp.
Build color-space index
Build a color-space index. The input fasta should be in nucleotide space.
Specify read group header
GATK requires a read group identifier (ID) and sample information (SM), e.g. specified by ’@RG\tID:foo\tSM:bar’. All reads within a read group are assumed to come from the same instrument run and to therefore share the same error model. SM defines the name of the sample sequenced in a read group. Read group header will not be added if BWA-SW is used.
Optional Parameters
For each computing step, additional parameters can be specified. For a list of all parameters, see online BWA user manual.
Number of threads
Set the number of threads to be used.
BWA aln (sub-command of BWA-backtrack), BWA-SW and BWA-MEM can be run in multi-threading mode.
Increasing the number of threads speeds up the node, but also increases the memory required for the calculations.

Preference page

HTE
Set threshold for repeated execution. Only used if HTE is enabled in the preference page.
Path to bwa
Set the path to BWA executable. This will be done automatically if the path is already defined in the preference page.
Path to the reference sequence
Choose a reference sequence, such as a genome sequence, to map the reads (file type: FastA). This will be done automatically if the path is already defined in the preference page.

Input Ports

Icon
Cell 0: Path to ReadFile1
Cell 1: (Optional) Path to ReadFile2

Output Ports

Icon
Cell 0: Path to SAM file

Views

STDOUT / STDERR
The node offers a direct view of its standard out and the standard error of the tool.

Workflows

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.