Icon

02_​Read_​Mapping_​and_​Variant_​Annotation

Activity I: Read Mapping 1. Configure the Input File node labeled (Reference gene) by selecting the FASTA file human_hbb_gene.fasta located under the provided sequencing data folder. 2. Similarly, configure the second Input File node labeled (Sequencing reads) by selecting the FASTQ file SRR10218764_40K.fastq 3. Use the provided YaraMapper node to perform read mapping. (Hint: connect the output of Indexed Genome and Sequencing Reads to the YaraMapper node in that particular order.) 4. Configure Copy/Move Files node at the end by selecting a folder on your computer under the option Use source name and output directory. Input Sequences Reference Sequences - Indexing Data source - Reference gene: https://www.ncbi.nlm.nih.gov/gene?Db=gene&Cmd=DetailsSearch&Term=3043 - Sequencing Reads: https://www.ncbi.nlm.nih.gov/sra/SRR10218764 Activity II: Explore called variants and annotate them 1. Adjust the settings of the CSV Reader node so that the VCF columns are loaded correctly to the table. Specifically, set Column Delimiter = \t Comment Char = ## and make sure the Has Row Header checkbox is unchecked. 2. Modify the View & Annotate Variants component by adding and connecting Color Manager, Table Editor and Parallel Coordinates Plot nodes. (Hint: use Ctrl + Double click to open and edit the component and use the workflow annotations inside the component as guidelines) 3. Inspect the results using the composite view of the component. Select the variants with higher quality score and annotate them editing the table. 4. Use Row Filter node to filter the selected variants and write the output to Excel file using the Excel Writer (XLS) node. persist files by moving them from tmplocations to a directory of choice Indexed GenomeReference gene Configure Me!Sequencing readsConfigure Me!Load VCF file as a tableConfigure Me!Port to URI YaraIndexer YaraMapper Copy/Move Files View & AnnotateVariants Input File Input File CSV Reader Activity I: Read Mapping 1. Configure the Input File node labeled (Reference gene) by selecting the FASTA file human_hbb_gene.fasta located under the provided sequencing data folder. 2. Similarly, configure the second Input File node labeled (Sequencing reads) by selecting the FASTQ file SRR10218764_40K.fastq 3. Use the provided YaraMapper node to perform read mapping. (Hint: connect the output of Indexed Genome and Sequencing Reads to the YaraMapper node in that particular order.) 4. Configure Copy/Move Files node at the end by selecting a folder on your computer under the option Use source name and output directory. Input Sequences Reference Sequences - Indexing Data source - Reference gene: https://www.ncbi.nlm.nih.gov/gene?Db=gene&Cmd=DetailsSearch&Term=3043 - Sequencing Reads: https://www.ncbi.nlm.nih.gov/sra/SRR10218764 Activity II: Explore called variants and annotate them 1. Adjust the settings of the CSV Reader node so that the VCF columns are loaded correctly to the table. Specifically, set Column Delimiter = \t Comment Char = ## and make sure the Has Row Header checkbox is unchecked. 2. Modify the View & Annotate Variants component by adding and connecting Color Manager, Table Editor and Parallel Coordinates Plot nodes. (Hint: use Ctrl + Double click to open and edit the component and use the workflow annotations inside the component as guidelines) 3. Inspect the results using the composite view of the component. Select the variants with higher quality score and annotate them editing the table. 4. Use Row Filter node to filter the selected variants and write the output to Excel file using the Excel Writer (XLS) node. persist files by moving them from tmplocations to a directory of choice Indexed GenomeReference gene Configure Me!Sequencing readsConfigure Me!Load VCF file as a tableConfigure Me!Port to URI YaraIndexer YaraMapper Copy/Move Files View & AnnotateVariants Input File Input File CSV Reader

Nodes

Extensions

Links