Icon

Variant_​Prioritization

Reproducible Variant Prioritization

Finding disease-causal variants among large amounts of present variants remains a major challenge in next-generation sequencing experiments data analysis ("Needles in stacks of needles", Cooper 2011).
One of the most frequently used formats to store variant information is the Variant Call Format (VCF). As extracting information from complex genetic variation data encoded in VCF files is not a straightforward task, there are several command line tools for filtering and querying information in VCF files with the ultimate goal to detect disease-causal variants.
This workflow illustrates how to mine your VCF files within KNIME Analytics Platform with the ultimate goal to find variants associated with a specific disease.
We utilize three common tools: BCFtools, VCFtools and VEP (via the Ensembl Rest API) to filter and annotate the variants. The domain expert can interactively select variants of interest, filter by allele frequency in the 1000 genomes project and gnomeAD or by predicted deleteriousness of a variant (SIFT Score).

Requirements:
- Run Bash scripts
- Install tabix, VCFtools and BCFtools

Reproducible Variant PrioritizationHere, we create a reproducible workflow for a typical bioinformatics application: variant annotation and filtering. For that, we combine typical commandline tools with built-in functionality in KNIME Analytics Platform including shared components, REST Services and interactive visualizations. Requirements:- Run Bash scripts- Install tabix, VCFtools and BCFtools Input vcf file Use Ensembl Rest API to:- map coordinates from assembly GRCh37 toGRCh38 if needed- annotate resulting variants using VEP Use tabix to index vcf file and vcftools andbcftools to compute statistics Summarize results Filter variants by quality or locationMake Selection!By default, everything will be selectedInput Result Summary Ensembl Rest API Filter Variants Comandline tools Select AnnotatedVariants Reproducible Variant PrioritizationHere, we create a reproducible workflow for a typical bioinformatics application: variant annotation and filtering. For that, we combine typical commandline tools with built-in functionality in KNIME Analytics Platform including shared components, REST Services and interactive visualizations. Requirements:- Run Bash scripts- Install tabix, VCFtools and BCFtools Input vcf file Use Ensembl Rest API to:- map coordinates from assembly GRCh37 toGRCh38 if needed- annotate resulting variants using VEP Use tabix to index vcf file and vcftools andbcftools to compute statistics Summarize results Filter variants by quality or locationMake Selection!By default, everything will be selectedInput Result Summary Ensembl Rest API Filter Variants Comandline tools Select AnnotatedVariants

Nodes

Extensions

Links