Finding disease-causal variants among large amounts of present variants remains a major challenge in next-generation sequencing experiments data analysis ("Needles in stacks of needles", Cooper 2011).
One of the most frequently used formats to store variant information is the Variant Call Format (VCF). As extracting information from complex genetic variation data encoded in VCF files is not a straightforward task, there are several command line tools for filtering and querying information in VCF files with the ultimate goal to detect disease-causal variants.
This workflow illustrates how to mine your VCF files within KNIME Analytics Platform with the ultimate goal to find variants associated with a specific disease.
We utilize three common tools: BCFtools, VCFtools and VEP (via the Ensembl Rest API) to filter and annotate the variants. The domain expert can interactively select variants of interest, filter by allele frequency in the 1000 genomes project and gnomeAD or by predicted deleteriousness of a variant (SIFT Score).
- Run Bash scripts
- Install tabix, VCFtools and BCFtools
Get this workflow from the following link: Download
Variant_Prioritization consists of the following 163 nodes(s):
Variant_Prioritization contains nodes provided by the following 11 plugin(s):
Do you have feedback, questions, comments about NodePit, want to support this platform, or want your own nodes or workflows listed here as well? Do you think, the search results could be improved or something is missing? Then please get in touch! Alternatively, you can send us an email to email@example.com, follow @NodePit on Twitter, or chat on Gitter!
Please note that this is only about NodePit. We do not provide general support for KNIME — please use the KNIME forums instead.