NGS related nodes for KNIME Workbench version 0.2.200.v201510231057 by Bernd Jagla, Institute Pasteur
ROIs enable a new way of visualizing, characterizing, and analyzing
aligned sequence data from next generation sequencing (NGS)
experiments that will have implications on the quality control as
well the biological interpretation of such experiments.
the strict linear order that is implied by the reference
allow for a much more detailed analysis of the alignment
it was previously possible. We show this on two
examples: one from the
quality control angle where we identify and
characterize regions with
coverage profiles that can most certainly
be associated to technical
artefacts, and one example where our
method can help or even guide the
interpretation of miRNA data.
Though we restricted ourselves to these
two example for reasons of
brevity the potential of this method is
much broader and can be
applied to CHiPSeq, RNASeq, transcription
start site analysis and
potentially many more biological problems.
This technology can also
help understanding the technical biases
imposed on the experiments by
the instruments and protocols.
The region of interest (ROI) file format is a text based format that is compatible with the BED6 file format. Thus, columns are separated by the tabulator character and the first four columns describe the location of the region and a name. We use the length of the region in lieu of the score, which is followed by the strand (+/-). What follows is an arbitrary but consistent (within the file) number of numerical columns (metrics), followed by a comma separated list of integer values (usually coverage values).
The file format allows for comment lines, which have “#” as the first character. By convention, in the file header a single line starting with “##” contains column names.
See http://www.seqan.de/projects/ngs-roi/ for further information
To use this node in KNIME, install NGS related nodes for KNIME Workbench from the following update site: