0 ×

**phenobo** version **2.1.6**

**GeneticNetworkScore** is part of the phenotype and metabotype analysis implemented in **PheNoBo**.
This node is the successor of the **PhenoToGeno** node and the **MetaboToGeno** node.
It is the predecessor of the **NetworkScore** node.

The aim of GeneticNetworkScore is to refine the gene scores of PhenoToGeno and MetaboToGeno.
The node calculates a new score for each gene based on a genetic network.
The procedure increases the scores of genes that interact with known causal genes for the patient's condition.
Therefore, the GeneticNetworkScore node enables the detection of new disease genes.

GeneticNetworkScore requires 2 tables with input data: the initial gene scores and a genetic network.
For detailed information about the format of the tables have a look at the Input Port section and
at the example files provided at https://github.com/marie-sophie/mapra.

The node implements a random walk with restart on a genetic network.
The random walk with restart is an iterative procedure based on the function s_{t+1} = (1-r)Ms_{t} + rs_{0}.
The function describes a random score transfer along the edges of the network.
s_{t} is a vector and denotes the scores of all genes after t iterations.
The vector s_{0} contains the initial scores calculated by PhenoToGeno or MetaboToGeno.
M is a (sparse) transition matrix representing the edges of the genetic network.
The entries m_{i,j} of M give the probability of transferring scores from gene j to gene i.
The parameter r gives the fraction of the original scores s_{0} that is not distributed within the network.

Finally, the gene scores of the random walk with restart are translated into enrichment scores.
The enrichment score of a gene with gene score g is determined as log_{10}(gn) where n denotes the total number of genes.
If the enrichment score is greater than 0, the gene score is higher than expected for a random prediction (where all genes get a score of n^{-1}).
If the enrichment score is lower than 0, the gene score is lower than expected for a random prediction.

- Use Weighted Edges
- This option allows to include edge weights into the calculations.
The edge weights are translated into the probabilities m
_{i,j}of the transition matrix M of the random walk with restart. The weight of an edge is proportional to the probability of transferring scores along the edge. If this option is checked, the table at input port 1 has to provide a column with integer edge weights. - Restart Probability
- The parameter restart probability r controls the fraction of the original scores that is distributed among the nodes of the network. For example, if the restart probability is set to r=0.9 (default value), 90% of the original score of a gene is kept and 10% of its score is distributed among its neighbors.
- Number of Iterations
- This option refers to the parameter t (number of steps) of the random walk with restart. It influences how far the score is spread among the neighbors of a node. For example, if the number of steps is t=2 (default value), the direct neighbors and the neighbors of the direct neighbors receive scores from a node.
- Iterate until Convergence
- This option provides an alternative to the option
*Number of Iterations*. If you choose this option, the scores are approximated for an infinite number of steps (t=∞). This means that the score of a node is distributed among all other nodes of the network.

**Scored Genes**: a table produced by the PhenoToGeno node or the MetaboToGeno node. GeneticNetworkScore requires not all columns generated by PhenoToGeno or MetaboToGeno. This node only depends on the columns**gene_id**and**gene_probability**.**Network**: a table representing a genetic network. Each row corresponds to an undirected edge of the network. The edges are described by 2 columns called**gene1**and**gene2**giving the gene ids of the edge's vertices. If the option*Use Weighted Edges*is checked, the table requires a third column named**weight**with integer values.

**Gene Scores**: Each row represents a gene and consists of 3 columns:**gene_id**,**gene_probability**and**enrichment_score**. The column gene_probability contains modified gene scores based on the scores from the table at input port 0. The gene probability indicates the likelihood that the gene is causal for the patient's disease. The column enrichment_score is a gene score that is normalized for the total number of genes. If the enrichment score is above 0, the gene probability is higher than expected for a random prediction.

To use this node in KNIME, download the below referenced file, save it to your KNIME's plugin folder and restart KNIME.

You don't know what to do with this link? Read our NodePit Product and Node Installation Guide that explains you in detail how to install nodes to your KNIME Analytics Platform.

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.

Do you have feedback, questions, comments about NodePit, want to support this platform, or want your own nodes or workflows listed here as well? Do you think, the search results could be improved or something is missing? Then please get in touch! Alternatively, you can send us an email to mail@nodepit.com, follow @NodePit on Twitter, or chat on Gitter!

Please note that this is only about NodePit. We do not provide general support for KNIME — please use the KNIME forums instead.