GeneticNetworkScore

GeneticNetworkScore is part of the phenotype and metabotype analysis implemented in PheNoBo. This node is the successor of the PhenoToGeno node and the MetaboToGeno node. It is the predecessor of the NetworkScore node.

The aim of GeneticNetworkScore is to refine the gene scores of PhenoToGeno and MetaboToGeno. The node calculates a new score for each gene based on a genetic network. The procedure increases the scores of genes that interact with known causal genes for the patient's condition. Therefore, the GeneticNetworkScore node enables the detection of new disease genes.

GeneticNetworkScore requires 2 tables with input data: the initial gene scores and a genetic network. For detailed information about the format of the tables have a look at the Input Port section and at the example files provided at https://github.com/marie-sophie/mapra.

The node implements a random walk with restart on a genetic network. The random walk with restart is an iterative procedure based on the function s_t+1 = (1-r)Ms_t + rs₀. The function describes a random score transfer along the edges of the network. s_t is a vector and denotes the scores of all genes after t iterations. The vector s₀ contains the initial scores calculated by PhenoToGeno or MetaboToGeno. M is a (sparse) transition matrix representing the edges of the genetic network. The entries m_i,j of M give the probability of transferring scores from gene j to gene i. The parameter r gives the fraction of the original scores s₀ that is not distributed within the network.

Finally, the gene scores of the random walk with restart are translated into enrichment scores. The enrichment score of a gene with gene score g is determined as log₁₀(gn) where n denotes the total number of genes. If the enrichment score is greater than 0, the gene score is higher than expected for a random prediction (where all genes get a score of n^-1). If the enrichment score is lower than 0, the gene score is lower than expected for a random prediction.

Options

Use Weighted Edges: This option allows to include edge weights into the calculations. The edge weights are translated into the probabilities m_i,j of the transition matrix M of the random walk with restart. The weight of an edge is proportional to the probability of transferring scores along the edge. If this option is checked, the table at input port 1 has to provide a column with integer edge weights.
Restart Probability: The parameter restart probability r controls the fraction of the original scores that is distributed among the nodes of the network. For example, if the restart probability is set to r=0.9 (default value), 90% of the original score of a gene is kept and 10% of its score is distributed among its neighbors.
Number of Iterations: This option refers to the parameter t (number of steps) of the random walk with restart. It influences how far the score is spread among the neighbors of a node. For example, if the number of steps is t=2 (default value), the direct neighbors and the neighbors of the direct neighbors receive scores from a node.
Iterate until Convergence: This option provides an alternative to the option Number of Iterations. If you choose this option, the scores are approximated for an infinite number of steps (t=∞). This means that the score of a node is distributed among all other nodes of the network.

Input Ports

: Scored Genes: a table produced by the PhenoToGeno node or the MetaboToGeno node. GeneticNetworkScore requires not all columns generated by PhenoToGeno or MetaboToGeno. This node only depends on the columns gene_id and gene_probability.
: Network: a table representing a genetic network. Each row corresponds to an undirected edge of the network. The edges are described by 2 columns called gene1 and gene2 giving the gene ids of the edge's vertices. If the option Use Weighted Edges is checked, the table requires a third column named weight with integer values.

Output Ports

: Gene Scores: Each row represents a gene and consists of 3 columns: gene_id, gene_probability and enrichment_score. The column gene_probability contains modified gene scores based on the scores from the table at input port 0. The gene probability indicates the likelihood that the gene is causal for the patient's disease. The column enrichment_score is a gene score that is normalized for the total number of genes. If the enrichment score is above 0, the gene probability is higher than expected for a random prediction.

Popular Predecessors

No recommendations found

Popular Successors

No recommendations found

Views

This node has no views

Workflows

No workflows found

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.

Installation

To use this node in KNIME, download the below referenced file, save it to your KNIME's plugin folder and restart KNIME.

v5.6

Plugin provider:

Plugin version: 2.1.6

On NodePit since: 2025-08-15

Last update: 2025-08-18

KNIME versions: Since v3.6

Deploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud or on-premises – with our brand new NodePit Runner.

Try NodePit Runner!