MetaboToGeno is part of the metabotype analysis implemented in PheNoBo. This node is the successor of the ScoreMetabolites node and a predecessor of the NetworkScore node.

The aim of MetaboToGeno is to transform the per-metabolite results of ScoreMetabolites into predictions of causal genes. The MetaboToGeno algorithm calculates a score for each gene based on the p values reported by ScoreMetabolites. The score of a gene x indicates the probability that x is the causal gene for the patient's metabotype.

MetaboToGeno requires 3 tables with input data: the metabolite scores, the metabolite-gene associations and the set of all genes to score. For detailed information about the format of the tables have a look at the Input Port section and at the example files provided at

The MetaboToGeno algorithm is a procedure with 2 main steps.
The first step of the procedure is a score transformation at metabolite level. The p value of each metabolite is converted to the probability that the metabolite is associated with the patient's disease. A metabolite with p value p gets a new score 1/(1+np) where n denotes the total number of metabolites.
The second step transfers a metabolite score to all genes that have an association with that metabolite. If a metabolite does not have any known associations, its score is distributed among all genes. There are several methods (see dialog options) to handle genes that obtain scores from more than one metabolite.
The algorithm of MetaboToGeno is derived from the Phen-Gen tool (see Javed et al., 2014) and is described in more detail at...


Gene Annotation Mode
This option specifies how MetaboToGeno handles genes that obtain scores from more than one metabolite. There are two possible modes.
Combination of all metabolite scores: This method combines the scores of all metabolites annotated to a gene. The final score of a gene is determined as (1-s1)(1-s2)...(1-sn) with s1 to sn denoting the scores of its metabolites.
Maximum metabolite score: This method takes the maximum score of all metabolites annotated to a gene. The final score of a gene is determined as max(s1,s2,...,sn) with s1 to sn denoting the scores of its metabolites.

Input Ports

Output of ScoreMetabolites: a table produced by the ScoreMetabolites node. MetaboToGeno requires not all columns generated by ScoreMetabolites. This node only depends on the columns metabolite_id and significance.
Associations Metabolite - Gene: a table representing associations between metabolites and genes. These associations can include indirect relations (e.g. results from GWAS) and data about biochemical reactions (e.g. gene x catalyzes a reaction that forms metabolite y). The table should have two columns named metabolite_id and gene_id. The associations are represented as pairs metabolite id (e.g. Metabolon id) - gene id (e.g. Ensmebl id). Note that the gene id is allowed to be missing (for a metabolite without known genes) whereas the metabolite id is required in every row.
All metabolites that should be considered in the MetaboToGeno algorithm have to occur at least once in the table.
All genes: a table with a single column gene_id. It contains gene ids (e.g. Ensembl gene ids) of all genes to score.

Output Ports

Gene Scores: Each row represents a gene from the table at Input Port 2 and consists of 3 columns: gene_id, gene_probability and contribution. The gene probability gives the likelihood that the gene is causal for the patient's metabotype. The column contribution lists the metabolites that contributed most to the score of the gene.

Popular Predecessors

  • No recommendations found

Popular Successors

  • No recommendations found


This node has no views


  • No workflows found



You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.