CoMetGeNe logo
Examples

Trail finding

Suppose CoMetGeNe.py (trail finding) is executed as follows:
python2 CoMetGeNe.py eco data/eco/ -dG 2 -dD 1 -o eco.out

Metabolic pathway maps for Escherichia coli K-12 MG1655 ( eco) are automatically downloaded from KEGG and stored under data/eco/. At most two genes (-dG 2) and one reaction (-dD 1) can be skipped. Results are saved in the output file eco.out.

For the above example, CoMetGeNe identifies 238 trails of span ranging from 2 to 10 (i.e., the 238 trails contain from 2 to 10 unique metabolic reactions). Below is the output corresponding to a trail of span 3:
path_eco00564.kgml: Found a trail of span 3 containing skipped vertices path_eco00564.kgml: 110 -> [139] -> 104 -> 123 path_eco00564.kgml: R02054 -> [R04864] -> R02053 -> R03416 path_eco00564.kgml: eco:b3821 -> [eco:b2836] -> eco:b3821 -> eco:b3825 path_eco00564.kgml: 3.1.1.32 -> [2.3.1.40] -> 3.1.1.4 -> 3.1.1.5 path_eco00564.kgml: Skipped genes: eco:b3823, eco:b3822
  • path_eco00564.kgml is the file name for the pathway map 00564 in eco (glycerophospholipid metabolism), retrieved automatically from KEGG by CoMetGeNe.
  • The four lines with entities separated by arrows (->) represent the trail R02054 -> R02053 -> R03416 in four distinct manners, using:
    • The KGML identifiers of the reactions in the trail (110 -> 104 -> 123).
    • The KEGG R numbers associated to the reactions in the trail (R02054 -> R02053 -> R03416). Span is computed in terms of distinct R numbers in the trail.
    • The names of genes whose products are involved in reactions in the trail (eco:b3821 -> eco:b3821 -> eco:b3825).
    • The EC numbers associated to the reactions in the trail (3.1.1.32 -> 3.1.1.4 -> 3.1.1.5).
  • The reaction R04864 was skipped (allowed because CoMetGeNe.py was executed with the option -dD 1): it is shown in square brackets, along with the corresponding KGML identifier (139), associated gene (b2836), and EC number (2.3.1.40).
  • Two genes were skipped (allowed because CoMetGeNe.py was executed with the option -dG 2): eco:b3823 and eco:b3822.

Trail grouping

Trail grouping by genes

Suppose trail finding was performed for species aae, bbn, eco, and mpn. A small part of the CSV obtained when grouping CoMetGeNe trails by genes for eco as the reference species is reproduced below (slightly re-formatted for readability purposes):
eco_gene;chr;str;aae;bbn;mpn b0114; chr; + ; . ; . ; x b0115; chr; + ; . ; . ; x b0116; chr; + ; x ; . ; x

From the table above, it can be seen that:

  • Species aae has at least two neighboring homologues to the gene b0116 in eco;
  • Species bbn has no neighboring homologues for the three genes in eco;
  • Species mpn has neighboring homologues for all the three genes in eco.

Trail grouping by reactions

Suppose trail finding was performed for species aae, bbn, eco, and mpn. A small part of the CSV obtained when grouping CoMetGeNe trails by reactions for eco as the reference species is reproduced below (slightly re-formatted for readability purposes):
reaction;eco_gene;pathway; aae;bbn;mpn R07618; b0116; 00010 00020 00280 00620 00640; . ; o ; x R03270; b0114; 00010 00020 00620; o ; o ; x R00014; b0114; 00010 00020 00620; o ; o ; x R02569; b0115; 00010 00020 00620; o ; o ; x

From the table above, it can be seen that:

  • Species aae performs only reaction R07618 but none of the three other reactions;
  • Species bbn performs none of the four reactions;
  • Species mpn performs all four reactions using products of neighboring genes.

Created by Alexandra Zaharia. Maintained by Alain Denise.
Site style derived from the GreenWorld template at Blue Website Templates.