Metabolomics → GO Bridge: protonation-normalization coverage probe
Generated by coverage_probe.py — all numbers
computed live from OLS4 (ChEBI), the Rhea REST API, and the GO rhea2go mapping.
Why this probe exists
Reported metabolite ChEBI ids rarely match Rhea participants directly,
for two reasons we test as successive normalization tiers:
- Protonation — Rhea writes participants in their major protonation
state at pH 7.3 (citrate(3-),ATP(4-)); repositories report neutral
forms. We expand over ChEBIis_protonated_form_of/
is_deprotonated_form_of(the protonation family). - Structure / skeleton — a study reports a generic, non-stereospecific
compound (isoleucine) while Rhea uses the stereospecific zwitterion
(L-isoleucine zwitterion). We expand over the broader structural
relations (+ tautomer/enantiomer + generic→specificchildren), bounded
to the seed's InChIKey skeleton. This tier is stereo/charge-blind, so
it is reported separately as the more permissive fallback.
Headline
- Metabolites probed: 26 (resolved to ChEBI: 26)
- In a Rhea reaction — exact: 0/26 → +protonation: 22/26 → +structure: 25/26
- Reaching a GO MF term (rhea2go) — exact: 0 → +protonation: 22 → +structure: 24 (of 26)
- Recovered by protonation: 22; additionally by structure/skeleton: 3
Per-metabolite
Tier reached: exact < proton (protonation) < struct (skeleton) < — (miss).
| Metabolite | Seed ChEBI | q | Tier | GO MF (exact→proton→struct) | Rhea-matched form |
|---|---|---|---|---|---|
| citric acid | CHEBI:30769 | 0 | proton | 0→8→8 | citrate(3-) (CHEBI:16947, q=-3) |
| cis-aconitic acid | CHEBI:32805 | 0 | proton | 0→2→2 | cis-aconitate(3-) (CHEBI:16383, q=-3) |
| isocitric acid | CHEBI:30887 | 0 | struct | 0→0→6 | D-erythro-isocitrate(3-) (CHEBI:15563, q=-3) |
| 2-oxoglutaric acid | CHEBI:30915 | 0 | proton | 0→152→152 | 2-oxoglutarate(2-) (CHEBI:16810, q=-2) |
| succinic acid | CHEBI:15741 | 0 | proton | 0→92→92 | succinate(2-) (CHEBI:30031, q=-2) |
| fumaric acid | CHEBI:18012 | 0 | proton | 0→18→18 | fumarate(2-) (CHEBI:29806, q=-2) |
| (S)-malic acid | CHEBI:30797 | 0 | proton | 0→19→19 | (S)-malate(2-) (CHEBI:15589, q=-2) |
| oxaloacetic acid | CHEBI:30744 | 0 | proton | 0→129→129 | oxaloacetate(2-) (CHEBI:16452, q=-2) |
| pyruvic acid | CHEBI:32816 | 0 | proton | 0→135→135 | pyruvate (CHEBI:15361, q=-1) |
| L-lactic acid | CHEBI:53408 | 0 | — | 0→0→0 | — |
| D-glucose | CHEBI:17634 | 0 | struct | 0→0→0 | aldehydo-D-glucose (CHEBI:42758, q=0) |
| D-glucose 6-phosphate | CHEBI:14314 | 0 | struct | 0→0→1 | aldehydo-D-glucose 6-phosphate(2-) (CHEBI:57584, q=-2) |
| D-fructose 6-phosphate | CHEBI:15946 | 0 | proton | 0→2→2 | D-fructose 6-phosphate(2-) (CHEBI:57579, q=-2) |
| phosphoenolpyruvic acid | CHEBI:44897 | 0 | proton | 0→23→23 | phosphonatoenolpyruvate (CHEBI:58702, q=-3) |
| 3-phospho-D-glyceric acid | CHEBI:17794 | 0 | proton | 0→25→25 | 3-phosphonato-D-glycerate(3-) (CHEBI:58272, q=-3) |
| L-glutamic acid | CHEBI:16015 | 0 | proton | 0→132→132 | L-glutamate(1-) (CHEBI:29985, q=-1) |
| L-aspartic acid | CHEBI:17053 | 0 | proton | 0→40→40 | L-aspartate(1-) (CHEBI:29991, q=-1) |
| L-alanine | CHEBI:16977 | 0 | proton | 0→51→51 | L-alanine zwitterion (CHEBI:57972, q=0) |
| glycine | CHEBI:15428 | 0 | proton | 0→53→53 | glycine zwitterion (CHEBI:57305, q=0) |
| L-serine | CHEBI:17115 | 0 | proton | 0→32→32 | L-serine zwitterion (CHEBI:33384, q=0) |
| ATP | CHEBI:15422 | 0 | proton | 0→492→492 | ATP(4-) (CHEBI:30616, q=-4) |
| ADP | CHEBI:16761 | 0 | proton | 0→370→370 | ADP(3-) (CHEBI:456216, q=-3) |
| AMP | CHEBI:18374 | 0 | proton | 0→2→2 | 1-(5-phosphonato-beta-D-ribosyl)-5'-AMP(4-) (CHEBI:59457, q=-4) |
| NAD(+) | CHEBI:15846 | 1 | proton | 0→447→447 | NAD(1-) (CHEBI:57540, q=-1) |
| acetyl-CoA | CHEBI:15351 | 0 | proton | 0→385→385 | acetyl-CoA(4-) (CHEBI:57288, q=-4) |
| succinyl-CoA | CHEBI:15380 | 0 | proton | 0→304→304 | succinyl-CoA(5-) (CHEBI:57292, q=-5) |
Recovered by structure (skeleton) normalization
Generic / stereochemistry mismatches the protonation tier could not fix,
recovered by InChIKey-skeleton expansion:
- isocitric acid (CHEBI:30887) → D-erythro-isocitrate(3-) (CHEBI:15563, q=-3)
- D-glucose (CHEBI:17634) → aldehydo-D-glucose (CHEBI:42758, q=0)
- D-glucose 6-phosphate (CHEBI:14314) → aldehydo-D-glucose 6-phosphate(2-) (CHEBI:57584, q=-2)
Example GO molecular functions reached
- citric acid (CHEBI:30769) → GO:0003878 ATP citrate synthase activity; GO:0003994 aconitate hydratase activity; GO:0008814 citrate CoA-transferase activity
- cis-aconitic acid (CHEBI:32805) → GO:0047613 aconitate decarboxylase activity; GO:0047614 aconitate delta-isomerase activity
- isocitric acid (CHEBI:30887) → GO:0003994 aconitate hydratase activity; GO:0004449 isocitrate dehydrogenase (NAD+) activity; GO:0004450 isocitrate dehydrogenase (NADP+) activity
- 2-oxoglutaric acid (CHEBI:30915) → GO:0000908 taurine dioxygenase activity; GO:0003973 (S)-2-hydroxy-acid oxidase activity; GO:0003992 N2-acetyl-L-ornithine:2-oxoglutarate 5-transaminase activity
- succinic acid (CHEBI:15741) → GO:0000104 succinate dehydrogenase activity; GO:0000908 taurine dioxygenase activity; GO:0003962 cystathionine gamma-synthase activity
- fumaric acid (CHEBI:18012) → GO:0000104 succinate dehydrogenase activity; GO:0004018 N6-(1,2-dicarboxyethyl)AMP AMP-lyase (fumarate-forming) activity; GO:0004056 argininosuccinate lyase activity
- (S)-malic acid (CHEBI:30797) → GO:0004029 aldehyde dehydrogenase (NAD+) activity; GO:0004333 fumarate hydratase activity; GO:0004471 malate dehydrogenase (decarboxylating) (NAD+) activity
- oxaloacetic acid (CHEBI:30744) → GO:0000104 succinate dehydrogenase activity; GO:0000908 taurine dioxygenase activity; GO:0001716 L-amino-acid oxidase activity
- pyruvic acid (CHEBI:32816) → GO:0000286 L-alanine dehydrogenase (NAD+) activity; GO:0001716 L-amino-acid oxidase activity; GO:0003941 L-serine ammonia-lyase activity
- D-glucose 6-phosphate (CHEBI:14314) → GO:0047641 aldose-6-phosphate reductase (NADPH) activity
- D-fructose 6-phosphate (CHEBI:15946) → GO:0034700 allulose 6-phosphate 3-epimerase activity; GO:0047905 fructose-6-phosphate phosphoketolase activity
- phosphoenolpyruvic acid (CHEBI:44897) → GO:0003849 3-deoxy-7-phosphoheptulonate synthase activity; GO:0003866 3-phosphoshikimate 1-carboxyvinyltransferase activity; GO:0004029 aldehyde dehydrogenase (NAD+) activity
- 3-phospho-D-glyceric acid (CHEBI:17794) → GO:0004617 phosphoglycerate dehydrogenase activity; GO:0004618 phosphoglycerate kinase activity; GO:0004619 phosphoglycerate mutase activity
- L-glutamic acid (CHEBI:16015) → GO:0000107 imidazoleglycerol-phosphate synthase activity; GO:0000514 3-sulfino-L-alanine: proton, glutamate antiporter activity; GO:0000515 aspartate:glutamate, proton antiporter activity
- L-aspartic acid (CHEBI:17053) → GO:0000515 aspartate:glutamate, proton antiporter activity; GO:0001716 L-amino-acid oxidase activity; GO:0003948 N4-(beta-N-acetylglucosaminyl)-L-asparaginase activity
- L-alanine (CHEBI:16977) → GO:0000286 L-alanine dehydrogenase (NAD+) activity; GO:0001716 L-amino-acid oxidase activity; GO:0004021 L-alanine:2-oxoglutarate transaminase activity
- glycine (CHEBI:15428) → GO:0003870 5-aminolevulinate synthase activity; GO:0004029 aldehyde dehydrogenase (NAD+) activity; GO:0004046 aminoacylase activity
- L-serine (CHEBI:17115) → GO:0001716 L-amino-acid oxidase activity; GO:0003882 CDP-diacylglycerol-serine O-phosphatidyltransferase activity; GO:0003884 D-amino-acid oxidase activity
- ATP (CHEBI:15422) → GO:0000285 1-phosphatidylinositol-3-phosphate 5-kinase activity; GO:0000309 nicotinamide-nucleotide adenylyltransferase activity; GO:0000823 inositol-1,4,5-trisphosphate 6-kinase activity
- ADP (CHEBI:16761) → GO:0000285 1-phosphatidylinositol-3-phosphate 5-kinase activity; GO:0000823 inositol-1,4,5-trisphosphate 6-kinase activity; GO:0000824 inositol-1,4,5,6-tetrakisphosphate 3-kinase activity
- AMP (CHEBI:18374) → GO:0004635 phosphoribosyl-AMP cyclohydrolase activity; GO:0004636 phosphoribosyl-ATP diphosphatase activity
- NAD(+) (CHEBI:15846) → GO:0000210 NAD+ diphosphatase activity; GO:0000215 tRNA 2'-phosphotransferase activity; GO:0000286 L-alanine dehydrogenase (NAD+) activity
- acetyl-CoA (CHEBI:15351) → GO:0000225 N-acetylglucosaminylphosphatidylinositol deacetylase activity; GO:0003841 1-acylglycerol-3-phosphate O-acyltransferase activity; GO:0003846 2-acylglycerol O-acyltransferase activity
- succinyl-CoA (CHEBI:15380) → GO:0003841 1-acylglycerol-3-phosphate O-acyltransferase activity; GO:0003846 2-acylglycerol O-acyltransferase activity; GO:0003852 2-isopropylmalate synthase activity
Residual misses (after both normalization tiers)
Resolved to ChEBI but matched no Rhea reaction even after protonation and
skeleton normalization — typically derivatives Rhea represents only in a
conjugated/acylated form, or compounds genuinely absent from Rhea.
- L-lactic acid (CHEBI:53408, family size 1)
Method / reproducibility
- ChEBI access + protonation traversal:
chebi.py(OLS4). - Rhea network + rhea2go:
rhea.py(Rhea REST, GO external2go). - Caches under
.cache/(gitignored); delete to force a fresh pull. - This is a coverage probe (does the bridge connect?), not yet a
statistical enrichment; GO-BP lift + ORA are the next step (see the
project page).