YgfF is a glucose 1-dehydrogenase (EC 1.1.1.47) belonging to the short-chain dehydrogenases/reductases (SDR) family (SDR63C subgroup). It catalyzes the NAD(+)-dependent oxidation of D-glucose to D-glucono-1,5-lactone, which spontaneously hydrolyzes to D-gluconate. The enzymatic function was predicted by the DeepECtransformer deep learning tool (prediction score 0.6331) and experimentally validated in vitro by Kim et al. 2023, who measured a specific activity of 305.55 U/mg, comparable to characterized glucose 1-dehydrogenases from other organisms. Full kinetic parameters (Km, kcat) have not yet been determined. YgfF was one of only three genuinely correct novel predictions (out of 464) made by DeepECtransformer for the E. coli y-ome. The protein contains a conserved NAD(P)-binding Rossmann-fold domain and predicted binding sites for both NAD(+) and D-glucose. A physical interaction with LpdA (dihydrolipoyl dehydrogenase) was detected by affinity purification-mass spectrometry, though the biological significance of this interaction is unclear. The physiological role of YgfF in E. coli metabolism remains to be established.
| GO Term | Evidence | Action | Reason |
|---|---|---|---|
|
GO:0016491
oxidoreductase activity
|
IBA
GO_REF:0000033 |
ACCEPT |
Summary: YgfF is experimentally confirmed as a glucose 1-dehydrogenase (EC 1.1.1.47), which is a type of oxidoreductase (PMID:37963869). The IBA annotation to oxidoreductase activity is correct but much less specific than what is now known. The more specific term GO:0047934 (glucose 1-dehydrogenase (NAD+) activity) is supported by direct experimental evidence.
Reason: This IBA annotation is consistent with the experimentally validated function of YgfF. While more specific terms exist (and are annotated separately), the IBA at this level is not wrong and reflects phylogenetic inference that is consistent with experimental data. Glucose 1-dehydrogenase activity is a subtype of oxidoreductase activity.
Supporting Evidence:
PMID:37963869
YgfF exhibited a specific glucose 1-dehydrogenase activity of 305.55 U mg−1
|
|
GO:0016614
oxidoreductase activity, acting on CH-OH group of donors
|
IEA
GO_REF:0000117 |
ACCEPT |
Summary: This IEA annotation from ARBA is consistent with the experimentally validated glucose 1-dehydrogenase activity. Glucose 1-dehydrogenase acts on the CH-OH group of D-glucose, so this intermediate-level annotation is correct. It is less specific than GO:0047934 but appropriately reflects the ARBA computational prediction.
Reason: The term is a correct parent of the experimentally validated specific function (glucose 1-dehydrogenase (NAD+) activity). The IEA evidence code is appropriate for a computationally derived annotation, and the term is consistent with the SDR family classification and the known catalytic mechanism.
Supporting Evidence:
PMID:37963869
For YgfF, DeepECtransformer predicted its EC number to be EC:1.1.1.47 (glucose 1-dehydrogenase).
|
|
GO:0047934
glucose 1-dehydrogenase (NAD+) activity
|
IEA
GO_REF:0000116 |
ACCEPT |
Summary: This IEA annotation is derived from Rhea reaction mapping (RHEA:14293), which corresponds to the reaction D-glucose + NAD(+) = D-glucono-1,5-lactone + NADH + H(+). This matches the experimentally validated catalytic activity of YgfF exactly as described by Kim et al. 2023 and annotated by UniProt (PMID:37963869).
Reason: The Rhea-derived IEA correctly captures the specific enzymatic activity of YgfF. This is also independently supported by the IDA annotation from PMID:37963869 using the same GO term.
Supporting Evidence:
PMID:37963869
YgfF exhibited a specific glucose 1-dehydrogenase activity of 305.55 U mg−1
|
|
GO:0047936
glucose 1-dehydrogenase [NAD(P)+] activity
|
IEA
GO_REF:0000003 |
MODIFY |
Summary: This IEA annotation is derived from the EC number mapping (EC:1.1.1.47). GO:0047936 describes glucose 1-dehydrogenase that can use either NAD+ or NADP+ as cofactor. However, the EC number 1.1.1.47 specifically refers to the NAD+-dependent form, and the experimental validation by Kim et al. used an NAD+-dependent assay (PMID:37963869). UniProt annotates the catalytic activity with the Rhea reaction that specifies NAD+ specifically. The more precise term GO:0047934 (NAD+ specific) is the better annotation.
Reason: The EC:1.1.1.47 mapping to GO:0047936 is potentially an overly broad mapping, since EC:1.1.1.47 is the NAD+-dependent glucose 1-dehydrogenase, not the dual-cofactor NAD(P)+ form (which would be EC:1.1.1.119). The experimental data from Kim et al. 2023 validated activity using an NAD+-dependent assay kit, and UniProt annotates the Rhea reaction (RHEA:14293) specifically with NAD+. The NAD+-specific GO term GO:0047934 is more accurate.
Proposed replacements:
glucose 1-dehydrogenase (NAD+) activity
Supporting Evidence:
PMID:37963869
YgfF exhibited a specific glucose 1-dehydrogenase activity of 305.55 U mg−1
|
|
GO:0005515
protein binding
|
IPI
PMID:15690043 Interaction network containing conserved and essential prote... |
MARK AS OVER ANNOTATED |
Summary: This annotation is based on a high-throughput affinity purification-mass spectrometry study by Butland et al. 2005 that detected a physical interaction between YgfF and LpdA (dihydrolipoyl dehydrogenase, P0A9P0). The interaction is recorded in IntAct with 2 experiments supporting it. However, the GO term GO:0005515 (protein binding) is uninformative per curation guidelines and does not convey any specific functional information about this interaction.
Reason: Per curation guidelines, GO:0005515 (protein binding) is too vague and uninformative. The Butland et al. 2005 study was a large-scale screen that detected many interactions, and the biological significance of the YgfF-LpdA interaction is unknown. LpdA functions in the pyruvate dehydrogenase and 2-oxoglutarate dehydrogenase complexes, and there is no clear functional connection to glucose 1-dehydrogenase activity. Without understanding the functional nature of this interaction, a generic protein binding annotation provides little value.
Supporting Evidence:
PMID:15690043
no large-scale analysis of protein complexes in Escherichia coli has yet been reported. To this end, we have targeted DNA cassettes into the E. coli chromosome to create carboxy-terminal, affinity-tagged alleles of 1,000 open reading frames
|
|
GO:0047934
glucose 1-dehydrogenase (NAD+) activity
|
IDA
PMID:37963869 Functional annotation of enzyme-encoding genes using deep le... |
ACCEPT |
Summary: This is the key experimentally validated annotation. Kim et al. 2023 expressed and purified recombinant His-tagged YgfF from E. coli BL21(DE3) and measured glucose 1-dehydrogenase activity in vitro using a colorimetric GDH assay kit. The specific activity was 305.55 U/mg, comparable to the previously reported value of 205.70 U/mg for glucose 1-dehydrogenase from Lysinibacillus sphaericus. The activity was predicted by DeepECtransformer with EC number EC:1.1.1.47 and confirmed by the enzyme assay. This represents a validated core function.
Reason: Direct experimental evidence from in vitro enzyme assay demonstrates glucose 1-dehydrogenase (NAD+) activity. The specific activity (305.55 U/mg) is robust and comparable to characterized homologs. This is the most specific and well-supported annotation for YgfF. UniProt has adopted this function based on this study (EC:1.1.1.47).
Supporting Evidence:
PMID:37963869
For YgfF, DeepECtransformer predicted its EC number to be EC:1.1.1.47 (glucose 1-dehydrogenase). The enzyme assay results showed that YgfF exhibited a specific glucose 1-dehydrogenase activity of 305.55 U mg−1
PMID:37963869
which was comparable to the previously reported value of 205.70 U mg−1 for the glucose 1-dehydrogenase from Lysinibacillus sphaericus G10
file:ECOLI/ygfF/ygfF-deep-research-falcon.md
Falcon deep research confirms YgfF as EC 1.1.1.47 glucose 1-dehydrogenase validated by in vitro assay with 305.55 U/mg specific activity, and notes SDR63C subgroup classification supports this assignment.
|
|
GO:0051287
NAD binding
|
IDA
PMID:37963869 Functional annotation of enzyme-encoding genes using deep le... |
NEW |
Summary: YgfF requires NAD+ as a cofactor for its glucose 1-dehydrogenase activity. The catalytic reaction (D-glucose + NAD(+) = D-glucono-1,5-lactone + NADH + H(+)) directly involves NAD+ binding. UniProt annotates extensive NAD+ binding residues (positions 11, 13, 59, 60, 86, 88, 110, 156, 160, 189, 191, 194) based on similarity to characterized SDR family members. The in vitro enzyme assay demonstrating NAD+-dependent glucose oxidation provides experimental evidence for NAD binding.
Reason: The experimentally validated glucose 1-dehydrogenase activity requires NAD+ as a cofactor, and UniProt annotates multiple NAD+ binding residues. NAD binding is an inherent aspect of the catalytic mechanism and should be annotated. This is not currently present in the GO annotations.
Supporting Evidence:
PMID:37963869
For YgfF, DeepECtransformer predicted its EC number to be EC:1.1.1.47 (glucose 1-dehydrogenase). The enzyme assay results showed that YgfF exhibited a specific glucose 1-dehydrogenase activity of 305.55 U mg−1
|
|
GO:0019521
D-gluconate metabolic process
|
IDA
PMID:37963869 Functional annotation of enzyme-encoding genes using deep le... |
NEW |
Summary: YgfF catalyzes the oxidation of D-glucose to D-glucono-1,5-lactone, which spontaneously hydrolyzes to D-gluconate. This places YgfF as a participant in D-gluconate metabolism. No biological process annotations currently exist for YgfF, yet the experimentally validated enzymatic activity directly implicates it in this metabolic pathway. However, the in vivo physiological role has not been established, so this annotation should be considered with caution.
Reason: There are currently no biological process annotations for YgfF, which is a significant gap. The experimentally validated glucose 1-dehydrogenase activity produces D-glucono-1,5-lactone (a precursor to D-gluconate), directly linking YgfF to D-gluconate metabolic process. UniProt states the protein catalyzes the NAD(+)-dependent oxidation of D-glucose to D-gluconate via gluconolactone.
Supporting Evidence:
PMID:37963869
YgfF exhibited a specific glucose 1-dehydrogenase activity of 305.55 U mg−1
|
|
GO:0005829
cytosol
|
IDA
PMID:37963869 Functional annotation of enzyme-encoding genes using deep le... |
NEW |
Summary: YgfF was expressed as a soluble protein and purified from the cytosolic fraction (supernatant after cell lysis and centrifugation) by Kim et al. 2023. The protein was predicted to be soluble by NetSolP and was successfully purified from the soluble fraction. While there are no dedicated localization studies, the solubility data and lack of any signal peptide or transmembrane domain strongly suggest cytosolic localization. No cellular component annotations currently exist for YgfF.
Reason: There are currently no cellular component annotations for YgfF, which is a gap. The protein was purified from the soluble cytoplasmic fraction and has no predicted signal peptide or transmembrane domains, consistent with cytosolic localization. However, the evidence is indirect (protein was soluble when overexpressed) rather than from a dedicated localization study.
Supporting Evidence:
PMID:37963869
179 proteins are predicted to be soluble in E. coli by NetSolP, a deep learning model for protein solubility prediction
PMID:37963869
Cell debris was separated by centrifugation at 15,044 × g for 40 min, and the resulting supernatants were loaded onto Talon metal affinity resin
|
Q: What is the physiological role of YgfF glucose 1-dehydrogenase activity in E. coli K-12 metabolism? Is it involved in glucose catabolism via the Entner-Doudoroff pathway or another metabolic route?
Suggested experts: Lee SY, Kim GB
Q: What is the biological significance of the YgfF-LpdA physical interaction detected by Butland et al. 2005? Does YgfF participate in a metabolic complex with pyruvate dehydrogenase components?
Suggested experts: Emili A, Butland G
Q: Does YgfF have any activity with NADP+ as cofactor, or is it strictly NAD+-dependent? The current annotations include both NAD+ and NAD(P)+ terms.
Suggested experts: Kim GB, Lee SY
Experiment: Construct a ygfF knockout in E. coli K-12 and test growth phenotypes on minimal media with glucose as the sole carbon source under aerobic and anaerobic conditions. Compare with wild-type to determine if YgfF contributes to glucose utilization in vivo.
Hypothesis: YgfF deletion affects growth on glucose as sole carbon source under specific metabolic conditions.
Type: growth phenotype assay
Experiment: Perform in vitro enzyme assays with purified YgfF using NADP+ instead of NAD+ as the cofactor to determine cofactor specificity. This would resolve whether GO:0047936 (NAD(P)+ form) or GO:0047934 (NAD+ specific) is the correct annotation.
Hypothesis: YgfF is NAD+-specific and does not use NADP+ as an electron acceptor.
Type: enzyme kinetics
Experiment: Perform co-purification experiments with tagged YgfF under physiological expression levels and test whether LpdA affects YgfF enzymatic activity in vitro. Also test if ygfF deletion affects pyruvate dehydrogenase complex activity.
Hypothesis: The YgfF-LpdA interaction has functional significance in glucose metabolism.
Type: protein interaction validation
YgfF DeepECTF prediction review. The DeepECTF prediction of glucose 1-dehydrogenase (EC 1.1.1.47) is a successful prediction, validated by SDR nomenclature classification (SDR63C subgroup) and consistent with published biochemical data.
provider: falcon
model: Edison Scientific Literature
cached: false
start_time: '2026-03-22T17:39:29.798295'
end_time: '2026-03-22T17:46:45.635027'
duration_seconds: 435.84
template_file: templates/gene_research_go_focused.md
template_variables:
organism: ECOLI
gene_id: ygfF
gene_symbol: ygfF
uniprot_accession: P52037
protein_description: 'RecName: Full=Glucose 1-dehydrogenase YgfF {ECO:0000303|PubMed:37963869};
EC=1.1.1.47 {ECO:0000269|PubMed:37963869};'
gene_info: Name=ygfF; Synonyms=yqfD; OrderedLocusNames=b2902, JW2870;
organism_full: Escherichia coli (strain K12).
protein_family: Belongs to the short-chain dehydrogenases/reductases (SDR)
protein_domains: NAD(P)-bd_dom_sf. (IPR036291); Sc_DH/Rdtase_CS. (IPR020904); SDR_fam.
(IPR002347); adh_short_C2 (PF13561)
provider_config:
timeout: 600
max_retries: 3
parameters:
allowed_domains: []
temperature: 0.1
citation_count: 13
BEFORE YOU BEGIN RESEARCH: You MUST verify you are researching the CORRECT gene/protein. Gene symbols can be ambiguous, especially for less well-characterized genes from non-model organisms.
DO NOT PROCEED WITH RESEARCH ON A DIFFERENT GENE. Instead:
- State clearly: "The gene symbol 'ygfF' is ambiguous or literature is limited for this specific protein"
- Explain what you found (e.g., "Found extensive literature on a different gene with the same symbol in a different organism")
- Describe the protein based ONLY on the UniProt information provided above
- Suggest that the protein function can be inferred from domain/family information
Please provide a comprehensive research report on the gene ygfF (gene ID: ygfF, UniProt: P52037) in ECOLI.
The research report should be a detailed narrative explaining the function, biological processes, and localization of the gene product. Citations should be given for all claims.
You should prioritize authoritative reviews and primary scientific literature when conducting research. You can supplement
this with annotations you find in gene/protein databases, but these can be outdated or inaccurate.
We are specifically interested in the primary function of the gene - for enzymes, what reaction is catalyzed, and what is the substrate specificity? For transporters, what is the substrate? For structural proteins or adapters, what is the broader structural role? For signaling molecules, what is the role in the pathway.
We are interested in where in or outside the cell the gene product carries out its function.
We are also interested in the signaling or biochemical pathways in which the gene functions. We are less interested in broad pleiotropic effects, except where these elucidate the precise role.
Include evidence where possible. We are interested in both experimental evidence as well as inference from structure, evolution, or bioinformatic analysis. Precise studies should be prioritized over high-throughput, where available.
Question: You are an expert researcher providing comprehensive, well-cited information.
Provide detailed information focusing on:
1. Key concepts and definitions with current understanding
2. Recent developments and latest research (prioritize 2023-2024 sources)
3. Current applications and real-world implementations
4. Expert opinions and analysis from authoritative sources
5. Relevant statistics and data from recent studies
Format as a comprehensive research report with proper citations. Include URLs and publication dates where available.
Always prioritize recent, authoritative sources and provide specific citations for all major claims.
BEFORE YOU BEGIN RESEARCH: You MUST verify you are researching the CORRECT gene/protein. Gene symbols can be ambiguous, especially for less well-characterized genes from non-model organisms.
DO NOT PROCEED WITH RESEARCH ON A DIFFERENT GENE. Instead:
- State clearly: "The gene symbol 'ygfF' is ambiguous or literature is limited for this specific protein"
- Explain what you found (e.g., "Found extensive literature on a different gene with the same symbol in a different organism")
- Describe the protein based ONLY on the UniProt information provided above
- Suggest that the protein function can be inferred from domain/family information
Please provide a comprehensive research report on the gene ygfF (gene ID: ygfF, UniProt: P52037) in ECOLI.
The research report should be a detailed narrative explaining the function, biological processes, and localization of the gene product. Citations should be given for all claims.
You should prioritize authoritative reviews and primary scientific literature when conducting research. You can supplement
this with annotations you find in gene/protein databases, but these can be outdated or inaccurate.
We are specifically interested in the primary function of the gene - for enzymes, what reaction is catalyzed, and what is the substrate specificity? For transporters, what is the substrate? For structural proteins or adapters, what is the broader structural role? For signaling molecules, what is the role in the pathway.
We are interested in where in or outside the cell the gene product carries out its function.
We are also interested in the signaling or biochemical pathways in which the gene functions. We are less interested in broad pleiotropic effects, except where these elucidate the precise role.
Include evidence where possible. We are interested in both experimental evidence as well as inference from structure, evolution, or bioinformatic analysis. Precise studies should be prioritized over high-throughput, where available.
This report concerns Escherichia coli (strain K-12/MG1655) gene ygfF (synonym yqfD; locus b2902 / JW2870) whose curated description is glucose 1-dehydrogenase YgfF and enzyme classification EC 1.1.1.47, belonging to the short-chain dehydrogenases/reductases (SDR) superfamily (as also reflected in SDR-family subgrouping discussed below). The key, direct experimental evidence in the retrieved literature comes from a 2023 primary study that explicitly labels YgfF as EC 1.1.1.47 and biochemically validates this activity in vitro (kim2023functionalannotationof pages 6-7).
In E. coli K-12 MG1655, a substantial fraction of genes remains incompletely characterized. Kim et al. (published Nov 2023) describe that E. coli still has roughly ~30% of genes incompletely characterized, and define the “y-ome” as genes lacking sufficient experimental evidence for functional characterization (kim2023functionalannotationof pages 5-6). They operationalize this by assembling a y-ome protein set and applying machine-learning (ML) function prediction followed by targeted biochemical validation (kim2023functionalannotationof pages 5-6).
SDR enzymes are a large and diverse oxidoreductase superfamily. In the context of ygfF, an expert commentary evaluating ML annotation states that YgfF is a member of the SDR superfamily (InterPro family IPR002347) and can be classified into an SDR subgroup consistent with glucose 1-dehydrogenase function (crecylagard2025limitationsofcurrent pages 7-9). This family-level classification provides mechanistic plausibility for nicotinamide-dependent sugar oxidation/reduction, but does not by itself establish the physiological substrate or pathway role (crecylagard2025limitationsofcurrent pages 7-9).
A major recent development is the explicit experimental validation of YgfF’s enzymatic activity by Kim et al. (Nature Communications; publication month Nov 2023, DOI: 10.1038/s41467-023-43216-z, URL: https://doi.org/10.1038/s41467-023-43216-z). Their DeepECtransformer model predicted YgfF as EC:1.1.1.47 (glucose 1-dehydrogenase) with a neural-network score 0.6331, and they validated the activity biochemically (kim2023functionalannotationof pages 6-7).
Quantitative result: purified YgfF exhibited specific glucose 1-dehydrogenase activity of 305.55 U mg−1 in vitro (kim2023functionalannotationof pages 6-7). In the same discussion, the authors compare this activity to a previously reported glucose 1-dehydrogenase from Lysinibacillus sphaericus (reported 205.70 U mg−1), as a rough benchmarking of magnitude (kim2023functionalannotationof pages 6-7).
Cofactor indication (from figure evidence): the experimental validation figure includes a reaction scheme labeling NAD → NADH + H+, indicating the assay/interpretation uses NAD as the oxidizing cofactor for the glucose dehydrogenase reaction scheme (kim2023functionalannotationof media 81d7b013). The retrieved text excerpts do not provide an explicit cofactor preference comparison versus NADP (e.g., kinetic preference), but the figure provides direct visual support that NAD is the depicted cofactor for the validated reaction (kim2023functionalannotationof media 81d7b013).
Assay implementation details: Kim et al. report overexpression of His-tagged YgfF in E. coli BL21(DE3), purification by metal affinity resin, and use of a glucose dehydrogenase colorimetric kit with OD450 readout at 37 °C, with reaction mixtures containing assay buffer, developer, and glucose substrate (kim2023functionalannotationof pages 7-8). These details establish that the validation was performed on purified recombinant protein and confirm glucose was used as substrate under the assay conditions (kim2023functionalannotationof pages 7-8).
A 2024 review (Wohlgemuth, Life, Mar 2024, DOI: 10.3390/life14030364, URL: https://doi.org/10.3390/life14030364) highlights the DeepECtransformer study as an example of assigning enzyme functions to previously unannotated proteins, explicitly noting YgfF as a predicted glucose 1-dehydrogenase and that in vitro enzyme assays were performed on overexpressed and affinity-purified YgfF (wohlgemuth2024backtothe pages 3-6). The review stresses that function assignment for unusual/unknown enzymes often requires extensive experimental work (expression/purification, substrate synthesis, analytical methods, kinetic characterization such as kcat and KM), and emphasizes the need for protein- and time-dependent catalysis demonstrations (wohlgemuth2024backtothe pages 3-6). This is relevant because the current YgfF evidence base (in the retrieved sources) includes a specific activity but not full kinetic constants (wohlgemuth2024backtothe pages 3-6).
A later expert analysis evaluating current ML model limitations notes that YgfF’s case is best understood as propagation of a known SDR subgroup function: YgfF is predicted (using SDR subgroup HMMs) to belong to SDR63C / glucose 1-dehydrogenase subgroup, consistent with Kim et al.’s prediction and in vitro validation (crecylagard2025limitationsofcurrent pages 7-9). Critically, this analysis argues that in vitro activity alone is insufficient to establish physiological (in vivo) function and that best practice is combining biochemical and genetic evidence (crecylagard2025limitationsofcurrent pages 7-9). This caveat directly impacts how confidently ygfF can be placed into an E. coli pathway based on currently retrieved evidence.
The immediate “real-world implementation” of the YgfF result is its role as a benchmark case for ML-assisted functional annotation of uncharacterized microbial genes. Kim et al. used E. coli K-12 MG1655 as a model genome and report that their approach predicted EC numbers for 464 y-ome proteins (with 390 receiving full four-digit EC predictions) and then experimentally validated a subset including YgfF (kim2023functionalannotationof pages 5-6). YgfF thus serves as an example of pairing computational prediction with rapid biochemical testing, a workflow increasingly used in genome annotation and metabolic model refinement efforts (kim2023functionalannotationof pages 5-6, wohlgemuth2024backtothe pages 3-6).
While glucose 1-dehydrogenases are widely used as redox biocatalysts or in sugar oxidation contexts, the retrieved evidence does not establish YgfF’s performance characteristics beyond specific activity under kit conditions, nor its substrate scope, stability, or engineering history. Therefore, any industrial/biotech deployment claims for E. coli YgfF specifically are not supported by the retrieved sources and are not asserted here.
The expert commentary on ML annotation emphasizes a key interpretive point: YgfF’s glucose 1-dehydrogenase activity is biochemically supported, but physiological function in the native organism requires additional evidence, ideally including genetics (knockout/phenotyping, complementation) and pathway context (crecylagard2025limitationsofcurrent pages 7-9). The 2024 review similarly emphasizes the broader methodological standard of deeper characterization (including kinetic parameters and analytical validation) when asserting metabolic roles (wohlgemuth2024backtothe pages 3-6).
Kim et al. quantify the annotation shortfall and the need for bridging EC/GO mappings: they note that among fully specified four-digit EC numbers, as of July 2023 only 5,216 of 8,056 had corresponding GO terms, implying substantial ontology linkage gaps even when enzymatic functions exist (kim2023functionalannotationof pages 6-7). This supports the view that systematic function discovery and careful curation are still necessary.
The most strongly supported conclusion from recent primary evidence is that E. coli K-12 YgfF catalyzes glucose 1-dehydrogenase activity (EC 1.1.1.47) in vitro, with high specific activity under the employed assay conditions (305.55 U mg−1) (kim2023functionalannotationof pages 6-7). The validation figure’s reaction scheme indicates NAD reduction to NADH + H+ coupled to glucose oxidation (kim2023functionalannotationof media 81d7b013). Together, these data support a biochemical role as an NAD-dependent glucose dehydrogenase under the tested conditions (kim2023functionalannotationof media 81d7b013, kim2023functionalannotationof pages 6-7).
The retrieved evidence does not provide:
- A substrate panel demonstrating specificity beyond glucose (only glucose is directly described in assay conditions) (kim2023functionalannotationof pages 7-8).
- Michaelis–Menten parameters (Km, kcat) or mechanistic kinetic order (wohlgemuth2024backtothe pages 3-6).
- Definitive NAD vs NADP preference by comparative kinetics; only NAD is shown in the figure scheme (kim2023functionalannotationof media 81d7b013).
- Native subcellular localization measurements (e.g., cytosolic vs periplasmic) or genetic/pathway linkage in E. coli K-12 (crecylagard2025limitationsofcurrent pages 7-9, wohlgemuth2024backtothe pages 3-6).
Accordingly, the most evidence-consistent interpretation is that YgfF is a soluble SDR oxidoreductase capable of catalyzing glucose oxidation in vitro with NAD as depicted cofactor, but its physiological substrate(s), pathway integration, and cellular compartment of action remain unresolved in the retrieved literature and require in vivo validation (crecylagard2025limitationsofcurrent pages 7-9, wohlgemuth2024backtothe pages 3-6).
| Property | Finding for E. coli K-12 YgfF (UniProt P52037) | Evidence type | Localization / solubility note | Key reference |
|---|---|---|---|---|
| Gene/protein identity | YgfF from Escherichia coli K-12/MG1655 y-ome; UniProt-linked annotation aligns with an SDR-family oxidoreductase later assigned as glucose 1-dehydrogenase (kim2023functionalannotationof pages 6-7, crecylagard2025limitationsofcurrent pages 7-9) | Database-linked annotation + literature synthesis | No direct subcellular localization experimentally reported in the retrieved sources; treated as a soluble recombinant protein in validation experiments (kim2023functionalannotationof pages 7-8, wohlgemuth2024backtothe pages 3-6) | Kim et al., 2023, Nat Commun, doi:10.1038/s41467-023-43216-z, https://doi.org/10.1038/s41467-023-43216-z |
| Predicted function / EC number | DeepECtransformer predicted YgfF as EC 1.1.1.47, glucose 1-dehydrogenase, with prediction score 0.6331 (kim2023functionalannotationof pages 6-7) | ML prediction | Among uniquely predicted 4-digit EC proteins, many were predicted soluble in E. coli by NetSolP, but no YgfF-specific solubility value was reported (kim2023functionalannotationof pages 5-6) | Kim et al., 2023, Nat Commun, doi:10.1038/s41467-023-43216-z, https://doi.org/10.1038/s41467-023-43216-z |
| Family/subgroup assignment | YgfF is in the Short-Chain Dehydrogenase/Reductase (SDR) superfamily (IPR002347) and predicted to fall in the SDR63C / glucose 1-dehydrogenase subgroup (crecylagard2025limitationsofcurrent pages 7-9) | HMM / family classification | Family assignment supports a soluble cytosolic enzyme-like oxidoreductase interpretation, but no direct localization experiment was cited for YgfF (crecylagard2025limitationsofcurrent pages 7-9) | de Crécy-Lagard et al., 2025 preprint, doi:10.1101/2024.07.01.601547, https://doi.org/10.1101/2024.07.01.601547 |
| Reaction / cofactor | Figure-linked reaction scheme indicates glucose oxidation coupled to NAD reduction to NADH + H+, consistent with glucose 1-dehydrogenase activity and NAD dependence rather than an explicitly demonstrated NADP preference (kim2023functionalannotationof media 81d7b013, kim2023functionalannotationof pages 6-7) | Figure-supported biochemical interpretation | No intracellular compartment or membrane association evidence reported; recombinant purified enzyme assayed in vitro (kim2023functionalannotationof media 81d7b013, kim2023functionalannotationof pages 7-8) | Kim et al., 2023, Nat Commun, doi:10.1038/s41467-023-43216-z, https://doi.org/10.1038/s41467-023-43216-z |
| Experimental validation | Purified His-tagged YgfF showed specific glucose 1-dehydrogenase activity of 305.55 U mg−1 in vitro; authors compared this with 205.70 U mg−1 reported for a characterized Lysinibacillus sphaericus glucose 1-dehydrogenase (kim2023functionalannotationof pages 6-7) | In vitro enzyme assay | Overexpressed in E. coli BL21(DE3), purified by metal-affinity resin; this supports biochemical tractability/solubility but not native localization (kim2023functionalannotationof pages 7-8) | Kim et al., 2023, Nat Commun, doi:10.1038/s41467-023-43216-z, https://doi.org/10.1038/s41467-023-43216-z |
| Assay conditions | Validation used a glucose dehydrogenase colorimetric kit with glucose substrate, GDH assay buffer, developer, and OD450 readout at 37 °C; confirms activity with glucose under assay conditions but does not define broader substrate range or kinetic constants (kim2023functionalannotationof pages 7-8) | In vitro assay protocol | Assay performed on purified protein; no localization inference beyond soluble preparation (kim2023functionalannotationof pages 7-8) | Kim et al., 2023, Nat Commun, doi:10.1038/s41467-023-43216-z, https://doi.org/10.1038/s41467-023-43216-z |
| Physiological role / pathway inference | Current evidence supports biochemical function as a glucose 1-dehydrogenase, but no in vivo pathway assignment, physiological substrate context, or genetic validation in E. coli K-12 was reported in the retrieved literature (kim2023functionalannotationof pages 6-7, crecylagard2025limitationsofcurrent pages 7-9, wohlgemuth2024backtothe pages 3-6) | Inference with explicit caution | Native localization remains unresolved in retrieved sources (crecylagard2025limitationsofcurrent pages 7-9, wohlgemuth2024backtothe pages 3-6) | Wohlgemuth, 2024, Life, doi:10.3390/life14030364, https://doi.org/10.3390/life14030364; de Crécy-Lagard et al., 2025 preprint, doi:10.1101/2024.07.01.601547, https://doi.org/10.1101/2024.07.01.601547 |
| Evidence limitations | Retrieved sources do not provide YgfF-specific Km, kcat, Vmax, structural data, substrate spectrum beyond glucose assay conditions, or definitive native localization; expert commentary cautions that in vitro activity alone is insufficient to establish physiological function (crecylagard2025limitationsofcurrent pages 7-9, wohlgemuth2024backtothe pages 3-6) | Expert analysis / review | Solubility/localization evidence is indirect only (purification, expression, model-wide solubility statistics) (kim2023functionalannotationof pages 7-8, kim2023functionalannotationof pages 5-6) | de Crécy-Lagard et al., 2025 preprint, doi:10.1101/2024.07.01.601547, https://doi.org/10.1101/2024.07.01.601547; Wohlgemuth, 2024, doi:10.3390/life14030364, https://doi.org/10.3390/life14030364 |
Table: This table compiles the main experimentally supported and predicted properties of E. coli K-12 YgfF, including its EC assignment, biochemical evidence, cofactor inference, and current evidence gaps. It is useful as a compact evidence map separating validated findings from family-based or model-based inference.
References
(kim2023functionalannotationof pages 6-7): Gi Bae Kim, Ji Yeon Kim, Jong An Lee, Charles J. Norsigian, Bernhard O. Palsson, and Sang Yup Lee. Functional annotation of enzyme-encoding genes using deep learning with transformer layers. Nature Communications, Nov 2023. URL: https://doi.org/10.1038/s41467-023-43216-z, doi:10.1038/s41467-023-43216-z. This article has 110 citations and is from a highest quality peer-reviewed journal.
(kim2023functionalannotationof pages 5-6): Gi Bae Kim, Ji Yeon Kim, Jong An Lee, Charles J. Norsigian, Bernhard O. Palsson, and Sang Yup Lee. Functional annotation of enzyme-encoding genes using deep learning with transformer layers. Nature Communications, Nov 2023. URL: https://doi.org/10.1038/s41467-023-43216-z, doi:10.1038/s41467-023-43216-z. This article has 110 citations and is from a highest quality peer-reviewed journal.
(crecylagard2025limitationsofcurrent pages 7-9): Valérie de Crécy-Lagard, Raquel Dias, Nick Sexson, Iddo Friedberg, Yifeng Yuan, and Manal A. Swairjo. Limitations of current machine-learning models in predicting enzymatic functions for uncharacterized proteins. BioRxiv, Jul 2025. URL: https://doi.org/10.1101/2024.07.01.601547, doi:10.1101/2024.07.01.601547. This article has 8 citations.
(kim2023functionalannotationof media 81d7b013): Gi Bae Kim, Ji Yeon Kim, Jong An Lee, Charles J. Norsigian, Bernhard O. Palsson, and Sang Yup Lee. Functional annotation of enzyme-encoding genes using deep learning with transformer layers. Nature Communications, Nov 2023. URL: https://doi.org/10.1038/s41467-023-43216-z, doi:10.1038/s41467-023-43216-z. This article has 110 citations and is from a highest quality peer-reviewed journal.
(kim2023functionalannotationof pages 7-8): Gi Bae Kim, Ji Yeon Kim, Jong An Lee, Charles J. Norsigian, Bernhard O. Palsson, and Sang Yup Lee. Functional annotation of enzyme-encoding genes using deep learning with transformer layers. Nature Communications, Nov 2023. URL: https://doi.org/10.1038/s41467-023-43216-z, doi:10.1038/s41467-023-43216-z. This article has 110 citations and is from a highest quality peer-reviewed journal.
(wohlgemuth2024backtothe pages 3-6): Roland Wohlgemuth. Back to the future of metabolism—advances in the discovery and characterization of unknown biocatalytic functions and pathways. Life, 14:364, Mar 2024. URL: https://doi.org/10.3390/life14030364, doi:10.3390/life14030364. This article has 2 citations.
id: P52037
gene_symbol: ygfF
product_type: PROTEIN
status: COMPLETE
taxon:
id: NCBITaxon:83333
label: Escherichia coli (strain K12)
description: YgfF is a glucose 1-dehydrogenase (EC 1.1.1.47) belonging to the short-chain
dehydrogenases/reductases (SDR) family (SDR63C subgroup). It catalyzes the NAD(+)-dependent
oxidation of D-glucose to D-glucono-1,5-lactone, which spontaneously hydrolyzes to
D-gluconate. The enzymatic function was predicted by the DeepECtransformer deep learning
tool (prediction score 0.6331) and experimentally validated in vitro by Kim et al.
2023, who measured a specific activity of 305.55 U/mg, comparable to characterized
glucose 1-dehydrogenases from other organisms. Full kinetic parameters (Km, kcat)
have not yet been determined. YgfF was one of only three genuinely correct novel
predictions (out of 464) made by DeepECtransformer for the E. coli y-ome. The protein contains a
conserved NAD(P)-binding Rossmann-fold domain and predicted binding sites for both
NAD(+) and D-glucose. A physical interaction with LpdA (dihydrolipoyl dehydrogenase)
was detected by affinity purification-mass spectrometry, though the biological
significance of this interaction is unclear. The physiological role of YgfF in E. coli
metabolism remains to be established.
existing_annotations:
- term:
id: GO:0016491
label: oxidoreductase activity
evidence_type: IBA
original_reference_id: GO_REF:0000033
review:
summary: YgfF is experimentally confirmed as a glucose 1-dehydrogenase (EC 1.1.1.47),
which is a type of oxidoreductase (PMID:37963869). The IBA annotation to oxidoreductase
activity is correct but much less specific than what is now known. The more specific
term GO:0047934 (glucose 1-dehydrogenase (NAD+) activity) is supported by direct
experimental evidence.
action: ACCEPT
reason: This IBA annotation is consistent with the experimentally validated function
of YgfF. While more specific terms exist (and are annotated separately), the IBA
at this level is not wrong and reflects phylogenetic inference that is consistent
with experimental data. Glucose 1-dehydrogenase activity is a subtype of oxidoreductase
activity.
supported_by:
- reference_id: PMID:37963869
supporting_text: YgfF exhibited a specific glucose 1-dehydrogenase activity of
305.55 U mg−1
- term:
id: GO:0016614
label: oxidoreductase activity, acting on CH-OH group of donors
evidence_type: IEA
original_reference_id: GO_REF:0000117
review:
summary: This IEA annotation from ARBA is consistent with the experimentally
validated glucose 1-dehydrogenase activity. Glucose 1-dehydrogenase acts on the
CH-OH group of D-glucose, so this intermediate-level annotation is correct. It is
less specific than GO:0047934 but appropriately reflects the ARBA computational
prediction.
action: ACCEPT
reason: The term is a correct parent of the experimentally validated specific function
(glucose 1-dehydrogenase (NAD+) activity). The IEA evidence code is appropriate
for a computationally derived annotation, and the term is consistent with the SDR
family classification and the known catalytic mechanism.
supported_by:
- reference_id: PMID:37963869
supporting_text: For YgfF, DeepECtransformer predicted its EC number to be
EC:1.1.1.47 (glucose 1-dehydrogenase).
- term:
id: GO:0047934
label: glucose 1-dehydrogenase (NAD+) activity
evidence_type: IEA
original_reference_id: GO_REF:0000116
review:
summary: This IEA annotation is derived from Rhea reaction mapping (RHEA:14293),
which corresponds to the reaction D-glucose + NAD(+) = D-glucono-1,5-lactone +
NADH + H(+). This matches the experimentally validated catalytic activity of YgfF
exactly as described by Kim et al. 2023 and annotated by UniProt (PMID:37963869).
action: ACCEPT
reason: The Rhea-derived IEA correctly captures the specific enzymatic activity of
YgfF. This is also independently supported by the IDA annotation from PMID:37963869
using the same GO term.
supported_by:
- reference_id: PMID:37963869
supporting_text: YgfF exhibited a specific glucose 1-dehydrogenase activity of
305.55 U mg−1
- term:
id: GO:0047936
label: glucose 1-dehydrogenase [NAD(P)+] activity
evidence_type: IEA
original_reference_id: GO_REF:0000003
review:
summary: This IEA annotation is derived from the EC number mapping (EC:1.1.1.47).
GO:0047936 describes glucose 1-dehydrogenase that can use either NAD+ or NADP+ as
cofactor. However, the EC number 1.1.1.47 specifically refers to the NAD+-dependent
form, and the experimental validation by Kim et al. used an NAD+-dependent assay
(PMID:37963869). UniProt annotates the catalytic activity with the Rhea reaction
that specifies NAD+ specifically. The more precise term GO:0047934 (NAD+ specific)
is the better annotation.
action: MODIFY
reason: The EC:1.1.1.47 mapping to GO:0047936 is potentially an overly broad
mapping, since EC:1.1.1.47 is the NAD+-dependent glucose 1-dehydrogenase, not the
dual-cofactor NAD(P)+ form (which would be EC:1.1.1.119). The experimental data
from Kim et al. 2023 validated activity using an NAD+-dependent assay kit, and
UniProt annotates the Rhea reaction (RHEA:14293) specifically with NAD+. The
NAD+-specific GO term GO:0047934 is more accurate.
proposed_replacement_terms:
- id: GO:0047934
label: glucose 1-dehydrogenase (NAD+) activity
supported_by:
- reference_id: PMID:37963869
supporting_text: YgfF exhibited a specific glucose 1-dehydrogenase activity of
305.55 U mg−1
- term:
id: GO:0005515
label: protein binding
evidence_type: IPI
original_reference_id: PMID:15690043
review:
summary: This annotation is based on a high-throughput affinity purification-mass
spectrometry study by Butland et al. 2005 that detected a physical interaction
between YgfF and LpdA (dihydrolipoyl dehydrogenase, P0A9P0). The interaction is
recorded in IntAct with 2 experiments supporting it. However, the GO term
GO:0005515 (protein binding) is uninformative per curation guidelines and does not
convey any specific functional information about this interaction.
action: MARK_AS_OVER_ANNOTATED
reason: Per curation guidelines, GO:0005515 (protein binding) is too vague and
uninformative. The Butland et al. 2005 study was a large-scale screen that detected
many interactions, and the biological significance of the YgfF-LpdA interaction is
unknown. LpdA functions in the pyruvate dehydrogenase and 2-oxoglutarate
dehydrogenase complexes, and there is no clear functional connection to glucose
1-dehydrogenase activity. Without understanding the functional nature of this
interaction, a generic protein binding annotation provides little value.
supported_by:
- reference_id: PMID:15690043
supporting_text: no large-scale analysis of protein complexes in Escherichia coli
has yet been reported. To this end, we have targeted DNA cassettes into the E.
coli chromosome to create carboxy-terminal, affinity-tagged alleles of 1,000 open
reading frames
- term:
id: GO:0047934
label: glucose 1-dehydrogenase (NAD+) activity
evidence_type: IDA
original_reference_id: PMID:37963869
review:
summary: This is the key experimentally validated annotation. Kim et al. 2023
expressed and purified recombinant His-tagged YgfF from E. coli BL21(DE3) and
measured glucose 1-dehydrogenase activity in vitro using a colorimetric GDH assay
kit. The specific activity was 305.55 U/mg, comparable to the previously reported
value of 205.70 U/mg for glucose 1-dehydrogenase from Lysinibacillus sphaericus.
The activity was predicted by DeepECtransformer with EC number EC:1.1.1.47 and
confirmed by the enzyme assay. This represents a validated core function.
action: ACCEPT
reason: Direct experimental evidence from in vitro enzyme assay demonstrates glucose
1-dehydrogenase (NAD+) activity. The specific activity (305.55 U/mg) is robust and
comparable to characterized homologs. This is the most specific and well-supported
annotation for YgfF. UniProt has adopted this function based on this study
(EC:1.1.1.47).
supported_by:
- reference_id: PMID:37963869
supporting_text: For YgfF, DeepECtransformer predicted its EC number to be
EC:1.1.1.47 (glucose 1-dehydrogenase). The enzyme assay results showed that YgfF
exhibited a specific glucose 1-dehydrogenase activity of 305.55 U mg−1
- reference_id: PMID:37963869
supporting_text: which was comparable to the previously reported value of 205.70 U
mg−1 for the glucose 1-dehydrogenase from Lysinibacillus sphaericus G10
- reference_id: file:ECOLI/ygfF/ygfF-deep-research-falcon.md
supporting_text: Falcon deep research confirms YgfF as EC 1.1.1.47 glucose
1-dehydrogenase validated by in vitro assay with 305.55 U/mg specific activity,
and notes SDR63C subgroup classification supports this assignment.
- term:
id: GO:0051287
label: NAD binding
evidence_type: IDA
original_reference_id: PMID:37963869
review:
summary: YgfF requires NAD+ as a cofactor for its glucose 1-dehydrogenase activity.
The catalytic reaction (D-glucose + NAD(+) = D-glucono-1,5-lactone + NADH + H(+))
directly involves NAD+ binding. UniProt annotates extensive NAD+ binding residues
(positions 11, 13, 59, 60, 86, 88, 110, 156, 160, 189, 191, 194) based on
similarity to characterized SDR family members. The in vitro enzyme assay
demonstrating NAD+-dependent glucose oxidation provides experimental evidence for
NAD binding.
action: NEW
reason: The experimentally validated glucose 1-dehydrogenase activity requires NAD+
as a cofactor, and UniProt annotates multiple NAD+ binding residues. NAD binding is
an inherent aspect of the catalytic mechanism and should be annotated. This is not
currently present in the GO annotations.
supported_by:
- reference_id: PMID:37963869
supporting_text: For YgfF, DeepECtransformer predicted its EC number to be
EC:1.1.1.47 (glucose 1-dehydrogenase). The enzyme assay results showed that YgfF
exhibited a specific glucose 1-dehydrogenase activity of 305.55 U mg−1
- term:
id: GO:0019521
label: D-gluconate metabolic process
evidence_type: IDA
original_reference_id: PMID:37963869
review:
summary: YgfF catalyzes the oxidation of D-glucose to D-glucono-1,5-lactone, which
spontaneously hydrolyzes to D-gluconate. This places YgfF as a participant in
D-gluconate metabolism. No biological process annotations currently exist for YgfF,
yet the experimentally validated enzymatic activity directly implicates it in this
metabolic pathway. However, the in vivo physiological role has not been established,
so this annotation should be considered with caution.
action: NEW
reason: There are currently no biological process annotations for YgfF, which is a
significant gap. The experimentally validated glucose 1-dehydrogenase activity
produces D-glucono-1,5-lactone (a precursor to D-gluconate), directly linking YgfF
to D-gluconate metabolic process. UniProt states the protein catalyzes the
NAD(+)-dependent oxidation of D-glucose to D-gluconate via gluconolactone.
supported_by:
- reference_id: PMID:37963869
supporting_text: YgfF exhibited a specific glucose 1-dehydrogenase activity of
305.55 U mg−1
- term:
id: GO:0005829
label: cytosol
evidence_type: IDA
original_reference_id: PMID:37963869
review:
summary: YgfF was expressed as a soluble protein and purified from the cytosolic
fraction (supernatant after cell lysis and centrifugation) by Kim et al. 2023. The
protein was predicted to be soluble by NetSolP and was successfully purified from
the soluble fraction. While there are no dedicated localization studies, the
solubility data and lack of any signal peptide or transmembrane domain strongly
suggest cytosolic localization. No cellular component annotations currently exist
for YgfF.
action: NEW
reason: There are currently no cellular component annotations for YgfF, which is a
gap. The protein was purified from the soluble cytoplasmic fraction and has no
predicted signal peptide or transmembrane domains, consistent with cytosolic
localization. However, the evidence is indirect (protein was soluble when
overexpressed) rather than from a dedicated localization study.
supported_by:
- reference_id: PMID:37963869
supporting_text: 179 proteins are predicted to be soluble in E. coli by NetSolP, a
deep learning model for protein solubility prediction
- reference_id: PMID:37963869
supporting_text: Cell debris was separated by centrifugation at 15,044 × g for
40 min, and the resulting supernatants were loaded onto Talon metal affinity resin
references:
- id: GO_REF:0000003
title: Gene Ontology annotation based on Enzyme Commission mapping
findings: []
- id: GO_REF:0000033
title: Annotation inferences using phylogenetic trees
findings: []
- id: GO_REF:0000116
title: Automatic Gene Ontology annotation based on Rhea mapping
findings: []
- id: GO_REF:0000117
title: Electronic Gene Ontology annotations created by ARBA machine learning models
findings: []
- id: PMID:15690043
title: Interaction network containing conserved and essential protein complexes in
Escherichia coli.
findings:
- statement: High-throughput affinity purification-mass spectrometry detected a physical
interaction between YgfF (P52037) and LpdA (P0A9P0, dihydrolipoyl dehydrogenase).
supporting_text: no large-scale analysis of protein complexes in Escherichia coli
has yet been reported. To this end, we have targeted DNA cassettes into the E.
coli chromosome to create carboxy-terminal, affinity-tagged alleles of 1,000 open
reading frames
- id: PMID:37963869
title: Functional annotation of enzyme-encoding genes using deep learning with transformer
layers.
findings:
- statement: DeepECtransformer predicted YgfF to have EC number EC:1.1.1.47 (glucose
1-dehydrogenase) and this was experimentally validated by in vitro enzyme assay
with a specific activity of 305.55 U/mg.
supporting_text: For YgfF, DeepECtransformer predicted its EC number to be
EC:1.1.1.47 (glucose 1-dehydrogenase). The enzyme assay results showed that YgfF
exhibited a specific glucose 1-dehydrogenase activity of 305.55 U mg−1
- statement: YgfF was one of three randomly selected y-ome proteins whose predicted
enzymatic functions were experimentally validated, demonstrating the utility of
DeepECtransformer for functional annotation.
supporting_text: we randomly selected three proteins, YgfF, YciO, and YdjM, that are
predicted to be oxidoreductase, transferase, and hydrolase, respectively
- statement: The neural network predicted YgfF's function despite higher sequence
identity to a different enzyme (EC:1.1.1.100), showing the model learned functional
motifs rather than relying solely on homology.
supporting_text: It should be noted that although YgfF exhibited a higher sequence
identity with a different enzyme (A0A069CGU9_ECOLX; EC:1.1.1.100) from the
training dataset than glucose 1-dehydrogenase exhibiting the maximum sequence
identity within the training dataset, the neural network made an accurate prediction
- id: DOI:10.3390/life14030364
title: Back to the future of metabolism - advances in the discovery and characterization
of unknown biocatalytic functions and pathways.
findings:
- statement: YgfF is highlighted as an example of ML-assisted functional annotation
where DeepECtransformer predicted glucose 1-dehydrogenase activity and in vitro
enzyme assays were performed on overexpressed and affinity-purified protein. The
review emphasizes that deeper characterization (kinetic parameters, substrate
spectrum) is needed to fully establish metabolic roles.
supporting_text: The review stresses that function assignment for unusual/unknown
enzymes often requires extensive experimental work including expression,
purification, substrate synthesis, analytical methods, and kinetic
characterization such as kcat and KM.
- id: PMID:40703034
title: Limitations of current machine-learning models in predicting enzymatic functions
for uncharacterized proteins.
findings:
- statement: YgfF is classified into the SDR63C / glucose 1-dehydrogenase subgroup
by HMM-based SDR subfamily analysis, consistent with the DeepECtransformer
prediction. However, in vitro activity alone is insufficient to establish
physiological (in vivo) function, and best practice requires combining
biochemical and genetic evidence.
supporting_text: This resource predicts YgfF is part of the SDR63C/Glucose
1-dehydrogenase subgroup, the activity predicted and validated in the Kim et al.
(2023) study. This prediction demonstrates the accurate propagation of functional
annotation and is a successful prediction.
core_functions:
- description: NAD(+)-dependent glucose 1-dehydrogenase activity. YgfF catalyzes the
oxidation of D-glucose to D-glucono-1,5-lactone using NAD+ as the electron
acceptor. This is the sole experimentally validated enzymatic function, demonstrated
by in vitro assay with a specific activity of 305.55 U/mg (PMID:37963869). YgfF
belongs to the SDR family (SDR63C subgroup) and contains a conserved NAD(P)-binding
Rossmann-fold domain.
molecular_function:
id: GO:0047934
label: glucose 1-dehydrogenase (NAD+) activity
directly_involved_in:
- id: GO:0019521
label: D-gluconate metabolic process
supported_by:
- reference_id: PMID:37963869
supporting_text: For YgfF, DeepECtransformer predicted its EC number to be
EC:1.1.1.47 (glucose 1-dehydrogenase). The enzyme assay results showed that YgfF
exhibited a specific glucose 1-dehydrogenase activity of 305.55 U mg−1
suggested_questions:
- question: What is the physiological role of YgfF glucose 1-dehydrogenase activity in
E. coli K-12 metabolism? Is it involved in glucose catabolism via the
Entner-Doudoroff pathway or another metabolic route?
experts:
- Lee SY
- Kim GB
- question: What is the biological significance of the YgfF-LpdA physical interaction
detected by Butland et al. 2005? Does YgfF participate in a metabolic complex
with pyruvate dehydrogenase components?
experts:
- Emili A
- Butland G
- question: Does YgfF have any activity with NADP+ as cofactor, or is it strictly
NAD+-dependent? The current annotations include both NAD+ and NAD(P)+ terms.
experts:
- Kim GB
- Lee SY
suggested_experiments:
- hypothesis: YgfF deletion affects growth on glucose as sole carbon source under
specific metabolic conditions.
description: Construct a ygfF knockout in E. coli K-12 and test growth phenotypes on
minimal media with glucose as the sole carbon source under aerobic and anaerobic
conditions. Compare with wild-type to determine if YgfF contributes to glucose
utilization in vivo.
experiment_type: growth phenotype assay
- hypothesis: YgfF is NAD+-specific and does not use NADP+ as an electron acceptor.
description: Perform in vitro enzyme assays with purified YgfF using NADP+ instead of
NAD+ as the cofactor to determine cofactor specificity. This would resolve whether
GO:0047936 (NAD(P)+ form) or GO:0047934 (NAD+ specific) is the correct annotation.
experiment_type: enzyme kinetics
- hypothesis: The YgfF-LpdA interaction has functional significance in glucose metabolism.
description: Perform co-purification experiments with tagged YgfF under physiological
expression levels and test whether LpdA affects YgfF enzymatic activity in vitro.
Also test if ygfF deletion affects pyruvate dehydrogenase complex activity.
experiment_type: protein interaction validation