Uncharacterized sugar kinase belonging to the PfkB/ribokinase carbohydrate kinase family (COG0524). UniProt assigns EC 2.7.1.- indicating it is a phosphotransferase with an alcohol group as acceptor, but the specific sugar substrate is unknown. The protein contains the PfkB domain (Pfam PF00294) and matches InterPro signatures for the ribokinase/fructokinase superfamily (IPR002139, IPR002173, IPR011611). CDD classifies it in the YegV_kinase_like subfamily (cd01944). yegV is transcribed as part of the yegTUV operon (~3.3 kb), which is repressed by the single-target regulator GgaR/YegW and derepressed by ADP-glucose (ADPG). Deletion of ggaR increases yegTUV mRNA ~30-fold and increases glycogen accumulation, linking YegV to carbon/glycogen metabolism. De Crecy-Lagard et al. 2025 (PMID:40703034) highlighted YegV as a case where computational methods (DeepECTF) incorrectly predicted the specific substrate as EC 2.7.1.92 (dehydro-2-deoxygluconokinase), when in fact that activity belongs to KdgK (b3526). This illustrates the challenge of distinguishing substrate specificity among nonisofunctional paralogs within the PfkB superfamily. No experimental characterization of YegV enzymatic activity or substrate has been published to date.
| GO Term | Evidence | Action | Reason |
|---|---|---|---|
|
GO:0006796
phosphate-containing compound metabolic process
|
IEA
GO_REF:0000117 |
ACCEPT |
Summary: IEA annotation from ARBA machine learning models mapping YegV to phosphate-containing compound metabolic process. This is a very broad biological process term that encompasses any metabolic process involving phosphate groups. As a predicted kinase (EC 2.7.1.-), YegV would indeed participate in phosphate-containing compound metabolism by catalyzing phosphoryl transfer from ATP to a sugar substrate.
Reason: This is a very general biological process annotation that is consistent with the predicted kinase function of YegV. While broad, it is not incorrect for a member of the PfkB kinase family. The annotation correctly reflects that YegV is involved in some form of phosphorylation, even though the specific substrate and pathway are unknown. For an uncharacterized enzyme, this level of generality is appropriate and avoids over-annotation.
|
|
GO:0016301
kinase activity
|
IEA
GO_REF:0000002 |
MODIFY |
Summary: IEA annotation from InterPro2GO mapping based on InterPro signatures IPR002139 (ribokinase/fructokinase) and IPR002173 (carbohydrate/purine kinase PfkB conserved site). YegV belongs to the PfkB/ribokinase family, and UniProt assigns EC 2.7.1.- (phosphotransferase with OH group as acceptor) based on sequence similarity.
Reason: While kinase activity (GO:0016301) is not wrong, a more informative term is available. YegV is specifically annotated by UniProt as an "uncharacterized sugar kinase" belonging to the PfkB carbohydrate kinase family (Pfam PF00294). CDD classifies it in the YegV_kinase_like subfamily (cd01944). The term GO:0019200 (carbohydrate kinase activity), defined as "catalysis of the transfer of a phosphate group to a carbohydrate", would be more specific and appropriate. This term correctly captures the sugar kinase function predicted from domain architecture without specifying an incorrect substrate. De Crecy-Lagard et al. 2025 (PMID:40703034) specifically noted that while the first three digits of the EC number (2.7.1) are reliably predicted for YegV, the specific substrate (4th digit) is unknown, making GO:0019200 the right level of specificity.
Proposed replacements:
carbohydrate kinase activity
Supporting Evidence:
PMID:40703034
b2100 was annotated as a dehydro-2-deoxygluconokinase (EC 2.7.1.92) using DeepECTF and as an uncharacterized sugar kinase YegV (EC 2.7.1.-) in UniProt (Supplementary Table 1b). These 2 predictions differ by the fourth or last position of the EC number that specifies substrate specificity. This protein is a member of a superfamily of sugar kinases with multiple nonisofunctional paralogous subgroups that phosphorylate different substrates
|
|
GO:0016772
transferase activity, transferring phosphorus-containing groups
|
IEA
GO_REF:0000117 |
ACCEPT |
Summary: IEA annotation from ARBA mapping YegV to transferase activity transferring phosphorus-containing groups (GO:0016772). This is the direct parent of kinase activity (GO:0016301) in the GO hierarchy: catalytic activity > transferase activity > transferase activity, transferring phosphorus-containing groups > kinase activity.
Reason: This is a correct but general parent term for kinase activity. As YegV belongs to the PfkB kinase family and UniProt assigns EC 2.7.1.-, it is indeed a transferase that transfers phosphorus-containing groups. While less informative than GO:0016301 (kinase activity) or GO:0019200 (carbohydrate kinase activity), it is not wrong and is acceptable as a broader IEA annotation that is hierarchically consistent with the more specific kinase annotations.
|
|
GO:0019200
carbohydrate kinase activity
|
ISS
PMID:40703034 Limitations of current machine learning models in predicting... |
NEW |
Summary: New annotation proposed based on domain architecture and family membership. YegV is a member of the PfkB/ribokinase carbohydrate kinase family (COG0524, Pfam PF00294). UniProt names it "uncharacterized sugar kinase YegV" with EC 2.7.1.- (phosphotransferase with OH acceptor, substrate unknown). CDD assigns it to the YegV_kinase_like subfamily (cd01944). De Crecy-Lagard et al. 2025 (PMID:40703034) confirmed the first three EC digits (2.7.1) are reliable for this protein, placing it firmly in the carbohydrate kinase class, while noting the specific substrate remains unknown.
Reason: GO:0019200 (carbohydrate kinase activity) is the most informative molecular function term that can be applied without over-specifying the substrate. This term captures the sugar kinase function predicted from multiple independent lines of evidence: PfkB domain (Pfam PF00294), InterPro ribokinase/fructokinase signatures (IPR002139), COG0524 membership, CDD YegV_kinase_like classification (cd01944), and UniProt naming as "uncharacterized sugar kinase." It avoids the error of specifying a particular sugar substrate, which PMID:40703034 demonstrated leads to incorrect annotations due to paralog confusion within this large superfamily.
Supporting Evidence:
PMID:40703034
This protein is a member of a superfamily of sugar kinases with multiple nonisofunctional paralogous subgroups that phosphorylate different substrates (Supplementary Table 1d, bottom). The dehydro-2-deoxygluconokinase (EC 2.7.1.92) activity is encoded by another member of this superfamily KdgK/b3526 (Supplementary Table 1d, bottom). Here, DeepECTF predicted correctly the first 3 digits of the EC number but not the last, making an overpropagation mistake
file:ECOLI/yegV/yegV-deep-research-falcon.md
Falcon deep research confirms yegV as a putative sugar kinase in the yegTUV operon regulated by GgaR/YegW in response to ADPG, linking it to glycogen metabolism. No direct enzymology for YegV exists; the carbohydrate kinase annotation is the most specific term supported by domain architecture.
|
Q: What is the specific sugar substrate of YegV? Metabolomic profiling of yegV knockout strains under various carbon sources could reveal accumulated substrates.
Suggested experts: de Crecy-Lagard V
Q: Is there a specific growth condition or carbon source where yegV expression is induced, which would provide clues to its physiological substrate?
Suggested experts: Mori H
Experiment: Express and purify recombinant YegV, then screen for kinase activity against a panel of monosaccharides and sugar derivatives (including ribose, fructose, galactose, gluconate derivatives, and other PfkB family substrates) using a coupled enzyme assay that monitors ADP production. Include divalent cations (Mg2+, Mn2+) and monovalent cations (K+) as the PfkB family typically requires these for activity.
Hypothesis: YegV phosphorylates a specific monosaccharide or sugar derivative that can be identified through substrate screening
Type: In vitro substrate screening with coupled enzyme assay
Experiment: Perform growth phenotyping of a yegV deletion strain on a diverse panel of carbon sources (Biolog phenotype microarray) to identify conditions where YegV is required for growth, thereby providing clues to its substrate and metabolic pathway.
Hypothesis: Deletion of yegV causes a growth defect under specific nutrient conditions that reveal its physiological role
Type: Phenotype microarray carbon source screen
YegV DeepECTF prediction review. The DeepECTF prediction of dehydro-2-deoxygluconokinase (EC 2.7.1.92) is a paralog overannotation error. YegV is correctly identified as a sugar kinase (first 3 EC digits correct) but the specific substrate assignment is wrong - that activity belongs to KdgK/b3526. YegV's substrate specificity remains unknown.
provider: falcon
model: Edison Scientific Literature
cached: false
start_time: '2026-03-22T18:04:23.979939'
end_time: '2026-03-22T18:13:44.687197'
duration_seconds: 560.71
template_file: templates/gene_research_go_focused.md
template_variables:
organism: ECOLI
gene_id: yegV
gene_symbol: yegV
uniprot_accession: P76419
protein_description: 'RecName: Full=Uncharacterized sugar kinase YegV; EC=2.7.1.-;'
gene_info: Name=yegV; OrderedLocusNames=b2100, JW2087;
organism_full: Escherichia coli (strain K12).
protein_family: Belongs to the carbohydrate kinase PfkB family.
protein_domains: Carboh/pur_kinase_PfkB_CS. (IPR002173); PfkB_dom. (IPR011611);
Ribo/fructo_kinase. (IPR002139); Ribokinase-like. (IPR029056); PfkB (PF00294)
provider_config:
timeout: 600
max_retries: 3
parameters:
allowed_domains: []
temperature: 0.1
citation_count: 17
BEFORE YOU BEGIN RESEARCH: You MUST verify you are researching the CORRECT gene/protein. Gene symbols can be ambiguous, especially for less well-characterized genes from non-model organisms.
DO NOT PROCEED WITH RESEARCH ON A DIFFERENT GENE. Instead:
- State clearly: "The gene symbol 'yegV' is ambiguous or literature is limited for this specific protein"
- Explain what you found (e.g., "Found extensive literature on a different gene with the same symbol in a different organism")
- Describe the protein based ONLY on the UniProt information provided above
- Suggest that the protein function can be inferred from domain/family information
Please provide a comprehensive research report on the gene yegV (gene ID: yegV, UniProt: P76419) in ECOLI.
The research report should be a detailed narrative explaining the function, biological processes, and localization of the gene product. Citations should be given for all claims.
You should prioritize authoritative reviews and primary scientific literature when conducting research. You can supplement
this with annotations you find in gene/protein databases, but these can be outdated or inaccurate.
We are specifically interested in the primary function of the gene - for enzymes, what reaction is catalyzed, and what is the substrate specificity? For transporters, what is the substrate? For structural proteins or adapters, what is the broader structural role? For signaling molecules, what is the role in the pathway.
We are interested in where in or outside the cell the gene product carries out its function.
We are also interested in the signaling or biochemical pathways in which the gene functions. We are less interested in broad pleiotropic effects, except where these elucidate the precise role.
Include evidence where possible. We are interested in both experimental evidence as well as inference from structure, evolution, or bioinformatic analysis. Precise studies should be prioritized over high-throughput, where available.
Question: You are an expert researcher providing comprehensive, well-cited information.
Provide detailed information focusing on:
1. Key concepts and definitions with current understanding
2. Recent developments and latest research (prioritize 2023-2024 sources)
3. Current applications and real-world implementations
4. Expert opinions and analysis from authoritative sources
5. Relevant statistics and data from recent studies
Format as a comprehensive research report with proper citations. Include URLs and publication dates where available.
Always prioritize recent, authoritative sources and provide specific citations for all major claims.
BEFORE YOU BEGIN RESEARCH: You MUST verify you are researching the CORRECT gene/protein. Gene symbols can be ambiguous, especially for less well-characterized genes from non-model organisms.
DO NOT PROCEED WITH RESEARCH ON A DIFFERENT GENE. Instead:
- State clearly: "The gene symbol 'yegV' is ambiguous or literature is limited for this specific protein"
- Explain what you found (e.g., "Found extensive literature on a different gene with the same symbol in a different organism")
- Describe the protein based ONLY on the UniProt information provided above
- Suggest that the protein function can be inferred from domain/family information
Please provide a comprehensive research report on the gene yegV (gene ID: yegV, UniProt: P76419) in ECOLI.
The research report should be a detailed narrative explaining the function, biological processes, and localization of the gene product. Citations should be given for all claims.
You should prioritize authoritative reviews and primary scientific literature when conducting research. You can supplement
this with annotations you find in gene/protein databases, but these can be outdated or inaccurate.
We are specifically interested in the primary function of the gene - for enzymes, what reaction is catalyzed, and what is the substrate specificity? For transporters, what is the substrate? For structural proteins or adapters, what is the broader structural role? For signaling molecules, what is the role in the pathway.
We are interested in where in or outside the cell the gene product carries out its function.
We are also interested in the signaling or biochemical pathways in which the gene functions. We are less interested in broad pleiotropic effects, except where these elucidate the precise role.
Include evidence where possible. We are interested in both experimental evidence as well as inference from structure, evolution, or bioinformatic analysis. Precise studies should be prioritized over high-throughput, where available.
The E. coli K-12 gene yegV (UniProt P76419) encodes a protein consistently described in the literature as a putative sugar kinase and is located in the yegTUV operon. The strongest recent (2024) evidence connected to yegV is operon-level regulation: the GntR-family transcription factor YegW, renamed GgaR, is a single-target repressor of yegTUV and responds to ADP-glucose (ADPG), coupling yegTUV expression to glycogen accumulation and carbon storage physiology. However, no direct biochemical characterization of YegV’s catalytic reaction or substrate specificity was identified in the retrieved full-text primary literature; thus, YegV’s enzymatic function remains unvalidated beyond family/domain inference from UniProt and predicted annotations. (saito2024regulatoryroleof pages 7-10, saito2024regulatoryroleof pages 10-12)
The sources retrieved refer explicitly to E. coli K-12 and the yegTUV operon, describing yegV as a putative sugar kinase and (in one study) linking YegV to UniProt P76419, consistent with the user-provided target. (saito2024regulatoryroleof pages 7-10, chowdhury2021theproteininteractome pages 14-15)
Saito et al. (Microorganisms, Jan 2024, https://doi.org/10.3390/microorganisms12010115) show that GgaR/YegW binds upstream of yegTUV and acts as a repressor. Multiple lines of evidence support this:
- yegTUV transcription is strongly repressed by GgaR: deletion of ggaR yields ~30-fold higher yegTUV mRNA (RT-qPCR) and the yegTUV transcript is detectable by Northern blot in a ggaR mutant but not in wild type. (saito2024regulatoryroleof pages 10-12, saito2024regulatoryroleof pages 7-10)
- ADPG acts as an effector: gel-shift assays show ADPG disrupts the GgaR–DNA complex in a concentration-dependent manner (while glucose and ADP were also tested). (saito2024regulatoryroleof pages 10-12, saito2024regulatoryroleof media 12354a51)
- Physiology: deleting ggaR (derepressing yegTUV) increases glycogen accumulation (iodine staining and quantitative assays), indicating this module channels carbon into glycogen storage under certain conditions. (saito2024regulatoryroleof pages 12-15, saito2024regulatoryroleof media 190aa077)
Interpretation: these results strongly implicate the yegTUV operon—and therefore the yegV gene product as part of that locus—in glycogen-related carbon storage regulation, but they do not identify which yegTUV gene product(s) directly act on ADPG or glycogen metabolism. (saito2024regulatoryroleof pages 7-10, saito2024regulatoryroleof pages 12-15)
In the 2024 paper, yegV is described as “the putative sugar kinase” within yegTUV, and the authors explicitly discuss that any of the predicted activities (YegT transporter, YegU hydrolase, YegV kinase) could act on ADPG, providing a hypothesis for operon function. (saito2024regulatoryroleof pages 7-10)
Across the retrieved primary literature, there is no direct enzymology for YegV:
- no demonstrated substrate(s), products, EC number assignment, kinetic constants, or active-site mutagenesis;
- no direct evidence that YegV phosphorylates ADPG or a specific sugar;
- no direct subcellular localization experiments (e.g., fractionation/fluorescent tagging). (saito2024regulatoryroleof pages 7-10, chowdhury2021theproteininteractome pages 14-15)
Accordingly, the UniProt description “uncharacterized sugar kinase; PfkB family” should be treated as bioinformatic inference until validated experimentally in E. coli K-12. (crecylagard2025limitationsofcurrent pages 7-9)
Saito et al. conclude that GgaR/YegW is a single-target transcription factor for yegTUV, integrating an intracellular metabolite signal (ADPG) to tune operon expression and glycogen accumulation. (saito2024regulatoryroleof pages 10-12, saito2024regulatoryroleof pages 2-3)
Shimada et al. (J. Bacteriol., Feb 2011, https://doi.org/10.1128/JB.01214-10) report a Cra-associated site at/near yegTUV (Table of Cra-associated loci), suggesting the locus may also connect to global carbon metabolism regulation. The retrieved excerpt provides site-level association but not downstream functional consequences for yegV. (shimada2011novelmembersof pages 5-6)
No direct localization evidence for YegV was found. Given its predicted enzymatic role as a sugar kinase, YegV is plausibly cytosolic, but this remains unconfirmed in the retrieved literature. (saito2024regulatoryroleof pages 7-10)
The principal 2024 advance is the detailed characterization of the GgaR–ADPG–yegTUV regulatory axis, providing a concrete physiological context for yegV (as part of yegTUV) in glycogen accumulation and growth trade-offs in glucose minimal media. (saito2024regulatoryroleof pages 12-15, saito2024regulatoryroleof media 190aa077)
A 2024 bioRxiv preprint (version posted Oct 15, 2024, doi: 10.1101/2024.07.01.601547) argues that supervised ML approaches are not designed to predict functions of true unknowns and can produce systematic misannotations; the authors emphasize using multiple evidence types (e.g., gene neighborhood, structural active-site signatures) and reporting uncertainty. This is directly relevant to yegV because it remains uncharacterized and sits in an enzyme superfamily where paralog divergence can cause over-annotation errors. (crecylagard2025limitationsofcurrent pages 7-9, crecylagard2025limitationsofcurrent pages 1-4)
Because YegV’s molecular function is not experimentally established, there are no yegV-specific applied implementations supported by the retrieved literature. Instead, the operon-level regulatory findings suggest practical directions:
- Metabolic engineering / carbon storage control: manipulating ggaR/yegTUV expression could modulate glycogen accumulation in E. coli K-12, a phenotype relevant to carbon partitioning and stress physiology studies. (saito2024regulatoryroleof pages 12-15)
- Systems biology / central carbon regulation: the reported ADPG-responsive regulator provides a tractable model for studying metabolite-responsive transcriptional control in bacteria. (saito2024regulatoryroleof pages 10-12)
Consistent with the emphasis on integrating evidence types and avoiding overconfident predictions for unknown enzymes, the following experiments would directly answer the remaining questions:
- purified YegV enzymatic assays against a targeted panel of sugars/sugar-phosphates and nucleotide sugars (including ADPG-related hypotheses);
- genetic tests (ΔyegV and complemented strains) measuring glycogen, growth, and metabolite pools;
- subcellular localization by tagging/fractionation;
- structure/active-site motif comparison within PfkB family members, followed by mutagenesis. (crecylagard2025limitationsofcurrent pages 7-9, chowdhury2021theproteininteractome pages 14-15)
| Source (citation short) | Publication date | Evidence type | Key findings about yegV/yegTUV | What is NOT established (gaps) | URL/DOI |
|---|---|---|---|---|---|
| Saito et al., Microorganisms (saito2024regulatoryroleof pages 7-10, saito2024regulatoryroleof pages 2-3) | Jan 2024 | Regulatory, operon mapping, phenotype | yegV is annotated as a putative sugar kinase within the yegTUV operon; yegTUV is transcribed as a single ~3.3 kb mRNA and is directly repressed by YegW/GgaR; prior deletion data cited in the paper indicate loss of yegV reduced glycogen accumulation in Kornberg medium; authors propose yegT, yegU, or yegV could act on ADP-glucose (ADPG), linking the operon to glycogen metabolism. | No direct biochemical assay for YegV; no confirmed substrate specificity; no direct proof that YegV itself acts on ADPG; no PfkB/ribokinase motif validation in the paper; no localization data. | https://doi.org/10.3390/microorganisms12010115 |
| Saito et al., Microorganisms (saito2024regulatoryroleof pages 15-16, saito2024regulatoryroleof pages 7-10, saito2024regulatoryroleof pages 10-12, saito2024regulatoryroleof pages 12-15) | Jan 2024 | Effector biology, transcriptional regulation, glycogen phenotype | GgaR/YegW is a single-target repressor of yegTUV; ADPG weakens GgaR-DNA binding in gel-shift assays; deletion of ggaR causes ~30-fold higher yegTUV mRNA and markedly increased glycogen accumulation, supporting an ADPG-responsive regulatory link between yegTUV and glycogen storage. | These experiments support operon-level function, not YegV enzymology; no catalytic reaction, EC assignment, or cellular localization for YegV is demonstrated. | https://doi.org/10.3390/microorganisms12010115 |
| Chowdhury et al., Proteomes (chowdhury2021theproteininteractome pages 14-15) | Apr 2021 | High-throughput PPI, comparative conservation | YegV (UniProt P76419) is listed as an uncharacterized sugar kinase and a Top 10% conserved candidate interactor of the glycolytic enzyme GpmI, suggesting possible integration with central carbon metabolism. | Interaction is from high-throughput datasets only and is not independently validated; no substrate, reaction, pathway assignment, or localization is established. | https://doi.org/10.3390/proteomes9020016 |
| Shimada et al., J. Bacteriol. (shimada2011novelmembersof pages 5-6) | Feb 2011 | DNA binding / regulon mapping | A Cra-associated site is reported at/near yegTUV, indicating this locus may also be connected to broader carbon-metabolism regulation via Cra. | The excerpt does not provide direct expression changes, functional annotation of yegV, biochemical data, or proof of Cra-dependent regulation of yegV specifically. | https://doi.org/10.1128/JB.01214-10 |
Table: This table compiles the key evidence retrieved for E. coli K-12 yegV/yegTUV, separating direct findings from unresolved gaps. It is useful for distinguishing supported operon-level regulatory inferences from the still-unproven molecular function of YegV itself.
References
(saito2024regulatoryroleof pages 7-10): Shunsuke Saito, Ikki Kobayashi, Motoki Hoshina, Emi Uenaka, Atsushi Sakurai, Sousuke Imamura, and Tomohiro Shimada. Regulatory role of ggar (yegw) for glycogen accumulation in escherichia coli k-12. Microorganisms, 12:115, Jan 2024. URL: https://doi.org/10.3390/microorganisms12010115, doi:10.3390/microorganisms12010115. This article has 0 citations.
(saito2024regulatoryroleof pages 10-12): Shunsuke Saito, Ikki Kobayashi, Motoki Hoshina, Emi Uenaka, Atsushi Sakurai, Sousuke Imamura, and Tomohiro Shimada. Regulatory role of ggar (yegw) for glycogen accumulation in escherichia coli k-12. Microorganisms, 12:115, Jan 2024. URL: https://doi.org/10.3390/microorganisms12010115, doi:10.3390/microorganisms12010115. This article has 0 citations.
(chowdhury2021theproteininteractome pages 14-15): Shomeek Chowdhury, Stephen Hepper, Mudassir K. Lodi, Milton H. Saier, and Peter Uetz. The protein interactome of glycolysis in escherichia coli. Proteomes, 9:16, Apr 2021. URL: https://doi.org/10.3390/proteomes9020016, doi:10.3390/proteomes9020016. This article has 12 citations.
(crecylagard2025limitationsofcurrent pages 7-9): Valérie de Crécy-Lagard, Raquel Dias, Nick Sexson, Iddo Friedberg, Yifeng Yuan, and Manal A. Swairjo. Limitations of current machine-learning models in predicting enzymatic functions for uncharacterized proteins. BioRxiv, Jul 2025. URL: https://doi.org/10.1101/2024.07.01.601547, doi:10.1101/2024.07.01.601547. This article has 8 citations.
(saito2024regulatoryroleof media 12354a51): Shunsuke Saito, Ikki Kobayashi, Motoki Hoshina, Emi Uenaka, Atsushi Sakurai, Sousuke Imamura, and Tomohiro Shimada. Regulatory role of ggar (yegw) for glycogen accumulation in escherichia coli k-12. Microorganisms, 12:115, Jan 2024. URL: https://doi.org/10.3390/microorganisms12010115, doi:10.3390/microorganisms12010115. This article has 0 citations.
(saito2024regulatoryroleof pages 12-15): Shunsuke Saito, Ikki Kobayashi, Motoki Hoshina, Emi Uenaka, Atsushi Sakurai, Sousuke Imamura, and Tomohiro Shimada. Regulatory role of ggar (yegw) for glycogen accumulation in escherichia coli k-12. Microorganisms, 12:115, Jan 2024. URL: https://doi.org/10.3390/microorganisms12010115, doi:10.3390/microorganisms12010115. This article has 0 citations.
(saito2024regulatoryroleof media 190aa077): Shunsuke Saito, Ikki Kobayashi, Motoki Hoshina, Emi Uenaka, Atsushi Sakurai, Sousuke Imamura, and Tomohiro Shimada. Regulatory role of ggar (yegw) for glycogen accumulation in escherichia coli k-12. Microorganisms, 12:115, Jan 2024. URL: https://doi.org/10.3390/microorganisms12010115, doi:10.3390/microorganisms12010115. This article has 0 citations.
(saito2024regulatoryroleof pages 2-3): Shunsuke Saito, Ikki Kobayashi, Motoki Hoshina, Emi Uenaka, Atsushi Sakurai, Sousuke Imamura, and Tomohiro Shimada. Regulatory role of ggar (yegw) for glycogen accumulation in escherichia coli k-12. Microorganisms, 12:115, Jan 2024. URL: https://doi.org/10.3390/microorganisms12010115, doi:10.3390/microorganisms12010115. This article has 0 citations.
(shimada2011novelmembersof pages 5-6): Tomohiro Shimada, Kaneyoshi Yamamoto, and Akira Ishihama. Novel members of the cra regulon involved in carbon metabolism in escherichia coli. Journal of Bacteriology, 193:649-659, Feb 2011. URL: https://doi.org/10.1128/jb.01214-10, doi:10.1128/jb.01214-10. This article has 127 citations and is from a peer-reviewed journal.
(crecylagard2025limitationsofcurrent pages 1-4): Valérie de Crécy-Lagard, Raquel Dias, Nick Sexson, Iddo Friedberg, Yifeng Yuan, and Manal A. Swairjo. Limitations of current machine-learning models in predicting enzymatic functions for uncharacterized proteins. BioRxiv, Jul 2025. URL: https://doi.org/10.1101/2024.07.01.601547, doi:10.1101/2024.07.01.601547. This article has 8 citations.
(saito2024regulatoryroleof pages 15-16): Shunsuke Saito, Ikki Kobayashi, Motoki Hoshina, Emi Uenaka, Atsushi Sakurai, Sousuke Imamura, and Tomohiro Shimada. Regulatory role of ggar (yegw) for glycogen accumulation in escherichia coli k-12. Microorganisms, 12:115, Jan 2024. URL: https://doi.org/10.3390/microorganisms12010115, doi:10.3390/microorganisms12010115. This article has 0 citations.
id: P76419
gene_symbol: yegV
product_type: PROTEIN
status: COMPLETE
taxon:
id: NCBITaxon:83333
label: Escherichia coli (strain K12)
description: >-
Uncharacterized sugar kinase belonging to the PfkB/ribokinase carbohydrate kinase family
(COG0524). UniProt assigns EC 2.7.1.- indicating it is a phosphotransferase with an
alcohol group as acceptor, but the specific sugar substrate is unknown. The protein contains
the PfkB domain (Pfam PF00294) and matches InterPro signatures for the ribokinase/fructokinase
superfamily (IPR002139, IPR002173, IPR011611). CDD classifies it in the YegV_kinase_like
subfamily (cd01944). yegV is transcribed as part of the yegTUV operon (~3.3 kb), which is
repressed by the single-target regulator GgaR/YegW and derepressed by ADP-glucose (ADPG).
Deletion of ggaR increases yegTUV mRNA ~30-fold and increases glycogen accumulation,
linking YegV to carbon/glycogen metabolism. De Crecy-Lagard et al. 2025 (PMID:40703034)
highlighted YegV as a case where computational methods (DeepECTF) incorrectly predicted
the specific substrate as EC 2.7.1.92 (dehydro-2-deoxygluconokinase), when in fact that
activity belongs to KdgK (b3526). This illustrates the challenge of distinguishing
substrate specificity among nonisofunctional paralogs within the PfkB superfamily. No
experimental characterization of YegV enzymatic activity or substrate has been published
to date.
existing_annotations:
- term:
id: GO:0006796
label: phosphate-containing compound metabolic process
evidence_type: IEA
original_reference_id: GO_REF:0000117
review:
summary: >-
IEA annotation from ARBA machine learning models mapping YegV to phosphate-containing
compound metabolic process. This is a very broad biological process term that encompasses
any metabolic process involving phosphate groups. As a predicted kinase (EC 2.7.1.-),
YegV would indeed participate in phosphate-containing compound metabolism by catalyzing
phosphoryl transfer from ATP to a sugar substrate.
action: ACCEPT
reason: >-
This is a very general biological process annotation that is consistent with the
predicted kinase function of YegV. While broad, it is not incorrect for a member
of the PfkB kinase family. The annotation correctly reflects that YegV is involved
in some form of phosphorylation, even though the specific substrate and pathway
are unknown. For an uncharacterized enzyme, this level of generality is appropriate
and avoids over-annotation.
- term:
id: GO:0016301
label: kinase activity
evidence_type: IEA
original_reference_id: GO_REF:0000002
review:
summary: >-
IEA annotation from InterPro2GO mapping based on InterPro signatures IPR002139
(ribokinase/fructokinase) and IPR002173 (carbohydrate/purine kinase PfkB conserved
site). YegV belongs to the PfkB/ribokinase family, and UniProt assigns EC 2.7.1.-
(phosphotransferase with OH group as acceptor) based on sequence similarity.
action: MODIFY
reason: >-
While kinase activity (GO:0016301) is not wrong, a more informative term is available.
YegV is specifically annotated by UniProt as an "uncharacterized sugar kinase" belonging
to the PfkB carbohydrate kinase family (Pfam PF00294). CDD classifies it in the
YegV_kinase_like subfamily (cd01944). The term GO:0019200 (carbohydrate kinase
activity), defined as "catalysis of the transfer of a phosphate group to a carbohydrate",
would be more specific and appropriate. This term correctly captures the sugar kinase
function predicted from domain architecture without specifying an incorrect substrate.
De Crecy-Lagard et al. 2025 (PMID:40703034) specifically noted that while the first
three digits of the EC number (2.7.1) are reliably predicted for YegV, the specific
substrate (4th digit) is unknown, making GO:0019200 the right level of specificity.
proposed_replacement_terms:
- id: GO:0019200
label: carbohydrate kinase activity
supported_by:
- reference_id: PMID:40703034
supporting_text: >-
b2100 was annotated as a dehydro-2-deoxygluconokinase (EC 2.7.1.92) using
DeepECTF and as an uncharacterized sugar kinase YegV (EC 2.7.1.-) in UniProt
(Supplementary Table 1b). These 2 predictions differ by the fourth or last
position of the EC number that specifies substrate specificity. This protein
is a member of a superfamily of sugar kinases with multiple nonisofunctional
paralogous subgroups that phosphorylate different substrates
- term:
id: GO:0016772
label: transferase activity, transferring phosphorus-containing groups
evidence_type: IEA
original_reference_id: GO_REF:0000117
review:
summary: >-
IEA annotation from ARBA mapping YegV to transferase activity transferring
phosphorus-containing groups (GO:0016772). This is the direct parent of kinase
activity (GO:0016301) in the GO hierarchy: catalytic activity > transferase
activity > transferase activity, transferring phosphorus-containing groups >
kinase activity.
action: ACCEPT
reason: >-
This is a correct but general parent term for kinase activity. As YegV belongs
to the PfkB kinase family and UniProt assigns EC 2.7.1.-, it is indeed a
transferase that transfers phosphorus-containing groups. While less informative
than GO:0016301 (kinase activity) or GO:0019200 (carbohydrate kinase activity),
it is not wrong and is acceptable as a broader IEA annotation that is hierarchically
consistent with the more specific kinase annotations.
- term:
id: GO:0019200
label: carbohydrate kinase activity
evidence_type: ISS
original_reference_id: PMID:40703034
review:
summary: >-
New annotation proposed based on domain architecture and family membership. YegV
is a member of the PfkB/ribokinase carbohydrate kinase family (COG0524, Pfam PF00294).
UniProt names it "uncharacterized sugar kinase YegV" with EC 2.7.1.- (phosphotransferase
with OH acceptor, substrate unknown). CDD assigns it to the YegV_kinase_like subfamily
(cd01944). De Crecy-Lagard et al. 2025 (PMID:40703034) confirmed the first three
EC digits (2.7.1) are reliable for this protein, placing it firmly in the
carbohydrate kinase class, while noting the specific substrate remains unknown.
action: NEW
reason: >-
GO:0019200 (carbohydrate kinase activity) is the most informative molecular function
term that can be applied without over-specifying the substrate. This term captures
the sugar kinase function predicted from multiple independent lines of evidence:
PfkB domain (Pfam PF00294), InterPro ribokinase/fructokinase signatures (IPR002139),
COG0524 membership, CDD YegV_kinase_like classification (cd01944), and UniProt
naming as "uncharacterized sugar kinase." It avoids the error of specifying a
particular sugar substrate, which PMID:40703034 demonstrated leads to incorrect
annotations due to paralog confusion within this large superfamily.
supported_by:
- reference_id: PMID:40703034
supporting_text: >-
This protein is a member of a superfamily of sugar kinases with multiple
nonisofunctional paralogous subgroups that phosphorylate different substrates
(Supplementary Table 1d, bottom). The dehydro-2-deoxygluconokinase (EC 2.7.1.92)
activity is encoded by another member of this superfamily KdgK/b3526
(Supplementary Table 1d, bottom). Here, DeepECTF predicted correctly the first
3 digits of the EC number but not the last, making an overpropagation mistake
- reference_id: file:ECOLI/yegV/yegV-deep-research-falcon.md
supporting_text: Falcon deep research confirms yegV as a putative sugar kinase
in the yegTUV operon regulated by GgaR/YegW in response to ADPG, linking it
to glycogen metabolism. No direct enzymology for YegV exists; the carbohydrate
kinase annotation is the most specific term supported by domain architecture.
references:
- id: GO_REF:0000002
title: Gene Ontology annotation through association of InterPro records with GO terms
findings: []
- id: GO_REF:0000117
title: Electronic Gene Ontology annotations created by ARBA machine learning models
findings: []
- id: DOI:10.3390/microorganisms12010115
title: Regulatory role of GgaR (YegW) for glycogen accumulation in Escherichia
coli K-12.
findings:
- statement: yegV is part of the yegTUV operon (~3.3 kb transcript), repressed
by the single-target regulator GgaR/YegW. ADPG disrupts GgaR-DNA binding.
Deletion of ggaR causes ~30-fold higher yegTUV mRNA and increased glycogen
accumulation, linking the operon to carbon/glycogen metabolism.
supporting_text: yegV is described as the putative sugar kinase within the
yegTUV operon; deletion of ggaR yields ~30-fold higher yegTUV mRNA and
increased glycogen accumulation
- id: DOI:10.3390/proteomes9020016
title: The protein interactome of glycolysis in Escherichia coli.
findings:
- statement: YegV (P76419) appears as a highly conserved (Top 10%) candidate
interactor of the glycolytic enzyme GpmI in high-throughput PPI analysis,
suggesting possible integration with central carbon metabolism.
supporting_text: YegV is listed as an uncharacterized sugar kinase and a
Top 10% conserved candidate interactor of the glycolytic enzyme GpmI
- id: DOI:10.1128/JB.01214-10
title: Novel members of the Cra regulon involved in carbon metabolism in
Escherichia coli.
findings:
- statement: A Cra-associated site is reported at/near yegTUV, suggesting the
locus may connect to broader carbon-metabolism regulation via Cra.
supporting_text: Cra-associated site reported at/near yegTUV locus
- id: PMID:40703034
title: >-
Limitations of current machine learning models in predicting enzymatic functions
for uncharacterized proteins.
findings:
- statement: >-
YegV (b2100) is a member of a sugar kinase superfamily where DeepECTF incorrectly
predicted EC 2.7.1.92 (dehydro-2-deoxygluconokinase), an activity that actually
belongs to KdgK (b3526)
supporting_text: >-
For example, b2100 was annotated as a dehydro-2-deoxygluconokinase (EC 2.7.1.92)
using DeepECTF and as an uncharacterized sugar kinase YegV (EC 2.7.1.-) in UniProt
(Supplementary Table 1b)
- statement: >-
The specific substrate of YegV is unknown; only the first three EC digits
(2.7.1) are reliably predicted, while the fourth digit specifying substrate
was incorrectly propagated from a paralog
supporting_text: >-
These 2 predictions differ by the fourth or last position of the EC number that
specifies substrate specificity. This protein is a member of a superfamily of sugar
kinases with multiple nonisofunctional paralogous subgroups that phosphorylate
different substrates
- statement: >-
The dehydro-2-deoxygluconokinase activity is encoded by KdgK/b3526, not YegV,
and the misprediction illustrates overpropagation error from nonisofunctional
paralogs
supporting_text: >-
The dehydro-2-deoxygluconokinase (EC 2.7.1.92) activity is encoded by another
member of this superfamily KdgK/b3526 (Supplementary Table 1d, bottom). Here,
DeepECTF predicted correctly the first 3 digits of the EC number but not the last,
making an overpropagation mistake (error 6 in Table 1 and Fig. 1)
core_functions:
- description: >-
Predicted carbohydrate kinase activity - YegV is an uncharacterized member of the
PfkB/ribokinase carbohydrate kinase family that likely catalyzes phosphorylation
of an unknown sugar substrate using ATP as the phosphoryl donor. The specific sugar
substrate has not been experimentally determined.
molecular_function:
id: GO:0019200
label: carbohydrate kinase activity
directly_involved_in:
- id: GO:0006796
label: phosphate-containing compound metabolic process
supported_by:
- reference_id: PMID:40703034
supporting_text: >-
This protein is a member of a superfamily of sugar kinases with multiple
nonisofunctional paralogous subgroups that phosphorylate different substrates
suggested_questions:
- question: >-
What is the specific sugar substrate of YegV? Metabolomic profiling of yegV knockout
strains under various carbon sources could reveal accumulated substrates.
experts:
- de Crecy-Lagard V
- question: >-
Is there a specific growth condition or carbon source where yegV expression is induced,
which would provide clues to its physiological substrate?
experts:
- Mori H
suggested_experiments:
- hypothesis: >-
YegV phosphorylates a specific monosaccharide or sugar derivative that can be
identified through substrate screening
description: >-
Express and purify recombinant YegV, then screen for kinase activity against a panel
of monosaccharides and sugar derivatives (including ribose, fructose, galactose,
gluconate derivatives, and other PfkB family substrates) using a coupled enzyme
assay that monitors ADP production. Include divalent cations (Mg2+, Mn2+) and
monovalent cations (K+) as the PfkB family typically requires these for activity.
experiment_type: In vitro substrate screening with coupled enzyme assay
- hypothesis: >-
Deletion of yegV causes a growth defect under specific nutrient conditions that
reveal its physiological role
description: >-
Perform growth phenotyping of a yegV deletion strain on a diverse panel of carbon
sources (Biolog phenotype microarray) to identify conditions where YegV is required
for growth, thereby providing clues to its substrate and metabolic pathway.
experiment_type: Phenotype microarray carbon source screen