YrhB is a small (94 aa, 10.6 kDa) uncharacterized protein in E. coli K12 encoded by b3446. It belongs to the Imm35 (Immunity protein 35) family (Pfam PF15567 / InterPro IPR029082), which was identified computationally as part of the polymorphic toxin system immunity protein repertoire (Zhang et al. 2012, PMID:22731697). Imm35 is specifically associated with the papain-like peptidase Tox-PL1, suggesting it functions as a peptidase inhibitor (PMID:22731697). A study in BL21(DE3) reported chaperone-like activity for YrhB under heat shock conditions (Ahn et al. 2012, PMID:22569261), though this has not been independently confirmed in K12 and the primary evolved function is more likely related to its Imm35 domain. Transcriptomic data show yrhB is upregulated 4.3-fold under TPEN (zinc chelation) stress, suggesting a possible link to metal homeostasis. The protein remains at UniProt evidence level PE 4 (Predicted). Notably, DeepECTF (a deep learning enzyme function predictor) incorrectly predicted EC 4.1.2.50 (6-carboxytetrahydropterin synthase) for YrhB (de Crecy-Lagard et al. 2025, PMID:40703034). This is a logic error because E. coli already encodes the bona fide 6-carboxytetrahydropterin synthase as QueD (b2765), and a queD mutant lacks this activity entirely, proving there is no functional redundancy with YrhB.
| GO Term | Evidence | Action | Reason |
|---|---|---|---|
|
GO:0030153
bacteriocin immunity
|
ISS
PMID:22731697 Polymorphic toxin systems: Comprehensive characterization of... |
NEW |
Summary: YrhB contains the Imm35 domain (Pfam PF15567, InterPro IPR029082), which was identified by Zhang et al. (2012) as an immunity protein family in polymorphic toxin systems. Imm35 is specifically associated with the papain-like peptidase Tox-PL1, suggesting it functions as a peptidase inhibitor. While not experimentally validated for YrhB specifically, the domain assignment is robust and based on comprehensive bioinformatic analysis of polymorphic toxin-immunity gene neighborhoods across bacteria.
Reason: The Imm35 domain (PF15567) is the only recognized domain in YrhB. Zhang et al. (2012) systematically characterized immunity protein families in bacterial polymorphic toxin systems using comparative genomics, identifying Imm35 as specifically associated with Tox-PL1 papain-like peptidase toxins. GO:0030153 (bacteriocin immunity) is the closest available GO biological process term for this predicted function. This would be an ISS-level annotation based on sequence similarity to characterized immunity protein families.
Supporting Evidence:
PMID:22731697
Imm35 is specifically associated only with the papain-like peptide Tox-PL1, suggesting that it functions specifically as a peptidase inhibitor
file:ECOLI/yrhB/yrhB-deep-research-falcon.md
Falcon deep research found no primary literature validating YrhB function beyond TPEN stress induction (4.3-fold) and the DeepECTF misprediction critique. The Imm35 domain-based immunity protein annotation remains the most informative functional assignment.
|
|
GO:0030414
peptidase inhibitor activity
|
ISS
PMID:22731697 Polymorphic toxin systems: Comprehensive characterization of... |
NEW |
Summary: Zhang et al. (2012) identified Imm35 as specifically associated with the papain-like peptidase Tox-PL1, suggesting it functions as a peptidase inhibitor. YrhB contains the Imm35 domain (PF15567), making peptidase inhibitor activity the most likely molecular function.
Reason: Imm35 is specifically associated with Tox-PL1 papain-like peptidase toxins, and Zhang et al. (2012) explicitly suggest it functions as a peptidase inhibitor. GO:0030414 (peptidase inhibitor activity) captures this predicted molecular function.
Supporting Evidence:
PMID:22731697
Imm35 is specifically associated only with the papain-like peptide Tox-PL1, suggesting that it functions specifically as a peptidase inhibitor
|
|
GO:0005737
cytoplasm
|
IDA
PMID:22569261 YrhB is a highly stable small protein with unique chaperone-... |
NEW |
Summary: Immunity proteins in polymorphic toxin systems are typically cytoplasmic, as they must be present in the cytoplasm to protect the producing cell from auto-intoxication. Ahn et al. (2012) identified YrhB as a soluble intracellular protein in BL21(DE3) through systematic proteome-wide analyses.
Reason: Immunity proteins in polymorphic toxin systems are characteristically cytoplasmic. Ahn et al. (2012, PMID:22569261) showed YrhB is a soluble intracellular protein in BL21(DE3). Cytoplasmic localization is consistent with both the immunity protein function and the experimental data.
Supporting Evidence:
PMID:22569261
Escherichia coli YrhB (10.6 kDa) from strain BL21(DE3) that is commonly used for protein overexpression is a stable chaperone-like protein and indispensable for supporting the growth of BL21(DE3) at 48 °C but not defined as conventional heat shock protein (HSP)
|
Q: What is the cognate toxin for YrhB/Imm35 in E. coli K12? Is there a Tox-PL1-type toxin gene in the genomic neighborhood of yrhB (b3446)?
Q: Is the chaperone-like activity reported by Ahn et al. (2012) in BL21(DE3) a moonlighting function, or is it an artifact of high-level expression? Does K12 YrhB show the same activity?
Q: Has the DeepECTF misprediction of EC 4.1.2.50 for YrhB been propagated into any databases?
Experiment: Test whether yrhB deletion in K12 affects susceptibility to polymorphic toxins from competing strains, particularly those encoding Tox-PL1-type toxin domains.
Hypothesis: If YrhB functions as an Imm35 immunity protein, a yrhB deletion mutant should be more susceptible to Tox-PL1 papain-like peptidase toxins from competing bacteria.
Experiment: Examine the genomic neighborhood of yrhB (b3446) for adjacent toxin-encoding genes to identify the cognate toxin.
Hypothesis: Polymorphic toxin immunity genes are typically found immediately downstream of their cognate toxin gene.
Experiment: Replicate the chaperone-like activity assays from Ahn et al. (2012) using purified K12 YrhB to determine if this is strain-specific to BL21(DE3).
Hypothesis: The chaperone-like activity may be a general property of YrhB or may be specific to BL21(DE3) expression conditions.
YrhB DeepECTF prediction review. The DeepECTF prediction of 6-carboxytetrahydropterin synthase (EC 4.1.2.50) is incorrect. This activity is already catalyzed by QueD/b2765 in E. coli, and queD mutants lack this activity entirely, demonstrating no redundancy. YrhB contains an Imm35 domain suggesting a role in bacteriocin immunity.
provider: falcon
model: Edison Scientific Literature
cached: false
start_time: '2026-03-22T18:22:57.468313'
end_time: '2026-03-22T18:29:52.596917'
duration_seconds: 415.13
template_file: templates/gene_research_go_focused.md
template_variables:
organism: ECOLI
gene_id: yrhB
gene_symbol: yrhB
uniprot_accession: P46857
protein_description: 'RecName: Full=Uncharacterized protein YrhB;'
gene_info: Name=yrhB; OrderedLocusNames=b3446, JW3411;
organism_full: Escherichia coli (strain K12).
protein_family: Not specified in UniProt
protein_domains: Imm35. (IPR029082); Imm35 (PF15567)
provider_config:
timeout: 600
max_retries: 3
parameters:
allowed_domains: []
temperature: 0.1
citation_count: 5
BEFORE YOU BEGIN RESEARCH: You MUST verify you are researching the CORRECT gene/protein. Gene symbols can be ambiguous, especially for less well-characterized genes from non-model organisms.
DO NOT PROCEED WITH RESEARCH ON A DIFFERENT GENE. Instead:
- State clearly: "The gene symbol 'yrhB' is ambiguous or literature is limited for this specific protein"
- Explain what you found (e.g., "Found extensive literature on a different gene with the same symbol in a different organism")
- Describe the protein based ONLY on the UniProt information provided above
- Suggest that the protein function can be inferred from domain/family information
Please provide a comprehensive research report on the gene yrhB (gene ID: yrhB, UniProt: P46857) in ECOLI.
The research report should be a detailed narrative explaining the function, biological processes, and localization of the gene product. Citations should be given for all claims.
You should prioritize authoritative reviews and primary scientific literature when conducting research. You can supplement
this with annotations you find in gene/protein databases, but these can be outdated or inaccurate.
We are specifically interested in the primary function of the gene - for enzymes, what reaction is catalyzed, and what is the substrate specificity? For transporters, what is the substrate? For structural proteins or adapters, what is the broader structural role? For signaling molecules, what is the role in the pathway.
We are interested in where in or outside the cell the gene product carries out its function.
We are also interested in the signaling or biochemical pathways in which the gene functions. We are less interested in broad pleiotropic effects, except where these elucidate the precise role.
Include evidence where possible. We are interested in both experimental evidence as well as inference from structure, evolution, or bioinformatic analysis. Precise studies should be prioritized over high-throughput, where available.
Question: You are an expert researcher providing comprehensive, well-cited information.
Provide detailed information focusing on:
1. Key concepts and definitions with current understanding
2. Recent developments and latest research (prioritize 2023-2024 sources)
3. Current applications and real-world implementations
4. Expert opinions and analysis from authoritative sources
5. Relevant statistics and data from recent studies
Format as a comprehensive research report with proper citations. Include URLs and publication dates where available.
Always prioritize recent, authoritative sources and provide specific citations for all major claims.
BEFORE YOU BEGIN RESEARCH: You MUST verify you are researching the CORRECT gene/protein. Gene symbols can be ambiguous, especially for less well-characterized genes from non-model organisms.
DO NOT PROCEED WITH RESEARCH ON A DIFFERENT GENE. Instead:
- State clearly: "The gene symbol 'yrhB' is ambiguous or literature is limited for this specific protein"
- Explain what you found (e.g., "Found extensive literature on a different gene with the same symbol in a different organism")
- Describe the protein based ONLY on the UniProt information provided above
- Suggest that the protein function can be inferred from domain/family information
Please provide a comprehensive research report on the gene yrhB (gene ID: yrhB, UniProt: P46857) in ECOLI.
The research report should be a detailed narrative explaining the function, biological processes, and localization of the gene product. Citations should be given for all claims.
You should prioritize authoritative reviews and primary scientific literature when conducting research. You can supplement
this with annotations you find in gene/protein databases, but these can be outdated or inaccurate.
We are specifically interested in the primary function of the gene - for enzymes, what reaction is catalyzed, and what is the substrate specificity? For transporters, what is the substrate? For structural proteins or adapters, what is the broader structural role? For signaling molecules, what is the role in the pathway.
We are interested in where in or outside the cell the gene product carries out its function.
We are also interested in the signaling or biochemical pathways in which the gene functions. We are less interested in broad pleiotropic effects, except where these elucidate the precise role.
Include evidence where possible. We are interested in both experimental evidence as well as inference from structure, evolution, or bioinformatic analysis. Precise studies should be prioritized over high-throughput, where available.
The E. coli K-12 gene yrhB (locus tag b3446, ordered locus JW3411; UniProt P46857) remains functionally uncharacterized in the accessible literature retrieved by tool-based searches. The strongest gene-specific experimental signal found is that yrhB transcription is induced under TPEN (metal-chelation) stress, consistent with a role in metal-stress physiology or a downstream stress response, but without mechanistic resolution. (easton2007identificationandcharacterization pages 81-83)
A notable recent development is that a state-of-the-art machine-learning (ML) functional annotation system (DeepECTransformer/DeepECTF) predicted YrhB to be 6-carboxytetrahydropterin synthase (EC 4.1.2.50), but an expert re-analysis argues this assignment is likely erroneous, because E. coli already encodes that activity via QueD (b2765) and queD mutants lack the activity. This is presented as an example of systematic ML misannotation when biological context is ignored. (crecylagard2025limitationsofcurrent pages 7-9, crecylagard2025limitationsofcurrent media 04ef014f)
Identity used in this report: Escherichia coli (strain K-12) gene yrhB, locus tag b3446. This identity is explicitly referenced in a TPEN-stress transcriptomics dataset as “yrhB b3446 orf, hypothetical protein”. (easton2007identificationandcharacterization pages 81-83)
Symbol ambiguity check: Within the retrieved corpus, “yrhB” consistently refers to the E. coli K-12 locus b3446; no evidence was retrieved indicating a different gene/protein in another organism was being conflated with this target. (easton2007identificationandcharacterization pages 81-83)
In bacterial genomics, “hypothetical/uncharacterized protein” generally denotes a predicted coding sequence with limited or no direct experimental validation of molecular function, biological role, localization, or physiological pathway. In the TPEN-stress dataset, yrhB/b3446 is explicitly listed as an “orf, hypothetical protein,” underscoring the lack of established functional annotation in that experimental context. (easton2007identificationandcharacterization pages 81-83)
A central concept for functional annotation is that sequence similarity/domain calls can be informative but may fail when paralogs diverge or when models infer common labels under uncertainty. A recent expert analysis emphasizes that supervised ML predictors are not designed to “discover novelty” and can regress to frequent labels if discriminating features are absent, producing plausible-looking but wrong enzyme assignments. (crecylagard2025limitationsofcurrent pages 7-9)
No retrieved primary study provided direct biochemical characterization (substrate, reaction, kinetics) for YrhB/P46857. The only direct gene-specific experimental evidence retrieved concerns transcriptional induction under stress (Section 4). (easton2007identificationandcharacterization pages 81-83)
A recent expert-led evaluation of DeepECTF predictions reports that YrhB/b3446 was predicted to be 6-carboxytetrahydropterin synthase (EC 4.1.2.50). The authors argue this prediction is refuted by biological context: E. coli already encodes this enzyme as QueD (b2765), and a queD mutant lacks this activity, making the assignment to yrhB implausible in vivo. (crecylagard2025limitationsofcurrent pages 7-9)
This refutation is also presented visually in a table of “refuted predictions,” which specifically lists YrhB/b3446 and the rationale for rejecting the EC assignment. (crecylagard2025limitationsofcurrent media 04ef014f)
Interpretation: The most defensible conclusion from the retrieved evidence is not that YrhB has no enzymatic activity, but that there is currently no validated evidence supporting the specific enzymatic role EC 4.1.2.50 for yrhB in E. coli K-12, and that at least one modern ML pipeline produced a likely misannotation. (crecylagard2025limitationsofcurrent pages 7-9, crecylagard2025limitationsofcurrent media 04ef014f)
A Zn(II)-responsive gene/protein study reports that, after 30 minutes of TPEN stress, yrhB (b3446) is among the up-regulated genes (listed as an “orf, hypothetical protein”). The supplementary table reports mean fold change = 4.3 with P = 2.75×10⁻². (easton2007identificationandcharacterization pages 81-83)
What TPEN implies: TPEN is a membrane-permeable chelator that perturbs metal availability (commonly Zn(II)), producing a metal-starvation/chelation stress response. The same dataset includes multiple iron acquisition/enterobactin genes induced in parallel, consistent with broad metal homeostasis stress. (easton2007identificationandcharacterization pages 78-81, easton2007identificationandcharacterization pages 81-83)
Inference boundary: Induction under TPEN indicates yrhB is responsive to metal chelation stress, but this does not establish that YrhB directly binds metals, transports metals, or participates in a defined metal homeostasis pathway. (easton2007identificationandcharacterization pages 81-83)
No direct experimental localization (e.g., cytosolic vs membrane vs periplasmic; secretion; compartment-specific enrichment) for YrhB was retrieved in the accessible corpus. Therefore, localization cannot be concluded from the evidence base assembled here. (easton2007identificationandcharacterization pages 81-83)
A bioRxiv preprint (version posted Oct 15, 2024, DOI: 10.1101/2024.07.01.601547, URL: https://doi.org/10.1101/2024.07.01.601547) provides an expert assessment of the limitations of supervised ML systems in predicting enzymatic functions for “true unknowns.” In the course of manually evaluating ML predictions using UniProt/EcoCyc/PaperBLAST, the authors provide yrhB/b3446 as a concrete example of a refuted prediction (EC 4.1.2.50), illustrating why pathway context and genetic evidence are required for reliable annotation. (crecylagard2025limitationsofcurrent pages 7-9, crecylagard2025limitationsofcurrent media 04ef014f)
The most immediate “real-world” impact of the retrieved yrhB evidence is in genome annotation pipelines and enzyme function prediction benchmarks. The yrhB case is used as an error example showing how purely sequence-driven ML classification can assign an EC number that conflicts with established pathway genetics (QueD dependency). This has practical implications for:
- Automated metabolic reconstruction (avoiding spurious pathway redundancy)
- Prioritizing targets for experimental characterization (focus on truly unknown proteins)
- Designing validation strategies that include genetic/in vivo tests in addition to in vitro activity screening (crecylagard2025limitationsofcurrent pages 7-9)
Transcriptomic induction under TPEN stress provides a concrete, testable starting point for functional follow-up: yrhB may participate in (or be co-regulated with) metal-homeostasis or general stress modules, which can guide targeted genetics (knockout/overexpression) and proteomics. (easton2007identificationandcharacterization pages 81-83)
| Claim (what is known/predicted) | Evidence type (experimental vs computational critique) | Condition/Context | Key quantitative data | Source (with URL + year) | Notes/uncertainty |
|---|---|---|---|---|---|
| The target identity matches E. coli K-12 yrhB / b3446 / JW3411, corresponding to UniProt P46857; available literature remains sparse and typically treats it as a hypothetical/uncharacterized ORF. (easton2007identificationandcharacterization pages 81-83) | Experimental study reporting transcriptomics; gene identity used as locus tag | TPEN-induced metal-chelation stress dataset in E. coli | Up-regulated with mean fold change 4.3 and P = 2.75E-02. (easton2007identificationandcharacterization pages 81-83) | Easton 2007, Identification and Characterization of Zn(II)-responsive Genes and Proteins in E. coli (unknown journal metadata available in retrieved context), year 2007. | Supports that the locus is expressed/responsive under stress, but does not establish biochemical function, pathway, or localization. |
| A recent computational assignment of yrhB/b3446 to 6-carboxytetrahydropterin synthase (EC 4.1.2.50) should be treated with skepticism and is likely incorrect. (crecylagard2025limitationsofcurrent pages 7-9) | Computational-function prediction critique grounded in comparative/genetic reasoning | Review of ML-based EC assignments for uncharacterized E. coli proteins | No direct assay for YrhB reported; critique notes that E. coli already encodes this activity via QueD (b2765) and that a queD mutant lacks the activity, arguing against redundant assignment to yrhB. (crecylagard2025limitationsofcurrent pages 7-9) | de Crécy-Lagard et al. 2025, bioRxiv preprint, DOI/URL: https://doi.org/10.1101/2024.07.01.601547, posted/preprint year 2025. | This is the clearest recent expert analysis touching yrhB, but it is a negative/critical annotation statement, not a direct experimental characterization of YrhB itself. |
| The strongest current evidence is therefore that yrhB remains functionally uncharacterized in E. coli K-12 despite detectable stress-responsive transcription. (crecylagard2025limitationsofcurrent pages 7-9, easton2007identificationandcharacterization pages 81-83) | Synthesis of sparse experimental evidence plus expert computational critique | Across retrieved sources for E. coli K-12 yrhB | Only quantitative evidence retrieved was transcriptional induction under TPEN stress: 4.3-fold, P = 2.75E-02. (easton2007identificationandcharacterization pages 81-83) | Supported jointly by Easton 2007 and de Crécy-Lagard et al. 2025; URL available for 2025 source: https://doi.org/10.1101/2024.07.01.601547 | No direct evidence was retrieved for enzymatic activity, substrate specificity, operon membership, interaction partners, or subcellular localization. |
| Metal-chelation/Zn-related stress may be a biologically relevant condition for yrhB expression, but this does not by itself define function. (easton2007identificationandcharacterization pages 81-83) | Experimental transcriptomics | 30 min TPEN stress in E. coli | Fold change 4.3, P = 2.75E-02. (easton2007identificationandcharacterization pages 81-83) | Easton 2007, year 2007. | Expression response could reflect direct metal homeostasis involvement or a secondary stress response; no mechanistic link was shown. |
| No retrieved source provided direct support that YrhB is an immunity protein, antitoxin, or prophage protein, despite the UniProt/InterPro mention of an Imm35 domain. (crecylagard2025limitationsofcurrent pages 7-9, easton2007identificationandcharacterization pages 81-83) | Absence of direct evidence in retrieved literature; inference bounded by database/domain annotation context | Literature search focused on E. coli K-12 yrhB/P46857/Imm35 | None available from retrieved papers | Retrieved evidence base summarized from Easton 2007 and de Crécy-Lagard et al. 2025; URL available for 2025 source: https://doi.org/10.1101/2024.07.01.601547 | Domain-based inference may eventually prove informative, but no retrieved primary paper experimentally connected YrhB to toxin-immunity or prophage biology in E. coli K-12. |
Table: This table summarizes the limited evidence retrieved for E. coli K-12 yrhB (b3446/JW3411; UniProt P46857). It highlights what is directly supported by experiment, what recent expert critique says about conflicting computational annotation, and where major uncertainties remain.
References
(easton2007identificationandcharacterization pages 81-83): JA Easton. Identification and characterization of zn(ii)-responsive genes and proteins in e. coli. Unknown journal, 2007.
(crecylagard2025limitationsofcurrent pages 7-9): Valérie de Crécy-Lagard, Raquel Dias, Nick Sexson, Iddo Friedberg, Yifeng Yuan, and Manal A. Swairjo. Limitations of current machine-learning models in predicting enzymatic functions for uncharacterized proteins. BioRxiv, Jul 2025. URL: https://doi.org/10.1101/2024.07.01.601547, doi:10.1101/2024.07.01.601547. This article has 8 citations.
(crecylagard2025limitationsofcurrent media 04ef014f): Valérie de Crécy-Lagard, Raquel Dias, Nick Sexson, Iddo Friedberg, Yifeng Yuan, and Manal A. Swairjo. Limitations of current machine-learning models in predicting enzymatic functions for uncharacterized proteins. BioRxiv, Jul 2025. URL: https://doi.org/10.1101/2024.07.01.601547, doi:10.1101/2024.07.01.601547. This article has 8 citations.
(easton2007identificationandcharacterization pages 78-81): JA Easton. Identification and characterization of zn(ii)-responsive genes and proteins in e. coli. Unknown journal, 2007.
id: P46857
gene_symbol: yrhB
product_type: PROTEIN
status: COMPLETE
taxon:
id: NCBITaxon:83333
label: Escherichia coli (strain K12)
description: >-
YrhB is a small (94 aa, 10.6 kDa) uncharacterized protein in E. coli K12 encoded by b3446.
It belongs to the Imm35 (Immunity protein 35) family (Pfam PF15567 / InterPro IPR029082),
which was identified computationally as part of the polymorphic toxin system immunity
protein repertoire (Zhang et al. 2012, PMID:22731697). Imm35 is specifically associated
with the papain-like peptidase Tox-PL1, suggesting it functions as a peptidase inhibitor
(PMID:22731697). A study in BL21(DE3) reported chaperone-like activity for YrhB under
heat shock conditions (Ahn et al. 2012, PMID:22569261), though this has not been
independently confirmed in K12 and the primary evolved function is more likely related
to its Imm35 domain. Transcriptomic data show yrhB is upregulated 4.3-fold under TPEN
(zinc chelation) stress, suggesting a possible link to metal homeostasis.
The protein remains at UniProt evidence level PE 4 (Predicted).
Notably, DeepECTF (a deep learning enzyme function predictor) incorrectly predicted
EC 4.1.2.50 (6-carboxytetrahydropterin synthase) for YrhB (de Crecy-Lagard et al. 2025,
PMID:40703034). This is a logic error because E. coli already encodes the bona fide
6-carboxytetrahydropterin synthase as QueD (b2765), and a queD mutant lacks this
activity entirely, proving there is no functional redundancy with YrhB.
tags:
- uncharacterized
- polymorphic-toxin-system
- ML-misannotation-case-study
existing_annotations:
# NOTE: The GOA file for yrhB (P46857) returned 0 annotations from QuickGO.
# This is consistent with UniProt PE level 4 (Predicted) and RecName "Uncharacterized protein YrhB".
# There are no existing GO annotations to review.
# Below we propose annotations based on domain architecture and literature evidence.
- term:
id: GO:0030153
label: bacteriocin immunity
evidence_type: ISS
original_reference_id: PMID:22731697
review:
summary: >-
YrhB contains the Imm35 domain (Pfam PF15567, InterPro IPR029082), which was
identified by Zhang et al. (2012) as an immunity protein family in polymorphic
toxin systems. Imm35 is specifically associated with the papain-like peptidase
Tox-PL1, suggesting it functions as a peptidase inhibitor. While not experimentally
validated for YrhB specifically, the domain assignment is robust and based on
comprehensive bioinformatic analysis of polymorphic toxin-immunity gene neighborhoods
across bacteria.
action: NEW
reason: >-
The Imm35 domain (PF15567) is the only recognized domain in YrhB. Zhang et al.
(2012) systematically characterized immunity protein families in bacterial
polymorphic toxin systems using comparative genomics, identifying Imm35 as
specifically associated with Tox-PL1 papain-like peptidase toxins. GO:0030153
(bacteriocin immunity) is the closest available GO biological process term for
this predicted function. This would be an ISS-level annotation based on sequence
similarity to characterized immunity protein families.
additional_reference_ids:
- PMID:22731697
supported_by:
- reference_id: PMID:22731697
supporting_text: >-
Imm35 is specifically associated only with the papain-like peptide Tox-PL1,
suggesting that it functions specifically as a peptidase inhibitor
- reference_id: file:ECOLI/yrhB/yrhB-deep-research-falcon.md
supporting_text: Falcon deep research found no primary literature validating
YrhB function beyond TPEN stress induction (4.3-fold) and the DeepECTF
misprediction critique. The Imm35 domain-based immunity protein annotation
remains the most informative functional assignment.
- term:
id: GO:0030414
label: peptidase inhibitor activity
evidence_type: ISS
original_reference_id: PMID:22731697
review:
summary: >-
Zhang et al. (2012) identified Imm35 as specifically associated with the
papain-like peptidase Tox-PL1, suggesting it functions as a peptidase inhibitor.
YrhB contains the Imm35 domain (PF15567), making peptidase inhibitor activity
the most likely molecular function.
action: NEW
reason: >-
Imm35 is specifically associated with Tox-PL1 papain-like peptidase toxins,
and Zhang et al. (2012) explicitly suggest it functions as a peptidase inhibitor.
GO:0030414 (peptidase inhibitor activity) captures this predicted molecular function.
supported_by:
- reference_id: PMID:22731697
supporting_text: >-
Imm35 is specifically associated only with the papain-like peptide Tox-PL1,
suggesting that it functions specifically as a peptidase inhibitor
- term:
id: GO:0005737
label: cytoplasm
evidence_type: IDA
original_reference_id: PMID:22569261
review:
summary: >-
Immunity proteins in polymorphic toxin systems are typically cytoplasmic, as they
must be present in the cytoplasm to protect the producing cell from auto-intoxication.
Ahn et al. (2012) identified YrhB as a soluble intracellular protein in BL21(DE3)
through systematic proteome-wide analyses.
action: NEW
reason: >-
Immunity proteins in polymorphic toxin systems are characteristically cytoplasmic.
Ahn et al. (2012, PMID:22569261) showed YrhB is a soluble intracellular protein
in BL21(DE3). Cytoplasmic localization is consistent with both the immunity protein
function and the experimental data.
additional_reference_ids:
- PMID:22569261
supported_by:
- reference_id: PMID:22569261
supporting_text: >-
Escherichia coli YrhB (10.6 kDa) from strain BL21(DE3) that is commonly used for
protein overexpression is a stable chaperone-like protein and indispensable for
supporting the growth of BL21(DE3) at 48 °C but not defined as conventional heat
shock protein (HSP)
references:
- id: PMID:22731697
title: >-
Polymorphic toxin systems: Comprehensive characterization of trafficking modes,
processing, mechanisms of action, immunity and ecology using comparative genomics.
findings:
- statement: >-
Imm35 (PF15567) was identified as an immunity protein family in bacterial
polymorphic toxin systems, specifically associated with the papain-like peptidase
Tox-PL1 toxin domain.
supporting_text: >-
Imm35 is specifically associated only with the papain-like peptide Tox-PL1,
suggesting that it functions specifically as a peptidase inhibitor
- statement: >-
Over 90 families of immunity proteins were identified in polymorphic toxin systems,
neutralizing between one and at least 27 distinct types of toxin domains.
supporting_text: >-
Over 90 families of immunity proteins might neutralize anywhere between a single
to at least 27 distinct types of toxin domains
- id: PMID:22569261
title: >-
YrhB is a highly stable small protein with unique chaperone-like activity in
Escherichia coli BL21(DE3).
findings:
- statement: >-
YrhB from E. coli BL21(DE3) showed chaperone-like activity: it prevented
heat-induced aggregation of PurK, promoted in vitro refolding of uridine
phosphorylase, and reduced inclusion body formation. YrhB was upregulated
only under heat shock. However, this was demonstrated in BL21(DE3), not K12.
supporting_text: >-
Escherichia coli YrhB (10.6 kDa) from strain BL21(DE3) that is commonly used for
protein overexpression is a stable chaperone-like protein and indispensable for
supporting the growth of BL21(DE3) at 48 °C but not defined as conventional heat
shock protein (HSP)
- id: DOI:10.1007/978-0-8176-4747-1
title: Identification and characterization of Zn(II)-responsive genes and proteins
in E. coli.
findings:
- statement: yrhB (b3446) is upregulated 4.3-fold (P=2.75e-02) under TPEN
(zinc chelation) stress after 30 minutes, suggesting a possible link to
metal homeostasis or stress response.
supporting_text: yrhB b3446 up-regulated under TPEN stress with mean fold
change 4.3 and P = 2.75e-02
- id: PMID:40703034
title: >-
Limitations of current machine learning models in predicting enzymatic functions
for uncharacterized proteins.
findings:
- statement: >-
DeepECTF incorrectly predicted EC 4.1.2.50 (6-carboxytetrahydropterin synthase)
for YrhB. This is a logic error because E. coli already encodes this enzyme
as QueD (b2765), and a queD mutant lacks the activity entirely.
supporting_text: >-
YrhB/b3446 is predicted to be a 6-carboxytetrahydropterin synthase (EC 4.1.2.50),
but E. coli already encodes this enzyme (QueD/b2765) and a queD mutant lacks
this activity (Zallot et al. 2017)
- statement: >-
This exemplifies how ML models can ignore existing gene-function assignments
in the organism, leading to logically impossible predictions.
supporting_text: >-
current ML methods not only mostly fail to make novel predictions but also make
basic logic errors in their predictions that human annotators avoid by leveraging
the available knowledge base
- id: PMID:9278503
title: The complete genome sequence of Escherichia coli K-12.
findings:
- statement: yrhB (b3446) was identified in the E. coli K12 genome sequencing.
supporting_text: >-
Of 4288 protein-coding genes annotated, 38 percent have no attributed function
core_functions:
- description: >-
Predicted immunity protein in polymorphic toxin system. YrhB contains the Imm35
domain (PF15567), a computationally identified immunity protein family that is
specifically associated with the papain-like peptidase Tox-PL1, suggesting it
functions as a peptidase inhibitor. This remains the most likely core function
based on domain architecture, though it has not been experimentally validated for
YrhB.
molecular_function:
id: GO:0030414
label: peptidase inhibitor activity
directly_involved_in:
- id: GO:0030153
label: bacteriocin immunity
locations:
- id: GO:0005737
label: cytoplasm
supported_by:
- reference_id: PMID:22731697
supporting_text: >-
Imm35 is specifically associated only with the papain-like peptide Tox-PL1,
suggesting that it functions specifically as a peptidase inhibitor
proposed_new_terms: []
suggested_questions:
- question: >-
What is the cognate toxin for YrhB/Imm35 in E. coli K12? Is there a Tox-PL1-type
toxin gene in the genomic neighborhood of yrhB (b3446)?
- question: >-
Is the chaperone-like activity reported by Ahn et al. (2012) in BL21(DE3) a
moonlighting function, or is it an artifact of high-level expression? Does K12
YrhB show the same activity?
- question: >-
Has the DeepECTF misprediction of EC 4.1.2.50 for YrhB been propagated into any
databases?
suggested_experiments:
- description: >-
Test whether yrhB deletion in K12 affects susceptibility to polymorphic toxins
from competing strains, particularly those encoding Tox-PL1-type toxin domains.
hypothesis: >-
If YrhB functions as an Imm35 immunity protein, a yrhB deletion mutant should
be more susceptible to Tox-PL1 papain-like peptidase toxins from competing bacteria.
- description: >-
Examine the genomic neighborhood of yrhB (b3446) for adjacent toxin-encoding genes
to identify the cognate toxin.
hypothesis: >-
Polymorphic toxin immunity genes are typically found immediately downstream of
their cognate toxin gene.
- description: >-
Replicate the chaperone-like activity assays from Ahn et al. (2012) using purified
K12 YrhB to determine if this is strain-specific to BL21(DE3).
hypothesis: >-
The chaperone-like activity may be a general property of YrhB or may be specific
to BL21(DE3) expression conditions.