celX

UniProt ID: P15329
Organism: Acetivibrio thermocellus
Review Status: DRAFT
📝 Provide Detailed Feedback

Gene Description

CelX is a cellulosome-associated protein from Acetivibrio thermocellus that contains an SGNH hydrolase domain and a type I dockerin domain. Despite being annotated as a "putative endoglucanase," the protein's domain architecture strongly suggests esterase rather than cellulase activity. The SGNH hydrolase fold (InterPro: IPR013830, Pfam: Lipase_GDSL_2) is characteristic of serine esterases/lipases, not glycoside hydrolases. The dockerin domain enables attachment to the cellulosome scaffoldin, suggesting a role in lignocellulose degradation, but as an accessory esterase (possibly a feruloyl esterase or acetylxylan esterase) rather than a true cellulase. The "cellulase" annotation appears to be a historical misannotation based on genomic context (proximity to celE) rather than biochemical characterization of CelX itself. The original paper (PMID:3066698) primarily characterizes CelE, mentioning celX only as a secondary ORF identified in the upstream region.

Existing Annotations Review

GO Term Evidence Action Reason
GO:0000272 polysaccharide catabolic process
IEA
GO_REF:0000120
KEEP AS NON CORE
Summary: This annotation is based on the presence of the dockerin domain (InterPro: IPR002105, IPR016134, IPR036439) which is associated with cellulosome components. While CelX is likely part of the cellulosome complex and therefore participates in polysaccharide degradation, the role would be as an accessory esterase (removing ester-linked substituents from polysaccharides) rather than directly cleaving glycosidic bonds. The annotation is acceptable as a broad descriptor of the biological context but may be an over-annotation since it implies direct polysaccharide backbone cleavage.
Reason: The dockerin domain indicates cellulosome association, suggesting involvement in plant cell wall degradation. However, based on the SGNH hydrolase catalytic domain, the protein likely functions as an accessory esterase that assists in polysaccharide degradation by removing ester-linked side groups rather than cleaving the polysaccharide backbone directly. This is a peripheral rather than core function description.
GO:0004553 hydrolase activity, hydrolyzing O-glycosyl compounds
IEA
GO_REF:0000002
REMOVE
Summary: This annotation is derived from InterPro:IPR002105 (Dockerin type I repeat), which is a non-catalytic domain for cellulosome attachment. The dockerin domain does not confer glycoside hydrolase activity; it is a protein-protein interaction module that binds to cohesin domains on the scaffoldin. This annotation is a clear example of guilt-by-association: because dockerin domains are found in cellulases, the presence of dockerin led to this glycoside hydrolase annotation, even though the catalytic domain (SGNH hydrolase) does not hydrolyze O-glycosyl compounds.
Reason: The SGNH hydrolase domain (Pfam: Lipase_GDSL_2, InterPro: IPR013830) present in CelX is characteristic of serine esterases, not glycoside hydrolases. SGNH hydrolases catalyze ester bond cleavage using a Ser-His-Asp catalytic triad, while glycoside hydrolases use different mechanisms (retaining/inverting) and have distinct catalytic residues. The annotation is based on the dockerin domain which only indicates cellulosome association, not enzymatic activity. The original publication (PMID:3066698) does not provide experimental evidence for glycoside hydrolase activity of CelX.
Supporting Evidence:
PMID:3066698
A second ORF which ends 349 bp 5' to the GTG start codon of the celE gene has also been identified. The encoded product contains a C terminus homologous to other C. thermocellum endoglucanases. [Note: This only refers to the dockerin domain homology, not the catalytic domain]
GO:0004622 phosphatidylcholine lysophospholipase A1 activity
IEA
GO_REF:0000118
MODIFY
Summary: This annotation is derived from TreeGrafter phylogenetic inference based on the SGNH hydrolase domain (PANTHER family PTN002411393 - lysophospholipase L1 family). While the domain architecture is consistent with the SGNH hydrolase superfamily that includes lysophospholipases, this specific activity annotation is likely too specific for a bacterial cellulosome component. SGNH hydrolases in cellulosome contexts typically function as carbohydrate esterases (feruloyl esterases, acetylxylan esterases) rather than phospholipid-cleaving enzymes.
Reason: The SGNH hydrolase domain correctly identifies the enzyme family, but lysophospholipase activity is unlikely for a cellulosome-associated enzyme. The biological context (cellulosome, plant cell wall degradation) strongly suggests this enzyme functions as a carbohydrate esterase involved in lignocellulose degradation. SGNH hydrolases in this context typically remove ester-linked substituents (ferulic acid, acetyl groups) from plant cell wall polysaccharides. A more appropriate annotation would reflect general esterase activity or, more specifically, carbohydrate esterase activity.
GO:0008810 cellulase activity
IEA
GO_REF:0000003
REMOVE
Summary: This annotation is based on EC:3.2.1.4 mapping. However, the EC number assignment appears to be erroneous. The protein's catalytic domain is an SGNH hydrolase (serine esterase fold), not a glycoside hydrolase fold. True cellulases (EC 3.2.1.4) belong to glycoside hydrolase families (GH5, GH6, GH7, GH8, GH9, GH12, GH44, GH45, GH48, etc.) and have completely different structural folds and catalytic mechanisms from SGNH hydrolases. The "putative endoglucanase" annotation in UniProt appears to be based on genomic context and dockerin domain presence rather than biochemical characterization or sequence analysis of the catalytic domain.
Reason: The protein structure is definitively SGNH hydrolase (PDB:2VPT at 1.40 A resolution shows the SGNH fold for residues 9-149). SGNH hydrolases are serine esterases with a catalytic mechanism involving a Ser-His-Asp triad and do not possess the active site architecture required for glycoside bond cleavage. The cellulase annotation is contradicted by the solved crystal structure. The original name "celX" and "putative endoglucanase" designation appear to be historical artifacts from the genomic context of discovery (adjacent to celE) rather than functional characterization. PMID:3066698 does not provide experimental evidence for cellulase activity of the celX gene product.
Supporting Evidence:
PMID:3066698
The complete nucleotide sequence of the Clostridium thermocellum celE gene, coding for an endo-beta-1,4-glucanase (endoglucanase E; EGE) with xylan-hydrolysing activity has been determined. [Note: This describes CelE, not CelX. The paper only mentions celX as a separate ORF upstream.]
GO:0016787 hydrolase activity
IEA
GO_REF:0000043
ACCEPT
Summary: This broad annotation based on UniProtKB keyword KW-0378 (Hydrolase) is correct. The SGNH hydrolase domain is indeed a hydrolase that catalyzes ester bond hydrolysis. This is the most appropriate of the molecular function annotations for this protein.
Reason: The SGNH hydrolase domain definitively encodes hydrolase activity. The broad term is appropriate given uncertainty about the specific substrate. This correctly captures the enzymatic function without the problematic over-specificity of the glycoside hydrolase or lysophospholipase annotations.
GO:0016798 hydrolase activity, acting on glycosyl bonds
IEA
GO_REF:0000043
REMOVE
Summary: This annotation is based on UniProtKB keyword KW-0326 (Glycosidase), which is incorrectly applied to this protein. The SGNH hydrolase fold is characteristic of esterases, not glycosidases. The keyword appears to be propagated from the erroneous "cellulase/endoglucanase" annotation rather than from structural or sequence evidence of glycosidase activity.
Reason: The protein contains an SGNH hydrolase domain (serine esterase superfamily), not a glycoside hydrolase domain. The catalytic mechanism of SGNH hydrolases involves ester bond cleavage, not glycosidic bond cleavage. The crystal structure (PDB:2VPT) confirms the SGNH fold. This annotation should be removed as it is structurally and mechanistically incorrect.
GO:0030245 cellulose catabolic process
IEA
GO_REF:0000043
MODIFY
Summary: This annotation is based on UniProtKB keyword KW-0136 (Cellulose degradation). While the protein is likely part of the cellulosome complex and may contribute to overall cellulose degradation through removal of ester-linked substituents that impede access to cellulose, it does not directly catabolize cellulose (break down the cellulose polymer). This annotation conflates participation in a cellulose-degrading system with direct cellulose catabolism.
Reason: CelX likely functions as an accessory esterase in lignocellulose degradation rather than a true cellulase. SGNH hydrolases in cellulosome contexts typically remove ester-linked ferulic acid or acetyl groups from hemicellulose, which facilitates access to cellulose but does not constitute cellulose catabolism. A more accurate annotation would reflect involvement in plant cell wall degradation or, more specifically, hemicellulose modification.
Proposed replacements: polysaccharide catabolic process
GO:0043263 cellulosome
IEA
file:ACET2/P15329/P15329-deep-research-falcon.md
NEW
Summary: The presence of a type I dockerin domain (aa 162-224) strongly indicates that CelX is a component of the cellulosome, the extracellular multi-enzyme complex that degrades plant cell walls. This is the most well-supported annotation for this protein based on domain architecture. The deep research confirms that dockerin-bearing enzymes are secreted and assembled on scaffoldins via cohesin-dockerin binding (file:ACET2/P15329/P15329-deep-research-falcon.md).
Reason: The dockerin domain (InterPro: IPR002105, IPR016134, IPR036439; PROSITE: PS00448, PS51766) is the signature module for cellulosomal proteins. It binds to cohesin domains on the scaffoldin protein, anchoring CelX within the cellulosome complex. This cellular component annotation is strongly supported by the domain architecture.
Supporting Evidence:
file:ACET2/P15329/P15329-deep-research-falcon.md
Dockerin‑bearing CAZymes are secreted and assembled on non‑catalytic scaffoldins via cohesin–dockerin binding; CBMs tether the complex to cellulose... CipA (primary scaffoldin) organizes multiple type‑I cohesins for CAZyme recruitment
GO:0016788 hydrolase activity, acting on ester bonds
IEA
file:ACET2/P15329/P15329-deep-research-falcon.md
NEW
Summary: Based on the SGNH hydrolase domain (Pfam: Lipase_GDSL_2, InterPro: IPR013830, IPR051532), CelX is predicted to have esterase activity. The UniProt domain annotation includes Ester_Hydrolysis_Enzymes (IPR051532), which is characteristic of the SGNH hydrolase superfamily.
Reason: The SGNH hydrolase superfamily (Gene3D: 3.40.50.1110, SUPFAM: SSF52266) consists of serine esterases/lipases that hydrolyze ester bonds. The catalytic domain structure (residues 9-149, solved at 1.40 A in PDB:2VPT) confirms this fold. In the context of the cellulosome, this esterase activity likely targets ester-linked substituents on plant cell wall polysaccharides.
Supporting Evidence:
file:ACET2/P15329/P15329-deep-research-falcon.md
Key Domains: Dockerin_1_rpt. (IPR002105); Dockerin_dom. (IPR016134); Dockerin_dom_sf. (IPR036439); EF_Hand_1_Ca_BS. (IPR018247); Ester_Hydrolysis_Enzymes. (IPR051532)

Core Functions

CelX functions as a cellulosome-associated esterase that likely removes ester-linked substituents from plant cell wall polysaccharides, facilitating lignocellulose degradation by the cellulosome complex. The dockerin domain anchors the enzyme to the scaffoldin, while the SGNH hydrolase domain provides the catalytic esterase activity.

Cellular Locations:

References

file:ACET2/P15329/P15329-deep-research-falcon.md
Deep research review of CelX (P15329) from Acetivibrio thermocellus
  • The protein contains dockerin repeats and EF-hand Ca2+-binding motifs characteristic of cellulosomal enzymes, as well as the Ester_Hydrolysis_Enzymes domain (IPR051532).
    "Key Domains: Dockerin_1_rpt. (IPR002105); Dockerin_dom. (IPR016134); Dockerin_dom_sf. (IPR036439); EF_Hand_1_Ca_BS. (IPR018247); Ester_Hydrolysis_Enzymes. (IPR051532)"
  • No experimental paper was found that explicitly characterizes CelX/EGX in C. thermocellum. The protein function is inferred from domain architecture.
    "No explicit primary literature found that experimentally characterizes a protein named "celX" or "endoglucanase X (EGX)" in C. thermocellum; annotation appears based on sequence/domain evidence and database curation"
  • Dockerin-bearing enzymes are secreted and assembled on scaffoldins via Ca2+-dependent cohesin-dockerin binding.
    "Dockerin‑bearing CAZymes are secreted and assembled on non‑catalytic scaffoldins via cohesin–dockerin binding; CBMs tether the complex to cellulose... CipA (primary scaffoldin) organizes multiple type‑I cohesins for CAZyme recruitment"
Conserved reiterated domains in Clostridium thermocellum endoglucanases are not essential for catalytic activity.
  • CelX was identified as a secondary ORF in the genomic region encoding CelE, with a C-terminus homologous to other C. thermocellum endoglucanases (the dockerin domain).
    "A second ORF which ends 349 bp 5' to the GTG start codon of the celE gene has also been identified. The encoded product contains a C terminus homologous to other C. thermocellum endoglucanases."
  • The paper primarily characterizes CelE (endoglucanase E with xylanase activity), not CelX. No experimental characterization of CelX enzymatic activity is provided.
    "The complete nucleotide sequence of the Clostridium thermocellum celE gene, coding for an endo-beta-1,4-glucanase (endoglucanase E; EGE) with xylan-hydrolysing activity has been determined."
Gene Ontology annotation through association of InterPro records with GO terms
Gene Ontology annotation based on Enzyme Commission mapping
Gene Ontology annotation based on UniProtKB/Swiss-Prot keyword mapping
TreeGrafter-generated GO annotations
Combined Automated Annotation using Multiple IEA Methods

Suggested Questions for Experts

Q: What is the actual enzymatic activity of CelX? Is it a feruloyl esterase, acetylxylan esterase, or does it have some other esterase substrate specificity?

Suggested experts: Gilbert HJ, Hazlewood GP

Q: Has CelX been biochemically characterized with purified protein? The current annotations appear to be based on genomic context and domain predictions rather than experimental evidence.

Suggested Experiments

Experiment: Express and purify recombinant CelX (without the dockerin domain if stability is an issue) and test for activity against: (1) generic esterase substrates (p-nitrophenyl acetate), (2) cellulase substrates (CMC, filter paper, cellooligosaccharides), and (3) hemicellulose-associated ester substrates (methyl ferulate, acetylated xylan). The crystal structure strongly predicts esterase activity and absence of cellulase activity.

Hypothesis: CelX has esterase activity rather than cellulase activity

Type: biochemical assay

Experiment: Test binding of CelX dockerin domain to recombinant cohesin domains from A. thermocellus scaffoldin using isothermal titration calorimetry or surface plasmon resonance. This would confirm cellulosome association.

Hypothesis: CelX is incorporated into the cellulosome via its dockerin domain

Type: protein-protein interaction assay

📚 Additional Documentation

Deep Research Falcon

(P15329-deep-research-falcon.md)

provider: falcon
model: Edison Scientific Literature
cached: false
start_time: '2025-12-26T17:03:23.906508'
end_time: '2025-12-26T17:13:10.756954'
duration_seconds: 586.85
template_file: templates/gene_research_go_focused.md
template_variables:
organism: ACET2
gene_id: P15329
gene_symbol: celX
uniprot_accession: P15329
protein_description: 'RecName: Full=Putative endoglucanase X; Short=EGX; EC=3.2.1.4;
AltName: Full=Cellulase; AltName: Full=Endo-1,4-beta-glucanase; Flags: Fragment;'
gene_info: Name=celX;
organism_full: Acetivibrio thermocellus (Hungateiclostridium thermocellum) (Clostridium
thermocellum).
protein_family: Not specified in UniProt
protein_domains: Dockerin_1_rpt. (IPR002105); Dockerin_dom. (IPR016134); Dockerin_dom_sf.
(IPR036439); EF_Hand_1_Ca_BS. (IPR018247); Ester_Hydrolysis_Enzymes. (IPR051532)
provider_config:
timeout: 600
max_retries: 3
parameters:
allowed_domains: []
temperature: 0.1
citation_count: 18


Question

Gene Research for Functional Annotation

⚠️ CRITICAL: Gene/Protein Identification Context

BEFORE YOU BEGIN RESEARCH: You MUST verify you are researching the CORRECT gene/protein. Gene symbols can be ambiguous, especially for less well-characterized genes from non-model organisms.

Target Gene/Protein Identity (from UniProt):

  • UniProt Accession: P15329
  • Protein Description: RecName: Full=Putative endoglucanase X; Short=EGX; EC=3.2.1.4; AltName: Full=Cellulase; AltName: Full=Endo-1,4-beta-glucanase; Flags: Fragment;
  • Gene Information: Name=celX;
  • Organism (full): Acetivibrio thermocellus (Hungateiclostridium thermocellum) (Clostridium thermocellum).
  • Protein Family: Not specified in UniProt
  • Key Domains: Dockerin_1_rpt. (IPR002105); Dockerin_dom. (IPR016134); Dockerin_dom_sf. (IPR036439); EF_Hand_1_Ca_BS. (IPR018247); Ester_Hydrolysis_Enzymes. (IPR051532)

MANDATORY VERIFICATION STEPS:

  1. Check if the gene symbol "celX" matches the protein description above
  2. Verify the organism is correct: Acetivibrio thermocellus (Hungateiclostridium thermocellum) (Clostridium thermocellum).
  3. Check if protein family/domains align with what you find in literature
  4. If you find literature for a DIFFERENT gene with the same or similar symbol, STOP

If Gene Symbol is Ambiguous or You Cannot Find Relevant Literature:

DO NOT PROCEED WITH RESEARCH ON A DIFFERENT GENE. Instead:
- State clearly: "The gene symbol 'celX' is ambiguous or literature is limited for this specific protein"
- Explain what you found (e.g., "Found extensive literature on a different gene with the same symbol in a different organism")
- Describe the protein based ONLY on the UniProt information provided above
- Suggest that the protein function can be inferred from domain/family information

Research Target:

Please provide a comprehensive research report on the gene celX (gene ID: P15329, UniProt: P15329) in ACET2.

The research report should be a detailed narrative explaining the function, biological processes, and localization of the gene product. Citations should be given for all claims.

You should prioritize authoritative reviews and primary scientific literature when conducting research. You can supplement
this with annotations you find in gene/protein databases, but these can be outdated or inaccurate.

We are specifically interested in the primary function of the gene - for enzymes, what reaction is catalyzed, and what is the substrate specificity? For transporters, what is the substrate? For structural proteins or adapters, what is the broader structural role? For signaling molecules, what is the role in the pathway.

We are interested in where in or outside the cell the gene product carries out its function.

We are also interested in the signaling or biochemical pathways in which the gene functions. We are less interested in broad pleiotropic effects, except where these elucidate the precise role.

Include evidence where possible. We are interested in both experimental evidence as well as inference from structure, evolution, or bioinformatic analysis. Precise studies should be prioritized over high-throughput, where available.

Output

Question: You are an expert researcher providing comprehensive, well-cited information.

Provide detailed information focusing on:
1. Key concepts and definitions with current understanding
2. Recent developments and latest research (prioritize 2023-2024 sources)
3. Current applications and real-world implementations
4. Expert opinions and analysis from authoritative sources
5. Relevant statistics and data from recent studies

Format as a comprehensive research report with proper citations. Include URLs and publication dates where available.
Always prioritize recent, authoritative sources and provide specific citations for all major claims.

Gene Research for Functional Annotation

⚠️ CRITICAL: Gene/Protein Identification Context

BEFORE YOU BEGIN RESEARCH: You MUST verify you are researching the CORRECT gene/protein. Gene symbols can be ambiguous, especially for less well-characterized genes from non-model organisms.

Target Gene/Protein Identity (from UniProt):

  • UniProt Accession: P15329
  • Protein Description: RecName: Full=Putative endoglucanase X; Short=EGX; EC=3.2.1.4; AltName: Full=Cellulase; AltName: Full=Endo-1,4-beta-glucanase; Flags: Fragment;
  • Gene Information: Name=celX;
  • Organism (full): Acetivibrio thermocellus (Hungateiclostridium thermocellum) (Clostridium thermocellum).
  • Protein Family: Not specified in UniProt
  • Key Domains: Dockerin_1_rpt. (IPR002105); Dockerin_dom. (IPR016134); Dockerin_dom_sf. (IPR036439); EF_Hand_1_Ca_BS. (IPR018247); Ester_Hydrolysis_Enzymes. (IPR051532)

MANDATORY VERIFICATION STEPS:

  1. Check if the gene symbol "celX" matches the protein description above
  2. Verify the organism is correct: Acetivibrio thermocellus (Hungateiclostridium thermocellum) (Clostridium thermocellum).
  3. Check if protein family/domains align with what you find in literature
  4. If you find literature for a DIFFERENT gene with the same or similar symbol, STOP

If Gene Symbol is Ambiguous or You Cannot Find Relevant Literature:

DO NOT PROCEED WITH RESEARCH ON A DIFFERENT GENE. Instead:
- State clearly: "The gene symbol 'celX' is ambiguous or literature is limited for this specific protein"
- Explain what you found (e.g., "Found extensive literature on a different gene with the same symbol in a different organism")
- Describe the protein based ONLY on the UniProt information provided above
- Suggest that the protein function can be inferred from domain/family information

Research Target:

Please provide a comprehensive research report on the gene celX (gene ID: P15329, UniProt: P15329) in ACET2.

The research report should be a detailed narrative explaining the function, biological processes, and localization of the gene product. Citations should be given for all claims.

You should prioritize authoritative reviews and primary scientific literature when conducting research. You can supplement
this with annotations you find in gene/protein databases, but these can be outdated or inaccurate.

We are specifically interested in the primary function of the gene - for enzymes, what reaction is catalyzed, and what is the substrate specificity? For transporters, what is the substrate? For structural proteins or adapters, what is the broader structural role? For signaling molecules, what is the role in the pathway.

We are interested in where in or outside the cell the gene product carries out its function.

We are also interested in the signaling or biochemical pathways in which the gene functions. We are less interested in broad pleiotropic effects, except where these elucidate the precise role.

Include evidence where possible. We are interested in both experimental evidence as well as inference from structure, evolution, or bioinformatic analysis. Precise studies should be prioritized over high-throughput, where available.

Plan
- Objective 1 (complete): Verify target identity and domains (UniProt P15329; celX; Acetivibrio/Hungateiclostridium thermocellum). Result: Identity matches the user-provided UniProt entry. Sequence is annotated as a fragment with dockerin repeats and EF‑hand Ca2+ motifs; no primary paper explicitly naming celX/EGX in this organism was found; downstream inferences rely on domain context and the C. thermocellum cellulosome literature (gold2007proteomicanalysisof pages 26-31, hsin2024lignocellulosedegradationin pages 11-15).
- Objective 2 (complete): Assemble recent and foundational sources on C. thermocellum cellulosomal enzymes, dockerin–cohesin interactions, localization, and activities, emphasizing 2023–2024 where available. Result: Core mechanistic/quantitative evidence from comparative biochemistry (Leis 2017), proteomics/architecture (Gold 2007), exoglucanase Cel48S native features (Liu 2018), and a 2024 review synthesizing modern perspectives and statistics (Hsin 2024) (leis2017comparativecharacterizationof pages 7-8, leis2017comparativecharacterizationof pages 8-10, leis2017comparativecharacterizationof pages 5-7, leis2017comparativecharacterizationof pages 12-13, leis2017comparativecharacterizationof pages 1-2, leis2017comparativecharacterizationof pages 10-12, gold2007proteomicanalysisof pages 26-31, hsin2024lignocellulosedegradationin pages 11-15).
- Objective 3: Extract functional, substrate, localization, and pathway evidence. Incorporated below with citations.
- Objective 4: Synthesize recent developments, applications, expert opinions, and quantitative data/statistics. Incorporated below.
- Objective 5: Create structured artifact summarizing P15329. Embedded below.
- Objective 6: Deliver the comprehensive research report with citations and URLs. Completed here.

Research report: celX (UniProt P15329) in Acetivibrio thermocellus (Clostridium thermocellum)

CRITICAL verification and scope
- Gene/protein identity: P15329 corresponds to celX, annotated as “putative endoglucanase X (EGX)” from Acetivibrio thermocellus (syn. Clostridium thermocellum). UniProt flags it as a fragment and lists dockerin repeats and EF‑hand Ca2+ binding signatures, consistent with a cellulosomal enzyme module (gold2007proteomicanalysisof pages 26-31). Literature searches did not identify primary experimental characterization of a C. thermocellum enzyme explicitly named “celX” or “endoglucanase X (EGX)”. Therefore, major functional statements specific to P15329 are inferred from its domain architecture and from experimentally validated paradigms for dockerin-bearing C. thermocellum cellulases; where possible, we cite quantitative data from characterized homologous cellulosomal enzymes (hsin2024lignocellulosedegradationin pages 11-15, leis2017comparativecharacterizationof pages 7-8).

Key concepts and current definitions
- Cellulosome architecture and roles: C. thermocellum deploys a large extracellular multi-enzyme complex (cellulosome) anchored to the cell surface that deconstructs plant cell wall polysaccharides. Catalytic subunits (glycoside hydrolases and others) bear dockerin modules that bind cohesins on scaffoldins (e.g., CipA). Scaffoldins are in turn linked to the cell envelope via SLH-containing anchoring proteins; the complex attaches to cellulose via CBMs (gold2007proteomicanalysisof pages 26-31, hsin2024lignocellulosedegradationin pages 11-15). Cohesin–dockerin interactions are strong, type- and species-specific, and central to assembly (gold2007proteomicanalysisof pages 26-31). These interactions are Ca2+-dependent and mediated by EF‑hand-like motifs within dockerins (gold2007proteomicanalysisof pages 26-31).
- Enzyme classes and hydrolysis modes: Comprehensive analysis of C. thermocellum cellulosomal cellulases identifies four modes: (i) exo‑acting cellobiohydrolases (CBHs), (ii) non‑processive endoglucanases (EGs), (iii) processive EGs that transiently generate cellotetraose (pEG4), and (iv) processive EGs that predominantly release cellobiose (pEG2) (leis2017comparativecharacterizationof pages 1-2). GH family mapping shows GH48 and some GH9 as exo‑acting; GH9 and some GH5 can be processive EGs; GH5, GH8, and subsets of GH9 populate non‑processive EGs (leis2017comparativecharacterizationof pages 2-4, leis2017comparativecharacterizationof pages 12-13). Product‑pattern diversity across these classes is essential for high complex activity (leis2017comparativecharacterizationof pages 1-2).

Function and substrate specificity of cellulosomal endoglucanases (context for celX)
- Non‑processive endoglucanases (EG): Random internal cleavage on amorphous cellulose/soluble β‑glucans; produce varied oligomers including DP≥5. High activities often seen on CMC and mixed‑linkage β‑glucan (e.g., GH5 and GH8 representatives) (leis2017comparativecharacterizationof pages 5-7, leis2017comparativecharacterizationof pages 12-13). After extensive hydrolysis, final products are enriched for cellobiose and cellotriose (leis2017comparativecharacterizationof pages 7-8).
- Processive endoglucanases producing cellotetraose (pEG4): Typically GH9 enzymes with CBM3c; initiate endo‑cleavage then proceed directionally, releasing DP4 intermediates en route to shorter products. Removing CBM3c converts many pEG4s into non‑processive EGs, underscoring CBM‑guided processivity (leis2017comparativecharacterizationof pages 12-13).
- Processive endoglucanases producing cellobiose (pEG2): Exemplified by GH5 subfamily 1 members and some GH9, which favor cellobiose as major product with minor cellotriose. Aromatic residue patterns at −3/−4 subsites correlate with endo vs exo behavior and product size in GH9s (leis2017comparativecharacterizationof pages 10-12, leis2017comparativecharacterizationof pages 5-7).
- Exo‑acting CBHs: Work from cellulose chain ends; poor activity on modified soluble substrates (CMC, β‑glucan) but strong on crystalline or amorphous native cellulose, releasing cellobiose specifically (e.g., Cel48S, Cbh9A, Cel9K) (leis2017comparativecharacterizationof pages 12-13, leis2017comparativecharacterizationof pages 5-7).
- Quantitative exemplars: Comparative activities (normalized assays at 60 °C, pH ~5.8) illustrate broad ranges and substrate preferences. For instance, Leis et al. report enzyme‑specific activity profiles across β‑glucan, CMC, PASC, and Avicel; exoglucanase Cel48S exhibits minimal activity on β‑glucan/CMC and modest on PASC, consistent with CBH behavior, while selected GH5/GH9 EGs show high CMC/β‑glucan activities (leis2017comparativecharacterizationof pages 7-8, leis2017comparativecharacterizationof pages 5-7).

Recent developments and latest research (with emphasis on 2023–2024)
- Modern synthesis of cellulosome biology: A 2024 review emphasizes that A. thermocellum cellulosomes can include multiple scaffoldins and on the order of tens of catalytic subunits (up to ~63 reported), that cellulosomes can accelerate hydrolysis up to ~50‑fold over free enzymes, and that disrupting cellulosome formation can decrease activity by ~15‑fold. It also highlights dynamic, substrate‑responsive enzyme composition and inter‑module specificity with some promiscuity within type‑specific interactions (Nov 2024; bioRxiv preprint) (hsin2024lignocellulosedegradationin pages 11-15). These findings reinforce the importance of combining distinct EG modes for optimal activity.
- Designer/minicellulosomes: Comparative reconstitution on recombinant CipA (or CipA8) demonstrates that including representatives of all four endoglucanase types increases activity and avoids functional jamming; a nonavalent complex containing multiple EG types achieved roughly half the activity of native cellulosome on microcrystalline cellulose, suggesting additional components/organization contribute to native efficiency (Leis 2017; Oct 2017; https://doi.org/10.1186/s13068-017-0928-4) (leis2017comparativecharacterizationof pages 1-2, leis2017comparativecharacterizationof pages 10-12).

Current applications and real‑world implementations
- Biomass deconstruction and biofuels: The C. thermocellum cellulosome paradigm underpins industrial strategies to improve lignocellulose hydrolysis. The 2024 synthesis identifies design lessons for “designer cellulosomes” and emphasizes that achieving native‑like stability/efficiency remains challenging, guiding enzyme cocktail optimization and multi‑enzyme display engineering (hsin2024lignocellulosedegradationin pages 11-15). Proteomic and biochemical mapping of C. thermocellum cellulosomes provide reference enzyme sets and binding rules used in bioprocess design (gold2007proteomicanalysisof pages 26-31).

Localization and assembly: secretion, scaffoldin binding, and S‑layer anchoring
- Secretion and extracellular assembly: Dockerin‑bearing CAZymes are secreted and assembled on non‑catalytic scaffoldins via cohesin–dockerin binding; CBMs tether the complex to cellulose (gold2007proteomicanalysisof pages 26-31, hsin2024lignocellulosedegradationin pages 11-15). CipA (primary scaffoldin) organizes multiple type‑I cohesins for CAZyme recruitment; scaffoldin anchoring proteins with SLH repeats connect the complex to the cell surface S‑layer, establishing a cell‑associated extracellular nanomachine (gold2007proteomicanalysisof pages 26-31, hsin2024lignocellulosedegradationin pages 11-15).

Mechanism of dockerin EF‑hand Ca2+‑dependent recognition
- EF‑hand‑like architecture: C. thermocellum dockerins contain two Ca2+‑binding loop‑helix motifs. Proper dockerin folding and binding require Ca2+. Recognition involves both Ca2+‑binding segments and features strong hydrophobic contacts plus key hydrogen bonds. Measured cohesin–dockerin affinities are unusually high (reported in the nanomolar range), reflecting the robust assembly of the complex (Gold & Martin 2007; Oct 2007; https://doi.org/10.1128/JB.00882-07) (gold2007proteomicanalysisof pages 26-31).

Exoglucanase Cel48S: native activity and substrate preference (contextual benchmark)
- Using in situ purification from C. thermocellum culture supernatant, the native catalytic domain of Cel48S (GH48) showed high activity of 117.61 ± 2.98 U/mg under assay conditions and a preference for crystalline cellulose. Crystal structures revealed induced‑fit residues coupled to substrate binding, and mass spectrometry found no significant post‑translational modifications, clarifying native features important for cellulosome performance (Liu 2018; Jan 2018; https://doi.org/10.1186/s13068-017-1009-4) (leis2017comparativecharacterizationof pages 12-13).

celX (P15329) specific evidence and cautions
- Ambiguity and evidence gap: No experimental paper located in our corpus explicitly characterizes a C. thermocellum endoglucanase named “celX/EGX”. Given UniProt’s “Fragment” status, domain annotations (dockerin repeats; EF‑hand motifs), and the organismal context, P15329 is best interpreted as a putative cellulosomal endoglucanase likely secreted and recruited to scaffoldins via Ca2+‑dependent dockerin–cohesin interactions within the cellulosome. Functional specifics (exact GH family, substrate range, processivity, kinetics, and product spectrum) remain to be demonstrated experimentally for this accession (gold2007proteomicanalysisof pages 26-31, hsin2024lignocellulosedegradationin pages 11-15, leis2017comparativecharacterizationof pages 7-8).

Relevant statistics and quantitative data
- Cellulosome performance: Up to ~50‑fold acceleration versus free enzymes and ~15‑fold activity reduction when cellulosome formation is impaired (Nov 2024) (hsin2024lignocellulosedegradationin pages 11-15).
- Designer complexes: A trivalent complex achieved up to 736.6 µM reducing sugar equivalents; a different four‑enzyme complex reached ~566.2 µM under the same assay, showing that composition, not just number of enzymes, governs output (Leis 2017; Oct 2017; https://doi.org/10.1186/s13068-017-0928-4) (leis2017comparativecharacterizationof pages 8-10).
- Recombinant “all‑EG‑types” mixture on CipA8 averaged 52.6 ± 1.4% of native cellulosome activity on microcrystalline cellulose, highlighting additional layers of native optimization (Leis 2017) (leis2017comparativecharacterizationof pages 10-12).
- Cel48S (native catalytic domain) activity: 117.61 ± 2.98 U/mg; preference for crystalline cellulose; induced‑fit substrate interactions observed crystallographically (Liu 2018; Jan 2018; https://doi.org/10.1186/s13068-017-1009-4) (leis2017comparativecharacterizationof pages 12-13).

Expert opinions and interpretations (grounded in authoritative sources)
- The synergy of diverse EG modes: Inclusion of both processive (pEG2, pEG4) and non‑processive EGs mitigates stalling and ensures complementary product formation, underpinning high cellulosomal efficiency; this is experimentally supported by designer cellulosome composition–activity relationships (Leis 2017) (leis2017comparativecharacterizationof pages 1-2, leis2017comparativecharacterizationof pages 10-12).
- Dockerin EF‑hand Ca2+‑dependence: Robust cohesin–dockerin binding and Ca2+‑dependent dockerin folding provide a mechanistic basis for the stability and modularity of the cellulosome, enabling strong attachment to scaffoldins and cell surfaces while supporting dynamic enzyme exchange when required (Gold & Martin 2007; synthesis in 2024 review) (gold2007proteomicanalysisof pages 26-31, hsin2024lignocellulosedegradationin pages 11-15).

Implications for P15329 (celX)
- Based on its dockerin/EF‑hand features and organismal context, P15329 likely encodes a secreted, dockerin‑bearing endoglucanase that integrates into the cellulosome via Ca2+‑dependent cohesin–dockerin binding to scaffoldins (e.g., CipA) and functions extracellularly at the cellulose interface. Without direct biochemical data, the enzyme’s processivity and product profile remain uncertain; by analogy to the established C. thermocellum repertoire, either non‑processive EG or processive pEG2/pEG4 behavior are plausible possibilities pending experimental confirmation (gold2007proteomicanalysisof pages 26-31, leis2017comparativecharacterizationof pages 7-8, leis2017comparativecharacterizationof pages 12-13).

Structured summary of P15329
| Aspect | Findings for P15329 (celX) | Evidence/Notes | Key sources (year, URL) |
|---|---|---|---|
| Identity | UniProt accession P15329; gene name: celX; organism: Acetivibrio thermocellum (aka Clostridium thermocellum); sequence annotated as "Fragment" | Based on provided UniProt record; no direct experimental paper found that explicitly characterizes this accession under the name "celX" or "EGX" (search and evidence review). (gold2007proteomicanalysisof pages 26-31, hsin2024lignocellulosedegradationin pages 11-15, leis2017comparativecharacterizationof pages 7-8) | UniProt P15329; Leis et al. 2017 (https://doi.org/10.1186/s13068-017-0928-4); Gold 2007 (https://doi.org/10.1128/jb.00882-07) |
| Protein description (annotation) | RecName: Putative endoglucanase X; Short=EGX; EC=3.2.1.4; AltName: Cellulase / Endo-1,4-β-glucanase; Flags: Fragment | Directly from UniProt-provided description; EC and names imply endoglucanase activity but annotation is putative. (gold2007proteomicanalysisof pages 26-31) | UniProt P15329; Leis et al. 2017 (2017, https://doi.org/10.1186/s13068-017-0928-4) |
| Domains / sequence features | Contains dockerin repeats (Dockerin_1_rpt / Dockerin_dom / Dockerin_dom_sf) and EF-hand Ca2+-binding motif(s); also annotated with Ester_Hydrolysis_Enzymes signature; fragment status noted | Domain list from UniProt; dockerin + EF-hand implies Ca2+-dependent cohesin–dockerin recognition and recruitment to cellulosomal scaffold; EF-hand/docking biology and S-layer anchoring supported by cellulosome literature. (gold2007proteomicanalysisof pages 26-31, santos2025unconventionalcohesindockerinbinding pages 58-61, hsin2024lignocellulosedegradationin pages 11-15) | UniProt P15329; Gold 2007 (2007, https://doi.org/10.1128/jb.00882-07); Hsin et al. 2024 (2024, https://doi.org/10.1101/2024.11.06.622210) |
| Predicted molecular function / substrate specificity | Predicted endo-1,4-β-glucanase (cellulase) activity (EC 3.2.1.4); likely acts on β-1,4-glucans (amorphous cellulose/oligosaccharides); precise GH family/substrate preference and kinetics not reported for this accession | Inference based on EC number and domain context; no direct biochemical characterization located for P15329 — function therefore remains putative and inferred from cellulosomal enzyme paradigms. (leis2017comparativecharacterizationof pages 7-8, leis2017comparativecharacterizationof pages 8-10, leis2017comparativecharacterizationof pages 5-7) | Leis et al. 2017 (2017, https://doi.org/10.1186/s13068-017-0928-4); Liu et al. 2018 (2018, https://doi.org/10.1186/s13068-017-1009-4) |
| Cellular localization / assembly | Predicted secreted/extracellular, dockerin-bearing (cellulosomal) enzyme expected to bind scaffoldin(s) such as CipA and participate in cell-surface / S-layer-associated cellulosomes | Dockerin modules mediate tight, Ca2+-dependent binding to cohesin modules on scaffoldins; type I vs type II dockerin/cohesin interactions determine recruitment vs surface anchoring. Localization inference stems from domain architecture + cellulosome models. (gold2007proteomicanalysisof pages 26-31, santos2025unconventionalcohesindockerinbinding pages 58-61, hsin2024lignocellulosedegradationin pages 11-15) | Gold 2007 (2007, https://doi.org/10.1128/jb.00882-07); Santos 2025 (reviewed excerpts) |
| Pathway / biological context | Likely component of the extracellular cellulosome complex for concerted cellulose depolymerization; contributes to generation of soluble cello-oligosaccharides feeding downstream uptake/metabolism | Inferred from cellulosome composition and designer cellulosome experiments showing multiple EG types (processive/non-processive) act synergistically to increase hydrolytic efficiency. No direct pathway-level experiments for P15329 located. (leis2017comparativecharacterizationof pages 7-8, leis2017comparativecharacterizationof pages 8-10, leis2017comparativecharacterizationof pages 12-13) | Leis et al. 2017 (2017, https://doi.org/10.1186/s13068-017-0928-4); Hsin et al. 2024 (2024, https://doi.org/10.1101/2024.11.06.622210) |
| Literature presence / ambiguity | No explicit primary literature found that experimentally characterizes a protein named "celX" or "endoglucanase X (EGX)" in C. thermocellum; annotation appears based on sequence/domain evidence and database curation | Searches and gathered evidence returned extensive cellulosome literature but no direct experimental paper naming P15329/"celX" — treat functional claims as inferred and provisional until validated experimentally. (gold2007proteomicanalysisof pages 26-31, hsin2024lignocellulosedegradationin pages 11-15, leis2017comparativecharacterizationof pages 7-8) | UniProt P15329; Leis et al. 2017 (2017, https://doi.org/10.1186/s13068-017-0928-4) |
| Evidence types supporting conclusions | Primary database annotation (UniProt entry), comparative cellulosome proteomics and biochemical characterization of other cellulosomal enzymes, structural/biophysical studies of dockerin–cohesin and EF-hand Ca2+ roles; designer cellulosome functional assays used to infer likely roles | Summary draws on curated UniProt annotation plus experimental and review literature on cellulosome composition, dockerin EF-hand/Ca2+ dependence, and endoglucanase processivity/product patterns. Citations point to representative studies and reviews used for inference. (gold2007proteomicanalysisof pages 26-31, hsin2024lignocellulosedegradationin pages 11-15, leis2017comparativecharacterizationof pages 7-8, leis2017comparativecharacterizationof pages 8-10, leis2017comparativecharacterizationof pages 5-7, leis2017comparativecharacterizationof pages 12-13, leis2017comparativecharacterizationof pages 1-2, leis2017comparativecharacterizationof pages 10-12, leis2017comparativecharacterizationof pages 14-15, leis2017comparativecharacterizationof pages 2-4) | Representative sources: Leis et al. 2017 (2017, https://doi.org/10.1186/s13068-017-0928-4); Gold 2007 (2007, https://doi.org/10.1128/jb.00882-07); Hsin et al. 2024 (2024, https://doi.org/10.1101/2024.11.06.622210) |

Table: Concise table summarizing UniProt P15329 (celX) annotation, domain features, inferred function/localization, pathway context, and evidence types; useful to show what is known from UniProt and what must be inferred from cellulosome literature (citations included).

References with URLs and dates
- Gold ND, Martin VJJ. Global view of the Clostridium thermocellum cellulosome revealed by quantitative proteomic analysis. Journal of Bacteriology. Oct 2007. https://doi.org/10.1128/JB.00882-07 (gold2007proteomicanalysisof pages 26-31).
- Leis B, et al. Comparative characterization of all cellulosomal cellulases from Clostridium thermocellum reveals high diversity in endoglucanase product formation essential for complex activity. Biotechnology for Biofuels. Oct 2017. https://doi.org/10.1186/s13068-017-0928-4 (leis2017comparativecharacterizationof pages 7-8, leis2017comparativecharacterizationof pages 8-10, leis2017comparativecharacterizationof pages 5-7, leis2017comparativecharacterizationof pages 12-13, leis2017comparativecharacterizationof pages 1-2, leis2017comparativecharacterizationof pages 10-12, leis2017comparativecharacterizationof pages 14-15, leis2017comparativecharacterizationof pages 2-4).
- Liu Y‑J, et al. Determination of the native features of the exoglucanase Cel48S from Clostridium thermocellum. Biotechnology for Biofuels. Jan 2018. https://doi.org/10.1186/s13068-017-1009-4 (leis2017comparativecharacterizationof pages 12-13).
- Hsin K‑T, et al. Lignocellulose Degradation in Bacteria and Fungi for Biomass Conversion. bioRxiv. Nov 2024. https://doi.org/10.1101/2024.11.06.622210 (hsin2024lignocellulosedegradationin pages 11-15).

Conclusion
- The gene symbol “celX” for UniProt P15329 is not well represented in the experimental literature for C. thermocellum; thus, the protein should be treated as a putative cellulosomal endoglucanase with function inferred from its dockerin/EF‑hand‑bearing architecture and the extensively characterized cellulosome context in this organism. Its likely role is extracellular cellulose depolymerization within the cellulosome, assembled via strong Ca2+‑dependent cohesin–dockerin interactions and anchored to the cell surface. Precise GH family assignment, processivity, and product profile remain to be determined experimentally for P15329 (gold2007proteomicanalysisof pages 26-31, hsin2024lignocellulosedegradationin pages 11-15, leis2017comparativecharacterizationof pages 7-8).

References

  1. (gold2007proteomicanalysisof pages 26-31): N Gold. Proteomic analysis of the clostridium thermocellum cellulosome. Unknown journal, 2007.

  2. (hsin2024lignocellulosedegradationin pages 11-15): Kuan-Ting Hsin, HueyTyng Lee, Ying-Chung Jimmy Lin, and Pao-Yang Chen. Lignocellulose degradation in bacteria and fungi for biomass conversion. bioRxiv, Nov 2024. URL: https://doi.org/10.1101/2024.11.06.622210, doi:10.1101/2024.11.06.622210. This article has 2 citations and is from a poor quality or predatory journal.

  3. (leis2017comparativecharacterizationof pages 7-8): Benedikt Leis, Claudia Held, Fabian Bergkemper, Katharina Dennemarck, Robert Steinbauer, Alarich Reiter, Matthias Mechelke, Matthias Moerch, Sigrid Graubner, Wolfgang Liebl, Wolfgang H. Schwarz, and Vladimir V. Zverlov. Comparative characterization of all cellulosomal cellulases from clostridium thermocellum reveals high diversity in endoglucanase product formation essential for complex activity. Biotechnology for Biofuels, Oct 2017. URL: https://doi.org/10.1186/s13068-017-0928-4, doi:10.1186/s13068-017-0928-4. This article has 63 citations.

  4. (leis2017comparativecharacterizationof pages 8-10): Benedikt Leis, Claudia Held, Fabian Bergkemper, Katharina Dennemarck, Robert Steinbauer, Alarich Reiter, Matthias Mechelke, Matthias Moerch, Sigrid Graubner, Wolfgang Liebl, Wolfgang H. Schwarz, and Vladimir V. Zverlov. Comparative characterization of all cellulosomal cellulases from clostridium thermocellum reveals high diversity in endoglucanase product formation essential for complex activity. Biotechnology for Biofuels, Oct 2017. URL: https://doi.org/10.1186/s13068-017-0928-4, doi:10.1186/s13068-017-0928-4. This article has 63 citations.

  5. (leis2017comparativecharacterizationof pages 5-7): Benedikt Leis, Claudia Held, Fabian Bergkemper, Katharina Dennemarck, Robert Steinbauer, Alarich Reiter, Matthias Mechelke, Matthias Moerch, Sigrid Graubner, Wolfgang Liebl, Wolfgang H. Schwarz, and Vladimir V. Zverlov. Comparative characterization of all cellulosomal cellulases from clostridium thermocellum reveals high diversity in endoglucanase product formation essential for complex activity. Biotechnology for Biofuels, Oct 2017. URL: https://doi.org/10.1186/s13068-017-0928-4, doi:10.1186/s13068-017-0928-4. This article has 63 citations.

  6. (leis2017comparativecharacterizationof pages 12-13): Benedikt Leis, Claudia Held, Fabian Bergkemper, Katharina Dennemarck, Robert Steinbauer, Alarich Reiter, Matthias Mechelke, Matthias Moerch, Sigrid Graubner, Wolfgang Liebl, Wolfgang H. Schwarz, and Vladimir V. Zverlov. Comparative characterization of all cellulosomal cellulases from clostridium thermocellum reveals high diversity in endoglucanase product formation essential for complex activity. Biotechnology for Biofuels, Oct 2017. URL: https://doi.org/10.1186/s13068-017-0928-4, doi:10.1186/s13068-017-0928-4. This article has 63 citations.

  7. (leis2017comparativecharacterizationof pages 1-2): Benedikt Leis, Claudia Held, Fabian Bergkemper, Katharina Dennemarck, Robert Steinbauer, Alarich Reiter, Matthias Mechelke, Matthias Moerch, Sigrid Graubner, Wolfgang Liebl, Wolfgang H. Schwarz, and Vladimir V. Zverlov. Comparative characterization of all cellulosomal cellulases from clostridium thermocellum reveals high diversity in endoglucanase product formation essential for complex activity. Biotechnology for Biofuels, Oct 2017. URL: https://doi.org/10.1186/s13068-017-0928-4, doi:10.1186/s13068-017-0928-4. This article has 63 citations.

  8. (leis2017comparativecharacterizationof pages 10-12): Benedikt Leis, Claudia Held, Fabian Bergkemper, Katharina Dennemarck, Robert Steinbauer, Alarich Reiter, Matthias Mechelke, Matthias Moerch, Sigrid Graubner, Wolfgang Liebl, Wolfgang H. Schwarz, and Vladimir V. Zverlov. Comparative characterization of all cellulosomal cellulases from clostridium thermocellum reveals high diversity in endoglucanase product formation essential for complex activity. Biotechnology for Biofuels, Oct 2017. URL: https://doi.org/10.1186/s13068-017-0928-4, doi:10.1186/s13068-017-0928-4. This article has 63 citations.

  9. (leis2017comparativecharacterizationof pages 2-4): Benedikt Leis, Claudia Held, Fabian Bergkemper, Katharina Dennemarck, Robert Steinbauer, Alarich Reiter, Matthias Mechelke, Matthias Moerch, Sigrid Graubner, Wolfgang Liebl, Wolfgang H. Schwarz, and Vladimir V. Zverlov. Comparative characterization of all cellulosomal cellulases from clostridium thermocellum reveals high diversity in endoglucanase product formation essential for complex activity. Biotechnology for Biofuels, Oct 2017. URL: https://doi.org/10.1186/s13068-017-0928-4, doi:10.1186/s13068-017-0928-4. This article has 63 citations.

  10. (santos2025unconventionalcohesindockerinbinding pages 58-61): MRCD Santos. Unconventional cohesin-dockerin binding mechanisms reveal the complexity of cellulosome assembly. Unknown journal, 2025.

  11. (leis2017comparativecharacterizationof pages 14-15): Benedikt Leis, Claudia Held, Fabian Bergkemper, Katharina Dennemarck, Robert Steinbauer, Alarich Reiter, Matthias Mechelke, Matthias Moerch, Sigrid Graubner, Wolfgang Liebl, Wolfgang H. Schwarz, and Vladimir V. Zverlov. Comparative characterization of all cellulosomal cellulases from clostridium thermocellum reveals high diversity in endoglucanase product formation essential for complex activity. Biotechnology for Biofuels, Oct 2017. URL: https://doi.org/10.1186/s13068-017-0928-4, doi:10.1186/s13068-017-0928-4. This article has 63 citations.

Citations

  1. gold2007proteomicanalysisof pages 26-31
  2. leis2017comparativecharacterizationof pages 1-2
  3. leis2017comparativecharacterizationof pages 7-8
  4. leis2017comparativecharacterizationof pages 12-13
  5. hsin2024lignocellulosedegradationin pages 11-15
  6. leis2017comparativecharacterizationof pages 8-10
  7. leis2017comparativecharacterizationof pages 10-12
  8. leis2017comparativecharacterizationof pages 5-7
  9. leis2017comparativecharacterizationof pages 2-4
  10. santos2025unconventionalcohesindockerinbinding pages 58-61
  11. leis2017comparativecharacterizationof pages 14-15
  12. https://doi.org/10.1186/s13068-017-0928-4
  13. https://doi.org/10.1128/JB.00882-07
  14. https://doi.org/10.1186/s13068-017-1009-4
  15. https://doi.org/10.1128/jb.00882-07
  16. https://doi.org/10.1101/2024.11.06.622210
  17. https://doi.org/10.1101/2024.11.06.622210,
  18. https://doi.org/10.1186/s13068-017-0928-4,

📄 View Raw YAML

id: P15329
gene_symbol: celX
product_type: PROTEIN
status: DRAFT
taxon:
  id: NCBITaxon:203119
  label: Acetivibrio thermocellus
description: >-
  CelX is a cellulosome-associated protein from Acetivibrio thermocellus that contains an SGNH hydrolase
  domain and a type I dockerin domain. Despite being annotated as a "putative endoglucanase," the protein's
  domain architecture strongly suggests esterase rather than cellulase activity. The SGNH hydrolase fold
  (InterPro: IPR013830, Pfam: Lipase_GDSL_2) is characteristic of serine esterases/lipases, not glycoside
  hydrolases. The dockerin domain enables attachment to the cellulosome scaffoldin, suggesting a role in
  lignocellulose degradation, but as an accessory esterase (possibly a feruloyl esterase or acetylxylan
  esterase) rather than a true cellulase. The "cellulase" annotation appears to be a historical misannotation
  based on genomic context (proximity to celE) rather than biochemical characterization of CelX itself.
  The original paper (PMID:3066698) primarily characterizes CelE, mentioning celX only as a secondary ORF
  identified in the upstream region.
existing_annotations:
- term:
    id: GO:0000272
    label: polysaccharide catabolic process
  evidence_type: IEA
  original_reference_id: GO_REF:0000120
  review:
    summary: >-
      This annotation is based on the presence of the dockerin domain (InterPro: IPR002105, IPR016134, IPR036439)
      which is associated with cellulosome components. While CelX is likely part of the cellulosome complex
      and therefore participates in polysaccharide degradation, the role would be as an accessory esterase
      (removing ester-linked substituents from polysaccharides) rather than directly cleaving glycosidic bonds.
      The annotation is acceptable as a broad descriptor of the biological context but may be an over-annotation
      since it implies direct polysaccharide backbone cleavage.
    action: KEEP_AS_NON_CORE
    reason: >-
      The dockerin domain indicates cellulosome association, suggesting involvement in plant cell wall
      degradation. However, based on the SGNH hydrolase catalytic domain, the protein likely functions as an
      accessory esterase that assists in polysaccharide degradation by removing ester-linked side groups rather
      than cleaving the polysaccharide backbone directly. This is a peripheral rather than core function description.

- term:
    id: GO:0004553
    label: hydrolase activity, hydrolyzing O-glycosyl compounds
  evidence_type: IEA
  original_reference_id: GO_REF:0000002
  review:
    summary: >-
      This annotation is derived from InterPro:IPR002105 (Dockerin type I repeat), which is a non-catalytic
      domain for cellulosome attachment. The dockerin domain does not confer glycoside hydrolase activity;
      it is a protein-protein interaction module that binds to cohesin domains on the scaffoldin. This
      annotation is a clear example of guilt-by-association: because dockerin domains are found in cellulases,
      the presence of dockerin led to this glycoside hydrolase annotation, even though the catalytic domain
      (SGNH hydrolase) does not hydrolyze O-glycosyl compounds.
    action: REMOVE
    reason: >-
      The SGNH hydrolase domain (Pfam: Lipase_GDSL_2, InterPro: IPR013830) present in CelX is characteristic
      of serine esterases, not glycoside hydrolases. SGNH hydrolases catalyze ester bond cleavage using a
      Ser-His-Asp catalytic triad, while glycoside hydrolases use different mechanisms (retaining/inverting)
      and have distinct catalytic residues. The annotation is based on the dockerin domain which only indicates
      cellulosome association, not enzymatic activity. The original publication (PMID:3066698) does not
      provide experimental evidence for glycoside hydrolase activity of CelX.
    supported_by:
    - reference_id: PMID:3066698
      supporting_text: >-
        A second ORF which ends 349 bp 5' to the GTG start codon of the celE gene has also been identified.
        The encoded product contains a C terminus homologous to other C. thermocellum endoglucanases.
        [Note: This only refers to the dockerin domain homology, not the catalytic domain]

- term:
    id: GO:0004622
    label: phosphatidylcholine lysophospholipase A1 activity
  evidence_type: IEA
  original_reference_id: GO_REF:0000118
  review:
    summary: >-
      This annotation is derived from TreeGrafter phylogenetic inference based on the SGNH hydrolase domain
      (PANTHER family PTN002411393 - lysophospholipase L1 family). While the domain architecture is consistent
      with the SGNH hydrolase superfamily that includes lysophospholipases, this specific activity annotation
      is likely too specific for a bacterial cellulosome component. SGNH hydrolases in cellulosome contexts
      typically function as carbohydrate esterases (feruloyl esterases, acetylxylan esterases) rather than
      phospholipid-cleaving enzymes.
    action: MODIFY
    reason: >-
      The SGNH hydrolase domain correctly identifies the enzyme family, but lysophospholipase activity is
      unlikely for a cellulosome-associated enzyme. The biological context (cellulosome, plant cell wall
      degradation) strongly suggests this enzyme functions as a carbohydrate esterase involved in lignocellulose
      degradation. SGNH hydrolases in this context typically remove ester-linked substituents (ferulic acid,
      acetyl groups) from plant cell wall polysaccharides. A more appropriate annotation would reflect
      general esterase activity or, more specifically, carbohydrate esterase activity.
    proposed_replacement_terms:
    - id: GO:0016788
      label: hydrolase activity, acting on ester bonds
    - id: GO:0030600
      label: feruloyl esterase activity
    - id: GO:0046555
      label: acetylxylan esterase activity

- term:
    id: GO:0008810
    label: cellulase activity
  evidence_type: IEA
  original_reference_id: GO_REF:0000003
  review:
    summary: >-
      This annotation is based on EC:3.2.1.4 mapping. However, the EC number assignment appears to be
      erroneous. The protein's catalytic domain is an SGNH hydrolase (serine esterase fold), not a glycoside
      hydrolase fold. True cellulases (EC 3.2.1.4) belong to glycoside hydrolase families (GH5, GH6, GH7,
      GH8, GH9, GH12, GH44, GH45, GH48, etc.) and have completely different structural folds and catalytic
      mechanisms from SGNH hydrolases. The "putative endoglucanase" annotation in UniProt appears to be
      based on genomic context and dockerin domain presence rather than biochemical characterization or
      sequence analysis of the catalytic domain.
    action: REMOVE
    reason: >-
      The protein structure is definitively SGNH hydrolase (PDB:2VPT at 1.40 A resolution shows the SGNH
      fold for residues 9-149). SGNH hydrolases are serine esterases with a catalytic mechanism involving
      a Ser-His-Asp triad and do not possess the active site architecture required for glycoside bond
      cleavage. The cellulase annotation is contradicted by the solved crystal structure. The original
      name "celX" and "putative endoglucanase" designation appear to be historical artifacts from the
      genomic context of discovery (adjacent to celE) rather than functional characterization.
      PMID:3066698 does not provide experimental evidence for cellulase activity of the celX gene product.
    supported_by:
    - reference_id: PMID:3066698
      supporting_text: >-
        The complete nucleotide sequence of the Clostridium thermocellum celE gene, coding for an
        endo-beta-1,4-glucanase (endoglucanase E; EGE) with xylan-hydrolysing activity has been determined.
        [Note: This describes CelE, not CelX. The paper only mentions celX as a separate ORF upstream.]

- term:
    id: GO:0016787
    label: hydrolase activity
  evidence_type: IEA
  original_reference_id: GO_REF:0000043
  review:
    summary: >-
      This broad annotation based on UniProtKB keyword KW-0378 (Hydrolase) is correct. The SGNH hydrolase
      domain is indeed a hydrolase that catalyzes ester bond hydrolysis. This is the most appropriate
      of the molecular function annotations for this protein.
    action: ACCEPT
    reason: >-
      The SGNH hydrolase domain definitively encodes hydrolase activity. The broad term is appropriate
      given uncertainty about the specific substrate. This correctly captures the enzymatic function
      without the problematic over-specificity of the glycoside hydrolase or lysophospholipase annotations.

- term:
    id: GO:0016798
    label: hydrolase activity, acting on glycosyl bonds
  evidence_type: IEA
  original_reference_id: GO_REF:0000043
  review:
    summary: >-
      This annotation is based on UniProtKB keyword KW-0326 (Glycosidase), which is incorrectly applied
      to this protein. The SGNH hydrolase fold is characteristic of esterases, not glycosidases. The
      keyword appears to be propagated from the erroneous "cellulase/endoglucanase" annotation rather
      than from structural or sequence evidence of glycosidase activity.
    action: REMOVE
    reason: >-
      The protein contains an SGNH hydrolase domain (serine esterase superfamily), not a glycoside
      hydrolase domain. The catalytic mechanism of SGNH hydrolases involves ester bond cleavage,
      not glycosidic bond cleavage. The crystal structure (PDB:2VPT) confirms the SGNH fold.
      This annotation should be removed as it is structurally and mechanistically incorrect.

- term:
    id: GO:0030245
    label: cellulose catabolic process
  evidence_type: IEA
  original_reference_id: GO_REF:0000043
  review:
    summary: >-
      This annotation is based on UniProtKB keyword KW-0136 (Cellulose degradation). While the protein
      is likely part of the cellulosome complex and may contribute to overall cellulose degradation
      through removal of ester-linked substituents that impede access to cellulose, it does not directly
      catabolize cellulose (break down the cellulose polymer). This annotation conflates participation
      in a cellulose-degrading system with direct cellulose catabolism.
    action: MODIFY
    reason: >-
      CelX likely functions as an accessory esterase in lignocellulose degradation rather than a true
      cellulase. SGNH hydrolases in cellulosome contexts typically remove ester-linked ferulic acid or
      acetyl groups from hemicellulose, which facilitates access to cellulose but does not constitute
      cellulose catabolism. A more accurate annotation would reflect involvement in plant cell wall
      degradation or, more specifically, hemicellulose modification.
    proposed_replacement_terms:
    - id: GO:0000272
      label: polysaccharide catabolic process

- term:
    id: GO:0043263
    label: cellulosome
  evidence_type: IEA
  original_reference_id: file:ACET2/P15329/P15329-deep-research-falcon.md
  review:
    summary: >-
      The presence of a type I dockerin domain (aa 162-224) strongly indicates that CelX is a component
      of the cellulosome, the extracellular multi-enzyme complex that degrades plant cell walls. This
      is the most well-supported annotation for this protein based on domain architecture. The deep
      research confirms that dockerin-bearing enzymes are secreted and assembled on scaffoldins via
      cohesin-dockerin binding (file:ACET2/P15329/P15329-deep-research-falcon.md).
    action: NEW
    reason: >-
      The dockerin domain (InterPro: IPR002105, IPR016134, IPR036439; PROSITE: PS00448, PS51766) is the
      signature module for cellulosomal proteins. It binds to cohesin domains on the scaffoldin protein,
      anchoring CelX within the cellulosome complex. This cellular component annotation is strongly
      supported by the domain architecture.
    supported_by:
    - reference_id: file:ACET2/P15329/P15329-deep-research-falcon.md
      supporting_text: >-
        Dockerin‑bearing CAZymes are secreted and assembled on non‑catalytic scaffoldins via cohesin–dockerin
        binding; CBMs tether the complex to cellulose... CipA (primary scaffoldin) organizes multiple type‑I
        cohesins for CAZyme recruitment

- term:
    id: GO:0016788
    label: hydrolase activity, acting on ester bonds
  evidence_type: IEA
  original_reference_id: file:ACET2/P15329/P15329-deep-research-falcon.md
  review:
    summary: >-
      Based on the SGNH hydrolase domain (Pfam: Lipase_GDSL_2, InterPro: IPR013830, IPR051532), CelX is
      predicted to have esterase activity. The UniProt domain annotation includes Ester_Hydrolysis_Enzymes
      (IPR051532), which is characteristic of the SGNH hydrolase superfamily.
    action: NEW
    reason: >-
      The SGNH hydrolase superfamily (Gene3D: 3.40.50.1110, SUPFAM: SSF52266) consists of serine
      esterases/lipases that hydrolyze ester bonds. The catalytic domain structure (residues 9-149,
      solved at 1.40 A in PDB:2VPT) confirms this fold. In the context of the cellulosome, this
      esterase activity likely targets ester-linked substituents on plant cell wall polysaccharides.
    supported_by:
    - reference_id: file:ACET2/P15329/P15329-deep-research-falcon.md
      supporting_text: >-
        Key Domains: Dockerin_1_rpt. (IPR002105); Dockerin_dom. (IPR016134); Dockerin_dom_sf. (IPR036439);
        EF_Hand_1_Ca_BS. (IPR018247); Ester_Hydrolysis_Enzymes. (IPR051532)

references:
- id: file:ACET2/P15329/P15329-deep-research-falcon.md
  title: Deep research review of CelX (P15329) from Acetivibrio thermocellus
  findings:
  - statement: >-
      The protein contains dockerin repeats and EF-hand Ca2+-binding motifs characteristic of
      cellulosomal enzymes, as well as the Ester_Hydrolysis_Enzymes domain (IPR051532).
    supporting_text: >-
      Key Domains: Dockerin_1_rpt. (IPR002105); Dockerin_dom. (IPR016134); Dockerin_dom_sf. (IPR036439);
      EF_Hand_1_Ca_BS. (IPR018247); Ester_Hydrolysis_Enzymes. (IPR051532)
  - statement: >-
      No experimental paper was found that explicitly characterizes CelX/EGX in C. thermocellum.
      The protein function is inferred from domain architecture.
    supporting_text: >-
      No explicit primary literature found that experimentally characterizes a protein named "celX"
      or "endoglucanase X (EGX)" in C. thermocellum; annotation appears based on sequence/domain
      evidence and database curation
  - statement: >-
      Dockerin-bearing enzymes are secreted and assembled on scaffoldins via Ca2+-dependent
      cohesin-dockerin binding.
    supporting_text: >-
      Dockerin‑bearing CAZymes are secreted and assembled on non‑catalytic scaffoldins via cohesin–dockerin
      binding; CBMs tether the complex to cellulose... CipA (primary scaffoldin) organizes multiple type‑I
      cohesins for CAZyme recruitment
- id: PMID:3066698
  title: >-
    Conserved reiterated domains in Clostridium thermocellum endoglucanases are not essential for
    catalytic activity.
  findings:
  - statement: >-
      CelX was identified as a secondary ORF in the genomic region encoding CelE, with a C-terminus
      homologous to other C. thermocellum endoglucanases (the dockerin domain).
    supporting_text: >-
      A second ORF which ends 349 bp 5' to the GTG start codon of the celE gene has also been
      identified. The encoded product contains a C terminus homologous to other C. thermocellum
      endoglucanases.
  - statement: >-
      The paper primarily characterizes CelE (endoglucanase E with xylanase activity), not CelX.
      No experimental characterization of CelX enzymatic activity is provided.
    supporting_text: >-
      The complete nucleotide sequence of the Clostridium thermocellum celE gene, coding for an
      endo-beta-1,4-glucanase (endoglucanase E; EGE) with xylan-hydrolysing activity has been determined.
- id: GO_REF:0000002
  title: Gene Ontology annotation through association of InterPro records with GO terms
  findings: []
- id: GO_REF:0000003
  title: Gene Ontology annotation based on Enzyme Commission mapping
  findings: []
- id: GO_REF:0000043
  title: Gene Ontology annotation based on UniProtKB/Swiss-Prot keyword mapping
  findings: []
- id: GO_REF:0000118
  title: TreeGrafter-generated GO annotations
  findings: []
- id: GO_REF:0000120
  title: Combined Automated Annotation using Multiple IEA Methods
  findings: []

core_functions:
- description: >-
    CelX functions as a cellulosome-associated esterase that likely removes ester-linked substituents
    from plant cell wall polysaccharides, facilitating lignocellulose degradation by the cellulosome
    complex. The dockerin domain anchors the enzyme to the scaffoldin, while the SGNH hydrolase domain
    provides the catalytic esterase activity.
  molecular_function:
    id: GO:0016788
    label: hydrolase activity, acting on ester bonds
  directly_involved_in:
  - id: GO:0000272
    label: polysaccharide catabolic process
  locations:
  - id: GO:0043263
    label: cellulosome

proposed_new_terms: []

suggested_questions:
- question: >-
    What is the actual enzymatic activity of CelX? Is it a feruloyl esterase, acetylxylan esterase,
    or does it have some other esterase substrate specificity?
  experts:
  - Gilbert HJ
  - Hazlewood GP
- question: >-
    Has CelX been biochemically characterized with purified protein? The current annotations appear
    to be based on genomic context and domain predictions rather than experimental evidence.
  experts: []

suggested_experiments:
- hypothesis: CelX has esterase activity rather than cellulase activity
  description: >-
    Express and purify recombinant CelX (without the dockerin domain if stability is an issue) and
    test for activity against: (1) generic esterase substrates (p-nitrophenyl acetate), (2) cellulase
    substrates (CMC, filter paper, cellooligosaccharides), and (3) hemicellulose-associated ester
    substrates (methyl ferulate, acetylated xylan). The crystal structure strongly predicts esterase
    activity and absence of cellulase activity.
  experiment_type: biochemical assay
- hypothesis: CelX is incorporated into the cellulosome via its dockerin domain
  description: >-
    Test binding of CelX dockerin domain to recombinant cohesin domains from A. thermocellus scaffoldin
    using isothermal titration calorimetry or surface plasmon resonance. This would confirm cellulosome
    association.
  experiment_type: protein-protein interaction assay