ClpP2 (Rv2460c) is the proteolytic subunit of the essential ATP-dependent Clp protease in M. tuberculosis. Unlike most bacteria with a single ClpP homooligomer, Mtb requires both ClpP1 and ClpP2 to form a catalytically active heterotetradecameric complex (7 ClpP1 + 7 ClpP2 subunits). The protease uses a Ser-His-Asp catalytic triad (Ser110, His135 in ClpP2) for serine-type endopeptidase activity, with a preference for cleavage after leucine or methionine. ClpP1P2 collaborates with AAA+ unfoldases ClpC1 and ClpX, which unfold and translocate substrates into the proteolytic chamber. The complex degrades misfolded proteins for quality control and selectively degrades regulatory anti-sigma factors (RseA via ClpC1P2, RsdA via ClpP1P2-ClpX) to activate stress-responsive transcription. Both clpP1 and clpP2 are essential for Mtb viability, and ClpP1P2 is a validated drug target for tuberculosis therapy. Overexpression of clpP2 (but not clpP1) is toxic, suggesting distinct roles for the two subunits.
| GO Term | Evidence | Action | Reason |
|---|---|---|---|
|
GO:0009368
endopeptidase Clp complex
|
IBA
GO_REF:0000033 |
ACCEPT |
Summary: ClpP2 is a core subunit of the Clp protease complex. In Mtb, this is specifically the heterotetradecameric ClpP1P2 complex. The IBA annotation is well-supported by extensive structural and biochemical evidence.
Reason: ClpP2 forms the ClpP1P2 heterocomplex, confirmed by crystal structure (PDB: 5DZK, 5E0S) and biochemical studies. Akopian et al. (2012) showed that only the mixed ClpP1-ClpP2 complex is proteolytically active.
Supporting Evidence:
PMID:22286948
reconstitute an active, mixed ClpP1P2 complex after identifying N-blocked dipeptides that stimulate dramatically (>1000-fold) ClpP1P2 activity
file:MYCTU/clpP2/clpP2-deep-research-bioreason-sft.md
A cytosolic protease that assembles into a tetradecameric barrel and, when engaged by AAA+ ATPase partners, degrades misfolded, damaged, or regulatory proteins
|
|
GO:0006515
protein quality control for misfolded or incompletely synthesized proteins
|
IBA
GO_REF:0000033 |
ACCEPT |
Summary: Protein quality control is a well-established core function of Clp proteases. The ClpP1P2 complex, in collaboration with AAA+ ATPases, degrades misfolded and damaged proteins. UniProt function description states ClpP2 "plays a major role in the degradation of misfolded proteins."
Reason: This is a conserved core function of ClpP proteases across bacteria. The Mtb ClpP1P2 complex specifically degrades misfolded proteins, consistent with IBA inference from E. coli ClpP (P0A6G7). Supported by biochemical evidence of processive degradation with AAA+ partners.
Supporting Evidence:
PMID:24976069
Mycobacterial Clp-family proteases function via collaboration of the heteromeric ClpP1P2 peptidase with a AAA+ partner, ClpX or ClpC1. These enzymes are essential for M. tuberculosis viability
PMID:22123255
the proteolytic activity of ClpP in concert with the ATPase activity of the ClpX/ClpC subunits results in degradation of misfolded or damaged proteins
|
|
GO:0004252
serine-type endopeptidase activity
|
IBA
GO_REF:0000033 |
ACCEPT |
Summary: ClpP2 is a serine protease with a Ser-His-Asp catalytic triad. The active site serine (Ser110) acts as the nucleophile for peptide bond hydrolysis. This is a core molecular function annotation.
Reason: Serine-type endopeptidase activity is confirmed by the catalytic triad (Ser110, His135), structural data, and enzymatic activity studies. ClpP1P2 prefers cleavage after leucine or methionine at the P1 position.
Supporting Evidence:
PMID:25759383
we show that ClpP1P2 prefers Met β« Leu > Phe > Ala in the X1 position
PMID:22286948
ClpP1P2 possesses chymotrypsin and caspase-like activities, and ClpP1 and ClpP2 differ in cleavage preferences
|
|
GO:0004176
ATP-dependent peptidase activity
|
IBA
GO_REF:0000033 |
ACCEPT |
Summary: The ClpP1P2 complex carries out ATP-dependent proteolysis, but importantly the ATPase activity resides in the AAA+ partner unfoldases (ClpC1, ClpX), not in ClpP2 itself. ClpP2 contributes the peptidase activity to a complex whose overall function is ATP-dependent. The qualifier should be contributes_to rather than enables, since the ATP-dependent activity is a property of the holoenzyme complex (ClpXP1P2 or ClpC1P1P2).
Reason: ClpP2 provides the proteolytic component of an ATP-dependent protease complex. ATP hydrolysis is performed by ClpX or ClpC1, not by ClpP2 itself. The annotation correctly reflects that the holoenzyme complex has ATP-dependent peptidase activity, and ClpP2 contributes to this.
Supporting Evidence:
PMID:24976069
Both ClpX and ClpC1 catalyse ATP-dependent unfolding and degradation of native protein substrates in conjunction with ClpP1P2
|
|
GO:0051117
ATPase binding
|
IBA
GO_REF:0000033 |
ACCEPT |
Summary: ClpP2 directly binds the AAA+ ATPases ClpX and ClpC1. The interaction is asymmetric: ClpX binds specifically to the ClpP2 ring, while ClpC1 can associate with either ring. This binding is essential for substrate delivery and activation of the protease.
Reason: Direct binding of ClpP2 to ClpX is well-established. Leodolter et al. (2015) demonstrated that ClpX binds specifically to the ClpP2 ring. Multiple structural studies confirm the ClpP-ATPase interface.
Supporting Evidence:
PMID:25933022
both Mycobacterium tuberculosis Clp chaperones require the intact interaction face of ClpP2 to support degradation, resulting in an asymmetric complex where chaperones only bind to the ClpP2 side of the proteolytic core
|
|
GO:0004176
ATP-dependent peptidase activity
|
IEA
GO_REF:0000002 |
ACCEPT |
Summary: InterPro2GO annotation from IPR001907 (ATP-dependent Clp protease proteolytic subunit). Correctly captures the functional family assignment. Redundant with the IBA annotation above.
Reason: Correct InterPro2GO mapping. IPR001907 is the ClpP protease family, and ATP-dependent peptidase activity is the appropriate molecular function for this family. Supported by the same biochemical evidence as the IBA annotation.
|
|
GO:0004252
serine-type endopeptidase activity
|
IEA
GO_REF:0000120 |
ACCEPT |
Summary: Combined IEA annotation from InterPro:IPR001907 and EC:3.4.21.92. Redundant with the IBA annotation but correctly assigned.
Reason: Correct and well-supported annotation. EC 3.4.21.92 is the specific EC number for endopeptidase Clp, and IPR001907 maps to the ClpP family.
|
|
GO:0005737
cytoplasm
|
IEA
GO_REF:0000120 |
ACCEPT |
Summary: Cytoplasmic localization is correct for ClpP2. The protein lacks signal peptides and transmembrane domains and functions as a soluble protease in the cytoplasm, consistent with UniProt annotation based on HAMAP-Rule MF_00444.
Reason: Cytoplasmic localization is appropriate for a soluble protease subunit. No signal peptide or transmembrane domains are present. Consistent with the known biology of Clp proteases as cytoplasmic quality-control enzymes.
|
|
GO:0006508
proteolysis
|
IEA
GO_REF:0000120 |
KEEP AS NON CORE |
Summary: Proteolysis is correct but very general. The more specific annotations for protein quality control (GO:0006515) and the specific molecular function annotations provide better functional resolution. This parent term is acceptable as a broad IEA annotation.
Reason: Technically correct but too general; the more informative annotations for protein quality control and specific substrate processing better describe ClpP2 function. Retained as a broad supporting annotation.
|
|
GO:0005515
protein binding
|
IPI
PMID:16844784 Dissecting virulence pathways of Mycobacterium tuberculosis ... |
REMOVE |
Summary: This annotation derives from the M-PFC protein-protein interaction study by Singh et al. (2006). The "protein binding" term is too vague and uninformative per curation guidelines. ClpP2 has specific well-characterized protein interactions (with ClpP1, ClpX, ClpC1) that are better captured by more specific terms. The ATPase binding annotation (GO:0051117) already covers the key functional interaction.
Reason: Per curation guidelines, GO:0005515 (protein binding) should be avoided as it does not convey meaningful functional information. The biologically relevant interactions of ClpP2 are captured by GO:0051117 (ATPase binding) and GO:0009368 (endopeptidase Clp complex). The IPI evidence from this high-throughput interaction screen does not add specificity.
Supporting Evidence:
PMID:16844784
Rv3596c (ClpC1), an AAA-ATPase chaperone involved in protein translocation and quality control
|
|
GO:0005886
plasma membrane
|
HDA
PMID:14532352 Comprehensive proteomic profiling of the membrane constituen... |
MARK AS OVER ANNOTATED |
Summary: This annotation derives from a large-scale membrane proteomics study (Gu et al. 2003) that identified 739 proteins in the membrane fraction. The study notes that many identified proteins are soluble cytoplasmic proteins that copurify with membranes. ClpP2 lacks transmembrane domains and a signal peptide. UniProt annotates cytoplasm, not membrane. This is likely a copurification artifact rather than true membrane localization.
Reason: ClpP2 is a soluble cytoplasmic protein (no TM domains, no signal peptide). The membrane proteomics study identified it as part of 739 proteins in the membrane fraction, many of which are known cytoplasmic contaminants. UniProt assigns cytoplasm via HAMAP-Rule MF_00444. The primary localization is cytoplasmic/cytosolic.
Supporting Evidence:
PMID:14532352
739 proteins have been identified by two or more distinct peptide sequences...approximately 450 proteins represent novel identifications, 79 of which are membrane proteins and more than 100 of which are membrane-associated proteins
file:MYCTU/clpP2/clpP2-deep-research-bioreason-sft.md
Although primarily soluble, it likely interfaces with membrane- and cell wall-proximal quality-control pathways during envelope stress
|
Q: What are the specific in vivo substrates of ClpP1P2 beyond the anti-sigma factors RseA and RsdA? A comprehensive substrate identification study would illuminate the full scope of ClpP1P2 regulatory functions.
Q: What is the molecular basis for the toxicity of clpP2 overexpression but not clpP1 overexpression? Does excess ClpP2 form aberrant complexes or titrate away essential partners?
Suggested experts: Ollinger J, Parish T
Q: How does the functional asymmetry between ClpP1 and ClpP2 catalytic sites affect substrate processing in vivo? Are there substrates preferentially processed by one active site over the other?
Suggested experts: Nagpal J
Experiment: Trap-and-release proteomics using catalytically inactive ClpP1P2 mutants to identify the full set of in vivo substrates in different growth conditions (exponential, stationary, stress).
Hypothesis: ClpP1P2 has a broader substrate repertoire than currently known, including regulators of virulence and persistence programs.
Experiment: Characterization of clpP2 overexpression toxicity mechanism using proteomics and genetic suppressor analysis.
Hypothesis: Excess ClpP2 may form aberrant homomeric complexes or deplete ClpP1 from functional heterocomplexes, disrupting proteostasis.
I begin with the InterPro architecture, which is dominated by signatures of the ATP-dependent ClpP protease core. The central scaffold is IPR029045 (ClpP/crotonase-like domain superfamily, residues 24β208), a double Ξ²βΞ²βΞ± sandwich that forms the proteolytic barrel of ClpP-like peptidases. Nested within this fold are multiple family-level matches to the proteolytic subunit: IPR001907 (ATP-dependent Clp protease proteolytic subunit, residues 15β207; 31β46; 33β203; 71β91; 102β119; 182β201) and IPR023562 (Clp protease proteolytic subunit / Translocation-enhancing protein TepA, residues 17β208; 30β206). These overlapping entries collectively define the canonical ClpP protease subunit that assembles into a tetradecameric barrel with active sites facing the chamber interior. Two catalytic-site annotations pinpoint the catalytic machinery: IPR018215 (ClpP, Ser active site, residues 102β113) and IPR033135 (ClpP, histidine active site, residues 124β137). Together with the conserved acidic residue in the same family signatures, these features create a SerβHisβAsp triad that drives peptide-bond hydrolysis. The repeated IPR001907 entries across distinct spans indicate that multiple conserved submotifs are present, consistent with a proteolytically active barrel that requires partner ATPases for substrate delivery.
This architecture causes serine-type endopeptidase activity: the catalytic serine within the ClpP fold performs nucleophilic attack on peptide bonds, while the histidine and aspartate complete the charge-relay system. Because ClpP barrels lack ATPase modules, the protease itself does not hydrolyze ATP; instead, it binds AAA+ ATPase partners (e.g., ClpX or ClpC) that unfold substrates and translocate them through axial pores into the proteolytic chamber. The presence of multiple ClpP family signatures and the catalytic-site annotations therefore specify the molecular functions as serine-type endopeptidase activity and protein binding (the latter arising from obligatory assembly with AAA+ unfoldases and adaptor proteins).
At the process level, a ClpP barrel that is activated by ATP-dependent unfoldases is central to protein quality control. Under heat stress, misfolded and damaged proteins accumulate; AAA+ partners recognize degrons and feed substrates into the ClpP chamber for processive degradation. This mechanistic coupling places the enzyme in the response to heat, where it eliminates aggregation-prone proteins and restores proteostasis. More broadly, the same chemistry underlies cellular protein catabolic processes, including turnover of regulatory factors and removal of damaged proteins.
The cellular context follows from the proteaseβs solubility and its reliance on cytosolic AAA+ partners. The absence of transmembrane segments and the soluble ClpP fold indicate a cytosolic enzyme, aligning with cytosol as a primary location. In bacteria, AAA+ proteases can transiently associate with the inner membrane and the cell wall interface during stress responses or when processing envelope-associated substrates; this explains observed associations with the plasma membrane and peptidoglycan-based cell wall as peripheral or transiently engaged locales rather than integral residence.
Mechanistically, the protein assembles into a tetradecameric barrel that binds ClpX or ClpC via their N-terminal domains. ATP hydrolysis by the AAA+ partners drives substrate unfolding and axial translocation, positioning polypeptides into the ClpP chamber where the SerβHisβAsp triad executes endoproteolysis. Chaperones such as trigger factor and GroES/GroEL and disaggregase ClpB likely hand off or remodel substrates to the AAA+βClpP complexes. The interaction list supports this model: ClpP1 suggests hetero-oligomeric pairing of ClpP paralogs; ClpX and ClpC are the ATPase unfoldases that gate substrate entry; ClpB, Tig (trigger factor), and GroES cooperate upstream to triage misfolded proteins; and additional partners (e.g., ClpB protein F84.1, GroES, and uncharacterized large proteins annotated as possible ClpC-like ATPases) suggest a network of proteostasis factors. Association with the 50S ribosomal protein L11 (RplK) hints at cotranslational quality control or turnover of ribosome-associated factors during stress. Overall, the domain architecture dictates a serine endopeptidase that, when coupled to AAA+ ATPases, degrades misfolded and regulatory proteins to maintain proteostasis during heat stress in the cytosol, with transient membrane/cell wall engagement for envelope-related quality control.
## Functional Summary
A cytosolic protease that assembles into a tetradecameric barrel and, when engaged by AAA+ ATPase partners, degrades misfolded, damaged, or regulatory proteins to maintain proteostasis during heat stress. It recognizes substrates via partner ATPases, uses ATP hydrolysis by those partners to unfold and translocate polypeptides into its proteolytic chamber, and employs a catalytic serineβhistidineβaspartate triad to cleave internal peptide bonds. Although primarily soluble, it likely interfaces with membrane- and cell wallβproximal quality-control pathways during envelope stress, coordinating with chaperones and disaggregases to triage substrates.
## UniProt Summary
Component of the ClpC-ClpP protease. Involved in degradation of denatured proteins. Has serine protease activity.
## InterPro Domains
- IPR001907: ATP-dependent Clp protease proteolytic subunit (family) [15-207]
- IPR023562: Clp protease proteolytic subunit /Translocation-enhancing protein TepA (family) [17-208]
- IPR029045: ClpP/crotonase-like domain superfamily (homologous_superfamily) [24-208]
- IPR023562: Clp protease proteolytic subunit /Translocation-enhancing protein TepA (family) [30-206]
- IPR001907: ATP-dependent Clp protease proteolytic subunit (family) [31-46]
- IPR001907: ATP-dependent Clp protease proteolytic subunit (family) [33-203]
- IPR001907: ATP-dependent Clp protease proteolytic subunit (family) [71-91]
- IPR001907: ATP-dependent Clp protease proteolytic subunit (family) [102-119]
- IPR018215: ClpP, Ser active site (active_site) [102-113]
- IPR001907: ATP-dependent Clp protease proteolytic subunit (family) [123-142]
- IPR033135: ClpP, histidine active site (active_site) [124-137]
- IPR001907: ATP-dependent Clp protease proteolytic subunit (family) [182-201]
## GO Term Predictions
### Molecular Function
### Biological Process
### Cellular Component
Both clpP1 and clpP2 are essential for M. tuberculosis viability, a property that distinguishes Mtb from organisms like E. coli where clpP deletion is tolerated. Ollinger et al. (2012) demonstrated this through genetic approaches: "ClpP1 is essential for viability in this organism in culture, since the gene could only be deleted from the chromosome when a second functional copy was provided. Overexpression of clpP1 had no effect on growth in aerobic culture... In contrast, clpP2 overexpression was toxic, suggesting different roles for the two homologs" [PMID:22123255, "clpP2 overexpression was toxic, suggesting different roles for the two homologs"]. This establishes ClpP as a validated drug target.
Unlike most bacteria that have a single ClpP forming a homocomplex, M. tuberculosis requires both ClpP1 and ClpP2 to form the active protease. Akopian et al. (2012) demonstrated: "The active ClpP protease from M. tuberculosis is a complex composed of a heptameric ClpP1 and a ClpP2 ring" [PMID:22286948, "Although each formed a tetradecameric structure in vitro, neither was active alone... only when ClpP1 and ClpP2 were mixed together did a proteolytically active complex form"]. This hetero-complex requires the dipeptide activator Z-Leu-Leu (or similar N-blocked peptides) for in vitro activation.
Raju et al. (2014) characterized the cleavage specificity: "Using a fluorogenic tripeptide library... and by determining kinetic parameters for single fluorogenic substrates, we found that ClpP1P2 has a marked preference for cleavage after leucine or methionine at the P1 position" PMID:25759383. The protease uses a catalytic Ser-His-Asp triad (Ser110, His135 in ClpP2).
ClpP1P2 collaborates with two AAA+ unfoldases, ClpC1 and ClpX. Sauer et al. (2014) showed: "Substrate delivery by the AAA+ ClpX and ClpC1 unfoldases activates the mycobacterial ClpP1P2 peptidase" PMID:24976069. The interaction is asymmetric: "The Mycobacterium tuberculosis ClpP1P2 Protease Interacts Asymmetrically with Its ATPase Partners ClpX and ClpC1" PMID:25933022. Leodolter et al. (2015) showed that "ClpX bound to the ClpP2 ring, whereas ClpC1 associated with either ring."
Nagpal et al. (2022) demonstrated: "Unlike in E. coli, the Mtb Clp protease consists of two distinct proteolytic subunits, ClpP1 and ClpP2, which hydrolyze substrates cooperatively but have different catalytic-site specificities" PMID:35507665. They showed the complex is functionally asymmetric in vivo.
Vahidi et al. (2020) solved the cryo-EM structure and identified: "An allosteric switch regulates Mtb ClpP1P2 protease function" PMID:32123115. The N-blocked dipeptide activators bind in the active site clefts and trigger a conformational change from an inactive compressed form to an extended active form.
Ingvarsson et al. (2014) solved the crystal structure: "Crystal structure of Mycobacterium tuberculosis ClpP1P2 suggests a model for peptidase activation by AAA+ partner binding and substrate delivery" PMID:25267638. The structure reveals the heterotetradecameric barrel with 7 ClpP1 and 7 ClpP2 subunits forming two stacked heptameric rings.
ClpP2 (in complex with ClpP1 and partner ATPases) degrades specific regulatory substrates:
- Anti-sigma-E factor RseA: degraded by ClpC1P2 complex upon PknB-dependent phosphorylation, activating the SigE regulon under envelope stress [PMID:20025669, "PknB-dependent phosphorylation of RseA on T39 is required for its cleavage by ClpC1P2 thereby activating the SigE regulon"]
- Anti-sigma-D factor RsdA: degraded when in complex with ClpP1 and ClpX, providing selective regulation of sigma factor activity PMID:23314154
- Does NOT act on anti-sigma-L factor RslA (selectivity among anti-sigma factors)
Multiple classes of compounds target ClpP1P2:
- Acyldepsipeptides (ADEPs) dysregulate ClpP activity [PMID:26919556, PMID:32083462]
- Bortezomib (proteasome inhibitor) inhibits ClpP1P2 [PMID:25944857, PMID:28193668]
- Beta-lactones and pyrrole-based compounds PMID:38088921
- Ilamycins/rufomycins target ClpC1 PMID:36286456
ClpP2 is primarily a cytoplasmic/cytosolic protein. The plasma membrane identification in Gu et al. (2003) PMID:14532352 is from a large-scale membrane fraction proteomics study that identified 739 proteins, many of which are soluble cytoplasmic proteins co-purifying with membranes. This is likely a contaminant rather than true membrane localization. UniProt assigns cytoplasm based on HAMAP rule MF_00444.
PMID:16844784 (Singh et al. 2006) developed the M-PFC system and identified ClpC1 as interacting with Cfp-10. The protein binding annotation for ClpP2 from this reference likely reflects the broader Clp network interactions. The direct physically validated partners are ClpP1 (heterooligomerization) and ClpX/ClpC1 (AAA+ partner binding).
ClpP2 is the proteolytic subunit of the essential ClpP1P2 heterotetradecameric protease in M. tuberculosis. It forms a barrel-like structure with ClpP1 (7+7 subunits). The protease requires AAA+ ATPase partners (ClpC1, ClpX) for substrate unfolding and delivery. Key functions include:
1. Serine-type endopeptidase activity (Ser-His-Asp catalytic triad)
2. ATP-dependent proteolysis (through partner ATPases)
3. Protein quality control (degradation of misfolded proteins)
4. Regulated proteolysis of anti-sigma factors (RseA, RsdA) for stress response
5. Essential for viability - validated drug target
Source: clpP2-deep-research-bioreason-sft.md
The BioReason functional summary describes clpP2 as:
A cytosolic protease that assembles into a tetradecameric barrel and, when engaged by AAA+ ATPase partners, degrades misfolded, damaged, or regulatory proteins to maintain proteostasis during heat stress. It recognizes substrates via partner ATPases, uses ATP hydrolysis by those partners to unfold and translocate polypeptides into its proteolytic chamber, and employs a catalytic serine-histidine-aspartate triad to cleave internal peptide bonds. Although primarily soluble, it likely interfaces with membrane- and cell wall-proximal quality-control pathways during envelope stress, coordinating with chaperones and disaggregases to triage substrates.
This is a largely accurate summary of ClpP2 function. The core biology is correct: tetradecameric barrel assembly, collaboration with AAA+ ATPases, Ser-His-Asp catalytic triad, and role in protein quality control. However, there are several issues:
Correctness issues (minor):
The summary overemphasizes "heat stress" as the primary context. While Clp proteases do function during heat stress, the Mtb ClpP1P2 system is essential for viability under all conditions, not just heat stress. The essentiality of clpP2 for normal growth (PMID:22123255) is a critical point that the summary misses.
The summary states ClpP2 "assembles into a tetradecameric barrel" without mentioning the essential heteromeric nature of the complex. The unique Mtb biology -- that ClpP1P2 is a HETERO-tetradecamer requiring both ClpP1 and ClpP2 (PMID:22286948) -- is completely absent. This is the single most important distinguishing feature of the Mtb Clp system.
The claim that ClpP2 "likely interfaces with membrane- and cell wall-proximal quality-control pathways during envelope stress" is speculative. The thinking trace infers this from GOA localization annotations (plasma membrane, peptidoglycan-based cell wall), but the membrane annotation from PMID:14532352 is likely a copurification artifact from a large-scale proteomics study. The actual envelope stress connection is through RseA degradation activating the SigE regulon (PMID:20025669), which the summary does not mention.
The mention of "coordinating with chaperones and disaggregases to triage substrates" including "GroES/GroEL and disaggregase ClpB" in the thinking trace is speculative and not specific to ClpP2 function. No evidence supports direct coordination between ClpP2 and GroEL/ES in Mtb.
Completeness issues:
Comparison with interpro2go:
The interpro2go annotation (GO_REF:0000002) maps IPR001907 to GO:0004176 (ATP-dependent peptidase activity). This is a correct and straightforward mapping. The BioReason summary recapitulates the same information that would be inferred from interpro2go -- the ClpP family membership, serine-type endopeptidase activity, and ATP-dependent proteolysis. The narrative adds some mechanistic context (AAA+ partner requirement, barrel assembly, catalytic triad) that goes modestly beyond a simple interpro2go mapping, but the biological insight is largely domain-architecture-driven rather than literature-informed.
Critically, the BioReason SFT trace does not capture the organism-specific biology that makes Mtb ClpP2 distinctive: the ClpP1-ClpP2 heteromer requirement, the essentiality, the specific substrates, or the drug target relevance. These aspects require literature knowledge beyond what can be inferred from protein domains alone. The functional summary reads like a generic ClpP description that could apply to any bacterial ClpP rather than specifically to the Mtb enzyme.
The interpro2go annotations are correct but minimal. BioReason provides a more narrative treatment but does not substantially surpass interpro2go in biological accuracy or specificity for this particular protein.
The thinking trace is methodical and follows a domain-architecture-first reasoning approach. It correctly identifies all InterPro entries and builds upward from the catalytic site annotations (IPR018215, IPR033135) to the family-level (IPR001907) and superfamily (IPR029045). The reasoning about the Ser-His-Asp triad is correct.
The trace is weakest where it attempts to infer biological context without literature support. The claims about "response to heat" being the primary context, about "transient membrane/cell wall engagement for envelope-related quality control," and about cooperation with "trigger factor and GroES/GroEL" are all either speculative or generic rather than Mtb-specific. The mention of "50S ribosomal protein L11 (RplK)" interaction suggesting "cotranslational quality control" appears to come from the interaction data but is not validated for ClpP2.
The trace notably fails to identify the heterocomplex requirement with ClpP1, which is arguably the most important biological insight about this protein. This is understandable since the domain architecture alone cannot reveal this -- it requires organism-specific biochemical knowledge.
id: P9WPC3
gene_symbol: clpP2
product_type: PROTEIN
status: DRAFT
taxon:
id: NCBITaxon:83332
label: Mycobacterium tuberculosis (strain ATCC 25618 / H37Rv)
description: >-
ClpP2 (Rv2460c) is the proteolytic subunit of the essential ATP-dependent
Clp protease in M. tuberculosis. Unlike most bacteria with a single ClpP
homooligomer, Mtb requires both ClpP1 and ClpP2 to form a catalytically
active heterotetradecameric complex (7 ClpP1 + 7 ClpP2 subunits). The
protease uses a Ser-His-Asp catalytic triad (Ser110, His135 in ClpP2) for
serine-type endopeptidase activity, with a preference for cleavage after
leucine or methionine. ClpP1P2 collaborates with AAA+ unfoldases ClpC1
and ClpX, which unfold and translocate substrates into the proteolytic
chamber. The complex degrades misfolded proteins for quality control and
selectively degrades regulatory anti-sigma factors (RseA via ClpC1P2,
RsdA via ClpP1P2-ClpX) to activate stress-responsive transcription.
Both clpP1 and clpP2 are essential for Mtb viability, and ClpP1P2 is a
validated drug target for tuberculosis therapy. Overexpression of clpP2
(but not clpP1) is toxic, suggesting distinct roles for the two subunits.
existing_annotations:
- term:
id: GO:0009368
label: endopeptidase Clp complex
evidence_type: IBA
original_reference_id: GO_REF:0000033
review:
summary: >-
ClpP2 is a core subunit of the Clp protease complex. In Mtb, this is
specifically the heterotetradecameric ClpP1P2 complex. The IBA annotation
is well-supported by extensive structural and biochemical evidence.
action: ACCEPT
reason: >-
ClpP2 forms the ClpP1P2 heterocomplex, confirmed by crystal structure
(PDB: 5DZK, 5E0S) and biochemical studies. Akopian et al. (2012) showed
that only the mixed ClpP1-ClpP2 complex is proteolytically active.
supported_by:
- reference_id: PMID:22286948
supporting_text: "reconstitute an active, mixed ClpP1P2 complex after identifying N-blocked dipeptides that stimulate dramatically (>1000-fold) ClpP1P2 activity"
- reference_id: file:MYCTU/clpP2/clpP2-deep-research-bioreason-sft.md
supporting_text: "A cytosolic protease that assembles into a tetradecameric barrel and, when engaged by AAA+ ATPase partners, degrades misfolded, damaged, or regulatory proteins"
- term:
id: GO:0006515
label: protein quality control for misfolded or incompletely synthesized proteins
evidence_type: IBA
original_reference_id: GO_REF:0000033
review:
summary: >-
Protein quality control is a well-established core function of Clp proteases.
The ClpP1P2 complex, in collaboration with AAA+ ATPases, degrades misfolded
and damaged proteins. UniProt function description states ClpP2 "plays a
major role in the degradation of misfolded proteins."
action: ACCEPT
reason: >-
This is a conserved core function of ClpP proteases across bacteria. The
Mtb ClpP1P2 complex specifically degrades misfolded proteins, consistent
with IBA inference from E. coli ClpP (P0A6G7). Supported by biochemical
evidence of processive degradation with AAA+ partners.
supported_by:
- reference_id: PMID:24976069
supporting_text: "Mycobacterial Clp-family proteases function via collaboration of the heteromeric ClpP1P2 peptidase with a AAA+ partner, ClpX or ClpC1. These enzymes are essential for M. tuberculosis viability"
- reference_id: PMID:22123255
supporting_text: "the proteolytic activity of ClpP in concert with the ATPase activity of the ClpX/ClpC subunits results in degradation of misfolded or damaged proteins"
- term:
id: GO:0004252
label: serine-type endopeptidase activity
evidence_type: IBA
original_reference_id: GO_REF:0000033
review:
summary: >-
ClpP2 is a serine protease with a Ser-His-Asp catalytic triad. The
active site serine (Ser110) acts as the nucleophile for peptide bond
hydrolysis. This is a core molecular function annotation.
action: ACCEPT
reason: >-
Serine-type endopeptidase activity is confirmed by the catalytic triad
(Ser110, His135), structural data, and enzymatic activity studies. ClpP1P2
prefers cleavage after leucine or methionine at the P1 position.
supported_by:
- reference_id: PMID:25759383
supporting_text: "we show that ClpP1P2 prefers Met β« Leu > Phe > Ala in the X1 position"
- reference_id: PMID:22286948
supporting_text: "ClpP1P2 possesses chymotrypsin and caspase-like activities, and ClpP1 and ClpP2 differ in cleavage preferences"
- term:
id: GO:0004176
label: ATP-dependent peptidase activity
evidence_type: IBA
original_reference_id: GO_REF:0000033
review:
summary: >-
The ClpP1P2 complex carries out ATP-dependent proteolysis, but importantly
the ATPase activity resides in the AAA+ partner unfoldases (ClpC1, ClpX),
not in ClpP2 itself. ClpP2 contributes the peptidase activity to a complex
whose overall function is ATP-dependent. The qualifier should be
contributes_to rather than enables, since the ATP-dependent activity
is a property of the holoenzyme complex (ClpXP1P2 or ClpC1P1P2).
action: ACCEPT
reason: >-
ClpP2 provides the proteolytic component of an ATP-dependent protease
complex. ATP hydrolysis is performed by ClpX or ClpC1, not by ClpP2
itself. The annotation correctly reflects that the holoenzyme complex
has ATP-dependent peptidase activity, and ClpP2 contributes to this.
supported_by:
- reference_id: PMID:24976069
supporting_text: "Both ClpX and ClpC1 catalyse ATP-dependent unfolding and degradation of native protein substrates in conjunction with ClpP1P2"
- term:
id: GO:0051117
label: ATPase binding
evidence_type: IBA
original_reference_id: GO_REF:0000033
review:
summary: >-
ClpP2 directly binds the AAA+ ATPases ClpX and ClpC1. The interaction
is asymmetric: ClpX binds specifically to the ClpP2 ring, while ClpC1
can associate with either ring. This binding is essential for substrate
delivery and activation of the protease.
action: ACCEPT
reason: >-
Direct binding of ClpP2 to ClpX is well-established. Leodolter et al.
(2015) demonstrated that ClpX binds specifically to the ClpP2 ring.
Multiple structural studies confirm the ClpP-ATPase interface.
supported_by:
- reference_id: PMID:25933022
supporting_text: "both Mycobacterium tuberculosis Clp chaperones require the intact interaction face of ClpP2 to support degradation, resulting in an asymmetric complex where chaperones only bind to the ClpP2 side of the proteolytic core"
- term:
id: GO:0004176
label: ATP-dependent peptidase activity
evidence_type: IEA
original_reference_id: GO_REF:0000002
review:
summary: >-
InterPro2GO annotation from IPR001907 (ATP-dependent Clp protease
proteolytic subunit). Correctly captures the functional family assignment.
Redundant with the IBA annotation above.
action: ACCEPT
reason: >-
Correct InterPro2GO mapping. IPR001907 is the ClpP protease family,
and ATP-dependent peptidase activity is the appropriate molecular
function for this family. Supported by the same biochemical evidence
as the IBA annotation.
- term:
id: GO:0004252
label: serine-type endopeptidase activity
evidence_type: IEA
original_reference_id: GO_REF:0000120
review:
summary: >-
Combined IEA annotation from InterPro:IPR001907 and EC:3.4.21.92.
Redundant with the IBA annotation but correctly assigned.
action: ACCEPT
reason: >-
Correct and well-supported annotation. EC 3.4.21.92 is the specific
EC number for endopeptidase Clp, and IPR001907 maps to the ClpP family.
- term:
id: GO:0005737
label: cytoplasm
evidence_type: IEA
original_reference_id: GO_REF:0000120
review:
summary: >-
Cytoplasmic localization is correct for ClpP2. The protein lacks
signal peptides and transmembrane domains and functions as a soluble
protease in the cytoplasm, consistent with UniProt annotation based
on HAMAP-Rule MF_00444.
action: ACCEPT
reason: >-
Cytoplasmic localization is appropriate for a soluble protease subunit.
No signal peptide or transmembrane domains are present. Consistent with
the known biology of Clp proteases as cytoplasmic quality-control enzymes.
- term:
id: GO:0006508
label: proteolysis
evidence_type: IEA
original_reference_id: GO_REF:0000120
review:
summary: >-
Proteolysis is correct but very general. The more specific annotations
for protein quality control (GO:0006515) and the specific molecular
function annotations provide better functional resolution. This parent
term is acceptable as a broad IEA annotation.
action: KEEP_AS_NON_CORE
reason: >-
Technically correct but too general; the more informative annotations
for protein quality control and specific substrate processing better
describe ClpP2 function. Retained as a broad supporting annotation.
- term:
id: GO:0005515
label: protein binding
evidence_type: IPI
original_reference_id: PMID:16844784
review:
summary: >-
This annotation derives from the M-PFC protein-protein interaction study
by Singh et al. (2006). The "protein binding" term is too vague and
uninformative per curation guidelines. ClpP2 has specific well-characterized
protein interactions (with ClpP1, ClpX, ClpC1) that are better captured by
more specific terms. The ATPase binding annotation (GO:0051117) already
covers the key functional interaction.
action: REMOVE
reason: >-
Per curation guidelines, GO:0005515 (protein binding) should be avoided
as it does not convey meaningful functional information. The biologically
relevant interactions of ClpP2 are captured by GO:0051117 (ATPase binding)
and GO:0009368 (endopeptidase Clp complex). The IPI evidence from this
high-throughput interaction screen does not add specificity.
supported_by:
- reference_id: PMID:16844784
supporting_text: "Rv3596c (ClpC1), an AAA-ATPase chaperone involved in protein translocation and quality control"
- term:
id: GO:0005886
label: plasma membrane
evidence_type: HDA
original_reference_id: PMID:14532352
review:
summary: >-
This annotation derives from a large-scale membrane proteomics study
(Gu et al. 2003) that identified 739 proteins in the membrane fraction.
The study notes that many identified proteins are soluble cytoplasmic
proteins that copurify with membranes. ClpP2 lacks transmembrane domains
and a signal peptide. UniProt annotates cytoplasm, not membrane. This is
likely a copurification artifact rather than true membrane localization.
action: MARK_AS_OVER_ANNOTATED
reason: >-
ClpP2 is a soluble cytoplasmic protein (no TM domains, no signal peptide).
The membrane proteomics study identified it as part of 739 proteins in
the membrane fraction, many of which are known cytoplasmic contaminants.
UniProt assigns cytoplasm via HAMAP-Rule MF_00444. The primary localization
is cytoplasmic/cytosolic.
supported_by:
- reference_id: PMID:14532352
supporting_text: "739 proteins have been identified by two or more distinct peptide sequences...approximately 450 proteins represent novel identifications, 79 of which are membrane proteins and more than 100 of which are membrane-associated proteins"
- reference_id: file:MYCTU/clpP2/clpP2-deep-research-bioreason-sft.md
supporting_text: "Although primarily soluble, it likely interfaces with membrane- and cell wall-proximal quality-control pathways during envelope stress"
references:
- id: GO_REF:0000002
title: Gene Ontology annotation through association of InterPro records with GO terms
findings: []
- id: GO_REF:0000033
title: Annotation inferences using phylogenetic trees
findings: []
- id: GO_REF:0000120
title: Combined Automated Annotation using Multiple IEA Methods
findings: []
- id: PMID:9634230
title: Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence.
findings:
- statement: Complete genome sequence of M. tuberculosis H37Rv, identifying Rv2460c (clpP2).
supporting_text: "The genome comprises 4,411,529 base pairs, contains around 4,000 genes"
- id: PMID:14532352
title: Comprehensive proteomic profiling of the membrane constituents of a Mycobacterium tuberculosis strain.
findings:
- statement: Large-scale membrane proteomics; identified ClpP2 in membrane fraction (likely contaminant).
supporting_text: "739 proteins have been identified by two or more distinct peptide sequences...approximately 450 proteins represent novel identifications, 79 of which are membrane proteins and more than 100 of which are membrane-associated proteins"
- id: PMID:16844784
title: Dissecting virulence pathways of Mycobacterium tuberculosis through protein-protein association.
findings:
- statement: Developed M-PFC interaction system; identified ClpC1 interactions in Mtb.
supporting_text: "Rv3596c (ClpC1), an AAA-ATPase chaperone involved in protein translocation and quality control"
- id: PMID:20025669
title: RseA, the SigE specific anti-sigma factor of Mycobacterium tuberculosis, is inactivated by phosphorylation-dependent ClpC1P2 proteolysis.
findings:
- statement: ClpC1P2 degrades anti-sigma-E factor RseA upon PknB-dependent phosphorylation at T39, activating the SigE regulon under envelope stress.
supporting_text: "PknB-dependent phosphorylation of RseA on T39 is required for its cleavage by ClpC1P2 thereby activating the SigE regulon"
- id: PMID:21969609
title: Proteogenomic analysis of Mycobacterium tuberculosis by high resolution mass spectrometry.
findings:
- statement: ClpP2 identified by mass spectrometry, confirming protein expression at the protein level.
supporting_text: "we identified 3176 proteins from Mycobacterium tuberculosis representing ~80% of its total predicted gene count"
- id: PMID:22123255
title: Validation of the essential ClpP protease in Mycobacterium tuberculosis as a novel drug target.
findings:
- statement: Both clpP1 and clpP2 are essential for viability. clpP2 overexpression is toxic. ADEPs are active against Mtb.
supporting_text: "clpP2 overexpression was toxic, suggesting different roles for the two homologs"
- id: PMID:22286948
title: The active ClpP protease from M. tuberculosis is a complex composed of a heptameric ClpP1 and a ClpP2 ring.
findings:
- statement: Active protease requires both ClpP1 and ClpP2 mixed together; neither homomeric form is active alone. Requires dipeptide activator for in vitro activity.
supporting_text: "We could, however, reconstitute an active, mixed ClpP1P2 complex after identifying N-blocked dipeptides that stimulate dramatically (>1000-fold) ClpP1P2 activity"
- id: PMID:23314154
title: "Mycobacterium tuberculosis RsdA provides a conformational rationale for selective regulation of \u03C3-factor activity by proteolysis."
findings:
- statement: ClpP1P2-ClpX complex degrades anti-sigma-D factor RsdA, providing selective regulation of sigma factor activity.
supporting_text: "the M. tuberculosis ClpX-ClpP1-ClpP2 proteolytic complex could specifically degrade the ASD of RsdA"
- id: PMID:24976069
title: Substrate delivery by the AAA+ ClpX and ClpC1 unfoldases activates the mycobacterial ClpP1P2 peptidase.
findings:
- statement: AAA+ unfoldases ClpX and ClpC1 deliver substrates to ClpP1P2 for degradation. Substrate delivery itself activates the peptidase.
supporting_text: "Substrate delivery by the AAA+ ClpX and ClpC1 unfoldases activates the mycobacterial ClpP1P2 peptidase"
- id: PMID:25267638
title: Crystal structure of Mycobacterium tuberculosis ClpP1P2 suggests a model for peptidase activation by AAA+ partner binding and substrate delivery.
full_text_unavailable: true
findings:
- statement: Crystal structure reveals heterotetradecameric barrel with 7 ClpP1 and 7 ClpP2 subunits.
full_text_unavailable: true
- id: PMID:25759383
title: "Cleavage Specificity of Mycobacterium tuberculosis ClpP1P2 Protease and Identification of Novel Peptide Substrates and Boronate Inhibitors with Anti-bacterial Activity."
findings:
- statement: ClpP1P2 has a marked preference for cleavage after leucine or methionine at the P1 position.
supporting_text: "we show that ClpP1P2 prefers Met β« Leu > Phe > Ala in the X1 position"
- id: PMID:25933022
title: The Mycobacterium tuberculosis ClpP1P2 Protease Interacts Asymmetrically with Its ATPase Partners ClpX and ClpC1.
findings:
- statement: ClpX binds specifically to the ClpP2 ring, while ClpC1 can associate with either ring.
supporting_text: "both Mycobacterium tuberculosis Clp chaperones require the intact interaction face of ClpP2 to support degradation"
- id: PMID:26919556
title: Acyldepsipeptide antibiotics kill mycobacteria by preventing the physiological functions of the ClpP1P2 protease.
findings:
- statement: ADEPs dysregulate the ClpP1P2 protease, providing chemical validation as a drug target.
supporting_text: "mycobacteria are killed by ADEP through inhibition of ClpP function"
- id: PMID:32123115
title: An allosteric switch regulates Mycobacterium tuberculosis ClpP1P2 protease function as established by cryo-EM and methyl-TROSY NMR.
full_text_unavailable: true
findings:
- statement: Cryo-EM structure reveals allosteric activation mechanism of ClpP1P2 triggered by dipeptide activators and substrate delivery.
full_text_unavailable: true
- id: PMID:35507665
title: The essential M. tuberculosis Clp protease is functionally asymmetric in vivo.
full_text_unavailable: true
findings:
- statement: ClpP1 and ClpP2 have different catalytic-site specificities and the complex is functionally asymmetric in vivo.
full_text_unavailable: true
core_functions:
- description: >-
Serine-type endopeptidase that forms the proteolytic core of the essential
ClpP1P2 heterotetradecameric protease complex. ClpP2 provides one of the
two heptameric rings (with ClpP1 providing the other) of the barrel-shaped
proteolytic chamber. Uses a Ser-His-Asp catalytic triad (Ser110, His135)
to cleave peptide bonds, with preference for leucine and methionine at the
P1 position. The complex requires AAA+ ATPase partners (ClpC1 or ClpX) for
substrate unfolding and delivery. ClpX binds specifically to the ClpP2 ring.
Functions in protein quality control (degradation of misfolded proteins) and
regulated proteolysis of anti-sigma factors (RseA, RsdA) to activate stress-
responsive transcriptional programs. Essential for M. tuberculosis viability.
supported_by:
- reference_id: PMID:22286948
supporting_text: "reconstitute an active, mixed ClpP1P2 complex after identifying N-blocked dipeptides that stimulate dramatically (>1000-fold) ClpP1P2 activity"
- reference_id: PMID:25759383
supporting_text: "we show that ClpP1P2 prefers Met β« Leu > Phe > Ala in the X1 position"
- reference_id: PMID:25933022
supporting_text: "both Mycobacterium tuberculosis Clp chaperones require the intact interaction face of ClpP2 to support degradation"
- reference_id: PMID:22123255
supporting_text: "clpP2 overexpression was toxic, suggesting different roles for the two homologs"
molecular_function:
id: GO:0004252
label: serine-type endopeptidase activity
contributes_to_molecular_function:
id: GO:0004176
label: ATP-dependent peptidase activity
directly_involved_in:
- id: GO:0006515
label: protein quality control for misfolded or incompletely synthesized proteins
locations:
- id: GO:0005737
label: cytoplasm
in_complex:
id: GO:0009368
label: endopeptidase Clp complex
- description: >-
Regulated proteolysis of anti-sigma-E factor RseA via the ClpC1-ClpP1P2
complex. PknB-dependent phosphorylation of RseA at T39 triggers its
degradation, releasing SigE to activate the envelope stress regulon.
This represents a specific regulatory function beyond general protein
quality control.
supported_by:
- reference_id: PMID:20025669
supporting_text: "PknB-dependent phosphorylation of RseA on T39 is required for its cleavage by ClpC1P2 thereby activating the SigE regulon"
molecular_function:
id: GO:0004252
label: serine-type endopeptidase activity
directly_involved_in:
- id: GO:0006515
label: protein quality control for misfolded or incompletely synthesized proteins
locations:
- id: GO:0005737
label: cytoplasm
in_complex:
id: GO:0009368
label: endopeptidase Clp complex
proposed_new_terms: []
suggested_questions:
- question: >-
What are the specific in vivo substrates of ClpP1P2 beyond the anti-sigma
factors RseA and RsdA? A comprehensive substrate identification study would
illuminate the full scope of ClpP1P2 regulatory functions.
- question: >-
What is the molecular basis for the toxicity of clpP2 overexpression but
not clpP1 overexpression? Does excess ClpP2 form aberrant complexes or
titrate away essential partners?
experts:
- Ollinger J
- Parish T
- question: >-
How does the functional asymmetry between ClpP1 and ClpP2 catalytic sites
affect substrate processing in vivo? Are there substrates preferentially
processed by one active site over the other?
experts:
- Nagpal J
suggested_experiments:
- description: >-
Trap-and-release proteomics using catalytically inactive ClpP1P2 mutants
to identify the full set of in vivo substrates in different growth conditions
(exponential, stationary, stress).
hypothesis: >-
ClpP1P2 has a broader substrate repertoire than currently known, including
regulators of virulence and persistence programs.
- description: >-
Characterization of clpP2 overexpression toxicity mechanism using
proteomics and genetic suppressor analysis.
hypothesis: >-
Excess ClpP2 may form aberrant homomeric complexes or deplete ClpP1
from functional heterocomplexes, disrupting proteostasis.