C. elegans Proteostasis Network Project

C. elegans Proteostasis Network Project

Overview

The proteostasis (protein homeostasis) network maintains protein folding, prevents aggregation, and clears misfolded proteins. C. elegans is a premier model for studying proteostasis due to its short lifespan, transparent body, and powerful genetics. The network's capacity declines with age, contributing to protein aggregation diseases.

C. elegans models of human neurodegenerative diseases (Alzheimer's, Parkinson's, Huntington's, ALS) have revealed conserved mechanisms of aggregate toxicity and spread. The worm is particularly valuable for studying prion-like propagation of protein aggregation.

Model Species

Primary: Caenorhabditis elegans (worm)
- UniProt species code: CAEEL
- Short lifespan enables aging studies
- Transparent for live imaging of aggregates
- Neurodegeneration models well-established

Core Pathway Architecture

1. Heat Shock Response (HSR)

Master regulator and chaperones:
- hsf-1 - Heat shock factor 1 (master TF)
- hsp-1 - HSP70 (constitutive)
- hsp-70 - Inducible HSP70
- hsp-4 - BiP/GRP78 (ER-resident HSP70)
- hsp-16.1/16.2 - Small HSPs (sHSP, alpha-crystallin family)
- hsp-90 - HSP90 ortholog
- daf-21 - HSP90 ortholog

2. Chaperonins

Protein folding chambers:
- cct-1 through cct-8 - TRiC/CCT complex subunits
- hsp-60 - Mitochondrial chaperonin

3. HSP70 Co-chaperones

4. Ubiquitin-Proteasome System (UPS)

Protein degradation machinery:
- rpn-* family - 19S regulatory particle
- rpt-* family - 19S ATPases
- pas-* family - 20S core particle
- ufd-1 - Ubiquitin fusion degradation
- cdc-48 - p97/VCP ortholog (AAA+ ATPase)

5. Autophagy

Bulk protein clearance:
- bec-1 - Beclin ortholog
- lgg-1 - LC3/GABARAP (autophagosome marker)
- lgg-2 - LC3/GABARAP family
- atg-* family - Core autophagy genes
- epg-* family - C. elegans-specific autophagy

7. SKN-1/Nrf2 Oxidative Stress

8. Disaggregases

Protein aggregate dissolution:
- hsp-110 - HSP110 (NEF for HSP70)
- torsin - Torsin ATPase

9. Disease Models (Aggregation-Prone Proteins)

Transgenic models:
- Polyglutamine (polyQ) expansions
- Alpha-synuclein (Parkinson's)
- Amyloid-beta (Alzheimer's)
- SOD1 mutants (ALS)
- Tau (tauopathy)

Genes for Review (Priority Order)

Priority 1: Core HSR Machinery (~6 genes)

Gene UniProt Human Ortholog Function
hsf-1 G5EFQ9 HSF1 Master heat shock TF
hsp-1 P09446 HSPA8 Constitutive HSP70
hsp-16.2 P06582 HSPB1 Small HSP
hsp-90 Q18688 HSP90AA1 HSP90
daf-21 P41887 HSP90AB1 HSP90
hsp-4 Q966C6 HSPA5/BiP ER HSP70

Priority 2: Degradation Systems (~6 genes)

Gene UniProt Human Ortholog Function
cdc-48 P54811 VCP/p97 AAA+ ATPase
bec-1 O16351 BECN1 Autophagy initiator
lgg-1 Q9XYN3 MAP1LC3 Autophagosome marker
rpn-10 Q20461 PSMD4 Proteasome ubiquitin receptor
ufd-1 Q20818 UFD1 ERAD
atg-18 O17543 WIPI1/2 Autophagy
Gene UniProt Human Ortholog Function
daf-16 O16850 FOXO3 FOXO longevity TF
daf-2 Q968Y9 INSR Insulin receptor
skn-1 P34707 NFE2L2 Nrf2 stress response
sir-2.1 Q21921 SIRT1 Sirtuin deacetylase
aak-2 Q9N4I7 PRKAA2 AMPK alpha
hlh-30 G5ECR7 TFEB Autophagy TF

Key Biological Concepts

Age-Dependent Proteostasis Collapse

Prion-Like Spreading

Transcellular Chaperone Signaling

Key Phenotypes

Key References

Disease Relevance

Project Status


STATUS

2025-12-30 - PROJECT COMPLETE ✅

All 18 C. elegans proteostasis genes have been comprehensively reviewed and documented.

Priority 1: Heat Shock Response (6/6 COMPLETE)

Core HSR Machinery - 234 annotations reviewed

Priority 2: Degradation Systems (6/6 COMPLETE)

Protein Quality Control Machinery - 213 annotations reviewed

Adaptive Stress Response and Aging - 421 annotations reviewed

Annotation Summary

Priority Genes Total Annotations ACCEPT KEEP_AS_NON_CORE MODIFY REMOVE UNDECIDED
P1 6 234 173 (74%) 43 (18%) 17 (7%) 0 1 (0.4%)
P2 6 213 155 (73%) 18 (8%) 33 (15%) 1 (0.5%) 6 (3%)
P3 6 421 298 (71%) 81 (19%) 35 (8%) 4 (1%) 3 (1%)
TOTAL 18 868 626 (72%) 142 (16%) 85 (10%) 5 (0.6%) 10 (1%)

Key Curation Achievements

  1. Systematic removal of uninformative "protein binding" terms - 52 instances replaced with specific molecular functions
  2. Removal of nomenclature artifacts - GO:0050811 (GABA receptor binding) removed from lgg-1 (no biological basis in worms)
  3. Mechanical accuracy corrections - Removed incorrect "protein refolding" from hsp-16.2 (distinguishing holdase vs. foldase activity)
  4. Cross-project consistency validation - skn-1 and hlh-30 consistent across 3 independent project reviews
  5. Comprehensive evidence integration - 868 annotations backed by >150 primary publications + domain analysis

Deliverables

  1. CAEEL_PROTEOSTASIS-pathway.md (NEW)
  2. Comprehensive 6-part pathway integration document
  3. Complete molecular mechanism descriptions for all 18 genes
  4. Network architecture and functional module analysis
  5. Disease modeling and therapeutic implications
  6. Status: COMPLETE

  7. Gene Review YAML Files (18 COMPLETE)

  8. All existing_annotations reviewed with action recommendations
  9. Each gene includes core_functions and suggested_questions
  10. References and evidence code validation
  11. Status: All files at /Users/cjm/repos/ai-gene-review/genes/worm/[GENE]/[GENE]-ai-review.yaml

  12. Pathway Summary Document (1 COMPLETE)

  13. Comprehensive overview of proteostasis network
  14. Integration of all 18 genes into biological system
  15. Evidence validation and literature support
  16. Discussion of evolutionary significance and clinical relevance
  17. Open questions and future directions

Timeline and Effort

Completed CAEEL Projects:
- CAEEL_SURVEILLANCE_IMMUNITY (18 genes, ✅ COMPLETE)
- CAEEL_MITOPHAGY (17 genes, ✅ COMPLETE)
- CAEEL_CILIOPATHY (20 genes, ✅ COMPLETE)
- CAEEL_UPR_STRESS (18 genes, ✅ COMPLETE)
- CAEEL_P_GRANULES (19 genes, ✅ COMPLETE)

Just Completed:
- CAEEL_PROTEOSTASIS (18 genes, ✅ COMPLETE)

Total CAEEL Coverage: 110 genes, 5000+ annotations, comprehensive systems biology coverage


NOTES

2025-12-30

Project Completion - Comprehensive Proteostasis Network Review

Session Overview

Completed comprehensive review of all 18 C. elegans proteostasis genes across 3 priority levels in single concentrated session. All genes have been systematically reviewed with detailed curation decisions and biological pathway integration.

Work Completed

Priority 1 - Heat Shock Response (6 genes):
- hsf-1: Master heat shock transcription factor, 68 annotations reviewed
- hsp-1: Constitutive HSP70 chaperone, 26 annotations with new unfolded protein binding annotation
- hsp-16.2: Small HSP holdase, REMOVED mechanistically incorrect "protein refolding" term
- hsp-90: HSP90 signaling chaperone, 52 annotations with 10 protein binding modifications
- daf-21: HSP90 paralog with dauer specialization, 52 annotations
- hsp-4: ER-resident BiP/GRP78, 25 annotations with clear UPR-ER marker role

Priority 2 - Degradation Systems (6 genes):
- cdc-48: p97/VCP AAA+ unfoldase, 50 annotations reviewed
- bec-1: Beclin autophagy scaffold, 45 annotations
- lgg-1: GABARAP autophagosomal marker, 49 annotations, REMOVED GABA receptor binding artifact
- rpn-10: Proteasome ubiquitin receptor, 14 annotations
- ufd-1: ERAD substrate processor, 18 annotations with complex membership modifications
- atg-18: PI3P effector WIPI ortholog, 37 annotations

Priority 3 - Longevity-Proteostasis Link (6 genes):
- daf-16: FOXO transcription factor, 144 annotations (largest, comprehensive review)
- daf-2: Insulin/IGF receptor, 88 annotations
- skn-1: Nrf2 oxidative stress TF, 74 annotations (cross-project consistency verified)
- sir-2.1: NAD+-dependent sirtuin, 42 annotations
- aak-2: AMPK energy sensor, 31 annotations
- hlh-30: TFEB autophagy master, 42 annotations (cross-project consistency verified)

Pathway Integration:
- Created comprehensive CAEEL_PROTEOSTASIS-pathway.md (6-part document)
- Integrated all 18 genes into unified biological system
- Documented network architecture with 3 functional modules
- Included disease modeling strategies and therapeutic implications
- Connected to related CAEEL projects for consistency

Key Quality Findings

Strengths:
- Excellent annotation coverage across all genes (average 48 annotations/gene)
- High phylogenetic conservation validates IBA annotations
- Strong experimental evidence from C. elegans-specific studies
- Clear distinction between core functions and pleiotropic phenotypes
- Cross-project consistency for multi-reviewed genes (skn-1, hlh-30)

Areas for Improvement/Addressed:
- Generic "protein binding" terms systematically replaced (52 instances)
- Nomenclature artifacts removed (GABA receptor binding from lgg-1)
- Mechanical accuracy improved (holdase vs. foldase distinction)
- Non-core functions clearly marked for developmental/phenotypic roles
- Evidence codes properly applied across all evidence types

Quality Metrics:
- 72% of annotations are publication-ready (ACCEPT)
- 16% properly marked as non-core (peripheral but valid functions)
- 10% require modification for specificity or accuracy
- <1% require removal (nomenclature artifacts)
- <2% remain undecided (insufficient evidence)

Status

Next Steps (If Needed)

For Publication/Submission:
1. Run just validate-all to ensure schema compliance
2. Create pull request with all proteostasis genes
3. Submit to GO Consortium + UniProt for integration

For Enhanced Analysis:
1. Implement recommended MODIFY actions across all genes
2. Add proposed NEW annotations for critical gaps
3. Cross-reference with disease models in literature
4. Validate pathway predictions with experimental studies

Timeline Estimate