Gene	UniProt	Protein
cipA	Q06851	Cellulosomal-scaffolding protein A
cipB	Q01866	Cellulosomal-scaffolding protein B
sdbA	P71143	Scaffolding dockerin binding protein A
ancA	Q06848	Cellulosome-anchoring protein

Gene	Protein
celS	Cellulose 1,4-beta-cellobiosidase CelS (GH48)
celK	Cellulose 1,4-beta-cellobiosidase CelK
celA / celC / celD	Endoglucanases A, C, D
celX (P15329)	Putative endoglucanase X

Gene	UniProt	Protein
xynY	P51584	Endo-1,4-beta-xylanase Y
xynX	P38535	Exoglucanase XynX
xghA	Q70DK5	Xyloglucanase Xgh74A
licB	Q84C00	Beta-glucanase (lichenase)

Gene	UniProt	Protein
celE	P10477	Cellulase/esterase CelE
xynZ	P10478	Xylanase Z (with feruloyl esterase domain)
celM	P55742	Putative aminopeptidase

Gene	UniProt	Protein
cbpA	P38058	Cellulose-binding protein A (scaffoldin)
engK	Q9RGE8	Glucanase/Endoglucanase K (GH9)
engL	Q9RGE6	Endoglucanase L (GH9)
engO	Q6DTY2	Endoglucanase O (GH9, non-cellulosomal)
hbpA	Q9RGE7	Hydrophobic protein A (accessory scaffoldin)

Finding: cellulosomal vs free — EngO is the contrast case

EngO (Q6DTY2) is a GH9 endoglucanase that is NOT part of the cellulosome:

Lacks a dockerin; instead carries its own CBM for substrate targeting
Functions as a free secreted enzyme that complements the cellulosome
Added GO:0030248 (cellulose binding) for its CBM function
No cellulosome annotations, unlike the dockerin-bearing EngK / EngL

This distinction — dockerin-anchored vs. free CBM-targeted — is exactly what GO annotations must capture correctly.

HbpA (Q9RGE7) — accessory scaffoldin: GO:0000272 (polysaccharide catabolic) marked for REMOVAL (non-catalytic); CBM-style GO:0030246 marked UNDECIDED (possible artifact); added type-I dockerin binding + cellulosome assembly.

The Cellulosome

A giant extracellular multi-enzyme complex for degrading crystalline cellulose

Why the cellulosome?

Cellulosome architecture

The approach: AI-assisted gene review

Genes reviewed — A. thermocellus (1/2)

Genes reviewed — A. thermocellus (2/2)

Second model — Clostridium cellulovorans

Finding: scaffoldins are non-catalytic — fix the annotations

Finding: cellulases — right activity, sharpen the context

Finding: cellulosomal vs free — EngO is the contrast case

Challenges

Status and future directions

Thank you