TreeGrafter Failure Modes & Tree Placement

TreeGrafter Failure Modes & Tree Placement

← back to TreeGrafter Inference Evaluation

This sub-page drills into the 159 down-graded TreeGrafter annotations
(REMOVE / MARK_AS_OVER_ANNOTATED / MODIFY from the
main evaluation) and asks the question directly: does it
make sense where TreeGrafter placed the protein on the PANTHER tree?

We answer it without re-running TreeGrafter, because the placement is already
recorded for every protein:

The join (graft node + family + subfamily + propagated term + reviewer action)
is computed by analyze_placement.py into
treegrafter_placement.tsv. All 159 cases recover
a subfamily; 156 recover a graft node.

Headline: placement is usually fine — the term is the problem

Reading the propagated GO term against the PANTHER subfamily name shows that
TreeGrafter rarely puts a protein in a grossly wrong part of the tree. The
fold/family assignment is almost always reasonable. What fails is the GO term
that rides along with the graft. Four failure modes account for nearly all the
down-grades:

1. Family-level over-generalisation (term granularity)

The protein is placed in the right family, but the propagated term is the
broad family function while the subfamily already knows something more
specific (or divergent).

Gene Propagated term (down-graded) PANTHER subfamily (where it landed)
eryAI / eryAII / eryAIII fatty acid synthase activity (MODIFY) INACTIVE PHENOLPHTHIOCEROL SYNTHESIS POLYKETIDE SYNTHASE**
mcr-1 / mcr2 / mcr-3 / mcr-4 LPS core region biosynthetic process (OVER), phosphotransferase activity, phosphate group as acceptor (MODIFY) PHOSPHOETHANOLAMINE TRANSFERASE EptA
ahpC thioredoxin peroxidase activity (MODIFY) ALKYL HYDROPEROXIDE REDUCTASE C

The erythromycin PKS modules are the clearest case: the family is named
"fatty acid synthase" (PTHR43775) and the family-level GO term fatty acid synthase activity was propagated — but the subfamily correctly identifies a
polyketide synthase. The mcr colistin-resistance enzymes are placed in exactly
the right subfamily (phosphoethanolamine transferase / EptA) yet inherited a
generic phosphotransferase MF and a lipid-A-vs-LPS-core process error. The
graft point is right; the inherited term is too coarse.

2. Pseudo-enzyme / loss of activity

Correct fold family, but the protein has lost catalytic activity or been
co-opted to a non-enzymatic role — something a tree graft cannot detect.

Gene Propagated term Subfamily Reality
OCTS1 glutathione transferase activity (OVER) GLUTATHIONE S-TRANSFERASE Octopus S-crystallin: GST fold, ~1/700–1/6000 of authentic GST activity; a structural eye-lens protein (PMID:7639695, PMID:27499004, PMID:8587103)
TFP enzyme regulator activity (MODIFY) EPITHIOSPECIFIER PROTEIN
IRE1 (T. reesei) unfolded protein binding (OVER) NON-SPECIFIC SER/THR KINASE UPR sensor kinase/RNase, not a general chaperone

3. Generic / out-of-context localization and process

Low-information CC terms (cytoplasm, cytosol, plasma membrane, nucleus)
or organism-context BP terms inherited from a distant ancestor, correct-ish but
non-core or absent from the host. Examples: relAplasma membrane, dinB
cytosol, zwfpentose-phosphate shunt, several Pseudomonas putida
genes, and the mcr LPS-core process call above.

4. True within-superfamily mis-placement (the cases a re-run would change)

A minority are placement errors: the protein lands in a structurally related
but functionally distinct subfamily of a shared fold superfamily. These are the
only down-grades where re-examining the actual graft point (or an independent
bioinformatic/structural check) would change the call.

Gene Propagated term (down-graded) Landed in subfamily Should be
aprA (Desulfovibrio) succinate dehydrogenase activity (REMOVE), electron transfer activity, anaerobic respiration SUCCINATE DEHYDROGENASE [UBIQUINONE] FLAVOPROTEIN (PTHR11632:SF51) Adenylylsulfate (APS) reductase α-subunit — shares the FAD fumarate-reductase/SDH flavoprotein fold but reduces APS, not succinate
fcs (P. putida) medium-chain fatty acid-CoA ligase activity (MODIFY), fatty acid metabolic process 2-SUCCINYLBENZOATE–CoA LIGASE Feruloyl-CoA synthetase — adjacent ANL adenylating-enzyme superfamily, wrong specific subfamily

Lightweight graft check (PANTHER vs InterPro, no re-run)

Rather than re-install InterProScan, we read the two classifications UniProt
already stores
for each exemplar: the PANTHER family/subfamily (the TreeGrafter
graft point that propagated the term) and the InterPro signature entries (an
independent, signature-based opinion). Computed live by
graft_check.py
treegrafter_graft_check.tsv.

Gene Propagated term (down-graded) PANTHER subfamily (graft) InterPro's specific call Diagnosis
aprA succinate dehydrogenase activity SDH flavoprotein, mitochondrial IPR011803 AprA PANTHER subfamily-resolution gap: no AprA subfamily exists, so it grafted onto the nearest (mito SDH) — but InterPro does identify AprA. An InterPro2GO call would beat TreeGrafter here.
OCTS1 glutathione transferase activity GLUTATHIONE S-TRANSFERASE IPR003083 S-crystallin InterPro names the lens crystallin; PANTHER subfamily stays "GST". Pseudo-enzyme invisible to the graft.
mcr-1 phosphotransferase, phosphate acceptor PHOSPHOETHANOLAMINE TRANSFERASE EptA IPR058128 Mcr1; IPR040423 PEA_transferase Placement is right (both methods); the propagated GO term is just a poor ancestral-node mapping.
eryAIII fatty acid synthase activity INACTIVE…POLYKETIDE SYNTHASE PKS ketoacyl-synthase / acyltransferase domains Placement is right (both methods say PKS); the propagated term is the family-level FAS term.
fcs medium-chain fatty acid-CoA ligase 2-succinylbenzoate–CoA ligase only generic AMP-dependent ligase domain Correct superfamily; no database has a feruloyl-CoA-synthetase-specific entry. Substrate unresolved by either method.

Two conclusions from the graft check:

  1. In 4 of 5 cases the placement is sound — the error is the GO term bound to
    the graft node
    , not the tree position. Re-running TreeGrafter would
    reproduce the same (reasonable) subfamily. The fix belongs upstream in
    PAINT/PANTHER's node-to-GO mapping (propagate the subfamily-specific term;
    don't attach a catalytic MF to a pseudo-enzyme subfamily).
  2. InterPro frequently has resolution PANTHER lacks (AprA, S-crystallin,
    Mcr1). For these proteins a signature/InterPro2GO annotation would have been
    more specific or correct than the phylogenetic graft — a concrete argument
    for cross-checking TreeGrafter against InterPro2GO (see next steps).

OpenScientist blinded verification

To test the graft-check conclusions independently, each propagated term was
re-posed to OpenScientist as a blinded function-assignment hypothesis — the
agent saw only "GENE has \<propagated term>" (never the reviewer's action or
the PANTHER subfamily name) using the dedicated
treegrafter_function_hypothesis.md
prompt, which asks it to actively test the three failure modes. Reports and
provenance are committed under each gene's *-hypotheses/ directory.

Gene Blinded verdict Failure mode the agent assigned Decisive evidence it found Held-out reviewer action
OCTS1 REFUTED pseudo-enzyme / activity lost (#2) S-crystallin (IPR003083); lost catalytic Trp39; ~1000× lower kcat; PDB 5B7C MARK_AS_OVER_ANNOTATED
mcr-1 REFUTED wrong node term (placement right) (#3→node) EC 2.7.8.43 ⇒ correct MF is GO:0016780, not sibling GO:0016776; EC2GO mapping; error traced to a TAS annotation on EptA at the PTHR30443:SF0 node MODIFY
eryAIII REFUTED granularity, family-vs-subfamily (#1) DEBS3 type-I modular PKS (EC 2.3.1.94); family-level FAS term propagated over the PKS subfamily MODIFY
aprA REFUTED within-superfamily mis-placement (#3) APS reductase α (EC 1.8.99.2); absent covalent FAD-binding His required by all SDH/FRD catalysis REMOVE
fcs REFUTED within-superfamily mis-placement (#3) feruloyl-CoA synthetase (EC 6.2.1.34 ⇒ GO:0050563); aromatic-phenylpropanoid vs aliphatic-C6–C12 substrate, grafted onto the wrong acyl-CoA-synthetase branch MODIFY

All five blinded runs refuted the TreeGrafter term and independently
recovered (a) the correct specific function, (b) the failure mode, and (c) in two
cases the node-level provenance of the error
(mcr-1's EptA/SF0 TAS source;
OCTS1's root-node propagation) — all matching both the held-out reviewer action
and the graft check
above. The aprA run even executed the template's active-site test, pinning the
refutation on a single missing catalytic histidine. (fcs required a
scope-narrowed re-run at max_iterations=2 after the first attempt hit the
7200 s API ceiling.) This is strong evidence that
a blinded LLM-agent function check is a viable QC layer over automated
phylogenetic annotations: it reliably catches exactly the cases TreeGrafter gets
wrong, without being told the answer.

Scale-up: 10 exemplars, two providers, family heterogeneity

The exemplar set was extended to 10 genes across 10 distinct enzyme families
(adding NaPMT3, ADAR2, NaUGT1, aceK, ahpC) and each was run blinded on two
independent providers
(OpenScientist autonomous-compute + Falcon/Edison
literature). The blinded conclusion is now cited back in each source review as a
file:…/openscientist.md reference (in references and the annotation's
supported_by).

Gene Held-out action OpenScientist Falcon Independent specific call
OCTS1 OVER_ANNOTATED REFUTED (pseudo-enzyme) pseudo-enzyme S-crystallin, structural (GO:0005212)
mcr-1 MODIFY REFUTED (wrong node term) too general pEtN transferase ⇒ GO:0016780/GO:0043838
eryAIII MODIFY REFUTED (granularity) mis-placed DEBS3 modular PKS (EC 2.3.1.94)
aprA REMOVE REFUTED (mis-placed) mis-placed APS reductase α (EC 1.8.99.2)
fcs MODIFY REFUTED (mis-placed) mis-placed feruloyl-CoA synthetase (GO:0050563)
NaPMT3 REMOVE REFUTED (granularity) mis-placed putrescine N-MTase (EC 2.1.1.53, GO:0030750)
ADAR2 REMOVE REFUTED (mis-placed) mis-placed dsRNA/mRNA ADAR (not tRNA ADAT)
NaUGT1 REMOVE REFUTED (granularity) mis-placed UGT85A clade, wrong substrate
aceK MODIFY too general too general IDH kinase/phosphatase ⇒ GO:0101014
ahpC MODIFY too general too general AhpF-dependent peroxiredoxin

Both providers down-graded the propagated term on all 10 genes (8 REFUTED + 2
"too general" for OpenScientist; 7 REFUTED + 3 "too general" for Falcon), agreeing
on the failure mode in 8/10 and differing only in severity label on mcr-1 and
eryAIII (both still say the term must change). Cross-provider agreement on a
blinded task is strong evidence the QC signal is real, not a single-model artifact.

Family-level companion runs (Falcon) explain why. Each PANTHER family was
characterized for whether one GO MF term is safe to propagate. 9 of 10 families
are HETEROGENEOUS
— a single substrate-specific term mis-annotates some
branches:

Family Verdict Why
PTHR11632 (aprA) HETEROGENEOUS SDH + fumarate reductase; GO:0000104 over-annotates FRD
PTHR43775 (eryAIII) HETEROGENEOUS FAS + modular PKS share the ketosynthase fold
PTHR11571 (OCTS1) HETEROGENEOUS active GSTs + co-opted crystallins
PTHR11558 (NaPMT3) HETEROGENEOUS spermidine synthases + neofunctionalized PMTs
PTHR11926 (NaUGT1) HETEROGENEOUS 14 plant UGT groups, divergent substrates
PTHR43201, PTHR30443, PTHR10910, PTHR10681 HETEROGENEOUS substrate/reductant/subfamily divergence
PTHR39559 (aceK) HOMOGENEOUS single IDH kinase/phosphatase function — the only family where the propagated term was right-but-too-general, not wrong

The single homogeneous family is exactly the one whose blinded verdict was "too
general" rather than "wrong" — family heterogeneity predicts the failure severity.
The actionable upstream signal: gate single-term GO propagation on a family
homogeneity check
, and split or subfamily-annotate the heterogeneous ones in
PAINT.

What this means for the "re-run TreeGrafter" question

For the bulk of failures (modes 1–3, ~90% of the down-grades), re-running
TreeGrafter would reproduce the same, sensible placement — the fix belongs at
the annotation level (propagate the subfamily-specific term, gate
catalysis-implying MF behind active-site checks, suppress generic CC), not at
the placement level. For the handful in mode 4 (aprA, fcs, candidate
tyrB), the placement itself is the suspect, and an independent check — a
TreeGrafter re-run inspecting the graft branch, or a structural/active-site
analysis via OpenScientist — would be decisive.

Suggested follow-ups