Rule predicting thioredoxin-disulfide reductase (NADPH) activity based on InterPro domains and CATH FunFam classifications. The rule uses three alternative condition sets targeting the same enzymatic function.
Interactive prediction matrix showing how row entries PREDICT column entries. Cell (i,j) shows what fraction of proteins with row domain i also have column domain j. Click cells to view intersection in UniProt. Click domain IDs to view proteins with that domain.
Legend: Each cell shows PREDICTS % (fraction of row entry proteins that also have column entry - row PREDICTS column), Jaccard similarity (J:%), and intersection count. CS = Condition Set(s), TGT = GO annotation target.
This rule is COMPLETELY REDUNDANT with existing InterPro2GO mappings. Condition set 1 (IPR005982-based) duplicates the existing IPR005982 → GO:0004791 mapping. Condition set 2 is a complete subset of CS1 with zero unique coverage. Critically, condition set 3 (Rattus FunFams), which initially appeared to provide unique value by capturing ~19 proteins DISJOINT from IPR005982, is actually COMPLETELY COVERED by IPR006338 - another InterPro entry that maps to the same GO term via ipr2go. Quantitative analysis shows CS3 FunFams have 100% containment in IPR006338 (Jaccard 0.895-0.947). Therefore ALL condition sets in this rule are redundant with existing ipr2go mappings and the rule should be DEPRECATED.
Complete analysis including external ipr2go mappings reveals this rule is entirely redundant. CS1 duplicates IPR005982 → GO:0004791 (already in ipr2go). CS2 is a subset of CS1. CS3, which captures ~19 proteins disjoint from IPR005982, appeared unique until examining IPR006338 - an external InterPro that also maps to GO:0004791 via ipr2go. Quantitative analysis proves CS3 FunFams are COMPLETELY COVERED by IPR006338: 3.30.390.30:FF:000004 has 100% containment in IPR006338 (17/17 proteins, Jaccard 0.895), and 3.50.50.60:FF:000190 has 100% containment (18/18 proteins, Jaccard 0.947, interpretation: REDUNDANT). The rule provides zero unique annotations beyond what ipr2go already covers via IPR005982 and IPR006338.
These InterPro domains also map to the rule's GO term(s) via InterPro2GO but are not part of any condition set in this rule. They may represent alternative domain signatures that predict the same function.
Primary condition set requiring three InterPro domain matches, but quantitative analysis reveals this reduces to a simple IPR005982 => GO:0004791 rule. All 65 SwissProt proteins with IPR005982 also have both IPR008255 and IPR023753 (complete subset, |A-B| = 0 for both pairs). The AND logic requiring all three domains is therefore redundant - IPR005982 alone is sufficient. Furthermore, this IPR005982 => GO:0004791 mapping already exists in the official InterPro2GO file, making this entire condition set completely redundant with existing manual curation.
Alternative condition set using CATH FunFam classification with eukaryotic taxonomic restriction, but quantitative analysis reveals this provides ZERO additional coverage. All 18 proteins matching this FunFam also possess IPR005982, IPR008255, AND IPR023753 (complete subset, containment = 1.0). This condition set is therefore completely redundant with condition set 1, adding no unique protein coverage beyond the InterPro-based conditions.
| Condition A | Condition B | Count A | Count B | Intersection | Jaccard | A in B | B in A | Interpretation |
|---|---|---|---|---|---|---|---|---|
3.30.390.30:FF:000004
|
3.50.50.60:FF:000190
|
17 | 18 | 16 | 0.842 | 0.941 | 0.889 | HIGH_OVERLAP |
Highly specific condition set for Rattus (rat) cytoplasmic thioredoxin reductase 1. Quantitative analysis reveals this provides SUBSTANTIAL unique coverage: both FunFams (3.30.390.30:FF:000004 with 17 proteins and 3.50.50.60:FF:000190 with 18 proteins) are COMPLETELY DISJOINT from IPR005982 (0 overlap), meaning they capture ~19 unique proteins that condition set 1 completely misses. While all CS3 proteins possess the broad IPR023753 domain (FAD/NAD binding, 869 proteins), they LACK the specific IPR005982 thioredoxin reductase domain. This represents proteins that may have thioredoxin reductase activity based on FunFam homology but are not annotated with the specific InterPro domain. The two FunFams overlap heavily with each other (16/17 and 16/18 shared proteins), suggesting they target the same protein set. The Rattus taxonomic restriction may be overly narrow, but this condition set provides valuable coverage for proteins lacking canonical InterPro domain annotations.
| Condition A | Condition B | Count A | Count B | Intersection | Jaccard | A in B | B in A | Interpretation |
|---|---|---|---|---|---|---|---|---|
3.30.390.30:FF:000004
|
3.50.50.60:FF:000190
|
17 | 18 | 16 | 0.842 | 0.941 | 0.889 | HIGH_OVERLAP |
This rule is COMPLETELY REDUNDANT with existing InterPro2GO mappings. All three condition sets are covered by ipr2go. CS1 duplicates IPR005982 → GO:0004791, CS2 is a subset of CS1, and CS3 - which appeared to provide unique value for ~19 proteins disjoint from IPR005982 - is actually 100% covered by IPR006338, another InterPro that maps to GO:0004791 via ipr2go. The CS3 FunFams (3.30.390.30:FF:000004 and 3.50.50.60:FF:000190) have complete containment in IPR006338 (Jaccard 0.895-0.947). The rule provides zero annotations beyond what ipr2go already covers and should be deprecated.
Extensive structural and mechanistic literature spanning multiple decades supports this annotation rule. X-ray crystallography of rat TrxR at 3.0 Angstrom resolution (PDB 1H6V) reveals the FAD binding domain (residues 1-163, 297-367), NADPH binding domain (residues 164-296), and interface domain (residues 368-499) with the C-terminal selenocysteine redox center. The C-terminal extension of ~16 residues containing the Gly-Cys-Sec-Gly motif serves dual functions: (1) extends the electron transport chain to the protein surface for thioredoxin reduction, and (2) sterically blocks the glutathione binding site. Mammalian TrxRs require selenocysteine for efficient catalysis - cysteine substitution reduces activity to ~1% of wild-type. Prokaryotic enzymes use a distinct 67-degree domain rotation mechanism. Recent work (2024) has identified the "doorstop pocket" at the re-face of FAD as a druggable regulatory site, with inhibitors showing efficacy against parasitic TGR.
Quantitative analysis reveals IPR005982 (Thioredoxin reductase) is a complete subset of both IPR008255 and IPR023753. Among 65 SwissProt proteins with IPR005982, all 65 (100%) also have IPR008255 (class-II active site), and all 65 also have IPR023753 (FAD/NAD(P)-binding). This represents appropriate hierarchical specificity: IPR005982 provides TrxR family-level specificity (65 proteins), IPR008255 adds the class-II active site motif (84 proteins total, including 19 non-TrxR enzymes like glutathione reductase and lipoamide dehydrogenase), and IPR023753 captures the broader FAD/NAD(P)-binding architecture (869 proteins across many flavoenzyme families). The AND requirement creates a specificity filter that prevents false positives from related oxidoreductases. Set difference analysis shows IPR005982 adds zero unique proteins beyond IPR008255 (|IPR005982 - IPR008255| = 0), but this is appropriate - the goal is specificity, not coverage expansion. Jaccard similarity of 0.774 between IPR005982 and IPR008255 indicates high but not complete overlap, with IPR008255 covering 19 additional class-II enzymes that correctly lack the TrxR-specific signature.
GO:0004791 is at the correct level of specificity. It captures the precise enzymatic function (NADPH-dependent thioredoxin reduction) without overgeneralizing to broader terms like GO:0016651 (oxidoreductase activity, acting on NAD(P)H) or GO:0016668 (oxidoreductase activity, acting on a sulfur group of donors, NAD(P) as acceptor). While mammalian TrxRs exhibit remarkable substrate promiscuity (selenite, lipid hydroperoxides, H2O2, vitamin C, alpha-lipoic acid), thioredoxin remains the physiologically primary substrate. The explicit NADPH specification is correct as mammalian TrxRs specifically utilize NADPH (not NADH) as electron donor. A selenoprotein-specific term would be inappropriately narrow as it would exclude functional variants like the nematode mitochondrial TrxR that lacks selenocysteine but retains full catalytic competence with its specific substrate.
Set 1 (InterPro-based) has no taxonomic restriction, appropriate as TrxR is found across all domains of life. Thioredoxin reductases exist as two evolutionarily distinct classes: low-MW (~35 kDa) in bacteria, archaea, plants, fungi, protists that uses a 67-degree domain rotation mechanism; and high-MW (~55 kDa) in animals that evolved from glutathione reductase (not from bacterial TrxR) with a C-terminal selenocysteine extension. Set 2's eukaryotic restriction is justified as it captures both eukaryotic enzyme classes while excluding prokaryotic enzymes. Set 3's Rattus restriction reflects the structural model organism but should be expanded to Mammalia since TrxR1 function is highly conserved across mammals with identical catalytic mechanisms.
3.0 Angstrom X-ray structure of rat TrxR (PDB 1H6V) reveals FAD binding domain (residues 1-163, 297-367), NADPH binding domain (residues 164-296), and interface domain (residues 368-499) with C-terminal selenocysteine redox center. The C-terminal 16-residue extension containing the Gly-Cys-Sec-Gly motif extends the electron transport chain to the surface and blocks glutathione binding.
Comprehensive review describing NADPH-dependent reduction mechanism, three mammalian isozymes (TrxR1 cytosolic, TrxR2 mitochondrial, TrxR3 testis-specific), and therapeutic targeting in cancer.
Comprehensive literature review validating the scientific soundness of this annotation rule. Confirms GO:0004791 is appropriately specific and domain combination is diagnostic. Details the doorstop pocket mechanism and the selenocysteine requirement for full catalytic activity.
Nematode H. contortus mitochondrial TrxR lacks selenocysteine but retains full catalytic competence, demonstrating that GO:0004791 appropriately does not require selenoprotein specification.
Recent 2024 research on the "doorstop pocket" as a druggable regulatory site (DOI: 10.1021/acs.jmedchem.4c00669). Universal presence across the family with isoform-specific shape/electrostatics.
Caution about TR-like FNRs and YpdA/Bdr potentially causing false positives if only generic family motifs are used without TrxR-specific signatures.
Two evolutionarily distinct enzyme classes: low-MW (~35 kDa) in bacteria/ plants/fungi using 67-degree domain rotation, and high-MW (~55 kDa) in animals derived from glutathione reductase with C-terminal selenocysteine extension.
The C-terminal extension serves dual functions: (1) extends electron transport to the surface for thioredoxin reduction, and (2) sterically blocks the glutathione binding site preventing GR activity despite the shared catalytic core.
Animals have lost the low-MW form entirely; plants and fungi have lost the high-MW form. This mutually exclusive distribution reflects functional redundancy - both classes effectively catalyze NADPH-dependent thioredoxin reduction.
Domain signature specific to the thioredoxin reductase family. The primary determinant for TrxR specificity when combined with active site and cofactor-binding signatures.
id: ARBA00026249
description: Rule predicting thioredoxin-disulfide reductase (NADPH) activity based
on InterPro domains and CATH FunFam classifications. The rule uses three alternative
condition sets targeting the same enzymatic function.
status: COMPLETE
rule_type: ARBA
rule:
rule_id: ARBA00026249
condition_sets:
- number: 1
conditions:
- condition_type: INTERPRO
value: IPR005982
curie: InterPro:IPR005982
label: Thioredoxin reductase
negated: false
- condition_type: INTERPRO
value: IPR008255
curie: InterPro:IPR008255
label: Pyridine nucleotide-disulphide oxidoreductase, class-II, active site
negated: false
- condition_type: INTERPRO
value: IPR023753
curie: InterPro:IPR023753
label: FAD/NAD(P)-binding domain
negated: false
notes: Primary condition set requiring three InterPro domain matches, but quantitative
analysis reveals this reduces to a simple IPR005982 => GO:0004791 rule. All
65 SwissProt proteins with IPR005982 also have both IPR008255 and IPR023753
(complete subset, |A-B| = 0 for both pairs). The AND logic requiring all three
domains is therefore redundant - IPR005982 alone is sufficient. Furthermore,
this IPR005982 => GO:0004791 mapping already exists in the official InterPro2GO
file, making this entire condition set completely redundant with existing manual
curation.
pairwise_overlap:
- condition_a: IPR005982
condition_b: IPR008255
protein_database: SWISSPROT
count_a: 65
count_b: 84
intersection_count: 65
a_minus_b_count: 0
b_minus_a_count: 19
jaccard_similarity: 0.7738095238095238
containment_a_in_b: 1.0
containment_b_in_a: 0.7738095238095238
interpretation: SUBSET
- condition_a: IPR005982
condition_b: IPR023753
protein_database: SWISSPROT
count_a: 65
count_b: 869
intersection_count: 65
a_minus_b_count: 0
b_minus_a_count: 804
jaccard_similarity: 0.07479861910241657
containment_a_in_b: 1.0
containment_b_in_a: 0.07479861910241657
interpretation: SUBSET
- condition_a: IPR008255
condition_b: IPR023753
protein_database: SWISSPROT
count_a: 84
count_b: 869
intersection_count: 84
a_minus_b_count: 0
b_minus_a_count: 785
jaccard_similarity: 0.09666283084004602
containment_a_in_b: 1.0
containment_b_in_a: 0.09666283084004602
interpretation: SUBSET
- number: 2
conditions:
- condition_type: FUNFAM
value: 3.50.50.60:FF:000064
curie: CATH.FunFam:3.50.50.60:FF:000064
label: Thioredoxin reductase
negated: false
- condition_type: TAXON
value: '2759'
curie: NCBITaxon:2759
label: Eukaryota
negated: false
notes: Alternative condition set using CATH FunFam classification with eukaryotic
taxonomic restriction, but quantitative analysis reveals this provides ZERO
additional coverage. All 18 proteins matching this FunFam also possess IPR005982,
IPR008255, AND IPR023753 (complete subset, containment = 1.0). This condition
set is therefore completely redundant with condition set 1, adding no unique
protein coverage beyond the InterPro-based conditions.
pairwise_overlap:
- condition_a: 3.30.390.30:FF:000004
condition_b: 3.50.50.60:FF:000190
protein_database: SWISSPROT
count_a: 17
count_b: 18
intersection_count: 16
a_minus_b_count: 1
b_minus_a_count: 2
jaccard_similarity: 0.8421052631578947
containment_a_in_b: 0.9411764705882353
containment_b_in_a: 0.8888888888888888
interpretation: HIGH_OVERLAP
- number: 3
conditions:
- condition_type: FUNFAM
value: 3.30.390.30:FF:000004
curie: CATH.FunFam:3.30.390.30:FF:000004
label: Thioredoxin reductase 1, cytoplasmic
negated: false
- condition_type: FUNFAM
value: 3.50.50.60:FF:000190
curie: CATH.FunFam:3.50.50.60:FF:000190
label: Thioredoxin reductase
negated: false
- condition_type: TAXON
value: '10114'
curie: NCBITaxon:10114
label: Rattus
negated: false
notes: 'Highly specific condition set for Rattus (rat) cytoplasmic thioredoxin
reductase 1. Quantitative analysis reveals this provides SUBSTANTIAL unique
coverage: both FunFams (3.30.390.30:FF:000004 with 17 proteins and 3.50.50.60:FF:000190
with 18 proteins) are COMPLETELY DISJOINT from IPR005982 (0 overlap), meaning
they capture ~19 unique proteins that condition set 1 completely misses. While
all CS3 proteins possess the broad IPR023753 domain (FAD/NAD binding, 869 proteins),
they LACK the specific IPR005982 thioredoxin reductase domain. This represents
proteins that may have thioredoxin reductase activity based on FunFam homology
but are not annotated with the specific InterPro domain. The two FunFams overlap
heavily with each other (16/17 and 16/18 shared proteins), suggesting they target
the same protein set. The Rattus taxonomic restriction may be overly narrow,
but this condition set provides valuable coverage for proteins lacking canonical
InterPro domain annotations.'
pairwise_overlap:
- condition_a: 3.30.390.30:FF:000004
condition_b: 3.50.50.60:FF:000190
protein_database: SWISSPROT
count_a: 17
count_b: 18
intersection_count: 16
a_minus_b_count: 1
b_minus_a_count: 2
jaccard_similarity: 0.8421052631578947
containment_a_in_b: 0.9411764705882353
containment_b_in_a: 0.8888888888888888
interpretation: HIGH_OVERLAP
go_annotations:
- go_id: GO:0004791
go_label: thioredoxin-disulfide reductase (NADPH) activity
aspect: MF
reviewed_protein_count: 0
unreviewed_protein_count: 0
created_date: '2021-10-20'
modified_date: '2025-03-21'
entries:
- id: 3.30.390.30:FF:000004
type: FUNFAM
label: Thioredoxin reductase 1, cytoplasmic
appears_in_condition_sets:
- 3
protein_count: 17
related_entries:
- relationship: EQUIV
target_id: IPR005982
containment: 0.0
jaccard_similarity: 0.0
intersection_count: 0
exclusive_count: 17
- relationship: EQUIV
target_id: IPR008255
containment: 0.0
jaccard_similarity: 0.0
intersection_count: 0
exclusive_count: 17
- relationship: PREDICTS
target_id: IPR023753
containment: 0.02
jaccard_similarity: 0.02
intersection_count: 17
exclusive_count: 852
- relationship: EQUIV
target_id: 3.50.50.60:FF:000064
containment: 0.0
jaccard_similarity: 0.0
intersection_count: 0
exclusive_count: 17
- relationship: PREDICTS
target_id: 3.50.50.60:FF:000190
containment: 0.941
jaccard_similarity: 0.842
intersection_count: 16
exclusive_count: 1
- relationship: PREDICTS
target_id: IPR006338
containment: 1.0
jaccard_similarity: 0.895
intersection_count: 17
exclusive_count: 0
- relationship: EQUIV
target_id: IPR045870
containment: 0.0
jaccard_similarity: 0.0
intersection_count: 0
exclusive_count: 17
- relationship: PREDICTS
target_id: GO:0004791
containment: 1.0
jaccard_similarity: 0.123
intersection_count: 17
exclusive_count: 0
- id: 3.50.50.60:FF:000064
type: FUNFAM
label: Thioredoxin reductase
appears_in_condition_sets:
- 2
protein_count: 18
related_entries:
- relationship: PREDICTS
target_id: IPR005982
containment: 0.277
jaccard_similarity: 0.277
intersection_count: 18
exclusive_count: 47
- relationship: PREDICTS
target_id: IPR008255
containment: 0.214
jaccard_similarity: 0.214
intersection_count: 18
exclusive_count: 66
- relationship: PREDICTS
target_id: IPR023753
containment: 0.021
jaccard_similarity: 0.021
intersection_count: 18
exclusive_count: 851
- relationship: EQUIV
target_id: 3.30.390.30:FF:000004
containment: 0.0
jaccard_similarity: 0.0
intersection_count: 0
exclusive_count: 18
- relationship: EQUIV
target_id: 3.50.50.60:FF:000190
containment: 0.0
jaccard_similarity: 0.0
intersection_count: 0
exclusive_count: 18
- relationship: EQUIV
target_id: IPR006338
containment: 0.0
jaccard_similarity: 0.0
intersection_count: 0
exclusive_count: 18
- relationship: EQUIV
target_id: IPR045870
containment: 0.0
jaccard_similarity: 0.0
intersection_count: 0
exclusive_count: 18
- relationship: PREDICTS
target_id: GO:0004791
containment: 1.0
jaccard_similarity: 0.13
intersection_count: 18
exclusive_count: 0
- id: 3.50.50.60:FF:000190
type: FUNFAM
label: Thioredoxin reductase
appears_in_condition_sets:
- 3
protein_count: 18
related_entries:
- relationship: EQUIV
target_id: IPR005982
containment: 0.0
jaccard_similarity: 0.0
intersection_count: 0
exclusive_count: 18
- relationship: EQUIV
target_id: IPR008255
containment: 0.0
jaccard_similarity: 0.0
intersection_count: 0
exclusive_count: 18
- relationship: PREDICTS
target_id: IPR023753
containment: 0.021
jaccard_similarity: 0.021
intersection_count: 18
exclusive_count: 851
- relationship: EQUIV
target_id: 3.50.50.60:FF:000064
containment: 0.0
jaccard_similarity: 0.0
intersection_count: 0
exclusive_count: 18
- relationship: PREDICTED_BY
target_id: 3.30.390.30:FF:000004
containment: 0.889
jaccard_similarity: 0.842
intersection_count: 16
exclusive_count: 2
- relationship: PREDICTS
target_id: IPR006338
containment: 1.0
jaccard_similarity: 0.947
intersection_count: 18
exclusive_count: 0
- relationship: EQUIV
target_id: IPR045870
containment: 0.0
jaccard_similarity: 0.0
intersection_count: 0
exclusive_count: 18
- relationship: PREDICTS
target_id: GO:0004791
containment: 1.0
jaccard_similarity: 0.13
intersection_count: 18
exclusive_count: 0
- id: IPR005982
type: INTERPRO
label: Thioredoxin reductase
appears_in_condition_sets:
- 1
protein_count: 65
related_entries:
- relationship: PREDICTS
target_id: IPR008255
containment: 1.0
jaccard_similarity: 0.774
intersection_count: 65
exclusive_count: 0
- relationship: PREDICTS
target_id: IPR023753
containment: 1.0
jaccard_similarity: 0.075
intersection_count: 65
exclusive_count: 0
- relationship: PREDICTED_BY
target_id: 3.50.50.60:FF:000064
containment: 1.0
jaccard_similarity: 0.277
intersection_count: 18
exclusive_count: 0
- relationship: EQUIV
target_id: 3.30.390.30:FF:000004
containment: 0.0
jaccard_similarity: 0.0
intersection_count: 0
exclusive_count: 65
- relationship: EQUIV
target_id: 3.50.50.60:FF:000190
containment: 0.0
jaccard_similarity: 0.0
intersection_count: 0
exclusive_count: 65
- relationship: EQUIV
target_id: IPR006338
containment: 0.0
jaccard_similarity: 0.0
intersection_count: 0
exclusive_count: 65
- relationship: EQUIV
target_id: IPR045870
containment: 0.0
jaccard_similarity: 0.0
intersection_count: 0
exclusive_count: 65
- relationship: PREDICTS
target_id: GO:0004791
containment: 1.0
jaccard_similarity: 0.471
intersection_count: 65
exclusive_count: 0
- id: IPR006338
type: INTERPRO
source: ipr2go
protein_count: 19
asserted_predicted_go_terms:
- GO:0004791
related_entries:
- relationship: EQUIV
target_id: IPR005982
containment: 0.0
jaccard_similarity: 0.0
intersection_count: 0
exclusive_count: 19
- relationship: EQUIV
target_id: IPR008255
containment: 0.0
jaccard_similarity: 0.0
intersection_count: 0
exclusive_count: 19
- relationship: PREDICTS
target_id: IPR023753
containment: 0.022
jaccard_similarity: 0.022
intersection_count: 19
exclusive_count: 850
- relationship: EQUIV
target_id: 3.50.50.60:FF:000064
containment: 0.0
jaccard_similarity: 0.0
intersection_count: 0
exclusive_count: 19
- relationship: PREDICTED_BY
target_id: 3.30.390.30:FF:000004
containment: 0.895
jaccard_similarity: 0.895
intersection_count: 17
exclusive_count: 2
- relationship: PREDICTED_BY
target_id: 3.50.50.60:FF:000190
containment: 0.947
jaccard_similarity: 0.947
intersection_count: 18
exclusive_count: 1
- relationship: EQUIV
target_id: IPR045870
containment: 0.0
jaccard_similarity: 0.0
intersection_count: 0
exclusive_count: 19
- relationship: PREDICTS
target_id: GO:0004791
containment: 1.0
jaccard_similarity: 0.138
intersection_count: 19
exclusive_count: 0
- id: IPR008255
type: INTERPRO
label: Pyridine nucleotide-disulphide oxidoreductase, class-II, active site
appears_in_condition_sets:
- 1
protein_count: 84
related_entries:
- relationship: PREDICTED_BY
target_id: IPR005982
containment: 0.774
jaccard_similarity: 0.774
intersection_count: 65
exclusive_count: 19
- relationship: PREDICTS
target_id: IPR023753
containment: 1.0
jaccard_similarity: 0.097
intersection_count: 84
exclusive_count: 0
- relationship: PREDICTED_BY
target_id: 3.50.50.60:FF:000064
containment: 1.0
jaccard_similarity: 0.214
intersection_count: 18
exclusive_count: 0
- relationship: EQUIV
target_id: 3.30.390.30:FF:000004
containment: 0.0
jaccard_similarity: 0.0
intersection_count: 0
exclusive_count: 84
- relationship: EQUIV
target_id: 3.50.50.60:FF:000190
containment: 0.0
jaccard_similarity: 0.0
intersection_count: 0
exclusive_count: 84
- relationship: EQUIV
target_id: IPR006338
containment: 0.0
jaccard_similarity: 0.0
intersection_count: 0
exclusive_count: 84
- relationship: EQUIV
target_id: IPR045870
containment: 0.0
jaccard_similarity: 0.0
intersection_count: 0
exclusive_count: 84
- relationship: PREDICTS
target_id: GO:0004791
containment: 0.857
jaccard_similarity: 0.48
intersection_count: 72
exclusive_count: 12
- id: IPR023753
type: INTERPRO
label: FAD/NAD(P)-binding domain
appears_in_condition_sets:
- 1
protein_count: 869
related_entries:
- relationship: PREDICTED_BY
target_id: IPR005982
containment: 0.075
jaccard_similarity: 0.075
intersection_count: 65
exclusive_count: 804
- relationship: PREDICTED_BY
target_id: IPR008255
containment: 0.097
jaccard_similarity: 0.097
intersection_count: 84
exclusive_count: 785
- relationship: PREDICTED_BY
target_id: 3.50.50.60:FF:000064
containment: 1.0
jaccard_similarity: 0.021
intersection_count: 18
exclusive_count: 0
- relationship: PREDICTED_BY
target_id: 3.30.390.30:FF:000004
containment: 1.0
jaccard_similarity: 0.02
intersection_count: 17
exclusive_count: 0
- relationship: PREDICTED_BY
target_id: 3.50.50.60:FF:000190
containment: 1.0
jaccard_similarity: 0.021
intersection_count: 18
exclusive_count: 0
- relationship: PREDICTED_BY
target_id: IPR006338
containment: 1.0
jaccard_similarity: 0.022
intersection_count: 19
exclusive_count: 0
- relationship: EQUIV
target_id: IPR045870
containment: 0.0
jaccard_similarity: 0.0
intersection_count: 0
exclusive_count: 869
- relationship: PREDICTED_BY
target_id: GO:0004791
containment: 0.775
jaccard_similarity: 0.119
intersection_count: 107
exclusive_count: 31
- id: IPR045870
type: INTERPRO
source: ipr2go
protein_count: 11
asserted_predicted_go_terms:
- GO:0004791
related_entries:
- relationship: EQUIV
target_id: IPR005982
containment: 0.0
jaccard_similarity: 0.0
intersection_count: 0
exclusive_count: 11
- relationship: EQUIV
target_id: IPR008255
containment: 0.0
jaccard_similarity: 0.0
intersection_count: 0
exclusive_count: 11
- relationship: EQUIV
target_id: IPR023753
containment: 0.0
jaccard_similarity: 0.0
intersection_count: 0
exclusive_count: 11
- relationship: EQUIV
target_id: 3.50.50.60:FF:000064
containment: 0.0
jaccard_similarity: 0.0
intersection_count: 0
exclusive_count: 11
- relationship: EQUIV
target_id: 3.30.390.30:FF:000004
containment: 0.0
jaccard_similarity: 0.0
intersection_count: 0
exclusive_count: 11
- relationship: EQUIV
target_id: 3.50.50.60:FF:000190
containment: 0.0
jaccard_similarity: 0.0
intersection_count: 0
exclusive_count: 11
- relationship: EQUIV
target_id: IPR006338
containment: 0.0
jaccard_similarity: 0.0
intersection_count: 0
exclusive_count: 11
- relationship: PREDICTS
target_id: GO:0004791
containment: 1.0
jaccard_similarity: 0.08
intersection_count: 11
exclusive_count: 0
review_summary: This rule is COMPLETELY REDUNDANT with existing InterPro2GO mappings.
Condition set 1 (IPR005982-based) duplicates the existing IPR005982 → GO:0004791 mapping.
Condition set 2 is a complete subset of CS1 with zero unique coverage. Critically,
condition set 3 (Rattus FunFams), which initially appeared to provide unique value
by capturing ~19 proteins DISJOINT from IPR005982, is actually COMPLETELY COVERED
by IPR006338 - another InterPro entry that maps to the same GO term via ipr2go.
Quantitative analysis shows CS3 FunFams have 100% containment in IPR006338 (Jaccard
0.895-0.947). Therefore ALL condition sets in this rule are redundant with existing
ipr2go mappings and the rule should be DEPRECATED.
action: DEPRECATE
action_rationale: 'Complete analysis including external ipr2go mappings reveals this
rule is entirely redundant. CS1 duplicates IPR005982 → GO:0004791 (already in ipr2go).
CS2 is a subset of CS1. CS3, which captures ~19 proteins disjoint from IPR005982,
appeared unique until examining IPR006338 - an external InterPro that also maps
to GO:0004791 via ipr2go. Quantitative analysis proves CS3 FunFams are COMPLETELY
COVERED by IPR006338: 3.30.390.30:FF:000004 has 100% containment in IPR006338 (17/17
proteins, Jaccard 0.895), and 3.50.50.60:FF:000190 has 100% containment (18/18
proteins, Jaccard 0.947, interpretation: REDUNDANT). The rule provides zero unique
annotations beyond what ipr2go already covers via IPR005982 and IPR006338.'
suggested_modifications:
- Deprecate the entire rule as completely redundant with existing InterPro2GO mappings
- No condition sets provide unique value - all proteins are covered by either IPR005982
or IPR006338, both of which map to GO:0004791 via ipr2go
- The apparent gap-filling value of CS3 was an artifact of only considering rule-internal
InterPro domains, not the complete ipr2go mapping space
parsimony:
assessment: REDUNDANT
notes: This rule is COMPLETELY REDUNDANT with existing InterPro2GO mappings. All three
condition sets are covered by ipr2go. CS1 duplicates IPR005982 → GO:0004791, CS2 is
a subset of CS1, and CS3 - which appeared to provide unique value for ~19 proteins
disjoint from IPR005982 - is actually 100% covered by IPR006338, another InterPro
that maps to GO:0004791 via ipr2go. The CS3 FunFams (3.30.390.30:FF:000004 and
3.50.50.60:FF:000190) have complete containment in IPR006338 (Jaccard 0.895-0.947).
The rule provides zero annotations beyond what ipr2go already covers and should
be deprecated.
literature_support:
assessment: STRONG
notes: 'Extensive structural and mechanistic literature spanning multiple decades
supports this annotation rule. X-ray crystallography of rat TrxR at 3.0 Angstrom
resolution (PDB 1H6V) reveals the FAD binding domain (residues 1-163, 297-367),
NADPH binding domain (residues 164-296), and interface domain (residues 368-499)
with the C-terminal selenocysteine redox center. The C-terminal extension of ~16
residues containing the Gly-Cys-Sec-Gly motif serves dual functions: (1) extends
the electron transport chain to the protein surface for thioredoxin reduction,
and (2) sterically blocks the glutathione binding site. Mammalian TrxRs require
selenocysteine for efficient catalysis - cysteine substitution reduces activity
to ~1% of wild-type. Prokaryotic enzymes use a distinct 67-degree domain rotation
mechanism. Recent work (2024) has identified the "doorstop pocket" at the re-face
of FAD as a druggable regulatory site, with inhibitors showing efficacy against
parasitic TGR.'
supported_by:
- reference_id: PMID:11481439
supporting_text: electron transfer from NADPH to the disulfide of the substrate
is possible without large conformational changes
- reference_id: file:rules/arba/ARBA00026249/ARBA00026249-deep-research-cyberian.md
supporting_text: High-resolution crystal structures are available for bacterial
thioredoxin reductases from E. coli and mammalian enzymes from rat
- reference_id: file:rules/arba/ARBA00026249/ARBA00026249-deep-research-falcon.md
supporting_text: 'Doorstop pocket as a druggable site: Medicinal chemistry now
targets the doorstop pocket, first demonstrated in Schistosoma mansoni TGR'
condition_overlap:
assessment: MINOR
notes: 'Quantitative analysis reveals IPR005982 (Thioredoxin reductase) is a complete
subset of both IPR008255 and IPR023753. Among 65 SwissProt proteins with IPR005982,
all 65 (100%) also have IPR008255 (class-II active site), and all 65 also have
IPR023753 (FAD/NAD(P)-binding). This represents appropriate hierarchical specificity:
IPR005982 provides TrxR family-level specificity (65 proteins), IPR008255 adds
the class-II active site motif (84 proteins total, including 19 non-TrxR enzymes
like glutathione reductase and lipoamide dehydrogenase), and IPR023753 captures
the broader FAD/NAD(P)-binding architecture (869 proteins across many flavoenzyme
families). The AND requirement creates a specificity filter that prevents false
positives from related oxidoreductases. Set difference analysis shows IPR005982
adds zero unique proteins beyond IPR008255 (|IPR005982 - IPR008255| = 0), but
this is appropriate - the goal is specificity, not coverage expansion. Jaccard
similarity of 0.774 between IPR005982 and IPR008255 indicates high but not complete
overlap, with IPR008255 covering 19 additional class-II enzymes that correctly
lack the TrxR-specific signature.'
supported_by:
- reference_id: file:rules/arba/ARBA00026249/ARBA00026249-deep-research-falcon.md
supporting_text: IPR008255 encodes the canonical class-II dithiol/disulfide redox
chemistry of the family. Found across TrxR, glutathione reductase, lipoamide
dehydrogenase, and related enzymes
go_specificity:
assessment: APPROPRIATE
notes: GO:0004791 is at the correct level of specificity. It captures the precise
enzymatic function (NADPH-dependent thioredoxin reduction) without overgeneralizing
to broader terms like GO:0016651 (oxidoreductase activity, acting on NAD(P)H)
or GO:0016668 (oxidoreductase activity, acting on a sulfur group of donors, NAD(P)
as acceptor). While mammalian TrxRs exhibit remarkable substrate promiscuity (selenite,
lipid hydroperoxides, H2O2, vitamin C, alpha-lipoic acid), thioredoxin remains
the physiologically primary substrate. The explicit NADPH specification is correct
as mammalian TrxRs specifically utilize NADPH (not NADH) as electron donor. A
selenoprotein-specific term would be inappropriately narrow as it would exclude
functional variants like the nematode mitochondrial TrxR that lacks selenocysteine
but retains full catalytic competence with its specific substrate.
supported_by:
- reference_id: file:rules/arba/ARBA00026249/ARBA00026249-deep-research-cyberian.md
supporting_text: This GO term is highly appropriate for the function of thioredoxin
reductases. The reaction it describes is precisely and specifically what thioredoxin
reductases do
- reference_id: file:rules/arba/ARBA00026249/ARBA00026249-deep-research-perplexity.md
supporting_text: 'The nematode Haemonchus contortus contains two distinct TrxRs
with notable structural differences: a cytoplasmic enzyme with a selenocysteine
active site motif similar to mammalian TrxR, and a mitochondrial enzyme with
a unique Gly-Cys-Cys-Gly active site lacking selenocysteine'
taxonomic_scope:
assessment: APPROPRIATE
notes: 'Set 1 (InterPro-based) has no taxonomic restriction, appropriate as TrxR
is found across all domains of life. Thioredoxin reductases exist as two evolutionarily
distinct classes: low-MW (~35 kDa) in bacteria, archaea, plants, fungi, protists
that uses a 67-degree domain rotation mechanism; and high-MW (~55 kDa) in animals
that evolved from glutathione reductase (not from bacterial TrxR) with a C-terminal
selenocysteine extension. Set 2''s eukaryotic restriction is justified as it captures
both eukaryotic enzyme classes while excluding prokaryotic enzymes. Set 3''s Rattus
restriction reflects the structural model organism but should be expanded to Mammalia
since TrxR1 function is highly conserved across mammals with identical catalytic
mechanisms.'
supported_by:
- reference_id: file:rules/arba/ARBA00026249/ARBA00026249-deep-research-cyberian.md
supporting_text: The high molecular weight form did not evolve from bacterial
thioredoxin reductases but rather from glutathione reductase in lower eukaryotes
- reference_id: file:rules/arba/ARBA00026249/ARBA00026249-deep-research-falcon.md
supporting_text: Bacterial proteins with TR-like folds (e.g., YpdA/Bdr; TR-like
FNRs) should be excluded unless the TrxR-specific signature is present
confidence: 0.92
references:
- id: PMID:11481439
title: 'Three-dimensional structure of a mammalian thioredoxin reductase: implications
for mechanism and evolution of a selenocysteine-dependent enzyme'
findings:
- statement: 3.0 Angstrom X-ray structure of rat TrxR (PDB 1H6V) reveals FAD binding
domain (residues 1-163, 297-367), NADPH binding domain (residues 164-296), and
interface domain (residues 368-499) with C-terminal selenocysteine redox center.
The C-terminal 16-residue extension containing the Gly-Cys-Sec-Gly motif extends
the electron transport chain to the surface and blocks glutathione binding.
- id: PMID:31367788
title: 'The thioredoxin system and cancer therapy: a review'
findings:
- statement: Comprehensive review describing NADPH-dependent reduction mechanism,
three mammalian isozymes (TrxR1 cytosolic, TrxR2 mitochondrial, TrxR3 testis-specific),
and therapeutic targeting in cancer.
- id: file:rules/arba/ARBA00026249/ARBA00026249-deep-research-perplexity.md
title: Deep research analysis via Perplexity (51 citations)
findings:
- statement: Comprehensive literature review validating the scientific soundness
of this annotation rule. Confirms GO:0004791 is appropriately specific and domain
combination is diagnostic. Details the doorstop pocket mechanism and the selenocysteine
requirement for full catalytic activity.
- statement: Nematode H. contortus mitochondrial TrxR lacks selenocysteine but retains
full catalytic competence, demonstrating that GO:0004791 appropriately does
not require selenoprotein specification.
- id: file:rules/arba/ARBA00026249/ARBA00026249-deep-research-falcon.md
title: Deep research analysis via Falcon (15 citations, 2023-2024 focus)
findings:
- statement: 'Recent 2024 research on the "doorstop pocket" as a druggable regulatory
site (DOI: 10.1021/acs.jmedchem.4c00669). Universal presence across the family
with isoform-specific shape/electrostatics.'
- statement: Caution about TR-like FNRs and YpdA/Bdr potentially causing false positives
if only generic family motifs are used without TrxR-specific signatures.
- id: file:rules/arba/ARBA00026249/ARBA00026249-deep-research-cyberian.md
title: Deep research analysis via Cyberian (14 citations, detailed evolutionary
analysis)
findings:
- statement: 'Two evolutionarily distinct enzyme classes: low-MW (~35 kDa) in bacteria/
plants/fungi using 67-degree domain rotation, and high-MW (~55 kDa) in animals
derived from glutathione reductase with C-terminal selenocysteine extension.'
- statement: 'The C-terminal extension serves dual functions: (1) extends electron
transport to the surface for thioredoxin reduction, and (2) sterically blocks
the glutathione binding site preventing GR activity despite the shared catalytic
core.'
- statement: Animals have lost the low-MW form entirely; plants and fungi have lost
the high-MW form. This mutually exclusive distribution reflects functional redundancy
- both classes effectively catalyze NADPH-dependent thioredoxin reduction.
- id: InterPro:IPR005982
title: Thioredoxin reductase InterPro entry
findings:
- statement: Domain signature specific to the thioredoxin reductase family. The
primary determinant for TrxR specificity when combined with active site and
cofactor-binding signatures.
supported_by:
- reference_id: PMID:31367788
supporting_text: Thioredoxin (Trx), thioredoxin reductase (TrxR), and NADPH are
key members of the Trx system that is involved in redox regulation and antioxidant
defense