ARBA00026249 thioredoxin-disulfide reductase (NADPH) activity (GO:0004791)

Type: ARBA
Status: COMPLETE
Action: DEPRECATE
Confidence: 0.92

Description

Rule predicting thioredoxin-disulfide reductase (NADPH) activity based on InterPro domains and CATH FunFam classifications. The rule uses three alternative condition sets targeting the same enzymatic function.

Analysis Summary

5
Domain Pairs Analyzed
3
Condition Sets
3
Subset Relationships
0
Redundant Annotations

Domain Overlap Analysis Table

Interactive prediction matrix showing how row entries PREDICT column entries. Cell (i,j) shows what fraction of proteins with row domain i also have column domain j. Click cells to view intersection in UniProt. Click domain IDs to view proteins with that domain.

CS 1 CS 2
Eukaryota
CS 3
Rattus
TGT EXT
ipr2go
Thioredoxin reductase
IPR005982 [F]
(65)
Pyridine nucleotide-disul...
IPR008255 [AS]
(84)
FAD/NAD(P)-binding domain
IPR023753 [D]
(869)
Thioredoxin reductase
3.50.50.60:FF:000064
(18)
Thioredoxin reductase 1, ...
3.30.390.30:FF:000004
(17)
Thioredoxin reductase
3.50.50.60:FF:000190
(18)
thioredoxin-disulfide red...
GO:0004791 []
(138)
IPR006338
IPR006338
(19)
IPR045870
IPR045870
(11)
CS 1 Thioredoxin reductase
IPR005982 [F] (65)
100%
100%
J:77%
(65)
100%
J:7%
(65)
28%
J:28%
(18)
0%
J:0%
(0)
0%
J:0%
(0)
100%
J:47%
(65)
0%
J:0%
(0)
0%
J:0%
(0)
Pyridine nucleotide-disulphide oxidoreductase, class-II, active site
IPR008255 [AS] (84)
77%
J:77%
(65)
100%
100%
J:10%
(84)
21%
J:21%
(18)
0%
J:0%
(0)
0%
J:0%
(0)
86%
J:48%
(72)
0%
J:0%
(0)
0%
J:0%
(0)
FAD/NAD(P)-binding domain
IPR023753 [D] (869)
7%
J:7%
(65)
10%
J:10%
(84)
100%
2%
J:2%
(18)
2%
J:2%
(17)
2%
J:2%
(18)
12%
J:12%
(107)
2%
J:2%
(19)
0%
J:0%
(0)
CS 2
Eukaryota
Thioredoxin reductase
3.50.50.60:FF:000064 (18)
100%
J:28%
(18)
100%
J:21%
(18)
100%
J:2%
(18)
100%
0%
J:0%
(0)
0%
J:0%
(0)
100%
J:13%
(18)
0%
J:0%
(0)
0%
J:0%
(0)
CS 3
Rattus
Thioredoxin reductase 1, cytoplasmic
3.30.390.30:FF:000004 (17)
0%
J:0%
(0)
0%
J:0%
(0)
100%
J:2%
(17)
0%
J:0%
(0)
100%
94%
J:84%
(16)
100%
J:12%
(17)
100%
J:89%
(17)
0%
J:0%
(0)
Thioredoxin reductase
3.50.50.60:FF:000190 (18)
0%
J:0%
(0)
0%
J:0%
(0)
100%
J:2%
(18)
0%
J:0%
(0)
89%
J:84%
(16)
100%
100%
J:13%
(18)
100%
J:95%
(18)
0%
J:0%
(0)
TGT thioredoxin-disulfide reductase (NADPH) activity
GO:0004791 [] (138)
47%
J:47%
(65)
52%
J:48%
(72)
78%
J:12%
(107)
13%
J:13%
(18)
12%
J:12%
(17)
13%
J:13%
(18)
100%
14%
J:14%
(19)
8%
J:8%
(11)
EXT
ipr2go
IPR006338
IPR006338 (19)
0%
J:0%
(0)
0%
J:0%
(0)
100%
J:2%
(19)
0%
J:0%
(0)
89%
J:89%
(17)
95%
J:95%
(18)
100%
J:14%
(19)
100%
0%
J:0%
(0)
IPR045870
IPR045870 (11)
0%
J:0%
(0)
0%
J:0%
(0)
0%
J:0%
(0)
0%
J:0%
(0)
0%
J:0%
(0)
0%
J:0%
(0)
100%
J:8%
(11)
0%
J:0%
(0)
100%

Legend: Each cell shows PREDICTS % (fraction of row entry proteins that also have column entry - row PREDICTS column), Jaccard similarity (J:%), and intersection count. CS = Condition Set(s), TGT = GO annotation target.

Review Summary

This rule is COMPLETELY REDUNDANT with existing InterPro2GO mappings. Condition set 1 (IPR005982-based) duplicates the existing IPR005982 → GO:0004791 mapping. Condition set 2 is a complete subset of CS1 with zero unique coverage. Critically, condition set 3 (Rattus FunFams), which initially appeared to provide unique value by capturing ~19 proteins DISJOINT from IPR005982, is actually COMPLETELY COVERED by IPR006338 - another InterPro entry that maps to the same GO term via ipr2go. Quantitative analysis shows CS3 FunFams have 100% containment in IPR006338 (Jaccard 0.895-0.947). Therefore ALL condition sets in this rule are redundant with existing ipr2go mappings and the rule should be DEPRECATED.

Action Rationale

Complete analysis including external ipr2go mappings reveals this rule is entirely redundant. CS1 duplicates IPR005982 → GO:0004791 (already in ipr2go). CS2 is a subset of CS1. CS3, which captures ~19 proteins disjoint from IPR005982, appeared unique until examining IPR006338 - an external InterPro that also maps to GO:0004791 via ipr2go. Quantitative analysis proves CS3 FunFams are COMPLETELY COVERED by IPR006338: 3.30.390.30:FF:000004 has 100% containment in IPR006338 (17/17 proteins, Jaccard 0.895), and 3.50.50.60:FF:000190 has 100% containment (18/18 proteins, Jaccard 0.947, interpretation: REDUNDANT). The rule provides zero unique annotations beyond what ipr2go already covers via IPR005982 and IPR006338.

GO Annotations

GO:0004791 - thioredoxin-disulfide reductase (NADPH) activity
Aspect: MF
⚠️ Redundant with InterPro2GO: IPR005982 → GO:0004791
This annotation already exists in the manual InterPro2GO mapping file

External Mappings (ipr2go)

These InterPro domains also map to the rule's GO term(s) via InterPro2GO but are not part of any condition set in this rule. They may represent alternative domain signatures that predict the same function.

IPR006338
ipr2go
Maps to:
GO:0004791
19 proteins in Swiss-Prot
IPR045870
ipr2go
Maps to:
GO:0004791
11 proteins in Swiss-Prot

Rule Definition

Condition Sets

Condition Set 1

3 condition(s)
Notes:

Primary condition set requiring three InterPro domain matches, but quantitative analysis reveals this reduces to a simple IPR005982 => GO:0004791 rule. All 65 SwissProt proteins with IPR005982 also have both IPR008255 and IPR023753 (complete subset, |A-B| = 0 for both pairs). The AND logic requiring all three domains is therefore redundant - IPR005982 alone is sufficient. Furthermore, this IPR005982 => GO:0004791 mapping already exists in the official InterPro2GO file, making this entire condition set completely redundant with existing manual curation.

Pairwise Overlap Analysis

Condition A Condition B Count A Count B Intersection Jaccard A in B B in A Interpretation
IPR005982 IPR008255 65 84 65 0.774 1.000 0.774 SUBSET
IPR005982 IPR023753 65 869 65 0.075 1.000 0.075 SUBSET
IPR008255 IPR023753 84 869 84 0.097 1.000 0.097 SUBSET

Condition Set 2

2 condition(s)
Notes:

Alternative condition set using CATH FunFam classification with eukaryotic taxonomic restriction, but quantitative analysis reveals this provides ZERO additional coverage. All 18 proteins matching this FunFam also possess IPR005982, IPR008255, AND IPR023753 (complete subset, containment = 1.0). This condition set is therefore completely redundant with condition set 1, adding no unique protein coverage beyond the InterPro-based conditions.

Pairwise Overlap Analysis

Condition A Condition B Count A Count B Intersection Jaccard A in B B in A Interpretation
3.30.390.30:FF:000004 3.50.50.60:FF:000190 17 18 16 0.842 0.941 0.889 HIGH_OVERLAP

Condition Set 3

3 condition(s)
Notes:

Highly specific condition set for Rattus (rat) cytoplasmic thioredoxin reductase 1. Quantitative analysis reveals this provides SUBSTANTIAL unique coverage: both FunFams (3.30.390.30:FF:000004 with 17 proteins and 3.50.50.60:FF:000190 with 18 proteins) are COMPLETELY DISJOINT from IPR005982 (0 overlap), meaning they capture ~19 unique proteins that condition set 1 completely misses. While all CS3 proteins possess the broad IPR023753 domain (FAD/NAD binding, 869 proteins), they LACK the specific IPR005982 thioredoxin reductase domain. This represents proteins that may have thioredoxin reductase activity based on FunFam homology but are not annotated with the specific InterPro domain. The two FunFams overlap heavily with each other (16/17 and 16/18 shared proteins), suggesting they target the same protein set. The Rattus taxonomic restriction may be overly narrow, but this condition set provides valuable coverage for proteins lacking canonical InterPro domain annotations.

Pairwise Overlap Analysis

Condition A Condition B Count A Count B Intersection Jaccard A in B B in A Interpretation
3.30.390.30:FF:000004 3.50.50.60:FF:000190 17 18 16 0.842 0.941 0.889 HIGH_OVERLAP

Assessments

REDUNDANT

This rule is COMPLETELY REDUNDANT with existing InterPro2GO mappings. All three condition sets are covered by ipr2go. CS1 duplicates IPR005982 → GO:0004791, CS2 is a subset of CS1, and CS3 - which appeared to provide unique value for ~19 proteins disjoint from IPR005982 - is actually 100% covered by IPR006338, another InterPro that maps to GO:0004791 via ipr2go. The CS3 FunFams (3.30.390.30:FF:000004 and 3.50.50.60:FF:000190) have complete containment in IPR006338 (Jaccard 0.895-0.947). The rule provides zero annotations beyond what ipr2go already covers and should be deprecated.

STRONG

Extensive structural and mechanistic literature spanning multiple decades supports this annotation rule. X-ray crystallography of rat TrxR at 3.0 Angstrom resolution (PDB 1H6V) reveals the FAD binding domain (residues 1-163, 297-367), NADPH binding domain (residues 164-296), and interface domain (residues 368-499) with the C-terminal selenocysteine redox center. The C-terminal extension of ~16 residues containing the Gly-Cys-Sec-Gly motif serves dual functions: (1) extends the electron transport chain to the protein surface for thioredoxin reduction, and (2) sterically blocks the glutathione binding site. Mammalian TrxRs require selenocysteine for efficient catalysis - cysteine substitution reduces activity to ~1% of wild-type. Prokaryotic enzymes use a distinct 67-degree domain rotation mechanism. Recent work (2024) has identified the "doorstop pocket" at the re-face of FAD as a druggable regulatory site, with inhibitors showing efficacy against parasitic TGR.

Supporting Evidence:

  • PMID:11481439: electron transfer from NADPH to the disulfide of the substrate is possible without large conformational changes
  • file:rules/arba/ARBA00026249/ARBA00026249-deep-research-cyberian.md: High-resolution crystal structures are available for bacterial thioredoxin reductases from E. coli and mammalian enzymes from rat
  • file:rules/arba/ARBA00026249/ARBA00026249-deep-research-falcon.md: Doorstop pocket as a druggable site: Medicinal chemistry now targets the doorstop pocket, first demonstrated in Schistosoma mansoni TGR
MINOR

Quantitative analysis reveals IPR005982 (Thioredoxin reductase) is a complete subset of both IPR008255 and IPR023753. Among 65 SwissProt proteins with IPR005982, all 65 (100%) also have IPR008255 (class-II active site), and all 65 also have IPR023753 (FAD/NAD(P)-binding). This represents appropriate hierarchical specificity: IPR005982 provides TrxR family-level specificity (65 proteins), IPR008255 adds the class-II active site motif (84 proteins total, including 19 non-TrxR enzymes like glutathione reductase and lipoamide dehydrogenase), and IPR023753 captures the broader FAD/NAD(P)-binding architecture (869 proteins across many flavoenzyme families). The AND requirement creates a specificity filter that prevents false positives from related oxidoreductases. Set difference analysis shows IPR005982 adds zero unique proteins beyond IPR008255 (|IPR005982 - IPR008255| = 0), but this is appropriate - the goal is specificity, not coverage expansion. Jaccard similarity of 0.774 between IPR005982 and IPR008255 indicates high but not complete overlap, with IPR008255 covering 19 additional class-II enzymes that correctly lack the TrxR-specific signature.

APPROPRIATE

GO:0004791 is at the correct level of specificity. It captures the precise enzymatic function (NADPH-dependent thioredoxin reduction) without overgeneralizing to broader terms like GO:0016651 (oxidoreductase activity, acting on NAD(P)H) or GO:0016668 (oxidoreductase activity, acting on a sulfur group of donors, NAD(P) as acceptor). While mammalian TrxRs exhibit remarkable substrate promiscuity (selenite, lipid hydroperoxides, H2O2, vitamin C, alpha-lipoic acid), thioredoxin remains the physiologically primary substrate. The explicit NADPH specification is correct as mammalian TrxRs specifically utilize NADPH (not NADH) as electron donor. A selenoprotein-specific term would be inappropriately narrow as it would exclude functional variants like the nematode mitochondrial TrxR that lacks selenocysteine but retains full catalytic competence with its specific substrate.

APPROPRIATE

Set 1 (InterPro-based) has no taxonomic restriction, appropriate as TrxR is found across all domains of life. Thioredoxin reductases exist as two evolutionarily distinct classes: low-MW (~35 kDa) in bacteria, archaea, plants, fungi, protists that uses a 67-degree domain rotation mechanism; and high-MW (~55 kDa) in animals that evolved from glutathione reductase (not from bacterial TrxR) with a C-terminal selenocysteine extension. Set 2's eukaryotic restriction is justified as it captures both eukaryotic enzyme classes while excluding prokaryotic enzymes. Set 3's Rattus restriction reflects the structural model organism but should be expanded to Mammalia since TrxR1 function is highly conserved across mammals with identical catalytic mechanisms.

References (6)

Raw YAML

View Source YAML
id: ARBA00026249
description: Rule predicting thioredoxin-disulfide reductase (NADPH) activity based
  on InterPro domains and CATH FunFam classifications. The rule uses three alternative
  condition sets targeting the same enzymatic function.
status: COMPLETE
rule_type: ARBA
rule:
  rule_id: ARBA00026249
  condition_sets:
  - number: 1
    conditions:
    - condition_type: INTERPRO
      value: IPR005982
      curie: InterPro:IPR005982
      label: Thioredoxin reductase
      negated: false
    - condition_type: INTERPRO
      value: IPR008255
      curie: InterPro:IPR008255
      label: Pyridine nucleotide-disulphide oxidoreductase, class-II, active site
      negated: false
    - condition_type: INTERPRO
      value: IPR023753
      curie: InterPro:IPR023753
      label: FAD/NAD(P)-binding domain
      negated: false
    notes: Primary condition set requiring three InterPro domain matches, but quantitative
      analysis reveals this reduces to a simple IPR005982 => GO:0004791 rule. All
      65 SwissProt proteins with IPR005982 also have both IPR008255 and IPR023753
      (complete subset, |A-B| = 0 for both pairs). The AND logic requiring all three
      domains is therefore redundant - IPR005982 alone is sufficient. Furthermore,
      this IPR005982 => GO:0004791 mapping already exists in the official InterPro2GO
      file, making this entire condition set completely redundant with existing manual
      curation.
    pairwise_overlap:
    - condition_a: IPR005982
      condition_b: IPR008255
      protein_database: SWISSPROT
      count_a: 65
      count_b: 84
      intersection_count: 65
      a_minus_b_count: 0
      b_minus_a_count: 19
      jaccard_similarity: 0.7738095238095238
      containment_a_in_b: 1.0
      containment_b_in_a: 0.7738095238095238
      interpretation: SUBSET
    - condition_a: IPR005982
      condition_b: IPR023753
      protein_database: SWISSPROT
      count_a: 65
      count_b: 869
      intersection_count: 65
      a_minus_b_count: 0
      b_minus_a_count: 804
      jaccard_similarity: 0.07479861910241657
      containment_a_in_b: 1.0
      containment_b_in_a: 0.07479861910241657
      interpretation: SUBSET
    - condition_a: IPR008255
      condition_b: IPR023753
      protein_database: SWISSPROT
      count_a: 84
      count_b: 869
      intersection_count: 84
      a_minus_b_count: 0
      b_minus_a_count: 785
      jaccard_similarity: 0.09666283084004602
      containment_a_in_b: 1.0
      containment_b_in_a: 0.09666283084004602
      interpretation: SUBSET
  - number: 2
    conditions:
    - condition_type: FUNFAM
      value: 3.50.50.60:FF:000064
      curie: CATH.FunFam:3.50.50.60:FF:000064
      label: Thioredoxin reductase
      negated: false
    - condition_type: TAXON
      value: '2759'
      curie: NCBITaxon:2759
      label: Eukaryota
      negated: false
    notes: Alternative condition set using CATH FunFam classification with eukaryotic
      taxonomic restriction, but quantitative analysis reveals this provides ZERO
      additional coverage. All 18 proteins matching this FunFam also possess IPR005982,
      IPR008255, AND IPR023753 (complete subset, containment = 1.0). This condition
      set is therefore completely redundant with condition set 1, adding no unique
      protein coverage beyond the InterPro-based conditions.
    pairwise_overlap:
    - condition_a: 3.30.390.30:FF:000004
      condition_b: 3.50.50.60:FF:000190
      protein_database: SWISSPROT
      count_a: 17
      count_b: 18
      intersection_count: 16
      a_minus_b_count: 1
      b_minus_a_count: 2
      jaccard_similarity: 0.8421052631578947
      containment_a_in_b: 0.9411764705882353
      containment_b_in_a: 0.8888888888888888
      interpretation: HIGH_OVERLAP
  - number: 3
    conditions:
    - condition_type: FUNFAM
      value: 3.30.390.30:FF:000004
      curie: CATH.FunFam:3.30.390.30:FF:000004
      label: Thioredoxin reductase 1, cytoplasmic
      negated: false
    - condition_type: FUNFAM
      value: 3.50.50.60:FF:000190
      curie: CATH.FunFam:3.50.50.60:FF:000190
      label: Thioredoxin reductase
      negated: false
    - condition_type: TAXON
      value: '10114'
      curie: NCBITaxon:10114
      label: Rattus
      negated: false
    notes: 'Highly specific condition set for Rattus (rat) cytoplasmic thioredoxin
      reductase 1. Quantitative analysis reveals this provides SUBSTANTIAL unique
      coverage: both FunFams (3.30.390.30:FF:000004 with 17 proteins and 3.50.50.60:FF:000190
      with 18 proteins) are COMPLETELY DISJOINT from IPR005982 (0 overlap), meaning
      they capture ~19 unique proteins that condition set 1 completely misses. While
      all CS3 proteins possess the broad IPR023753 domain (FAD/NAD binding, 869 proteins),
      they LACK the specific IPR005982 thioredoxin reductase domain. This represents
      proteins that may have thioredoxin reductase activity based on FunFam homology
      but are not annotated with the specific InterPro domain. The two FunFams overlap
      heavily with each other (16/17 and 16/18 shared proteins), suggesting they target
      the same protein set. The Rattus taxonomic restriction may be overly narrow,
      but this condition set provides valuable coverage for proteins lacking canonical
      InterPro domain annotations.'
    pairwise_overlap:
    - condition_a: 3.30.390.30:FF:000004
      condition_b: 3.50.50.60:FF:000190
      protein_database: SWISSPROT
      count_a: 17
      count_b: 18
      intersection_count: 16
      a_minus_b_count: 1
      b_minus_a_count: 2
      jaccard_similarity: 0.8421052631578947
      containment_a_in_b: 0.9411764705882353
      containment_b_in_a: 0.8888888888888888
      interpretation: HIGH_OVERLAP
  go_annotations:
  - go_id: GO:0004791
    go_label: thioredoxin-disulfide reductase (NADPH) activity
    aspect: MF
  reviewed_protein_count: 0
  unreviewed_protein_count: 0
  created_date: '2021-10-20'
  modified_date: '2025-03-21'
  entries:
  - id: 3.30.390.30:FF:000004
    type: FUNFAM
    label: Thioredoxin reductase 1, cytoplasmic
    appears_in_condition_sets:
    - 3
    protein_count: 17
    related_entries:
    - relationship: EQUIV
      target_id: IPR005982
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 17
    - relationship: EQUIV
      target_id: IPR008255
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 17
    - relationship: PREDICTS
      target_id: IPR023753
      containment: 0.02
      jaccard_similarity: 0.02
      intersection_count: 17
      exclusive_count: 852
    - relationship: EQUIV
      target_id: 3.50.50.60:FF:000064
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 17
    - relationship: PREDICTS
      target_id: 3.50.50.60:FF:000190
      containment: 0.941
      jaccard_similarity: 0.842
      intersection_count: 16
      exclusive_count: 1
    - relationship: PREDICTS
      target_id: IPR006338
      containment: 1.0
      jaccard_similarity: 0.895
      intersection_count: 17
      exclusive_count: 0
    - relationship: EQUIV
      target_id: IPR045870
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 17
    - relationship: PREDICTS
      target_id: GO:0004791
      containment: 1.0
      jaccard_similarity: 0.123
      intersection_count: 17
      exclusive_count: 0
  - id: 3.50.50.60:FF:000064
    type: FUNFAM
    label: Thioredoxin reductase
    appears_in_condition_sets:
    - 2
    protein_count: 18
    related_entries:
    - relationship: PREDICTS
      target_id: IPR005982
      containment: 0.277
      jaccard_similarity: 0.277
      intersection_count: 18
      exclusive_count: 47
    - relationship: PREDICTS
      target_id: IPR008255
      containment: 0.214
      jaccard_similarity: 0.214
      intersection_count: 18
      exclusive_count: 66
    - relationship: PREDICTS
      target_id: IPR023753
      containment: 0.021
      jaccard_similarity: 0.021
      intersection_count: 18
      exclusive_count: 851
    - relationship: EQUIV
      target_id: 3.30.390.30:FF:000004
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 18
    - relationship: EQUIV
      target_id: 3.50.50.60:FF:000190
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 18
    - relationship: EQUIV
      target_id: IPR006338
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 18
    - relationship: EQUIV
      target_id: IPR045870
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 18
    - relationship: PREDICTS
      target_id: GO:0004791
      containment: 1.0
      jaccard_similarity: 0.13
      intersection_count: 18
      exclusive_count: 0
  - id: 3.50.50.60:FF:000190
    type: FUNFAM
    label: Thioredoxin reductase
    appears_in_condition_sets:
    - 3
    protein_count: 18
    related_entries:
    - relationship: EQUIV
      target_id: IPR005982
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 18
    - relationship: EQUIV
      target_id: IPR008255
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 18
    - relationship: PREDICTS
      target_id: IPR023753
      containment: 0.021
      jaccard_similarity: 0.021
      intersection_count: 18
      exclusive_count: 851
    - relationship: EQUIV
      target_id: 3.50.50.60:FF:000064
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 18
    - relationship: PREDICTED_BY
      target_id: 3.30.390.30:FF:000004
      containment: 0.889
      jaccard_similarity: 0.842
      intersection_count: 16
      exclusive_count: 2
    - relationship: PREDICTS
      target_id: IPR006338
      containment: 1.0
      jaccard_similarity: 0.947
      intersection_count: 18
      exclusive_count: 0
    - relationship: EQUIV
      target_id: IPR045870
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 18
    - relationship: PREDICTS
      target_id: GO:0004791
      containment: 1.0
      jaccard_similarity: 0.13
      intersection_count: 18
      exclusive_count: 0
  - id: IPR005982
    type: INTERPRO
    label: Thioredoxin reductase
    appears_in_condition_sets:
    - 1
    protein_count: 65
    related_entries:
    - relationship: PREDICTS
      target_id: IPR008255
      containment: 1.0
      jaccard_similarity: 0.774
      intersection_count: 65
      exclusive_count: 0
    - relationship: PREDICTS
      target_id: IPR023753
      containment: 1.0
      jaccard_similarity: 0.075
      intersection_count: 65
      exclusive_count: 0
    - relationship: PREDICTED_BY
      target_id: 3.50.50.60:FF:000064
      containment: 1.0
      jaccard_similarity: 0.277
      intersection_count: 18
      exclusive_count: 0
    - relationship: EQUIV
      target_id: 3.30.390.30:FF:000004
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 65
    - relationship: EQUIV
      target_id: 3.50.50.60:FF:000190
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 65
    - relationship: EQUIV
      target_id: IPR006338
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 65
    - relationship: EQUIV
      target_id: IPR045870
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 65
    - relationship: PREDICTS
      target_id: GO:0004791
      containment: 1.0
      jaccard_similarity: 0.471
      intersection_count: 65
      exclusive_count: 0
  - id: IPR006338
    type: INTERPRO
    source: ipr2go
    protein_count: 19
    asserted_predicted_go_terms:
    - GO:0004791
    related_entries:
    - relationship: EQUIV
      target_id: IPR005982
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 19
    - relationship: EQUIV
      target_id: IPR008255
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 19
    - relationship: PREDICTS
      target_id: IPR023753
      containment: 0.022
      jaccard_similarity: 0.022
      intersection_count: 19
      exclusive_count: 850
    - relationship: EQUIV
      target_id: 3.50.50.60:FF:000064
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 19
    - relationship: PREDICTED_BY
      target_id: 3.30.390.30:FF:000004
      containment: 0.895
      jaccard_similarity: 0.895
      intersection_count: 17
      exclusive_count: 2
    - relationship: PREDICTED_BY
      target_id: 3.50.50.60:FF:000190
      containment: 0.947
      jaccard_similarity: 0.947
      intersection_count: 18
      exclusive_count: 1
    - relationship: EQUIV
      target_id: IPR045870
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 19
    - relationship: PREDICTS
      target_id: GO:0004791
      containment: 1.0
      jaccard_similarity: 0.138
      intersection_count: 19
      exclusive_count: 0
  - id: IPR008255
    type: INTERPRO
    label: Pyridine nucleotide-disulphide oxidoreductase, class-II, active site
    appears_in_condition_sets:
    - 1
    protein_count: 84
    related_entries:
    - relationship: PREDICTED_BY
      target_id: IPR005982
      containment: 0.774
      jaccard_similarity: 0.774
      intersection_count: 65
      exclusive_count: 19
    - relationship: PREDICTS
      target_id: IPR023753
      containment: 1.0
      jaccard_similarity: 0.097
      intersection_count: 84
      exclusive_count: 0
    - relationship: PREDICTED_BY
      target_id: 3.50.50.60:FF:000064
      containment: 1.0
      jaccard_similarity: 0.214
      intersection_count: 18
      exclusive_count: 0
    - relationship: EQUIV
      target_id: 3.30.390.30:FF:000004
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 84
    - relationship: EQUIV
      target_id: 3.50.50.60:FF:000190
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 84
    - relationship: EQUIV
      target_id: IPR006338
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 84
    - relationship: EQUIV
      target_id: IPR045870
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 84
    - relationship: PREDICTS
      target_id: GO:0004791
      containment: 0.857
      jaccard_similarity: 0.48
      intersection_count: 72
      exclusive_count: 12
  - id: IPR023753
    type: INTERPRO
    label: FAD/NAD(P)-binding domain
    appears_in_condition_sets:
    - 1
    protein_count: 869
    related_entries:
    - relationship: PREDICTED_BY
      target_id: IPR005982
      containment: 0.075
      jaccard_similarity: 0.075
      intersection_count: 65
      exclusive_count: 804
    - relationship: PREDICTED_BY
      target_id: IPR008255
      containment: 0.097
      jaccard_similarity: 0.097
      intersection_count: 84
      exclusive_count: 785
    - relationship: PREDICTED_BY
      target_id: 3.50.50.60:FF:000064
      containment: 1.0
      jaccard_similarity: 0.021
      intersection_count: 18
      exclusive_count: 0
    - relationship: PREDICTED_BY
      target_id: 3.30.390.30:FF:000004
      containment: 1.0
      jaccard_similarity: 0.02
      intersection_count: 17
      exclusive_count: 0
    - relationship: PREDICTED_BY
      target_id: 3.50.50.60:FF:000190
      containment: 1.0
      jaccard_similarity: 0.021
      intersection_count: 18
      exclusive_count: 0
    - relationship: PREDICTED_BY
      target_id: IPR006338
      containment: 1.0
      jaccard_similarity: 0.022
      intersection_count: 19
      exclusive_count: 0
    - relationship: EQUIV
      target_id: IPR045870
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 869
    - relationship: PREDICTED_BY
      target_id: GO:0004791
      containment: 0.775
      jaccard_similarity: 0.119
      intersection_count: 107
      exclusive_count: 31
  - id: IPR045870
    type: INTERPRO
    source: ipr2go
    protein_count: 11
    asserted_predicted_go_terms:
    - GO:0004791
    related_entries:
    - relationship: EQUIV
      target_id: IPR005982
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 11
    - relationship: EQUIV
      target_id: IPR008255
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 11
    - relationship: EQUIV
      target_id: IPR023753
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 11
    - relationship: EQUIV
      target_id: 3.50.50.60:FF:000064
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 11
    - relationship: EQUIV
      target_id: 3.30.390.30:FF:000004
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 11
    - relationship: EQUIV
      target_id: 3.50.50.60:FF:000190
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 11
    - relationship: EQUIV
      target_id: IPR006338
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 11
    - relationship: PREDICTS
      target_id: GO:0004791
      containment: 1.0
      jaccard_similarity: 0.08
      intersection_count: 11
      exclusive_count: 0
review_summary: This rule is COMPLETELY REDUNDANT with existing InterPro2GO mappings.
  Condition set 1 (IPR005982-based) duplicates the existing IPR005982 → GO:0004791 mapping.
  Condition set 2 is a complete subset of CS1 with zero unique coverage. Critically,
  condition set 3 (Rattus FunFams), which initially appeared to provide unique value
  by capturing ~19 proteins DISJOINT from IPR005982, is actually COMPLETELY COVERED
  by IPR006338 - another InterPro entry that maps to the same GO term via ipr2go.
  Quantitative analysis shows CS3 FunFams have 100% containment in IPR006338 (Jaccard
  0.895-0.947). Therefore ALL condition sets in this rule are redundant with existing
  ipr2go mappings and the rule should be DEPRECATED.
action: DEPRECATE
action_rationale: 'Complete analysis including external ipr2go mappings reveals this
  rule is entirely redundant. CS1 duplicates IPR005982 → GO:0004791 (already in ipr2go).
  CS2 is a subset of CS1. CS3, which captures ~19 proteins disjoint from IPR005982,
  appeared unique until examining IPR006338 - an external InterPro that also maps
  to GO:0004791 via ipr2go. Quantitative analysis proves CS3 FunFams are COMPLETELY
  COVERED by IPR006338: 3.30.390.30:FF:000004 has 100% containment in IPR006338 (17/17
  proteins, Jaccard 0.895), and 3.50.50.60:FF:000190 has 100% containment (18/18
  proteins, Jaccard 0.947, interpretation: REDUNDANT). The rule provides zero unique
  annotations beyond what ipr2go already covers via IPR005982 and IPR006338.'
suggested_modifications:
- Deprecate the entire rule as completely redundant with existing InterPro2GO mappings
- No condition sets provide unique value - all proteins are covered by either IPR005982
  or IPR006338, both of which map to GO:0004791 via ipr2go
- The apparent gap-filling value of CS3 was an artifact of only considering rule-internal
  InterPro domains, not the complete ipr2go mapping space
parsimony:
  assessment: REDUNDANT
  notes: This rule is COMPLETELY REDUNDANT with existing InterPro2GO mappings. All three
    condition sets are covered by ipr2go. CS1 duplicates IPR005982 → GO:0004791, CS2 is
    a subset of CS1, and CS3 - which appeared to provide unique value for ~19 proteins
    disjoint from IPR005982 - is actually 100% covered by IPR006338, another InterPro
    that maps to GO:0004791 via ipr2go. The CS3 FunFams (3.30.390.30:FF:000004 and
    3.50.50.60:FF:000190) have complete containment in IPR006338 (Jaccard 0.895-0.947).
    The rule provides zero annotations beyond what ipr2go already covers and should
    be deprecated.
literature_support:
  assessment: STRONG
  notes: 'Extensive structural and mechanistic literature spanning multiple decades
    supports this annotation rule. X-ray crystallography of rat TrxR at 3.0 Angstrom
    resolution (PDB 1H6V) reveals the FAD binding domain (residues 1-163, 297-367),
    NADPH binding domain (residues 164-296), and interface domain (residues 368-499)
    with the C-terminal selenocysteine redox center. The C-terminal extension of ~16
    residues containing the Gly-Cys-Sec-Gly motif serves dual functions: (1) extends
    the electron transport chain to the protein surface for thioredoxin reduction,
    and (2) sterically blocks the glutathione binding site. Mammalian TrxRs require
    selenocysteine for efficient catalysis - cysteine substitution reduces activity
    to ~1% of wild-type. Prokaryotic enzymes use a distinct 67-degree domain rotation
    mechanism. Recent work (2024) has identified the "doorstop pocket" at the re-face
    of FAD as a druggable regulatory site, with inhibitors showing efficacy against
    parasitic TGR.'
  supported_by:
  - reference_id: PMID:11481439
    supporting_text: electron transfer from NADPH to the disulfide of the substrate
      is possible without large conformational changes
  - reference_id: file:rules/arba/ARBA00026249/ARBA00026249-deep-research-cyberian.md
    supporting_text: High-resolution crystal structures are available for bacterial
      thioredoxin reductases from E. coli and mammalian enzymes from rat
  - reference_id: file:rules/arba/ARBA00026249/ARBA00026249-deep-research-falcon.md
    supporting_text: 'Doorstop pocket as a druggable site: Medicinal chemistry now
      targets the doorstop pocket, first demonstrated in Schistosoma mansoni TGR'
condition_overlap:
  assessment: MINOR
  notes: 'Quantitative analysis reveals IPR005982 (Thioredoxin reductase) is a complete
    subset of both IPR008255 and IPR023753. Among 65 SwissProt proteins with IPR005982,
    all 65 (100%) also have IPR008255 (class-II active site), and all 65 also have
    IPR023753 (FAD/NAD(P)-binding). This represents appropriate hierarchical specificity:
    IPR005982 provides TrxR family-level specificity (65 proteins), IPR008255 adds
    the class-II active site motif (84 proteins total, including 19 non-TrxR enzymes
    like glutathione reductase and lipoamide dehydrogenase), and IPR023753 captures
    the broader FAD/NAD(P)-binding architecture (869 proteins across many flavoenzyme
    families). The AND requirement creates a specificity filter that prevents false
    positives from related oxidoreductases. Set difference analysis shows IPR005982
    adds zero unique proteins beyond IPR008255 (|IPR005982 - IPR008255| = 0), but
    this is appropriate - the goal is specificity, not coverage expansion. Jaccard
    similarity of 0.774 between IPR005982 and IPR008255 indicates high but not complete
    overlap, with IPR008255 covering 19 additional class-II enzymes that correctly
    lack the TrxR-specific signature.'
  supported_by:
  - reference_id: file:rules/arba/ARBA00026249/ARBA00026249-deep-research-falcon.md
    supporting_text: IPR008255 encodes the canonical class-II dithiol/disulfide redox
      chemistry of the family. Found across TrxR, glutathione reductase, lipoamide
      dehydrogenase, and related enzymes
go_specificity:
  assessment: APPROPRIATE
  notes: GO:0004791 is at the correct level of specificity. It captures the precise
    enzymatic function (NADPH-dependent thioredoxin reduction) without overgeneralizing
    to broader terms like GO:0016651 (oxidoreductase activity, acting on NAD(P)H)
    or GO:0016668 (oxidoreductase activity, acting on a sulfur group of donors, NAD(P)
    as acceptor). While mammalian TrxRs exhibit remarkable substrate promiscuity (selenite,
    lipid hydroperoxides, H2O2, vitamin C, alpha-lipoic acid), thioredoxin remains
    the physiologically primary substrate. The explicit NADPH specification is correct
    as mammalian TrxRs specifically utilize NADPH (not NADH) as electron donor. A
    selenoprotein-specific term would be inappropriately narrow as it would exclude
    functional variants like the nematode mitochondrial TrxR that lacks selenocysteine
    but retains full catalytic competence with its specific substrate.
  supported_by:
  - reference_id: file:rules/arba/ARBA00026249/ARBA00026249-deep-research-cyberian.md
    supporting_text: This GO term is highly appropriate for the function of thioredoxin
      reductases. The reaction it describes is precisely and specifically what thioredoxin
      reductases do
  - reference_id: file:rules/arba/ARBA00026249/ARBA00026249-deep-research-perplexity.md
    supporting_text: 'The nematode Haemonchus contortus contains two distinct TrxRs
      with notable structural differences: a cytoplasmic enzyme with a selenocysteine
      active site motif similar to mammalian TrxR, and a mitochondrial enzyme with
      a unique Gly-Cys-Cys-Gly active site lacking selenocysteine'
taxonomic_scope:
  assessment: APPROPRIATE
  notes: 'Set 1 (InterPro-based) has no taxonomic restriction, appropriate as TrxR
    is found across all domains of life. Thioredoxin reductases exist as two evolutionarily
    distinct classes: low-MW (~35 kDa) in bacteria, archaea, plants, fungi, protists
    that uses a 67-degree domain rotation mechanism; and high-MW (~55 kDa) in animals
    that evolved from glutathione reductase (not from bacterial TrxR) with a C-terminal
    selenocysteine extension. Set 2''s eukaryotic restriction is justified as it captures
    both eukaryotic enzyme classes while excluding prokaryotic enzymes. Set 3''s Rattus
    restriction reflects the structural model organism but should be expanded to Mammalia
    since TrxR1 function is highly conserved across mammals with identical catalytic
    mechanisms.'
  supported_by:
  - reference_id: file:rules/arba/ARBA00026249/ARBA00026249-deep-research-cyberian.md
    supporting_text: The high molecular weight form did not evolve from bacterial
      thioredoxin reductases but rather from glutathione reductase in lower eukaryotes
  - reference_id: file:rules/arba/ARBA00026249/ARBA00026249-deep-research-falcon.md
    supporting_text: Bacterial proteins with TR-like folds (e.g., YpdA/Bdr; TR-like
      FNRs) should be excluded unless the TrxR-specific signature is present
confidence: 0.92
references:
- id: PMID:11481439
  title: 'Three-dimensional structure of a mammalian thioredoxin reductase: implications
    for mechanism and evolution of a selenocysteine-dependent enzyme'
  findings:
  - statement: 3.0 Angstrom X-ray structure of rat TrxR (PDB 1H6V) reveals FAD binding
      domain (residues 1-163, 297-367), NADPH binding domain (residues 164-296), and
      interface domain (residues 368-499) with C-terminal selenocysteine redox center.
      The C-terminal 16-residue extension containing the Gly-Cys-Sec-Gly motif extends
      the electron transport chain to the surface and blocks glutathione binding.
- id: PMID:31367788
  title: 'The thioredoxin system and cancer therapy: a review'
  findings:
  - statement: Comprehensive review describing NADPH-dependent reduction mechanism,
      three mammalian isozymes (TrxR1 cytosolic, TrxR2 mitochondrial, TrxR3 testis-specific),
      and therapeutic targeting in cancer.
- id: file:rules/arba/ARBA00026249/ARBA00026249-deep-research-perplexity.md
  title: Deep research analysis via Perplexity (51 citations)
  findings:
  - statement: Comprehensive literature review validating the scientific soundness
      of this annotation rule. Confirms GO:0004791 is appropriately specific and domain
      combination is diagnostic. Details the doorstop pocket mechanism and the selenocysteine
      requirement for full catalytic activity.
  - statement: Nematode H. contortus mitochondrial TrxR lacks selenocysteine but retains
      full catalytic competence, demonstrating that GO:0004791 appropriately does
      not require selenoprotein specification.
- id: file:rules/arba/ARBA00026249/ARBA00026249-deep-research-falcon.md
  title: Deep research analysis via Falcon (15 citations, 2023-2024 focus)
  findings:
  - statement: 'Recent 2024 research on the "doorstop pocket" as a druggable regulatory
      site (DOI: 10.1021/acs.jmedchem.4c00669). Universal presence across the family
      with isoform-specific shape/electrostatics.'
  - statement: Caution about TR-like FNRs and YpdA/Bdr potentially causing false positives
      if only generic family motifs are used without TrxR-specific signatures.
- id: file:rules/arba/ARBA00026249/ARBA00026249-deep-research-cyberian.md
  title: Deep research analysis via Cyberian (14 citations, detailed evolutionary
    analysis)
  findings:
  - statement: 'Two evolutionarily distinct enzyme classes: low-MW (~35 kDa) in bacteria/
      plants/fungi using 67-degree domain rotation, and high-MW (~55 kDa) in animals
      derived from glutathione reductase with C-terminal selenocysteine extension.'
  - statement: 'The C-terminal extension serves dual functions: (1) extends electron
      transport to the surface for thioredoxin reduction, and (2) sterically blocks
      the glutathione binding site preventing GR activity despite the shared catalytic
      core.'
  - statement: Animals have lost the low-MW form entirely; plants and fungi have lost
      the high-MW form. This mutually exclusive distribution reflects functional redundancy
      - both classes effectively catalyze NADPH-dependent thioredoxin reduction.
- id: InterPro:IPR005982
  title: Thioredoxin reductase InterPro entry
  findings:
  - statement: Domain signature specific to the thioredoxin reductase family. The
      primary determinant for TrxR specificity when combined with active site and
      cofactor-binding signatures.
supported_by:
- reference_id: PMID:31367788
  supporting_text: Thioredoxin (Trx), thioredoxin reductase (TrxR), and NADPH are
    key members of the Trx system that is involved in redox regulation and antioxidant
    defense