ARBA00047244 amide catabolic process (GO:0043605)

Type: ARBA
Status: COMPLETE
Action: DEPRECATE
Confidence: 0.80

Description

Rule predicting amide catabolic process (GO:0043605) based on four condition sets: (1) urease domains (beta/gamma subunits and active site) across all organisms, (2) PM20D1 FunFams in Eukaryota, (3) ureide pathway hydrolases in Viridiplantae, and (4) (S)-ureidoglycine aminohydrolase in Streptophyta. The rule captures mechanistically diverse enzymes that share the common chemistry of amide bond hydrolysis but serve distinct biological functions.

Analysis Summary

5
Domain Pairs Analyzed
4
Condition Sets
1
Subset Relationships
0
Redundant Annotations

Domain Overlap Analysis Table

Interactive prediction matrix showing how row entries PREDICT column entries. Cell (i,j) shows what fraction of proteins with row domain i also have column domain j. Click cells to view intersection in UniProt. Click domain IDs to view proteins with that domain.

CS 1 CS 2
Eukaryota
CS 3
Viridiplantae
CS 4
Streptophyta
TGT
Urease, beta subunit-like
IPR002019
(279)
Urease, gamma/gamma-beta ...
IPR002026
(308)
Urease active site
IPR017950
(264)
N-fatty-acyl-amino acid s...
1.10.150.900:FF:000003
(10)
N-fatty-acyl-amino acid s...
3.40.630.10:FF:000027
(10)
Putative ureidoglycolate ...
3.30.70.360:FF:000012
(2)
Allantoate amidohydrolase
3.40.630.10:FF:000044
(7)
(S)-ureidoglycine aminohy...
2.60.120.10:FF:000137
(2)
amide catabolic process
GO:0043605 []
(2230)
CS 1 Urease, beta subunit-like
IPR002019 (279)
100%
10%
J:5%
(29)
4%
J:2%
(10)
0%
J:0%
(0)
0%
J:0%
(0)
0%
J:0%
(0)
0%
J:0%
(0)
0%
J:0%
(0)
100%
J:13%
(279)
Urease, gamma/gamma-beta subunit
IPR002026 (308)
9%
J:5%
(29)
100%
3%
J:2%
(10)
0%
J:0%
(0)
0%
J:0%
(0)
0%
J:0%
(0)
0%
J:0%
(0)
0%
J:0%
(0)
100%
J:14%
(308)
Urease active site
IPR017950 (264)
4%
J:2%
(10)
4%
J:2%
(10)
100%
0%
J:0%
(0)
0%
J:0%
(0)
0%
J:0%
(0)
0%
J:0%
(0)
0%
J:0%
(0)
100%
J:12%
(264)
CS 2
Eukaryota
N-fatty-acyl-amino acid synthase/hydrolase PM20D1
1.10.150.900:FF:000003 (10)
0%
J:0%
(0)
0%
J:0%
(0)
0%
J:0%
(0)
100%
100%
J:100%
(10)
0%
J:0%
(0)
0%
J:0%
(0)
0%
J:0%
(0)
70%
J:0%
(7)
N-fatty-acyl-amino acid synthase/hydrolase PM20D1
3.40.630.10:FF:000027 (10)
0%
J:0%
(0)
0%
J:0%
(0)
0%
J:0%
(0)
100%
J:100%
(10)
100%
0%
J:0%
(0)
0%
J:0%
(0)
0%
J:0%
(0)
70%
J:0%
(7)
CS 3
Viridiplantae
Putative ureidoglycolate hydrolase
3.30.70.360:FF:000012 (2)
0%
J:0%
(0)
0%
J:0%
(0)
0%
J:0%
(0)
0%
J:0%
(0)
0%
J:0%
(0)
100%
100%
J:29%
(2)
0%
J:0%
(0)
100%
J:0%
(2)
Allantoate amidohydrolase
3.40.630.10:FF:000044 (7)
0%
J:0%
(0)
0%
J:0%
(0)
0%
J:0%
(0)
0%
J:0%
(0)
0%
J:0%
(0)
29%
J:29%
(2)
100%
0%
J:0%
(0)
57%
J:0%
(4)
CS 4
Streptophyta
(S)-ureidoglycine aminohydrolase
2.60.120.10:FF:000137 (2)
0%
J:0%
(0)
0%
J:0%
(0)
0%
J:0%
(0)
0%
J:0%
(0)
0%
J:0%
(0)
0%
J:0%
(0)
0%
J:0%
(0)
100%
100%
J:0%
(2)
TGT amide catabolic process
GO:0043605 [] (2230)
13%
J:13%
(279)
14%
J:14%
(308)
12%
J:12%
(264)
0%
J:0%
(7)
0%
J:0%
(7)
0%
J:0%
(2)
0%
J:0%
(4)
0%
J:0%
(2)
100%

Legend: Each cell shows PREDICTS % (fraction of row entry proteins that also have column entry - row PREDICTS column), Jaccard similarity (J:%), and intersection count. CS = Condition Set(s), TGT = GO annotation target.

Review Summary

This rule captures four mechanistically distinct enzyme families that share the common chemistry of amide bond hydrolysis: (1) ureases with dinuclear Ni(II) centers hydrolyzing urea for nitrogen mobilization and pH modulation, (2) PM20D1 bidirectional synthase/hydrolases regulating thermogenic N-acyl amino acids, (3-4) plant ureide pathway Mn2+-dependent hydrolases (AAH, UAH, UGlyAH) liberating nitrogen from purine catabolites. Condition sets 1, 3, and 4 are strongly supported by extensive structural, biochemical, and genetic evidence demonstrating amide catabolism as the core or primary function. Condition set 2 (PM20D1) is technically accurate but potentially misleading—PM20D1's primary role is thermogenic lipid signaling, not general amide catabolism. While GO:0043605 is biochemically correct for all four sets, more specific annotations would improve biological utility: GO:0043419 (urea catabolic process) for urease, pathway-specific terms for ureide enzymes, and lipid metabolism terms for PM20D1. The rule demonstrates both strength (mechanistic accuracy) and weakness (biological context obscured by overly general annotation).

Action Rationale

Rule has fundamental logical errors that invalidate it: (1) CS1 uses AND logic requiring 3 urease domains that have LOW overlap (4-10%), creating a triple intersection of ≤10 proteins when there are 851 total urease proteins. This is a 99% false negative rate. Each domain individually predicts GO:0043605 with 100% containment, proving each is sufficient and AND logic is incorrect. (2) CS2 contains perfect redundancy - two PM20D1 FunFams with identical 10-protein sets. (3) CS3 uses AND logic with a perfect subset (UAH ⊆ AAH), missing 5 AAH-only proteins. Additionally, PM20D1's primary biological role is thermogenic lipid signaling, not nitrogen catabolism - the 70% containment to GO:0043605 reflects that 30% are correctly annotated to more specific metabolic/thermogenic terms.

GO Annotations

GO:0043605 - amide catabolic process
Aspect: BP

Rule Definition

Condition Sets

Condition Set 1

3 condition(s)
Notes:

Urease catalyzes the hydrolysis of urea (H2N-CO-NH2) to ammonia and CO2 via a carbamate intermediate, using a conserved dinuclear Ni(II) active site bridged by a carbamylated lysine. This represents a quintessential amide catabolic reaction. Ureases are found across bacteria and plants (but absent from animals) and serve multiple biological roles including nitrogen mobilization, environmental pH modulation, and bacterial virulence. The combination of beta/gamma subunit domains and active site signature provides high specificity for ureolytic activity.

Pairwise Overlap Analysis

Condition A Condition B Count A Count B Intersection Jaccard A in B B in A Interpretation
IPR002019 IPR002026 279 308 29 0.052 0.104 0.094 LOW
IPR002019 IPR017950 279 264 10 0.019 0.036 0.038 LOW
IPR002026 IPR017950 308 264 10 0.018 0.032 0.038 LOW

Condition Set 2

3 condition(s)
Notes:

PM20D1 functions as a bidirectional enzyme catalyzing both synthesis and hydrolysis of N-acyl amino acids from fatty acids and amino acids. While PM20D1 does catalyze amide hydrolysis (94% conversion in hydrolase direction), its primary biological role centers on thermogenic lipid signaling rather than general amide catabolism. The enzyme regulates N-acyl amino acid levels for thermogenic and metabolic effects, not nitrogen mobilization. This condition set raises questions about annotation specificity—while technically correct, GO:0043605 may obscure PM20D1's specialized metabolic function. The Eukaryota scope may also be overly broad.

Pairwise Overlap Analysis

Condition A Condition B Count A Count B Intersection Jaccard A in B B in A Interpretation
1.10.150.900:FF:000003 3.40.630.10:FF:000027 10 10 10 1.000 1.000 1.000 REDUNDANT

Condition Set 3

3 condition(s)
Notes:

These enzymes function in the plant ureide pathway for purine catabolism. Allantoate amidohydrolase (AAH) hydrolyzes allantoate to (S)-ureidoglycine, CO2, and NH3; ureidoglycolate hydrolase (UAH) cleaves ureidoglycolate to glyoxylate, CO2, and NH3. Both use Mn2+ cofactors and are ER-localized. These are well-established amide catabolic enzymes in nitrogen remobilization. The Viridiplantae restriction is appropriate but may be too narrow—the ureide pathway also exists in some bacteria and fungi.

Pairwise Overlap Analysis

Condition A Condition B Count A Count B Intersection Jaccard A in B B in A Interpretation
3.30.70.360:FF:000012 3.40.630.10:FF:000044 2 7 2 0.286 1.000 0.286 SUBSET

Condition Set 4

2 condition(s)
Notes:

(S)-Ureidoglycine aminohydrolase (UGlyAH) catalyzes the enantioselective hydrolysis of (S)-ureidoglycine to (S)-ureidoglycolate and NH3 using Mn2+ as a cofactor. The enzyme adopts a bicupin fold and functions as an octamer. Crystal structures demonstrate that Mn2+ acts as a molecular anchor dictating enantioselectivity. UGlyAH is a well-characterized amide catabolic enzyme in the plant ureide pathway. The Streptophyta restriction appropriately captures the distribution of this enzyme family.

Assessments

OVERLY_COMPLEX

The rule's AND logic creates massive false negatives. CS1 should use OR logic or be split into 3 separate condition sets. CS2 has redundant FunFams. CS3's subsetting means 5 proteins are excluded. The rule groups mechanistically diverse enzymes under overly broad GO term, obscuring biological specificity. CS1 urease domains have only 4-10% pairwise overlap but use AND logic, creating a triple intersection of ≤10 proteins from 851 total urease proteins. Each domain individually has 100% containment to GO:0043605, proving each is sufficient.

STRONG

Condition sets 1, 3, and 4 have extensive experimental support spanning structural crystallography, enzyme kinetics, mechanistic studies, and genetic analyses. Urease: High-resolution structures of bacterial (K. aerogenes, B. pasteurii, H. pylori) and plant (jack bean) ureases reveal conserved dinuclear Ni(II) active sites bridged by carbamylated lysine. Mechanistic studies demonstrate bridging hydroxide nucleophilic attack with extraordinary catalytic proficiency (10^14-fold rate enhancement). Urease is essential for H. pylori gastric colonization and pH-dependent bacterial survival. Plant ureide enzymes: Crystal structures of AtUGlyAH and AtUAH in substrate-bound forms demonstrate Mn2+-dependent catalysis and explain enantioselectivity. Genetic studies show AAH knockout causes seed dormancy, ureide accumulation, and amino acid depletion; UAH-URE double mutants show synthetic lethality, confirming physiological importance. The ureide pathway is conserved across Viridiplantae including mosses and algae. PM20D1: Biochemical assays demonstrate bidirectional NAA synthesis/hydrolysis (1.2% synthase, 94% hydrolase conversion). However, genetic and physiological studies emphasize thermogenic/metabolic roles rather than nitrogen catabolism—PM20D1 knockout causes glucose intolerance and metabolic dysfunction; overexpression increases energy expenditure. Natural promoter variants explain mouse strain cold tolerance differences. This represents strong evidence for catalytic activity but raises questions about functional annotation as general amide catabolism.

Supporting Evidence:

  • file:rules/arba/ARBA00047244/ARBA00047244-deep-research-perplexity.md: Extensive biochemical literature supports the conclusion that proteins matching condition sets 1, 3, and 4 catalyze amide hydrolysis. For urease enzymes, the chemical mechanism has been elucidated through structural crystallography, kinetic analysis, and spectroscopic investigation
  • file:rules/arba/ARBA00047244/ARBA00047244-deep-research-perplexity.md: The structure of (S)-ureidoglycine aminohydrolase from Arabidopsis thaliana in complex with its substrate (S)-ureidoglycine directly demonstrated substrate binding at the Mn2+ coordination site and the catalytic geometry supporting C-N bond hydrolysis
  • file:rules/arba/ARBA00047244/ARBA00047244-deep-research-perplexity.md: Analysis of Ataah mutant seeds revealed pleiotropic effects of allantoate amidohydrolase disruption, including seed dormancy increase, ureide accumulation, and amino acid depletion
  • file:rules/arba/ARBA00047244/ARBA00047244-deep-research-perplexity.md: Mice with PM20D1 deletion exhibit metabolic dysfunction characterized by glucose intolerance and decreased insulin sensitivity, indicating disrupted signaling through N-acyl amino acids rather than impaired amide degradation per se
  • file:rules/arba/ARBA00047244/ARBA00047244-deep-research-falcon.md: Dinuclear Ni(II) center bridged by a carbamylated lysine is conserved; accessory proteins UreD/E/F/G assemble and metallate the site; catalytic "bridging hydroxide" mechanism supported by structural and inhibitor studies
  • file:rules/arba/ARBA00047244/ARBA00047244-deep-research-falcon.md: In Arabidopsis, soybean and rice, allantoin degradation employs one aminohydrolase (UGlyAH) and three amidohydrolases (ALN upstream ring opening; AAH; UAH) to convert (S)-allantoin to glyoxylate, CO2 and NH3
NONE

The four condition sets capture entirely distinct protein families with no sequence or structural overlap. Condition set 1 targets ureases (Ni-dependent, bacterial/plant, multi-subunit assemblies). Condition sets 3-4 target plant ureide pathway enzymes (Mn-dependent, bicupin or related folds, ER-localized). Condition set 2 targets PM20D1 (M20 peptidase family, secreted, bidirectional). These represent independent evolutionary solutions to amide hydrolysis chemistry. Mechanistically, urease uses dinuclear Ni(II) with bridging hydroxide; ureide enzymes use Mn2+ coordination; PM20D1 mechanism is less characterized but involves M20 peptidase chemistry. Functionally, ureases serve nitrogen mobilization and pH modulation; ureide enzymes serve purine catabolism/nitrogen remobilization; PM20D1 serves thermogenic lipid signaling. The lack of overlap underscores that the rule groups mechanistically and functionally diverse enzymes solely on the basis of shared chemistry (amide bond cleavage), which may obscure biologically relevant distinctions.

TOO_BROAD

GO:0043605 (amide catabolic process) is technically accurate for all four condition sets but too broad to capture biological context. The term describes a general chemical mechanism shared by functionally diverse enzymes, potentially misleading users about actual biological roles. For urease (condition set 1), GO:0043419 (urea catabolic process) would be more specific—it captures the unique substrate (urea) rather than the chemical class (amides). Urease's biological roles in nitrogen cycling, bacterial virulence, and pH modulation are obscured by generic amide catabolism annotation. For ureide enzymes (condition sets 3-4), pathway-specific terms like "allantoin catabolic process" or "ureide catabolic process" would better capture their role in purine nitrogen remobilization. For PM20D1 (condition set 2), GO:0043605 is particularly problematic—while the enzyme catalyzes amide hydrolysis, its primary biological role involves thermogenic lipid signaling and metabolic regulation, not nitrogen catabolism. More appropriate annotations might include "N-acyl amino acid metabolic process", "lipid biosynthetic process" (for synthase activity), or "positive regulation of adaptive thermogenesis". The breadth of GO:0043605 enables computational grouping of chemically similar enzymes but sacrifices biological interpretability. A hierarchical annotation strategy (specific + general terms) would preserve both granularity and computational utility.

TOO_BROAD

Condition set 1 (urease) has no taxonomic restriction, which is appropriate given urease's broad distribution across bacteria and plants (but absence from animals). However, the rule could be made more explicit by adding "not Animalia" as an exclusion, providing logical clarity. Condition set 2 (PM20D1) restricts to Eukaryota (NCBITaxon:2759), which is appropriate for PM20D1 presence but may be too broad for the specific annotation. If GO:0043605 is retained, the scope might be narrowed to Mammalia or Vertebrata where thermogenic/metabolic context is relevant—PM20D1 orthologs in plants, fungi, or invertebrates may serve unrelated functions. Condition set 3 (ureide enzymes) restricts to Viridiplantae, which is too narrow—the ureide pathway exists in bacteria and fungi, creating false negatives. The scope should expand to "Viridiplantae, selected bacteria, and selected fungi" or implement granular inclusion of known lineages. Condition set 4 (UGlyAH) restricts to Streptophyta (land plants + streptophyte algae), which appropriately captures this enzyme's distribution and is consistent with experimental evidence. Overall, taxonomic scopes show mixed appropriateness—some too broad (PM20D1 in Eukaryota), some too narrow (ureide enzymes excluding prokaryotes), some appropriate (urease unrestricted, UGlyAH in Streptophyta).

References (3)

Raw YAML

View Source YAML
id: ARBA00047244
description: 'Rule predicting amide catabolic process (GO:0043605) based on four condition
  sets: (1) urease domains (beta/gamma subunits and active site) across all organisms,
  (2) PM20D1 FunFams in Eukaryota, (3) ureide pathway hydrolases in Viridiplantae,
  and (4) (S)-ureidoglycine aminohydrolase in Streptophyta. The rule captures mechanistically
  diverse enzymes that share the common chemistry of amide bond hydrolysis but serve
  distinct biological functions.'
status: COMPLETE
rule_type: ARBA
rule:
  rule_id: ARBA00047244
  condition_sets:
  - number: 1
    conditions:
    - condition_type: INTERPRO
      value: IPR002019
      curie: InterPro:IPR002019
      label: Urease, beta subunit-like
      negated: false
    - condition_type: INTERPRO
      value: IPR002026
      curie: InterPro:IPR002026
      label: Urease, gamma/gamma-beta subunit
      negated: false
    - condition_type: INTERPRO
      value: IPR017950
      curie: InterPro:IPR017950
      label: Urease active site
      negated: false
    notes: Urease catalyzes the hydrolysis of urea (H2N-CO-NH2) to ammonia and CO2
      via a carbamate intermediate, using a conserved dinuclear Ni(II) active site
      bridged by a carbamylated lysine. This represents a quintessential amide catabolic
      reaction. Ureases are found across bacteria and plants (but absent from animals)
      and serve multiple biological roles including nitrogen mobilization, environmental
      pH modulation, and bacterial virulence. The combination of beta/gamma subunit
      domains and active site signature provides high specificity for ureolytic activity.
    pairwise_overlap:
    - condition_a: IPR002019
      condition_b: IPR002026
      protein_database: SWISSPROT
      count_a: 279
      count_b: 308
      intersection_count: 29
      a_minus_b_count: 250
      b_minus_a_count: 279
      jaccard_similarity: 0.05197132616487455
      containment_a_in_b: 0.1039426523297491
      containment_b_in_a: 0.09415584415584416
      interpretation: LOW
    - condition_a: IPR002019
      condition_b: IPR017950
      protein_database: SWISSPROT
      count_a: 279
      count_b: 264
      intersection_count: 10
      a_minus_b_count: 269
      b_minus_a_count: 254
      jaccard_similarity: 0.01876172607879925
      containment_a_in_b: 0.035842293906810034
      containment_b_in_a: 0.03787878787878788
      interpretation: LOW
    - condition_a: IPR002026
      condition_b: IPR017950
      protein_database: SWISSPROT
      count_a: 308
      count_b: 264
      intersection_count: 10
      a_minus_b_count: 298
      b_minus_a_count: 254
      jaccard_similarity: 0.017793594306049824
      containment_a_in_b: 0.032467532467532464
      containment_b_in_a: 0.03787878787878788
      interpretation: LOW
  - number: 2
    conditions:
    - condition_type: FUNFAM
      value: 1.10.150.900:FF:000003
      curie: CATH.FunFam:1.10.150.900:FF:000003
      label: N-fatty-acyl-amino acid synthase/hydrolase PM20D1
      negated: false
    - condition_type: FUNFAM
      value: 3.40.630.10:FF:000027
      curie: CATH.FunFam:3.40.630.10:FF:000027
      label: N-fatty-acyl-amino acid synthase/hydrolase PM20D1
      negated: false
    - condition_type: TAXON
      value: '2759'
      curie: NCBITaxon:2759
      label: Eukaryota
      negated: false
    notes: PM20D1 functions as a bidirectional enzyme catalyzing both synthesis and
      hydrolysis of N-acyl amino acids from fatty acids and amino acids. While PM20D1
      does catalyze amide hydrolysis (94% conversion in hydrolase direction), its
      primary biological role centers on thermogenic lipid signaling rather than general
      amide catabolism. The enzyme regulates N-acyl amino acid levels for thermogenic
      and metabolic effects, not nitrogen mobilization. This condition set raises
      questions about annotation specificity—while technically correct, GO:0043605
      may obscure PM20D1's specialized metabolic function. The Eukaryota scope may
      also be overly broad.
    pairwise_overlap:
    - condition_a: 1.10.150.900:FF:000003
      condition_b: 3.40.630.10:FF:000027
      protein_database: SWISSPROT
      count_a: 10
      count_b: 10
      intersection_count: 10
      a_minus_b_count: 0
      b_minus_a_count: 0
      jaccard_similarity: 1.0
      containment_a_in_b: 1.0
      containment_b_in_a: 1.0
      interpretation: REDUNDANT
  - number: 3
    conditions:
    - condition_type: FUNFAM
      value: 3.30.70.360:FF:000012
      curie: CATH.FunFam:3.30.70.360:FF:000012
      label: Putative ureidoglycolate hydrolase
      negated: false
    - condition_type: FUNFAM
      value: 3.40.630.10:FF:000044
      curie: CATH.FunFam:3.40.630.10:FF:000044
      label: Allantoate amidohydrolase
      negated: false
    - condition_type: TAXON
      value: '33090'
      curie: NCBITaxon:33090
      label: Viridiplantae
      negated: false
    notes: These enzymes function in the plant ureide pathway for purine catabolism.
      Allantoate amidohydrolase (AAH) hydrolyzes allantoate to (S)-ureidoglycine,
      CO2, and NH3; ureidoglycolate hydrolase (UAH) cleaves ureidoglycolate to glyoxylate,
      CO2, and NH3. Both use Mn2+ cofactors and are ER-localized. These are well-established
      amide catabolic enzymes in nitrogen remobilization. The Viridiplantae restriction
      is appropriate but may be too narrow—the ureide pathway also exists in some
      bacteria and fungi.
    pairwise_overlap:
    - condition_a: 3.30.70.360:FF:000012
      condition_b: 3.40.630.10:FF:000044
      protein_database: SWISSPROT
      count_a: 2
      count_b: 7
      intersection_count: 2
      a_minus_b_count: 0
      b_minus_a_count: 5
      jaccard_similarity: 0.2857142857142857
      containment_a_in_b: 1.0
      containment_b_in_a: 0.2857142857142857
      interpretation: SUBSET
  - number: 4
    conditions:
    - condition_type: FUNFAM
      value: 2.60.120.10:FF:000137
      curie: CATH.FunFam:2.60.120.10:FF:000137
      label: (S)-ureidoglycine aminohydrolase
      negated: false
    - condition_type: TAXON
      value: '35493'
      curie: NCBITaxon:35493
      label: Streptophyta
      negated: false
    notes: (S)-Ureidoglycine aminohydrolase (UGlyAH) catalyzes the enantioselective
      hydrolysis of (S)-ureidoglycine to (S)-ureidoglycolate and NH3 using Mn2+ as
      a cofactor. The enzyme adopts a bicupin fold and functions as an octamer. Crystal
      structures demonstrate that Mn2+ acts as a molecular anchor dictating enantioselectivity.
      UGlyAH is a well-characterized amide catabolic enzyme in the plant ureide pathway.
      The Streptophyta restriction appropriately captures the distribution of this
      enzyme family.
  go_annotations:
  - go_id: GO:0043605
    go_label: amide catabolic process
    aspect: BP
  reviewed_protein_count: 0
  unreviewed_protein_count: 0
  created_date: '2024-08-14'
  modified_date: '2025-03-21'
  entries:
  - id: 1.10.150.900:FF:000003
    type: FUNFAM
    label: N-fatty-acyl-amino acid synthase/hydrolase PM20D1
    appears_in_condition_sets:
    - 2
    protein_count: 10
    related_entries:
    - relationship: EQUIV
      target_id: IPR002019
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 10
    - relationship: EQUIV
      target_id: IPR002026
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 10
    - relationship: EQUIV
      target_id: IPR017950
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 10
    - relationship: EQUIV
      target_id: 3.40.630.10:FF:000027
      containment: 1.0
      jaccard_similarity: 1.0
      intersection_count: 10
      exclusive_count: 0
    - relationship: EQUIV
      target_id: 3.30.70.360:FF:000012
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 10
    - relationship: EQUIV
      target_id: 3.40.630.10:FF:000044
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 10
    - relationship: EQUIV
      target_id: 2.60.120.10:FF:000137
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 10
    - relationship: PREDICTS
      target_id: GO:0043605
      containment: 0.7
      jaccard_similarity: 0.003
      intersection_count: 7
      exclusive_count: 3
  - id: 2.60.120.10:FF:000137
    type: FUNFAM
    label: (S)-ureidoglycine aminohydrolase
    appears_in_condition_sets:
    - 4
    protein_count: 2
    related_entries:
    - relationship: EQUIV
      target_id: IPR002019
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 2
    - relationship: EQUIV
      target_id: IPR002026
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 2
    - relationship: EQUIV
      target_id: IPR017950
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 2
    - relationship: EQUIV
      target_id: 1.10.150.900:FF:000003
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 2
    - relationship: EQUIV
      target_id: 3.40.630.10:FF:000027
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 2
    - relationship: EQUIV
      target_id: 3.30.70.360:FF:000012
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 2
    - relationship: EQUIV
      target_id: 3.40.630.10:FF:000044
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 2
    - relationship: PREDICTS
      target_id: GO:0043605
      containment: 1.0
      jaccard_similarity: 0.001
      intersection_count: 2
      exclusive_count: 0
  - id: 3.30.70.360:FF:000012
    type: FUNFAM
    label: Putative ureidoglycolate hydrolase
    appears_in_condition_sets:
    - 3
    protein_count: 2
    related_entries:
    - relationship: EQUIV
      target_id: IPR002019
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 2
    - relationship: EQUIV
      target_id: IPR002026
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 2
    - relationship: EQUIV
      target_id: IPR017950
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 2
    - relationship: EQUIV
      target_id: 1.10.150.900:FF:000003
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 2
    - relationship: EQUIV
      target_id: 3.40.630.10:FF:000027
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 2
    - relationship: PREDICTS
      target_id: 3.40.630.10:FF:000044
      containment: 1.0
      jaccard_similarity: 0.286
      intersection_count: 2
      exclusive_count: 0
    - relationship: EQUIV
      target_id: 2.60.120.10:FF:000137
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 2
    - relationship: PREDICTS
      target_id: GO:0043605
      containment: 1.0
      jaccard_similarity: 0.001
      intersection_count: 2
      exclusive_count: 0
  - id: 3.40.630.10:FF:000027
    type: FUNFAM
    label: N-fatty-acyl-amino acid synthase/hydrolase PM20D1
    appears_in_condition_sets:
    - 2
    protein_count: 10
    related_entries:
    - relationship: EQUIV
      target_id: IPR002019
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 10
    - relationship: EQUIV
      target_id: IPR002026
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 10
    - relationship: EQUIV
      target_id: IPR017950
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 10
    - relationship: EQUIV
      target_id: 1.10.150.900:FF:000003
      containment: 1.0
      jaccard_similarity: 1.0
      intersection_count: 10
      exclusive_count: 0
    - relationship: EQUIV
      target_id: 3.30.70.360:FF:000012
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 10
    - relationship: EQUIV
      target_id: 3.40.630.10:FF:000044
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 10
    - relationship: EQUIV
      target_id: 2.60.120.10:FF:000137
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 10
    - relationship: PREDICTS
      target_id: GO:0043605
      containment: 0.7
      jaccard_similarity: 0.003
      intersection_count: 7
      exclusive_count: 3
  - id: 3.40.630.10:FF:000044
    type: FUNFAM
    label: Allantoate amidohydrolase
    appears_in_condition_sets:
    - 3
    protein_count: 7
    related_entries:
    - relationship: EQUIV
      target_id: IPR002019
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 7
    - relationship: EQUIV
      target_id: IPR002026
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 7
    - relationship: EQUIV
      target_id: IPR017950
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 7
    - relationship: EQUIV
      target_id: 1.10.150.900:FF:000003
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 7
    - relationship: EQUIV
      target_id: 3.40.630.10:FF:000027
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 7
    - relationship: PREDICTED_BY
      target_id: 3.30.70.360:FF:000012
      containment: 0.286
      jaccard_similarity: 0.286
      intersection_count: 2
      exclusive_count: 5
    - relationship: EQUIV
      target_id: 2.60.120.10:FF:000137
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 7
    - relationship: PREDICTS
      target_id: GO:0043605
      containment: 0.571
      jaccard_similarity: 0.002
      intersection_count: 4
      exclusive_count: 3
  - id: IPR002019
    type: INTERPRO
    label: Urease, beta subunit-like
    appears_in_condition_sets:
    - 1
    protein_count: 279
    related_entries:
    - relationship: PREDICTS
      target_id: IPR002026
      containment: 0.104
      jaccard_similarity: 0.052
      intersection_count: 29
      exclusive_count: 250
    - relationship: EQUIV
      target_id: IPR017950
      containment: 0.038
      jaccard_similarity: 0.019
      intersection_count: 10
      exclusive_count: 269
    - relationship: EQUIV
      target_id: 1.10.150.900:FF:000003
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 279
    - relationship: EQUIV
      target_id: 3.40.630.10:FF:000027
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 279
    - relationship: EQUIV
      target_id: 3.30.70.360:FF:000012
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 279
    - relationship: EQUIV
      target_id: 3.40.630.10:FF:000044
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 279
    - relationship: EQUIV
      target_id: 2.60.120.10:FF:000137
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 279
    - relationship: PREDICTS
      target_id: GO:0043605
      containment: 1.0
      jaccard_similarity: 0.125
      intersection_count: 279
      exclusive_count: 0
  - id: IPR002026
    type: INTERPRO
    label: Urease, gamma/gamma-beta subunit
    appears_in_condition_sets:
    - 1
    protein_count: 308
    related_entries:
    - relationship: PREDICTED_BY
      target_id: IPR002019
      containment: 0.094
      jaccard_similarity: 0.052
      intersection_count: 29
      exclusive_count: 279
    - relationship: EQUIV
      target_id: IPR017950
      containment: 0.038
      jaccard_similarity: 0.018
      intersection_count: 10
      exclusive_count: 298
    - relationship: EQUIV
      target_id: 1.10.150.900:FF:000003
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 308
    - relationship: EQUIV
      target_id: 3.40.630.10:FF:000027
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 308
    - relationship: EQUIV
      target_id: 3.30.70.360:FF:000012
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 308
    - relationship: EQUIV
      target_id: 3.40.630.10:FF:000044
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 308
    - relationship: EQUIV
      target_id: 2.60.120.10:FF:000137
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 308
    - relationship: PREDICTS
      target_id: GO:0043605
      containment: 1.0
      jaccard_similarity: 0.138
      intersection_count: 308
      exclusive_count: 0
  - id: IPR017950
    type: INTERPRO
    label: Urease active site
    appears_in_condition_sets:
    - 1
    protein_count: 264
    related_entries:
    - relationship: EQUIV
      target_id: IPR002019
      containment: 0.038
      jaccard_similarity: 0.019
      intersection_count: 10
      exclusive_count: 254
    - relationship: EQUIV
      target_id: IPR002026
      containment: 0.038
      jaccard_similarity: 0.018
      intersection_count: 10
      exclusive_count: 254
    - relationship: EQUIV
      target_id: 1.10.150.900:FF:000003
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 264
    - relationship: EQUIV
      target_id: 3.40.630.10:FF:000027
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 264
    - relationship: EQUIV
      target_id: 3.30.70.360:FF:000012
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 264
    - relationship: EQUIV
      target_id: 3.40.630.10:FF:000044
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 264
    - relationship: EQUIV
      target_id: 2.60.120.10:FF:000137
      containment: 0.0
      jaccard_similarity: 0.0
      intersection_count: 0
      exclusive_count: 264
    - relationship: PREDICTS
      target_id: GO:0043605
      containment: 1.0
      jaccard_similarity: 0.118
      intersection_count: 264
      exclusive_count: 0
review_summary: 'This rule captures four mechanistically distinct enzyme families
  that share the common chemistry of amide bond hydrolysis: (1) ureases with dinuclear
  Ni(II) centers hydrolyzing urea for nitrogen mobilization and pH modulation, (2)
  PM20D1 bidirectional synthase/hydrolases regulating thermogenic N-acyl amino acids,
  (3-4) plant ureide pathway Mn2+-dependent hydrolases (AAH, UAH, UGlyAH) liberating
  nitrogen from purine catabolites. Condition sets 1, 3, and 4 are strongly supported
  by extensive structural, biochemical, and genetic evidence demonstrating amide catabolism
  as the core or primary function. Condition set 2 (PM20D1) is technically accurate
  but potentially misleading—PM20D1''s primary role is thermogenic lipid signaling,
  not general amide catabolism. While GO:0043605 is biochemically correct for all
  four sets, more specific annotations would improve biological utility: GO:0043419
  (urea catabolic process) for urease, pathway-specific terms for ureide enzymes,
  and lipid metabolism terms for PM20D1. The rule demonstrates both strength (mechanistic
  accuracy) and weakness (biological context obscured by overly general annotation).'
action: DEPRECATE
action_rationale: 'Rule has fundamental logical errors that invalidate it: (1) CS1
  uses AND logic requiring 3 urease domains that have LOW overlap (4-10%), creating
  a triple intersection of ≤10 proteins when there are 851 total urease proteins.
  This is a 99% false negative rate. Each domain individually predicts GO:0043605
  with 100% containment, proving each is sufficient and AND logic is incorrect. (2)
  CS2 contains perfect redundancy - two PM20D1 FunFams with identical 10-protein sets.
  (3) CS3 uses AND logic with a perfect subset (UAH ⊆ AAH), missing 5 AAH-only proteins.
  Additionally, PM20D1''s primary biological role is thermogenic lipid signaling,
  not nitrogen catabolism - the 70% containment to GO:0043605 reflects that 30% are
  correctly annotated to more specific metabolic/thermogenic terms.'
suggested_modifications:
- For condition set 1 (urease), use GO:0043419 (urea catabolic process) instead of
  or in addition to GO:0043605
- For condition sets 3-4 (ureide enzymes), use pathway-specific terms like "allantoin
  catabolic process" or "ureide catabolic process" in addition to GO:0043605
- For condition set 2 (PM20D1), consider alternative annotations such as "N-acyl amino
  acid metabolic process" or split into separate synthesis and hydrolysis terms
- Expand condition set 3 taxonomic scope from Viridiplantae to include known bacterial
  and fungal lineages with ureide pathway
- Add hierarchical GO annotation structure where proteins receive both specific (substrate/pathway)
  and general (amide catabolic) terms
- Document biological context differences across condition sets to prevent misinterpretation
  of GO:0043605 as indicating general amide-degrading capacity
parsimony:
  assessment: OVERLY_COMPLEX
  notes: The rule's AND logic creates massive false negatives. CS1 should use OR logic
    or be split into 3 separate condition sets. CS2 has redundant FunFams. CS3's subsetting
    means 5 proteins are excluded. The rule groups mechanistically diverse enzymes
    under overly broad GO term, obscuring biological specificity. CS1 urease domains
    have only 4-10% pairwise overlap but use AND logic, creating a triple intersection
    of ≤10 proteins from 851 total urease proteins. Each domain individually has 100%
    containment to GO:0043605, proving each is sufficient.
  supported_by:
  - reference_id: file:rules/arba/ARBA00047244/ARBA00047244-deep-research-perplexity.md
    supporting_text: The rule employs specific combinations of domains and families
      to predict GO:0043605 annotation. This multi-component approach increases specificity
      by preventing false positive annotations.
  - reference_id: file:rules/arba/ARBA00047244/ARBA00047244-deep-research-falcon.md
    supporting_text: Condition Set 1 (urease beta, gamma, and active site) provides
      a strong, mechanistically coherent signature for urease; together these should
      be highly specific for ureolysis and thus for amide catabolism.
literature_support:
  assessment: STRONG
  notes: 'Condition sets 1, 3, and 4 have extensive experimental support spanning
    structural crystallography, enzyme kinetics, mechanistic studies, and genetic
    analyses. Urease: High-resolution structures of bacterial (K. aerogenes, B. pasteurii,
    H. pylori) and plant (jack bean) ureases reveal conserved dinuclear Ni(II) active
    sites bridged by carbamylated lysine. Mechanistic studies demonstrate bridging
    hydroxide nucleophilic attack with extraordinary catalytic proficiency (10^14-fold
    rate enhancement). Urease is essential for H. pylori gastric colonization and
    pH-dependent bacterial survival. Plant ureide enzymes: Crystal structures of AtUGlyAH
    and AtUAH in substrate-bound forms demonstrate Mn2+-dependent catalysis and explain
    enantioselectivity. Genetic studies show AAH knockout causes seed dormancy, ureide
    accumulation, and amino acid depletion; UAH-URE double mutants show synthetic
    lethality, confirming physiological importance. The ureide pathway is conserved
    across Viridiplantae including mosses and algae. PM20D1: Biochemical assays demonstrate
    bidirectional NAA synthesis/hydrolysis (1.2% synthase, 94% hydrolase conversion).
    However, genetic and physiological studies emphasize thermogenic/metabolic roles
    rather than nitrogen catabolism—PM20D1 knockout causes glucose intolerance and
    metabolic dysfunction; overexpression increases energy expenditure. Natural promoter
    variants explain mouse strain cold tolerance differences. This represents strong
    evidence for catalytic activity but raises questions about functional annotation
    as general amide catabolism.'
  supported_by:
  - reference_id: file:rules/arba/ARBA00047244/ARBA00047244-deep-research-perplexity.md
    supporting_text: Extensive biochemical literature supports the conclusion that
      proteins matching condition sets 1, 3, and 4 catalyze amide hydrolysis. For
      urease enzymes, the chemical mechanism has been elucidated through structural
      crystallography, kinetic analysis, and spectroscopic investigation
  - reference_id: file:rules/arba/ARBA00047244/ARBA00047244-deep-research-perplexity.md
    supporting_text: The structure of (S)-ureidoglycine aminohydrolase from Arabidopsis
      thaliana in complex with its substrate (S)-ureidoglycine directly demonstrated
      substrate binding at the Mn2+ coordination site and the catalytic geometry supporting
      C-N bond hydrolysis
  - reference_id: file:rules/arba/ARBA00047244/ARBA00047244-deep-research-perplexity.md
    supporting_text: Analysis of Ataah mutant seeds revealed pleiotropic effects of
      allantoate amidohydrolase disruption, including seed dormancy increase, ureide
      accumulation, and amino acid depletion
  - reference_id: file:rules/arba/ARBA00047244/ARBA00047244-deep-research-perplexity.md
    supporting_text: Mice with PM20D1 deletion exhibit metabolic dysfunction characterized
      by glucose intolerance and decreased insulin sensitivity, indicating disrupted
      signaling through N-acyl amino acids rather than impaired amide degradation
      per se
  - reference_id: file:rules/arba/ARBA00047244/ARBA00047244-deep-research-falcon.md
    supporting_text: Dinuclear Ni(II) center bridged by a carbamylated lysine is conserved;
      accessory proteins UreD/E/F/G assemble and metallate the site; catalytic "bridging
      hydroxide" mechanism supported by structural and inhibitor studies
  - reference_id: file:rules/arba/ARBA00047244/ARBA00047244-deep-research-falcon.md
    supporting_text: In Arabidopsis, soybean and rice, allantoin degradation employs
      one aminohydrolase (UGlyAH) and three amidohydrolases (ALN upstream ring opening;
      AAH; UAH) to convert (S)-allantoin to glyoxylate, CO2 and NH3
condition_overlap:
  assessment: NONE
  notes: The four condition sets capture entirely distinct protein families with no
    sequence or structural overlap. Condition set 1 targets ureases (Ni-dependent,
    bacterial/plant, multi-subunit assemblies). Condition sets 3-4 target plant ureide
    pathway enzymes (Mn-dependent, bicupin or related folds, ER-localized). Condition
    set 2 targets PM20D1 (M20 peptidase family, secreted, bidirectional). These represent
    independent evolutionary solutions to amide hydrolysis chemistry. Mechanistically,
    urease uses dinuclear Ni(II) with bridging hydroxide; ureide enzymes use Mn2+
    coordination; PM20D1 mechanism is less characterized but involves M20 peptidase
    chemistry. Functionally, ureases serve nitrogen mobilization and pH modulation;
    ureide enzymes serve purine catabolism/nitrogen remobilization; PM20D1 serves
    thermogenic lipid signaling. The lack of overlap underscores that the rule groups
    mechanistically and functionally diverse enzymes solely on the basis of shared
    chemistry (amide bond cleavage), which may obscure biologically relevant distinctions.
  supported_by:
  - reference_id: file:rules/arba/ARBA00047244/ARBA00047244-deep-research-perplexity.md
    supporting_text: Urease enzymes from plants and bacteria maintain nearly identical
      active site architectures despite differences in subunit composition (single
      polypeptide vs. multi-subunit), oligomeric state, and overall fold contexts
  - reference_id: file:rules/arba/ARBA00047244/ARBA00047244-deep-research-perplexity.md
    supporting_text: Manganese-dependent ureide pathway enzymes employ metal coordination
      chemistry fundamentally similar to nickel-dependent ureases, suggesting that
      nature has converged on metal-dependent catalysis as the solution to rapid amide
      hydrolysis
go_specificity:
  assessment: TOO_BROAD
  notes: GO:0043605 (amide catabolic process) is technically accurate for all four
    condition sets but too broad to capture biological context. The term describes
    a general chemical mechanism shared by functionally diverse enzymes, potentially
    misleading users about actual biological roles. For urease (condition set 1),
    GO:0043419 (urea catabolic process) would be more specific—it captures the unique
    substrate (urea) rather than the chemical class (amides). Urease's biological
    roles in nitrogen cycling, bacterial virulence, and pH modulation are obscured
    by generic amide catabolism annotation. For ureide enzymes (condition sets 3-4),
    pathway-specific terms like "allantoin catabolic process" or "ureide catabolic
    process" would better capture their role in purine nitrogen remobilization. For
    PM20D1 (condition set 2), GO:0043605 is particularly problematic—while the enzyme
    catalyzes amide hydrolysis, its primary biological role involves thermogenic lipid
    signaling and metabolic regulation, not nitrogen catabolism. More appropriate
    annotations might include "N-acyl amino acid metabolic process", "lipid biosynthetic
    process" (for synthase activity), or "positive regulation of adaptive thermogenesis".
    The breadth of GO:0043605 enables computational grouping of chemically similar
    enzymes but sacrifices biological interpretability. A hierarchical annotation
    strategy (specific + general terms) would preserve both granularity and computational
    utility.
  supported_by:
  - reference_id: file:rules/arba/ARBA00047244/ARBA00047244-deep-research-perplexity.md
    supporting_text: For condition set 1 (urease enzymes), GO:0043605 is appropriate
      but potentially insufficiently specific. Ureases catalyze hydrolysis of urea,
      the simplest amide, and thus undoubtedly perform amide catabolism. However,
      the specific biological roles of urease in nitrogen mobilization, pH modulation,
      and virulence are better captured by more specific annotations
  - reference_id: file:rules/arba/ARBA00047244/ARBA00047244-deep-research-perplexity.md
    supporting_text: For condition set 2 (PM20D1 and related N-acyl amino acid synthase/hydrolases),
      the appropriateness of GO:0043605 is questionable. While PM20D1 does catalyze
      hydrolysis of amide bonds, its primary biological role involves regulation of
      N-acyl amino acid biosynthesis and catabolism for thermogenic purposes
  - reference_id: file:rules/arba/ARBA00047244/ARBA00047244-deep-research-perplexity.md
    supporting_text: For condition sets 3 and 4 (ureide pathway enzymes), GO:0043605
      is highly appropriate, as these enzymes specifically function to liberate ammonia
      from ureide substrates as part of purine catabolism. Alternative more specific
      annotations might reference "purine catabolism" or "nitrogen remobilization
      from purines," which would capture additional biological context
  - reference_id: file:rules/arba/ARBA00047244/ARBA00047244-deep-research-falcon.md
    supporting_text: GO:0043605 is broad. Where the rule has strong evidence for specific
      substrates (urea, allantoate, (S)-ureidoglycine, (S)-ureidoglycolate), more
      specific BP annotations would reduce ambiguity and improve downstream utility
taxonomic_scope:
  assessment: TOO_BROAD
  notes: Condition set 1 (urease) has no taxonomic restriction, which is appropriate
    given urease's broad distribution across bacteria and plants (but absence from
    animals). However, the rule could be made more explicit by adding "not Animalia"
    as an exclusion, providing logical clarity. Condition set 2 (PM20D1) restricts
    to Eukaryota (NCBITaxon:2759), which is appropriate for PM20D1 presence but may
    be too broad for the specific annotation. If GO:0043605 is retained, the scope
    might be narrowed to Mammalia or Vertebrata where thermogenic/metabolic context
    is relevant—PM20D1 orthologs in plants, fungi, or invertebrates may serve unrelated
    functions. Condition set 3 (ureide enzymes) restricts to Viridiplantae, which
    is too narrow—the ureide pathway exists in bacteria and fungi, creating false
    negatives. The scope should expand to "Viridiplantae, selected bacteria, and selected
    fungi" or implement granular inclusion of known lineages. Condition set 4 (UGlyAH)
    restricts to Streptophyta (land plants + streptophyte algae), which appropriately
    captures this enzyme's distribution and is consistent with experimental evidence.
    Overall, taxonomic scopes show mixed appropriateness—some too broad (PM20D1 in
    Eukaryota), some too narrow (ureide enzymes excluding prokaryotes), some appropriate
    (urease unrestricted, UGlyAH in Streptophyta).
  supported_by:
  - reference_id: file:rules/arba/ARBA00047244/ARBA00047244-deep-research-perplexity.md
    supporting_text: Urease enzymes are ubiquitous across plants, fungi, and bacteria
      but completely absent from animals. The universal distribution of urease across
      these kingdoms reflects the fundamental importance of urea as a nitrogen source
      in nature
  - reference_id: file:rules/arba/ARBA00047244/ARBA00047244-deep-research-perplexity.md
    supporting_text: Condition set 2 restricts the annotation to eukaryotes (NCBITaxon:2759),
      which is appropriate given that PM20D1 represents a eukaryotic protein present
      in circulation in both mice and humans
  - reference_id: file:rules/arba/ARBA00047244/ARBA00047244-deep-research-perplexity.md
    supporting_text: Condition set 3 currently restricts annotation to Viridiplantae,
      but the ureide pathway and its constituent enzymes are known to exist in bacteria
      and fungi
  - reference_id: file:rules/arba/ARBA00047244/ARBA00047244-deep-research-falcon.md
    supporting_text: AAH, UGlyAH, and UAH orthologs are broadly conserved in Viridiplantae
      (including soybean, rice, Arabidopsis; also reported in mosses/algae). Streptophyta-restricted
      conditions (e.g., for (S)-ureidoglycine aminohydrolase) are consistent with
      the distribution of the ureide catabolic phase
confidence: 0.8
references:
- id: file:rules/arba/ARBA00047244/ARBA00047244-deep-research-perplexity.md
  title: Deep research analysis via Perplexity (48 citations, comprehensive)
  findings:
  - statement: Urease catalyzes hydrolysis of urea to ammonia and CO2 via carbamate
      intermediate using conserved dinuclear Ni(II) active site bridged by carbamylated
      lysine. Catalytic proficiency is extraordinary (10^14-fold rate enhancement).
      Conserved across bacteria and plants despite quaternary structure variation.
  - statement: Plant ureide pathway enzymes (AAH, UGlyAH, UAH) use Mn2+ cofactors
      to sequentially degrade allantoin to glyoxylate, liberating ammonia and CO2.
      Crystal structures demonstrate metal-dependent catalysis and substrate specificity
      determinants. Genetic studies confirm physiological importance in nitrogen remobilization
      and seed development.
  - statement: PM20D1 is bidirectional synthase/hydrolase for N-acyl amino acids,
      showing 1.2% synthase and 94% hydrolase conversion in vitro. Primary biological
      role is thermogenic lipid signaling—knockout causes metabolic dysfunction, overexpression
      increases energy expenditure. Natural variants explain mouse strain cold tolerance
      differences.
  - statement: 'GO:0043605 is appropriate but insufficiently specific. More specific
      annotations: GO:0043419 (urea catabolism) for urease, pathway-specific terms
      for ureide enzymes, metabolic/thermogenic terms for PM20D1. Current annotation
      obscures biological context.'
  - statement: 'Taxonomic scope issues: Viridiplantae restriction for ureide enzymes
      excludes known bacterial/fungal systems (false negatives); Eukaryota scope for
      PM20D1 may be too broad given thermogenesis context.'
- id: file:rules/arba/ARBA00047244/ARBA00047244-deep-research-falcon.md
  title: Deep research analysis via Falcon (17 citations)
  findings:
  - statement: Urease subunit/active site signatures provide strong, mechanistically
      coherent signature for ureolysis. Dinuclear Ni(II) center bridged by carbamylated
      lysine conserved across bacteria and plants. Accessory proteins UreD/E/F/G assemble
      and metallate the site.
  - statement: Plant ureide-pathway hydrolases (AAH, UGlyAH, UAH) are close homologs
      with distinct active-site determinants enforcing narrow substrate specificity
      (e.g., AtUAH Tyr-423 vs. Gly in AAH; QXR motif in AAH). Specialized roles in
      purine/ureide catabolism, not broadly multifunctional.
  - statement: GO:0043605 appropriate for urease and ureide enzymes but broad. More
      specific BP annotations would reduce ambiguity. PM20D1 condition set lacks strong
      supporting evidence in provided context—should be reassessed.
  - statement: 'Rule logic assessment: Urease and ureide enzyme condition sets are
      well-supported by conserved structure and pathway biochemistry. Multi-domain
      requirements reduce false positives. PM20D1 branch needs validation.'
- id: file:rules/arba/ARBA00047244/ARBA00047244.enriched.json
  title: Enriched rule JSON with condition sets and annotations
  findings:
  - statement: Rule created 2024-08-14, modified 2025-03-21. Four condition sets targeting
      urease domains, PM20D1 FunFams, and plant ureide pathway hydrolases. Predicts
      GO:0043605 (amide catabolic process). Currently annotates 0 proteins (0 reviewed,
      0 unreviewed).
supported_by:
- reference_id: file:rules/arba/ARBA00047244/ARBA00047244-deep-research-perplexity.md
  supporting_text: The UniProt rule ARBA00047244 presents a mixed picture of annotation
    accuracy and appropriateness. The rule correctly identifies that proteins matching
    these signatures do possess amide catabolic activity, as all condition sets target
    enzymes known to catalyze the hydrolysis of amide bonds