# GENERATED FILE - do not edit by hand.
# Source of truth: projects/NCBIFam/ncbifam2go.sssom.yaml
# Regenerate: uv run python projects/NCBIFam/sssom_to_terms.py projects/NCBIFam/ncbifam2go.sssom.yaml -o projects/NCBIFam/ncbifam2go.terms.yaml
# Validate:   uv run linkml-term-validator validate-data projects/NCBIFam/ncbifam2go.terms.yaml -s src/ai_gene_review/schema/ncbifam_go_mapping.yaml -t NCBIFAMGOMappingSet --labels -c conf/oak_config.yaml
mappings:
- subject:
    id: NCBIFAM:NF009803
    label: formamidase (equivalog)
  predicate:
    id: skos:exactMatch
    label: exact match
  object:
    id: GO:0004328
    label: formamidase activity
  mapping_justification: semapv:ManualMappingCuration
  comment: 'EC-bridge supported: hmm_PGAP gives EC 3.5.1.49 and ec2go maps EC 3.5.1.49 -> exactly GO:0004328.
    equivalog family. Propagation gain: gain_rev=0 (27/27 reviewed already have it) but gain_all=54 --
    the gap is entirely in unreviewed TrEMBL entries carrying NF009803.'
- subject:
    id: NCBIFAM:NF005824
    label: acetolactate synthase large subunit (equivalog)
  predicate:
    id: skos:exactMatch
    label: exact match
  object:
    id: GO:0003984
    label: acetolactate synthase activity
  mapping_justification: semapv:ManualMappingCuration
  comment: The large (catalytic) subunit carries the activity. equivalog. gain_rev=0 (1/1), gain_all=6.
    Clean propagation case -- nearly fully covered already, included as an exactMatch exemplar.
- subject:
    id: NCBIFAM:NF045700
    label: AttM family quorum-quenching N-acyl homoserine lactonase (equivalog)
  predicate:
    id: skos:exactMatch
    label: exact match
  object:
    id: GO:0102007
    label: acyl-L-homoserine-lactone lactonohydrolase activity
  mapping_justification: semapv:ManualMappingCuration
  comment: 'EC-bridge supported: EC 3.1.1.81 -> GO:0102007 in ec2go. equivalog, PMID:11930013. Quorum
    quenching. gain_rev=0 (4/4), gain_all=115 -- large TrEMBL gap.'
- subject:
    id: NCBIFAM:TIGR03230
    label: lipoprotein lipase (equivalog)
  predicate:
    id: skos:exactMatch
    label: exact match
  object:
    id: GO:0004465
    label: lipoprotein lipase activity
  mapping_justification: semapv:ManualMappingCuration
  comment: 'EC-bridge supported: EC 3.1.1.34 -> GO:0004465 in ec2go. equivalog, PMID:8308035. gain_rev=0
    (11/11), gain_all=9.'
- subject:
    id: NCBIFAM:NF033545
    label: IS630 family transposase (equivalog)
  predicate:
    id: skos:exactMatch
    label: exact match
  object:
    id: GO:0004803
    label: transposase activity
  mapping_justification: semapv:ManualMappingCuration
  comment: 'equivalog. The single largest propagation gap in the scoping sample: gain_all=18874 of 18881
    entries carrying NF033545 lack GO:0004803 (or any descendant), and gain_rev=2. Mobile-element families
    are precisely where InterPro integration lags, so NCBIFAM''s GO is unused.'
- subject:
    id: NCBIFAM:NF041162
    label: family 2A encapsulin nanocompartment shell protein (equivalog)
  predicate:
    id: skos:exactMatch
    label: exact match
  object:
    id: GO:0140737
    label: encapsulin nanocompartment
  mapping_justification: semapv:ManualMappingCuration
  comment: 'Cellular-component mapping: the shell protein localises to / constitutes the encapsulin nanocompartment.
    equivalog, PMID:25024436,34362927,35146412. gain_all=897 of 901, gain_rev=0.'
- subject:
    id: NCBIFAM:NF042963
    label: anti-phage-associated DUF1156 domain-containing protein (equivalog)
  predicate:
    id: skos:exactMatch
    label: exact match
  object:
    id: GO:0051607
    label: defense response to virus
  mapping_justification: semapv:ManualMappingCuration
  comment: Biological-process mapping for an anti-phage defense family. equivalog, PMID:32855333. gain_all=151
    of 151 (0 carry the term), gain_rev=0 -- a brand-new defense family with no GO propagation at all;
    a clear gap-fill / proposed-annotation candidate.
- subject:
    id: NCBIFAM:NF000320
    label: PEN family class A beta-lactamase, Bcc-type (equivalog)
  predicate:
    id: skos:exactMatch
    label: exact match
  object:
    id: GO:0008800
    label: beta-lactamase activity
  mapping_justification: semapv:ManualMappingCuration
  comment: 'EC-bridge: EC 3.5.2.6 -> GO:0008800 in ec2go. AMR-relevant. equivalog, PMID:19075063,9371340.
    gain_all=0 of 45 (already fully covered in this reviewed-heavy family), included as a clean AMR exemplar;
    cf. NF033105 (a different beta-lactamase family) which maps to the same GO term.'
- subject:
    id: NCBIFAM:NF033105
    label: subclass B3 metallo-beta-lactamase (equivalog)
  predicate:
    id: skos:exactMatch
    label: exact match
  object:
    id: GO:0008800
    label: beta-lactamase activity
  mapping_justification: semapv:ManualMappingCuration
  comment: 'EC-bridge: EC 3.5.2.6 -> GO:0008800. AMR-relevant (carbapenem-hydrolysing metallo class).
    equivalog. gain_all=165 of 1170 -- a substantial TrEMBL gap. Two distinct NCBIFAM families (NF000320
    serine class A, NF033105 metallo B3) legitimately share one GO term: family -> GO is many-to-one,
    the structural analog of RHEA''s many-reactions-to-one-activity finding.'
- subject:
    id: NCBIFAM:NF002525
    label: D-alanine--D-alanine ligase (equivalog)
  predicate:
    id: skos:exactMatch
    label: exact match
  object:
    id: GO:0008716
    label: D-alanine-D-alanine ligase activity
  mapping_justification: semapv:ManualMappingCuration
  comment: 'EC-bridge: EC 6.3.2.4 -> GO:0008716. Peptidoglycan biosynthesis (Ddl, a vancomycin-resistance
    locus). equivalog. gain_all=3 of 777, gain_rev=0.'
- subject:
    id: NCBIFAM:NF003009
    label: 5'-deoxynucleotidase (equivalog)
  predicate:
    id: skos:exactMatch
    label: exact match
  object:
    id: GO:0002953
    label: 5'-deoxynucleotidase activity
  mapping_justification: semapv:ManualMappingCuration
  comment: 'EC-bridge: EC 3.1.3.89 -> GO:0002953. equivalog. gain_all=170 of 1714, gain_rev=0.'
- subject:
    id: NCBIFAM:NF004018
    label: uridine kinase (equivalog)
  predicate:
    id: skos:exactMatch
    label: exact match
  object:
    id: GO:0004849
    label: uridine kinase activity
  mapping_justification: semapv:ManualMappingCuration
  comment: 'EC-bridge: EC 2.7.1.48 -> GO:0004849. Pyrimidine salvage. equivalog. gain_all=329 of 15080
    -- the largest TrEMBL gap among the enzyme rows in this batch; gain_rev=0.'
- subject:
    id: NCBIFAM:NF006707
    label: class I fructose-bisphosphate aldolase (equivalog)
  predicate:
    id: skos:exactMatch
    label: exact match
  object:
    id: GO:0004332
    label: fructose-bisphosphate aldolase activity
  mapping_justification: semapv:ManualMappingCuration
  comment: 'EC-bridge: EC 4.1.2.13 -> GO:0004332. Glycolysis. equivalog. gain_all=4 of 1476, gain_rev=0.'
- subject:
    id: NCBIFAM:NF007054
    label: alpha-amylase (equivalog)
  predicate:
    id: skos:exactMatch
    label: exact match
  object:
    id: GO:0004556
    label: alpha-amylase activity
  mapping_justification: semapv:ManualMappingCuration
  comment: 'EC-bridge: EC 3.2.1.1 -> GO:0004556. equivalog. gain_all=1 of 37, gain_rev=0.'
- subject:
    id: NCBIFAM:NF011000
    label: acylphosphatase (equivalog)
  predicate:
    id: skos:exactMatch
    label: exact match
  object:
    id: GO:0003998
    label: acylphosphatase activity
  mapping_justification: semapv:ManualMappingCuration
  comment: 'EC-bridge: EC 3.6.1.7 -> GO:0003998. equivalog. gain_all=4 of 1135, gain_rev=0.'
- subject:
    id: NCBIFAM:NF040791
    label: glycerate 2-kinase (equivalog)
  predicate:
    id: skos:exactMatch
    label: exact match
  object:
    id: GO:0043798
    label: glycerate 2-kinase activity
  mapping_justification: semapv:ManualMappingCuration
  comment: 'EC-bridge: EC 2.7.1.165 -> GO:0043798. equivalog, PMID:16684110. gain_all=6 of 7 -- small
    but nearly total gap for this family.'
- subject:
    id: NCBIFAM:NF045654
    label: acid phosphatase PhoC (equivalog)
  predicate:
    id: skos:exactMatch
    label: exact match
  object:
    id: GO:0003993
    label: acid phosphatase activity
  mapping_justification: semapv:ManualMappingCuration
  comment: 'EC-bridge: EC 3.1.3.2 -> GO:0003993. equivalog, PMID:10877772,8081499. gain_all=1 of 80.'
- subject:
    id: NCBIFAM:TIGR02321
    label: phosphonopyruvate hydrolase (equivalog)
  predicate:
    id: skos:exactMatch
    label: exact match
  object:
    id: GO:0033978
    label: phosphonopyruvate hydrolase activity
  mapping_justification: semapv:ManualMappingCuration
  comment: 'EC-bridge: EC 3.11.1.3 -> GO:0033978. Phosphonate catabolism. equivalog, PMID:12697754. gain_all=71
    of 137, gain_rev=0.'
- subject:
    id: NCBIFAM:TIGR02694
    label: arsenate reductase (azurin) small subunit (equivalog)
  predicate:
    id: skos:exactMatch
    label: exact match
  object:
    id: GO:0050611
    label: arsenate reductase (azurin) activity
  mapping_justification: semapv:ManualMappingCuration
  comment: 'EC-bridge: EC 1.20.9.1 -> GO:0050611. Arsenic detoxification/respiration. equivalog, PMID:12679550.
    gain_all=144 of 355, and gain_rev=2 -- reviewed entries are missing it too.'
- subject:
    id: NCBIFAM:TIGR03828
    label: 1-phosphofructokinase (equivalog)
  predicate:
    id: skos:exactMatch
    label: exact match
  object:
    id: GO:0008662
    label: 1-phosphofructokinase activity
  mapping_justification: semapv:ManualMappingCuration
  comment: 'EC-bridge: EC 2.7.1.56 -> GO:0008662. Fructose catabolism. equivalog. gain_all=16 of 5974,
    gain_rev=0.'
- subject:
    id: NCBIFAM:NF001277
    label: adenosylcobinamide-GDP ribazoletransferase (equivalog)
  predicate:
    id: skos:exactMatch
    label: exact match
  object:
    id: GO:0051073
    label: adenosylcobinamide-GDP ribazoletransferase activity
  mapping_justification: semapv:ManualMappingCuration
  comment: 'EC-bridge: EC 2.7.8.26 -> GO:0051073. Cobalamin (B12) biosynthesis (CobS). equivalog. gain_all=4
    of 1293, gain_rev=0.'
- subject:
    id: NCBIFAM:NF002326
    label: deoxyguanosinetriphosphate triphosphohydrolase / dGTPase (equivalog)
  predicate:
    id: skos:exactMatch
    label: exact match
  object:
    id: GO:0008832
    label: dGTPase activity
  mapping_justification: semapv:ManualMappingCuration
  comment: Our proposed term. NCBI's go_terms gave only the broad parent GO:0016793 (triphosphoric monoester
    hydrolase activity); the specific GO:0008832 (dGTPase activity) is correct (product name = dGTPase,
    EC 3.1.5.1; ec2go EC 3.1.5.1 -> GO:0008832). equivalog. Against the specific term gain_all=456 of
    3719 and gain_rev=13 -- a substantial reviewed gap the broad parent (gain~0) hid entirely.
- subject:
    id: NCBIFAM:NF005804
    label: enoyl-CoA hydratase (equivalog)
  predicate:
    id: skos:exactMatch
    label: exact match
  object:
    id: GO:0004300
    label: enoyl-CoA hydratase activity
  mapping_justification: semapv:ManualMappingCuration
  comment: Our proposed term. NCBI assigned the ontology near-root GO:0003824 (catalytic activity) to
    this EC 4.2.1.17 enzyme; ec2go maps EC 4.2.1.17 -> GO:0004300 directly. equivalog. Against the specific
    term gain_all=184 of 485, gain_rev=1 (the GO:0003824 parent gain was ~0 -- altitude masking).
- subject:
    id: NCBIFAM:NF006559
    label: dihydroorotase (equivalog)
  predicate:
    id: skos:exactMatch
    label: exact match
  object:
    id: GO:0004151
    label: dihydroorotase activity
  mapping_justification: semapv:ManualMappingCuration
  comment: Our proposed term. NCBI assigned the broad parent GO:0016810; the specific GO:0004151 is the
    ec2go target of EC 3.5.2.3. Pyrimidine biosynthesis (PyrC). equivalog. Against the specific term gain_all=491
    of 1354 (vs 4 for the broad parent), gain_rev=0.
- subject:
    id: NCBIFAM:TIGR00417
    label: spermidine synthase (equivalog)
  predicate:
    id: skos:exactMatch
    label: exact match
  object:
    id: GO:0004766
    label: spermidine synthase activity
  mapping_justification: semapv:ManualMappingCuration
  comment: Our proposed term. NCBI assigned the ontology near-root GO:0003824 (catalytic activity) to
    this EC 2.5.1.16 enzyme; ec2go maps EC 2.5.1.16 -> GO:0004766 directly. Polyamine biosynthesis. equivalog.
    Against the specific term gain_all=575 of 9260 and gain_rev=1 (the GO:0003824 parent gain was ~0).
- subject:
    id: NCBIFAM:TIGR03542
    label: LL-diaminopimelate aminotransferase (equivalog)
  predicate:
    id: skos:exactMatch
    label: exact match
  object:
    id: GO:0010285
    label: L,L-diaminopimelate:2-oxoglutarate transaminase activity
  mapping_justification: semapv:ManualMappingCuration
  comment: Our proposed term. NCBI assigned the broad class GO:0008483 (transaminase activity); the specific
    GO:0010285 is the ec2go target of EC 2.6.1.83. Lysine/peptidoglycan biosynthesis. equivalog, PMID:17093042,17583737.
    Against the specific term gain_all=1185 of 2649 (vs 77 for the broad class), gain_rev=2.
- subject:
    id: NCBIFAM:TIGR02791
    label: P-type DNA transfer protein VirB5 (equivalog)
  predicate:
    id: skos:broadMatch
    label: broad match
  object:
    id: GO:0043684
    label: type IV secretion system complex
  mapping_justification: semapv:ManualMappingCuration
  comment: VirB5 is a minor pilus/component subunit part_of the T4SS; the GO term is the whole complex,
    so the relation is part_of (broadMatch), not exact, and there is no VirB5-specific CC term to propose.
    equivalog, PMID:12855161,14673074,15901731. gain_all=298 of 298 (0 carry it), gain_rev=4.
- subject:
    id: NCBIFAM:TIGR00439
    label: permease-like cell division protein FtsX (equivalog)
  predicate:
    id: skos:exactMatch
    label: exact match
  object:
    id: GO:0051301
    label: cell division
  mapping_justification: semapv:ManualMappingCuration
  comment: 'Our proposed term, chosen against NCBI''s GO:0000910 (cytokinesis) on empirical + ontology
    grounds. Ontology: GO:0000910 cytokinesis is part_of GO:0051301 cell division, so NCBI''s term is
    actually NARROWER, not broader. Empirically all 7 reviewed FtsX proteins carry GO:0051301 (cell division)
    but only 2/7 carry cytokinesis or GO:0043093 (FtsZ-dependent cytokinesis). FtsX/FtsEX regulates septal
    peptidoglycan hydrolysis and divisome assembly, so the conservative participation term (cell division)
    is the curator consensus. gain_all=22 of 3306 (already near-universal -- a confirmatory, low-gap mapping).
    NOTE: mapping to the specific GO:0043093 instead would show a huge apparent gain (3304) but curators
    apply it to only 2/7 entries, so blanket propagation would be OVER-ANNOTATION -- we deliberately keep
    the safe consensus term. The opposite call from the five altitude rows above, where the specific term
    WAS the correct, universally-applicable one.'
