# ARO -> GO mapping (SSSOM)
#
# CARD/ARO carries no native GO cross-references (verified against aro.owl: hasDbXref
# namespaces are ARO/PMID/PubChem/ChEBI/ChEMBL/CAS/PDB/ISBN/RO/DOID, zero GO). This file is the
# hand-curated ARO->GO bridge described in projects/ANTIMICROBIAL_RESISTANCE.md, seeded from the
# macrolide phosphotransferase (MPH) family.
#
# Every entity is given as a CURIE + label tuple (subject_id/subject_label,
# predicate_id/predicate_label, object_id/object_label) so the rows are self-describing.
#
# Validate with linkml-validate (ltv) -- either via the just recipe:
#   just validate-mappings
# or directly (schema ships with the sssom_schema pip package):
#   uv run linkml-validate \
#     -s "$(uv run python -c 'import sssom_schema,os;print(os.path.join(os.path.dirname(sssom_schema.__file__),"schema","sssom_schema.yaml"))')" \
#     -C "mapping set" projects/ANTIMICROBIAL_RESISTANCE/aro2go.sssom.yaml

mapping_set_id: https://w3id.org/ai4curation/ai-gene-review/mappings/aro2go
mapping_set_title: ARO to GO mapping (AMR gene families and resistance mechanisms)
mapping_set_description: >-
  Curated mappings from Antibiotic Resistance Ontology (ARO / CARD) determinant, AMR-gene-family,
  and resistance-mechanism terms to Gene Ontology molecular-function and biological-process terms.
  CARD provides no native ARO->GO cross-references, so these are manually asserted to seed and
  quality-check GO annotations of AMR genes (e.g. MphA Q47396 / ARO:3000316, MphB A0A0H3EUF3 /
  ARO:3000318). Coverage: the macrolide phosphotransferase (MPH) family plus beta-lactamase,
  chloramphenicol acetyltransferase, aminoglycoside-modifying enzymes (APH/AAC/ANT), Erm rRNA
  methyltransferases, trimethoprim-resistant DHFR and sulfonamide-resistant DHPS, and the six ARO
  resistance-mechanism classes. Every ARO id is verified against aro.owl; every GO id/label against
  QuickGO. Gene/family -> MF use 'enables' (RO:0002327); mechanism -> BP use skos:relatedMatch.
license: https://creativecommons.org/licenses/by/4.0/
creator_label:
- AI Gene Review project
mapping_date: "2026-06-12"
curie_map:
  ARO: http://purl.obolibrary.org/obo/ARO_
  GO: http://purl.obolibrary.org/obo/GO_
  RO: http://purl.obolibrary.org/obo/RO_
  skos: http://www.w3.org/2004/02/skos/core#
  semapv: https://w3id.org/semapv/vocab/
  sssom: https://w3id.org/sssom/
  obo: http://purl.obolibrary.org/obo/

mappings:
# --- Determinant (gene) -> molecular function ---
- subject_id: ARO:3000316
  subject_label: mphA
  predicate_id: RO:0002327
  predicate_label: enables
  object_id: GO:0050073
  object_label: macrolide 2'-kinase activity
  mapping_justification: semapv:ManualMappingCuration
  comment: >-
    mph(A) / macrolide 2'-phosphotransferase I enables the GO macrolide 2'-kinase activity (the
    determinant is not a GO term, so this is an enables relation, not a term equivalence).

- subject_id: ARO:3000318
  subject_label: mphB
  predicate_id: RO:0002327
  predicate_label: enables
  object_id: GO:0050073
  object_label: macrolide 2'-kinase activity
  mapping_justification: semapv:ManualMappingCuration
  comment: >-
    mph(B) / macrolide 2'-phosphotransferase II enables the same GO MF; substrate-scope differences
    (16-membered macrolides) are finer than the GO term and recorded separately as ChEBI substrates.

# --- AMR gene family -> molecular function ---
- subject_id: ARO:3000333
  subject_label: macrolide phosphotransferase (MPH)
  predicate_id: RO:0002327
  predicate_label: enables
  object_id: GO:0050073
  object_label: macrolide 2'-kinase activity
  mapping_justification: semapv:ManualMappingCuration
  comment: >-
    Family-level mapping: every member of the ARO macrolide phosphotransferase (MPH) family is a
    macrolide 2'-kinase. Useful for propagating a candidate GO MF to other MPH members (mphC/E/G).

# --- Resistance mechanism -> biological process ---
- subject_id: ARO:0001004
  subject_label: antibiotic inactivation
  predicate_id: skos:relatedMatch
  predicate_label: related match
  object_id: GO:0046677
  object_label: response to antibiotic
  mapping_justification: semapv:ManualMappingCuration
  comment: >-
    Mechanism axis (ARO 'antibiotic inactivation') relates to the GO biological process 'response to
    antibiotic'; relatedMatch (not exact) because the two terms sit on different axes. This row is the
    mechanism->BP QC prior: an inactivation-mechanism determinant should carry GO:0046677.

# --- Other resistance mechanisms -> biological process ---
# All ARO mechanism-axis terms relate to GO:0046677 'response to antibiotic' (relatedMatch; the BP an
# AMR determinant participates in, regardless of the specific mechanism). Verified ARO ids from aro.owl.
- subject_id: ARO:0010000
  subject_label: antibiotic efflux
  predicate_id: skos:relatedMatch
  predicate_label: related match
  object_id: GO:0046677
  object_label: response to antibiotic
  mapping_justification: semapv:ManualMappingCuration
  comment: >-
    Efflux determinants additionally have a transmembrane-transporter MF, but GO lacks a specific
    'antibiotic/efflux transmembrane transporter activity' term, so only the BP prior is asserted here.

- subject_id: ARO:0001001
  subject_label: antibiotic target alteration
  predicate_id: skos:relatedMatch
  predicate_label: related match
  object_id: GO:0046677
  object_label: response to antibiotic
  mapping_justification: semapv:ManualMappingCuration

- subject_id: ARO:0001003
  subject_label: antibiotic target protection
  predicate_id: skos:relatedMatch
  predicate_label: related match
  object_id: GO:0046677
  object_label: response to antibiotic
  mapping_justification: semapv:ManualMappingCuration

- subject_id: ARO:0001002
  subject_label: antibiotic target replacement
  predicate_id: skos:relatedMatch
  predicate_label: related match
  object_id: GO:0046677
  object_label: response to antibiotic
  mapping_justification: semapv:ManualMappingCuration

- subject_id: ARO:3000244
  subject_label: reduced permeability to antibiotic
  predicate_id: skos:relatedMatch
  predicate_label: related match
  object_id: GO:0046677
  object_label: response to antibiotic
  mapping_justification: semapv:ManualMappingCuration

# --- Other AMR gene families -> molecular function (enables) ---
# Family-level 'enables' mappings: every member of the ARO family carries the named GO MF. Useful for
# propagating a candidate GO MF to family members. Verified: ARO ids from aro.owl, GO ids/labels from QuickGO.
- subject_id: ARO:3000001
  subject_label: beta-lactamase
  predicate_id: RO:0002327
  predicate_label: enables
  object_id: GO:0008800
  object_label: beta-lactamase activity
  mapping_justification: semapv:ManualMappingCuration

- subject_id: ARO:3000122
  subject_label: chloramphenicol acetyltransferase (CAT)
  predicate_id: RO:0002327
  predicate_label: enables
  object_id: GO:0008811
  object_label: chloramphenicol O-acetyltransferase activity
  mapping_justification: semapv:ManualMappingCuration

- subject_id: ARO:3000114
  subject_label: aminoglycoside phosphotransferase (APH)
  predicate_id: RO:0002327
  predicate_label: enables
  object_id: GO:0034071
  object_label: aminoglycoside phosphotransferase activity
  mapping_justification: semapv:ManualMappingCuration

- subject_id: ARO:3000121
  subject_label: aminoglycoside acetyltransferase (AAC)
  predicate_id: RO:0002327
  predicate_label: enables
  object_id: GO:0034069
  object_label: aminoglycoside N-acetyltransferase activity
  mapping_justification: semapv:ManualMappingCuration

- subject_id: ARO:3000218
  subject_label: aminoglycoside nucleotidyltransferase (ANT)
  predicate_id: RO:0002327
  predicate_label: enables
  object_id: GO:0034068
  object_label: aminoglycoside nucleotidyltransferase activity
  mapping_justification: semapv:ManualMappingCuration

- subject_id: ARO:3000560
  subject_label: Erm 23S ribosomal RNA methyltransferase
  predicate_id: RO:0002327
  predicate_label: enables
  object_id: GO:0008988
  object_label: rRNA (adenine-N6-)-methyltransferase activity
  mapping_justification: semapv:ManualMappingCuration
  comment: >-
    Family-safe term: all Erm methylate the N6 of 23S rRNA adenine A2058, so every member enables
    GO:0008988 (rRNA adenine-N6-methyltransferase). The N6,N6-DImethyltransferase term GO:0052910 (a
    descendant of GO:0008988) is correct only for the di-methylating majority and would over-annotate
    mono-methylating Erm variants, so it is applied at the allele level, not the family node.

- subject_id: ARO:3001218
  subject_label: trimethoprim resistant dihydrofolate reductase dfr
  predicate_id: RO:0002327
  predicate_label: enables
  object_id: GO:0004146
  object_label: dihydrofolate reductase activity
  mapping_justification: semapv:ManualMappingCuration
  comment: >-
    Target-replacement resistance: a drug-insensitive Dfr variant that retains dihydrofolate reductase
    activity (the resistance is to trimethoprim, not loss of the GO MF).

- subject_id: ARO:3003415
  subject_label: sulfonamide resistant dihydropteroate synthase folP
  predicate_id: RO:0002327
  predicate_label: enables
  object_id: GO:0004156
  object_label: dihydropteroate synthase activity
  mapping_justification: semapv:ManualMappingCuration
  comment: >-
    Target-replacement resistance: a sulfonamide-insensitive Sul/FolP variant that retains dihydropteroate
    synthase activity.

- subject_id: ARO:3000149
  subject_label: FosA
  predicate_id: skos:relatedMatch
  predicate_label: related match
  object_id: GO:0004364
  object_label: glutathione transferase activity
  mapping_justification: semapv:ManualMappingCuration
  comment: >-
    Re-anchored from the heterogeneous parent ARO:3000133 'fosfomycin thiol transferase' to the
    glutathione-specific FosA: FosA adds glutathione to fosfomycin's epoxide and genuinely has GST
    activity. The parent node also contains FosB (bacillithiol/L-cysteine thiol transferase, NOT
    glutathione) and FosX (an epoxide hydrolase), for which GO:0004364 would be wrong. relatedMatch
    because GO:0004364 is the general GST activity (no fosfomycin-specific term; GO:0034797/0034798 are
    obsolete). The other FosA-type GSTs (FosA2/FosA3/fosA5) are ARO siblings (not descendants) and are
    mapped separately below; FosB/FosX are deliberately excluded.

- subject_id: ARO:3002804
  subject_label: FosA2
  predicate_id: skos:relatedMatch
  predicate_label: related match
  object_id: GO:0004364
  object_label: glutathione transferase activity
  mapping_justification: semapv:ManualMappingCuration
  comment: >-
    FosA-type Mn2+/K+-dependent glutathione S-transferase (chromosomal FosA of Enterobacter cloacae);
    same fosfomycin-GSH conjugation activity as FosA. relatedMatch to the general GST term.

- subject_id: ARO:3002872
  subject_label: FosA3
  predicate_id: skos:relatedMatch
  predicate_label: related match
  object_id: GO:0004364
  object_label: glutathione transferase activity
  mapping_justification: semapv:ManualMappingCuration
  comment: >-
    FosA-type glutathione S-transferase; the most widespread plasmid-borne FosA in ESBL-producing
    Enterobacterales. relatedMatch to the general GST term.

- subject_id: ARO:3003209
  subject_label: fosA5
  predicate_id: skos:relatedMatch
  predicate_label: related match
  object_id: GO:0004364
  object_label: glutathione transferase activity
  mapping_justification: semapv:ManualMappingCuration
  comment: >-
    FosA-type glutathione S-transferase (K. pneumoniae origin); same fosfomycin-GSH conjugation
    activity. relatedMatch to the general GST term.

- subject_id: ARO:3004112
  subject_label: phosphoethanolamine transferase conferring colistin resistance
  predicate_id: RO:0002327
  predicate_label: enables
  object_id: GO:0043838
  object_label: phosphatidylethanolamine:Kdo2-lipid A phosphoethanolamine transferase activity
  mapping_justification: semapv:ManualMappingCuration
  comment: >-
    Colistin resistance (last-resort): MCR/EptA-type enzymes transfer phosphoethanolamine onto lipid A,
    reducing the net negative charge that polymyxins bind. This ARO family node covers the mobile MCR
    alleles plus pmr and intrinsic chromosomal phosphoethanolamine transferases (>110 descendant ARO
    terms), all of which enable this exact GO MF.

- subject_id: ARO:3004271
  subject_label: 16S rRNA methyltransferase (G1405)
  predicate_id: skos:relatedMatch
  predicate_label: related match
  object_id: GO:0070043
  object_label: rRNA (guanine-N7-)-methyltransferase activity
  mapping_justification: semapv:ManualMappingCuration
  comment: >-
    High-level aminoglycoside resistance: ArmA/RmtA-H-type 16S rRNA methyltransferases methylate N7 of
    G1405 in the aminoglycoside binding site. relatedMatch (not enables) because GO:0070043 is the general
    rRNA guanine-N7-methyltransferase activity (which also covers housekeeping enzymes such as RsmG); the
    G1405-specific resistance methyltransferase is narrower than this term, so the GO MF still propagates.

# --- Gaps: ARO families with NO suitable GO molecular-function term ---
# Recorded with the SSSOM no-match convention (object_id: sssom:NoTermFound, object_source: obo:go.owl)
# so the gap lives in the mapping file itself rather than only in prose. These are candidates for GO
# new-term requests. predicate_id is the relation we WOULD use (skos:exactMatch) had a term existed.
- subject_id: ARO:3000221
  subject_label: lincosamide nucleotidyltransferase (LNU)
  predicate_id: skos:exactMatch
  object_id: sssom:NoTermFound
  object_source: obo:go.owl
  mapping_justification: semapv:ManualMappingCuration
  comment: >-
    GO has no lincosamide (O-)nucleotidyltransferase MF term (Lnu adenylylates lincosamides). GO term
    request candidate.
- subject_id: ARO:3000453
  subject_label: streptogramin vat acetyltransferase
  predicate_id: skos:exactMatch
  object_id: sssom:NoTermFound
  object_source: obo:go.owl
  mapping_justification: semapv:ManualMappingCuration
  comment: >-
    GO has no streptogramin A O-acetyltransferase MF term (Vat). GO term request candidate.
- subject_id: ARO:3000376
  subject_label: streptogramin vgb lyase
  predicate_id: skos:exactMatch
  object_id: sssom:NoTermFound
  object_source: obo:go.owl
  mapping_justification: semapv:ManualMappingCuration
  comment: >-
    GO has no streptogramin B (virginiamycin B) lyase MF term (Vgb). GO term request candidate.
- subject_id: ARO:3000320
  subject_label: macrolide esterase
  predicate_id: skos:exactMatch
  object_id: sssom:NoTermFound
  object_source: obo:go.owl
  mapping_justification: semapv:ManualMappingCuration
  comment: >-
    GO has no macrolide/erythromycin esterase MF term (Ere). GO term request candidate.
- subject_id: ARO:3000390
  subject_label: rifampin ADP-ribosyltransferase (Arr)
  predicate_id: skos:exactMatch
  object_id: sssom:NoTermFound
  object_source: obo:go.owl
  mapping_justification: semapv:ManualMappingCuration
  comment: >-
    GO's nearest term GO:0003950 is NAD+ poly-ADP-ribosyltransferase; Arr is a mono-ADP-ribosyltransferase
    acting on rifampin. No suitable mono term. GO term request candidate.
- subject_id: ARO:3000445
  subject_label: rifampin monooxygenase
  predicate_id: skos:exactMatch
  object_id: sssom:NoTermFound
  object_source: obo:go.owl
  mapping_justification: semapv:ManualMappingCuration
  comment: >-
    GO has no rifampin/rifamycin monooxygenase MF term (only the general GO:0004497 monooxygenase activity).
    GO term request candidate.
- subject_id: ARO:3000036
  subject_label: tetracycline inactivation enzyme
  predicate_id: skos:exactMatch
  object_id: sssom:NoTermFound
  object_source: obo:go.owl
  mapping_justification: semapv:ManualMappingCuration
  comment: >-
    GO has no Tet(X)/tetracycline-destructase (flavin-dependent tetracycline monooxygenase) MF term.
    GO term request candidate.
- subject_id: ARO:3000202
  subject_label: Cfr 23S ribosomal RNA methyltransferase
  predicate_id: skos:exactMatch
  object_id: sssom:NoTermFound
  object_source: obo:go.owl
  mapping_justification: semapv:ManualMappingCuration
  comment: >-
    GO:0070040 is the C2 (RlmN housekeeping) 23S rRNA methyltransferase; Cfr methylates C8 of A2503. No
    C8-specific GO term. GO term request candidate.
- subject_id: ARO:3002978
  subject_label: D-Ala-D-Lac ligase
  predicate_id: skos:exactMatch
  object_id: sssom:NoTermFound
  object_source: obo:go.owl
  mapping_justification: semapv:ManualMappingCuration
  comment: >-
    VanA/B-type ligase makes D-Ala-D-Lactate; GO:0008716 is D-Ala-D-Ala ligase (wrong product). No
    D-alanine-D-lactate ligase GO term. GO term request candidate.



