Pfam2GO vs InterPro2GO: Precision-Gap Analysis Results

Pfam2GO vs InterPro2GO — Precision-Gap Analysis

Auto-generated by analyze_pfam_go_gaps.py. Do not edit by hand; re-run the script to refresh. See parent project.

Provenance

Headline

Of 9,871 pfam2go assertions on integrated families, 9,844 are byte-identical to a GO id already on the parent InterPro entry and 0 are more specific. pfam2go is, as its own header states, generated from InterPro2GO — it provides no precision gain over InterPro2GO.

Coverage

Classification of pfam2go assertions for integrated families

Each pfam2go (Pfam, GO) assertion compared to the GO terms of the Pfam family's parent InterPro entry:

Category Assertions Meaning
SAME 9,844 identical GO id already on the InterPro entry
MORE_SPECIFIC 0 GO descendant of an InterPro-entry term (would be a precision gain)
MORE_GENERAL 1 GO ancestor of an InterPro-entry term (Pfam less specific)
DISJOINT_PARENT_HAS_GO 1 unrelated to the entry's terms, entry does have GO (genuine difference)
DISJOINT_PARENT_NO_GO 25 parent entry has no GO at all — InterPro release-skew artifact

Examples — increased precision (MORE_SPECIFIC)

Pfam family maps to a GO term strictly more specific than anything on its parent InterPro entry. These would be the gap-filling candidates.

None found — there is no case where pfam2go is more specific than its parent InterPro entry.

Examples — Pfam less specific than InterPro (MORE_GENERAL)

The reverse of the hypothesis: InterPro2GO carries the more precise term.

Pfam InterPro Pfam GO (general) InterPro has descendant
PF08214 IPR016849 GO:0004402 histone acetyltransferase activity GO:0010484 histone H3 acetyltransferase activity

Examples — genuine disjoint difference (DISJOINT_PARENT_HAS_GO)

Parent InterPro entry has GO terms, but the pfam2go term is unrelated to them. The few cases here are typically a generic process term paired with a more specific InterPro molecular-function term for the same family.

Pfam InterPro Pfam GO (unrelated to entry's terms)
PF08214 IPR016849 GO:0006355 regulation of DNA-templated transcription

Examples — release-skew disjointness (DISJOINT_PARENT_NO_GO)

Parent InterPro entry has no GO at all (newly created entry the GO snapshot predates). pfam2go merely retains the older term; not a precision gain, but a small recall advantage until InterPro2GO catches up.

Pfam InterPro (no GO yet) Pfam GO (retained)
PF01513 IPR064509 GO:0006741 NADP+ biosynthetic process
PF02346 IPR063475 GO:0019031 viral envelope
PF02346 IPR063475 GO:0019064 fusion of virus membrane with host plasma membrane
PF03588 IPR063612 GO:0008914 leucyl-tRNA--protein transferase activity
PF03588 IPR063612 GO:0030163 protein catabolic process
PF04258 IPR064098 GO:0016020 membrane
PF04258 IPR064098 GO:0042500 aspartic endopeptidase activity, intramembrane cleaving
PF04350 IPR061922 GO:0043107 type IV pilus-dependent motility
PF04350 IPR061922 GO:0043683 type IV pilus assembly
PF04612 IPR061921 GO:0015627 type II protein secretion system complex
PF04612 IPR061921 GO:0015628 protein secretion by the type II secretion system
PF06213 IPR063680 GO:0009236 cobalamin biosynthetic process
PF08551 IPR061914 GO:0006890 retrograde vesicle-mediated transport, Golgi to endoplasmic reticulum
PF08551 IPR061914 GO:0016020 membrane
PF10156 IPR063477 GO:0003712 transcription coregulator activity
PF10156 IPR063477 GO:0006357 regulation of transcription by RNA polymerase II
PF10156 IPR063477 GO:0016592 mediator complex
PF10272 IPR063936 GO:0016020 membrane
PF10272 IPR063936 GO:0016567 protein ubiquitination
PF10272 IPR063936 GO:0061630 ubiquitin protein ligase activity
PF10995 IPR063588 GO:0035438 cyclic-di-GMP binding
PF17659 IPR061920 GO:0003697 single-stranded DNA binding
PF17825 IPR063559 GO:0000712 resolution of meiotic recombination intermediates
PF17825 IPR063559 GO:0000794 condensed nuclear chromosome
PF17825 IPR063559 GO:0016887 ATP hydrolysis activity

Examples — unintegrated Pfam families with GO terms

These Pfam families carry pfam2go terms but are not part of any InterPro entry, so InterPro2GO has no equivalent.

Pfam Pfam GO
PF04715 GO:0009058 biosynthetic process
PF06009 GO:0007155 cell adhesion
PF13929 GO:0048255 mRNA stabilization

Output files