Skip to content

Philosophy

Why does ai-blame exist, and what problems does it solve?

The Problem: AI-Assisted Curation at Scale

AI agents are increasingly being used to assist with knowledge base curation, documentation maintenance, and code generation. Tools like Claude Code can create and modify dozens of files in a single session.

But this raises a critical question: who made this change?

Traditional version control (git) tells you what changed and when, but in an AI-assisted workflow, the commit author is a human even when an AI made the actual edit. This creates an attribution gap:

$ git log --oneline
a1b2c3d Add disease definitions (Alice)
d4e5f6g Update phenotype mappings (Bob)

Both commits might have been AI-generated, but there's no way to tell:

  • Which AI model was used?
  • Was it created from scratch or edited from existing content?
  • What version of the AI tool made the change?

The Solution: Embedded Provenance

ai-blame extracts provenance information from AI agent execution traces and embeds it directly in the affected files:

# disease.yaml
name: Asthma
definition: A chronic respiratory condition...

edit_history:
  - timestamp: "2025-12-01T08:03:42+00:00"
    model: claude-opus-4-5-20251101
    agent_tool: claude-code
    agent_version: "2.0.75"
    action: CREATED

This provides:

  1. Attribution — Know which model made each change
  2. Traceability — Track the evolution of AI-curated content
  3. Reproducibility — Record the exact tool versions used
  4. Transparency — Make AI involvement visible and auditable

Design Principles

1. Non-Invasive

ai-blame works with existing files and doesn't require changes to your workflow. It reads execution traces that are already being generated by tools like Claude Code.

2. Flexible Output

Different file types need different approaches:

  • Structured data (YAML, JSON) — Append an edit_history key
  • Code files — Use sidecar files or embedded comments
  • Documentation — Often best skipped

The configuration system lets you define policies per file type.

3. Dry-Run by Default

The tool never modifies files unless explicitly told to. This lets you preview changes and ensure they're correct before applying.

4. Minimal Footprint

With --initial-and-recent, you can keep only the first and last edit, avoiding bloated history sections while preserving the essential provenance information.

Read More

For a deeper dive into the motivation and real-world use cases, see:

"Whose Code Is This, Anyway? Tracking AI Agent Provenance" on Medium

This article explores why provenance tracking matters for knowledge bases, code auditing, model comparison, and regulatory compliance—and includes examples of how ai-blame solves the "git blame won't tell you anymore" problem.