verifying-agent-handoffs

/home/avalon/.hermes/skills/software-development/verifying-agent-handoffs/SKILL.md · raw

Verifying Agent Handoffs

When to use

Trigger any time the user hands you work claimed to be done by another agent (Codex CLI, a different Claude session, a coworker's AI, a Hermes subagent that ran in a prior turn) and asks you to "pick up", "land", "review", "deploy", or "continue" it. Also use when the user pastes a long status report ("Implemented X. Verified Y. Did not commit Z.") and asks you to take the next step.

The core risk: agent self-reports are not verified facts. "tsc passes", "vite build passes", "smoke tested", "I did not commit" are all claims that need to be checked against the file system and git before you act on them.

The one rule

Before you touch anything: reconcile the claim against the repo.

Three checks, in this order:

  1. git status -sb && git log --oneline -15 in every repo the claim touches. Is there a commit? A stash? An uncommitted diff? Or nothing?
  2. For each file the claim names, diff -q <handoff-source> <repo-path> (or git log --all -- <path>). Does the file exist? Is it identical to the handoff artifact? Does it differ?
  3. For each behavior the claim names, grep the repo for the exact mount point / route / table / function name. Is it actually wired up?

Only after these three pass do you trust the claim enough to build on it.

Common claim → check mapping

Agent said What you actually verify
"Added route /cp/v1" grep -n "app.use.*'/cp/v1'" server.ts and check it's before express.static
"Applied migration" psql -c "\d <table>" or query information_schema.tables
"tsc && vite build passes" Run it yourself — npm run build in the repo
"Did not commit" git status -sb, git stash list, git log @{u}..HEAD
"Mirrored in both repos" diff -u <repo-a>/<file> <repo-b>/<file>
"Routes return JSON" curl -sS -H 'Accept: application/json' <url> and check Content-Type
"Tables created" \d+ <table> or grep the migration file for CREATE TABLE IF NOT EXISTS

The pitfall that cost time this session

Before asking the user to upload, paste, or re-send artifacts another agent referenced — check the repo first. If the agent says "I edited src/pages/config/ConfigClassPass.tsx", that file is almost certainly already in the repo. Read it directly.

This applies doubly when the Hermes uploader rejects an extension (.tsx, .sql, anything not in the supported list). Do not loop telling the user to zip or rename. One pass of find <app> -name '*classpass*' -o -name 'ConfigClassPass*' in the relevant repos usually surfaces every file the agent referenced. Only if the file genuinely isn't in any repo (e.g. it's a new file the other agent generated but never wrote) do you need an upload — and even then, prefer asking the user to git add -A && git stash the working tree so you can git stash show -p it.

Workflow

1. Inventory the claim

Parse the handoff message into a checklist: - Files claimed to be added / modified (with paths) - Routes / endpoints claimed to be live - Migrations claimed to be applied - Commits/pushes/deploys claimed to have happened (or explicitly NOT happened) - Smoke-test results claimed

2. Locate the artifacts

# In every repo the claim touches:
cd <repo>
git status -sb
git stash list
git log --oneline -20
git log @{u}..HEAD  # commits ahead of remote
git log HEAD..@{u}  # commits behind remote
find . -name '<claimed-file>' -not -path './node_modules/*'

If the user also sent a zip / pasted code, unzip to a sandbox dir (e.g. ~/<task>-handoff/) and do not move files into the repo yet.

3. Diff each claimed file

# For each file in the handoff:
diff -q <handoff>/<file> <repo>/<path>   # identical / differs / missing
diff -u <handoff>/<file> <repo>/<path> | head -60   # if differs, see how

Classify each as: - Already committed (identical) → no work needed - Already committed (differs) → which is newer / correct? Ask if unclear - In repo uncommitted → there's a working tree change to preserve - Not in repo at all → the agent never wrote it; you'd be applying fresh

4. Verify wired-up behavior, not just file presence

A file existing on disk doesn't mean it's mounted. Common gotchas: - Express router file exists but never app.use()d → route 404s - Migration file exists in migrations/ but db_migrate was never run → table missing - Component file exists but never imported in App.tsx → not in the UI - Both repos have the same file but only one ran the migration → schema drift

For each behavior claim, grep for the wiring:

grep -nE "app\.use|router\.use|import .* from .*classpass" server.ts
grep -rn "import ConfigClassPass" src/

5. Run the build / health check yourself

Never trust "build passes" from a handoff. Run it:

npm run build   # or tsc -b && vite build, whatever the project uses
curl -sS -o /dev/null -w '%{http_code} %{content_type}\n' <live-url>

If the agent claimed a smoke test, reproduce it byte-for-byte.

6. Report the reconciliation

Tell the user the truth, not the agent's truth:

Agent claimed X, Y, Z done. - X: ✅ committed on origin/main, byte-identical to handoff - Y: ❌ never written to disk — exists only in the handoff zip - Z: ⚠️ file is there but never wired into app.use()

Then ask what to land vs. stage on a branch.

Pitfalls

Quick script: full reconciliation pass

# Adjust REPO and FILES; run in the handoff sandbox dir
REPO=/home/avalon/apps/<app>
for f in *; do
  hits=$(find "$REPO" -name "$f" -not -path '*/node_modules/*')
  if [ -z "$hits" ]; then
    echo "MISSING: $f (not in repo)"
  else
    for h in $hits; do
      if diff -q "$f" "$h" >/dev/null 2>&1; then
        echo "OK    : $f == $h"
      else
        echo "DIFFER: $f vs $h"
      fi
    done
  fi
done