--- name: verifying-agent-handoffs description: "Verify another agent's handoff claims (Codex, prior session, subagent) against the actual repo before trusting or building on them. Diff zip/pasted artifacts vs. on-disk files, check git status/log, grep for claimed mount points, never take 'verified' / 'implemented' / 'committed' at face value." version: 1.0.0 author: Hermes Agent license: MIT metadata: hermes: tags: [verification, handoff, multi-agent, git, audit] related_skills: [subagent-driven-development, systematic-debugging, github-pr-workflow] --- # Verifying Agent Handoffs ## When to use Trigger any time the user hands you work claimed to be done by **another agent** (Codex CLI, a different Claude session, a coworker's AI, a Hermes subagent that ran in a prior turn) and asks you to "pick up", "land", "review", "deploy", or "continue" it. Also use when the user pastes a long status report ("Implemented X. Verified Y. Did not commit Z.") and asks you to take the next step. The core risk: **agent self-reports are not verified facts.** "tsc passes", "vite build passes", "smoke tested", "I did not commit" are all claims that need to be checked against the file system and git before you act on them. ## The one rule **Before you touch anything: reconcile the claim against the repo.** Three checks, in this order: 1. `git status -sb && git log --oneline -15` in every repo the claim touches. Is there a commit? A stash? An uncommitted diff? Or nothing? 2. For each file the claim names, `diff -q ` (or `git log --all -- `). Does the file exist? Is it identical to the handoff artifact? Does it differ? 3. For each behavior the claim names, `grep` the repo for the exact mount point / route / table / function name. Is it actually wired up? Only after these three pass do you trust the claim enough to build on it. ## Common claim → check mapping | Agent said | What you actually verify | |---|---| | "Added route `/cp/v1`" | `grep -n "app.use.*'/cp/v1'" server.ts` and check it's before `express.static` | | "Applied migration" | `psql -c "\d "` or query `information_schema.tables` | | "tsc && vite build passes" | Run it yourself — `npm run build` in the repo | | "Did not commit" | `git status -sb`, `git stash list`, `git log @{u}..HEAD` | | "Mirrored in both repos" | `diff -u / /` | | "Routes return JSON" | `curl -sS -H 'Accept: application/json' ` and check `Content-Type` | | "Tables created" | `\d+
` or grep the migration file for `CREATE TABLE IF NOT EXISTS` | ## The pitfall that cost time this session Before asking the user to upload, paste, or re-send artifacts another agent referenced — **check the repo first**. If the agent says "I edited `src/pages/config/ConfigClassPass.tsx`", that file is almost certainly already in the repo. Read it directly. This applies doubly when the Hermes uploader rejects an extension (`.tsx`, `.sql`, anything not in the supported list). Do not loop telling the user to zip or rename. One pass of `find -name '*classpass*' -o -name 'ConfigClassPass*'` in the relevant repos usually surfaces every file the agent referenced. Only if the file genuinely isn't in any repo (e.g. it's a new file the other agent generated but never wrote) do you need an upload — and even then, prefer asking the user to `git add -A && git stash` the working tree so you can `git stash show -p` it. ## Workflow ### 1. Inventory the claim Parse the handoff message into a checklist: - Files claimed to be added / modified (with paths) - Routes / endpoints claimed to be live - Migrations claimed to be applied - Commits/pushes/deploys claimed to have happened (or explicitly NOT happened) - Smoke-test results claimed ### 2. Locate the artifacts ```bash # In every repo the claim touches: cd git status -sb git stash list git log --oneline -20 git log @{u}..HEAD # commits ahead of remote git log HEAD..@{u} # commits behind remote find . -name '' -not -path './node_modules/*' ``` If the user also sent a zip / pasted code, unzip to a sandbox dir (e.g. `~/-handoff/`) and **do not move files into the repo yet**. ### 3. Diff each claimed file ```bash # For each file in the handoff: diff -q / / # identical / differs / missing diff -u / / | head -60 # if differs, see how ``` Classify each as: - **Already committed (identical)** → no work needed - **Already committed (differs)** → which is newer / correct? Ask if unclear - **In repo uncommitted** → there's a working tree change to preserve - **Not in repo at all** → the agent never wrote it; you'd be applying fresh ### 4. Verify wired-up behavior, not just file presence A file existing on disk doesn't mean it's mounted. Common gotchas: - Express router file exists but never `app.use()`d → route 404s - Migration file exists in `migrations/` but `db_migrate` was never run → table missing - Component file exists but never imported in App.tsx → not in the UI - Both repos have the same file but only one ran the migration → schema drift For each behavior claim, grep for the wiring: ```bash grep -nE "app\.use|router\.use|import .* from .*classpass" server.ts grep -rn "import ConfigClassPass" src/ ``` ### 5. Run the build / health check yourself Never trust "build passes" from a handoff. Run it: ```bash npm run build # or tsc -b && vite build, whatever the project uses curl -sS -o /dev/null -w '%{http_code} %{content_type}\n' ``` If the agent claimed a smoke test, reproduce it byte-for-byte. ### 6. Report the reconciliation Tell the user the truth, not the agent's truth: > Agent claimed X, Y, Z done. > - X: ✅ committed on origin/main, byte-identical to handoff > - Y: ❌ never written to disk — exists only in the handoff zip > - Z: ⚠️ file is there but never wired into `app.use()` Then ask what to land vs. stage on a branch. ## Pitfalls - **Trusting "tsc passes" / "build passes" / "smoke tested".** Agents hallucinate these constantly. Re-run. - **Trusting "did not commit, did not push".** Sometimes they did. `git log @{u}..HEAD` is fast — just run it. - **Assuming "mirrored in both repos" means identical.** Diff them. - **Asking the user to re-upload files that are already in the repo.** Check the repo first. If Hermes rejects an extension, the file almost always lives on disk under a different name or another path — `find` it before bothering the user. - **Treating a handoff zip as authoritative.** It's a snapshot of what the previous agent *thinks* it did. The repo is ground truth. - **Building on top of unverified claims.** If you skip verification and the next step fails, you'll waste twice the time backing out. ## Quick script: full reconciliation pass ```bash # Adjust REPO and FILES; run in the handoff sandbox dir REPO=/home/avalon/apps/ for f in *; do hits=$(find "$REPO" -name "$f" -not -path '*/node_modules/*') if [ -z "$hits" ]; then echo "MISSING: $f (not in repo)" else for h in $hits; do if diff -q "$f" "$h" >/dev/null 2>&1; then echo "OK : $f == $h" else echo "DIFFER: $f vs $h" fi done fi done ``` ## Related skills - **subagent-driven-development** — when *you* dispatch the subagent, you control the review loop. This skill is for when someone else's agent already ran and you inherit the result. - **systematic-debugging** — if a claim doesn't reconcile, treat the discrepancy as a bug to root-cause, not a thing to paper over. - **github-pr-workflow** — after reconciling and landing, branch + PR per the user's normal flow.