repo-claim-verifier
Verifies claims from repo-bfs-architecture. Three passes: (A) code-first grep of all claims, (B) DFS traversal of ARCH_GRAPH to verify component relationships, (C) targeted web searches for unresolved claims. Never re-clones the repo.
Install
mkdir -p .claude/skills/repo-claim-verifier && curl -L -o skill.zip "https://agentskills.codes/api/skills/download/16478" && unzip -o skill.zip -d .claude/skills/repo-claim-verifier && rm skill.zipInstalls to .claude/skills/repo-claim-verifier
Activation
This is the description your AI agent reads to decide when to run this skill — the better it matches your request, the more reliably it fires.
Verifies claims from repo-bfs-architecture. Three passes: (A) code-first grep of all claims, (B) DFS traversal of ARCH_GRAPH to verify component relationships, (C) targeted web searches for unresolved claims. Never re-clones the repo.About this skill
Repo Claim Verifier
Purpose
Verify claims from the main agent's analysis and confirm the accuracy of the ARCH_GRAPH component relationship model. Three passes in strict order:
- Step A -- Code pass: grep all claims against the repo source
- Step B -- DFS graph check: depth-first traversal of ARCH_GRAPH edges to confirm component relationships are accurate
- Step C -- Web pass: targeted searches for claims that code alone cannot resolve
Never re-clone the repo. Use repo_root from the VERIFY_PACKET throughout.
Token Budget
| Variable | Default | Notes |
|---|---|---|
| VERIFIER_TOKEN_BUDGET | 15000 | Hard cap for this entire run |
| VERIFIER_TOKENS_USED | 0 | Track after every action |
| MAX_FILE_LINES_READ | 80 | Max lines from any single new file read |
| CODE_BUDGET_FRACTION | 0.60 | 60% for code pass (~9,000 tokens) |
| GRAPH_BUDGET_FRACTION | 0.25 | 25% for DFS graph check (~3,750 tokens) |
| WEB_BUDGET_FRACTION | 0.15 | 15% for web searches (~2,250 tokens) |
| MAX_WEB_SEARCHES | 10 | Hard ceiling regardless of claim count |
| WEB_SEARCHES_USED | 0 | Track after every search |
| MAX_READS_PER_CLAIM | 1 | At most one new file read per claim |
Check budget before every action. Complete Step A before spending any graph budget.
Complete Steps A+B before spending any web budget.
If exhausted mid-step: emit ~ c{id}: BUDGET-EXHAUSTED -- not checked and stop.
Input: VERIFY_PACKET
VERIFY_PACKET
=============================================================
repo_root: /tmp/repo-analysis-XXXXXX <- use this path directly
repo_structure: <directory listing>
files_read: <list of files main agent already read, with strategies>
arch_graph: <ARCH_GRAPH JSON -- nodes[] and edges[] arrays>
claims: <claim-manifest XML>
=============================================================
On receipt:
- Record
repo_root. NEVER re-clone. - Parse
arch_graphinto working GRAPH state. - Parse
claim-manifestand partition: CONFIRMED (spot-check), INFERRED/SPECULATIVE (full). - Log:
Verifier received N claims, M graph edges (confirmed: X, unconfirmed: Y)
Step A -- Code Pass (all claims, no web or graph budget)
Process ALL claims against repo source before any graph or web work.
CONFIRMED claims:
Grep the cited source file/line
Hit -> mark SPOT-CHECKED
Miss -> mark NEEDS-WEB
INFERRED / SPECULATIVE claims:
STEP 1 -- Grep/ls from hint (~30-50 tokens)
Hit -> mark CODE-CONFIRMED
Miss -> STEP 2
STEP 2 -- Was hinted file in main agent's files_read?
YES -> grep only (never re-fetch). Still miss -> mark NEEDS-WEB
NO -> check length: wc -l
<= 80 lines: read hint range (sed -n)
> 80 lines: grep-only
Hit -> mark CODE-CONFIRMED; miss -> mark NEEDS-WEB
Re-fetch rule: if a file is in files_read, NEVER re-fetch it. Grep the local clone.
Step B -- DFS Graph Check (ARCH_GRAPH edge verification)
This step verifies the structural accuracy of the component relationship graph. It is the primary mechanism for catching diagram errors like co-location vs. separation, call direction, and which components actually share process space.
B1. Prioritise unconfirmed edges
Process edges in this order:
confirmed: falseedges first -- these are the main agent's guessesconfirmed: trueedges with high architectural significance:containsedges (co-location / process boundaries)guardsedges (security layer coverage)- Any edge involving a
security-layernode
Skip confirmed: true non-security edges if budget is tight.
B2. DFS traversal procedure
For each edge { from: A, to: B, type: T, label: L }:
1. IDENTIFY the source files for nodes A and B
- Use node.file if present
- Otherwise: grep -r "class {label}\|const {label}\|export.*{label}" src/ -l | head -3
2. GREP for the relationship in A's file:
grep -n "{B.label}\|{B's exported name}" {A.file} | head -20
3. INTERPRET the grep result against the claimed edge type:
Type = "contains":
Look for: B instantiated inside A's class/constructor, B imported and held
as a class property, B's start() called from A's start().
Counter-evidence: B has its own process.argv / main() entrypoint,
B spawned via child_process.spawn or subprocess.
Type = "calls":
Look for: direct function/method call from A to B, import of B in A,
B's API used in A's request handler or dispatch logic.
Counter-evidence: A only receives callbacks FROM B, not calling B.
Type = "two-way":
Look for: A calls B AND B calls A (may need to grep B's file too).
Counter-evidence: traffic is one-directional.
Type = "guards":
Look for: B's processing ONLY proceeds after A returns/resolves,
middleware chain where A is before B, all entry paths to B pass through A.
Counter-evidence: direct calls to B that bypass A, optional/configurable guard.
Type = "depends-on":
Look for: A reads B's config/state file, A imports B's exported constants,
A makes a request to B on startup.
Type = "publishes":
Look for: A emits events that B's listener/subscriber handles.
4. EMIT result:
Confirmed by grep -> ✔ EDGE-CONFIRMED: {A}->{B} ({type}) -- {file:line}
Contradicted -> ❗ EDGE-WRONG: {A}->{B} claimed {type} but evidence shows {X}
Emit correction with proposed replacement edge
Insufficient evidence-> ⚠ EDGE-UNVERIFIED: {A}->{B} -- could not confirm from {file}
B3. Relationship-specific deep dives
Beyond individual edge checks, perform these targeted relationship investigations:
Process boundary check (contains vs. separate process):
For any contains edge where one node is a major component (gateway, agent, ACP):
# Check if the contained component has its own process entry
grep -rn "process.argv\|yargs\|commander\|parseArgs\|__main__\|if __name__" \
{component_dir}/ --include="*.ts" --include="*.py" | head -10
# Check for spawn/fork calls from the container
grep -n "spawn\|fork\|child_process\|subprocess\|exec(" {container.file} | head -20
If the component has its own CLI entry AND is spawned from the container: edge type
should be calls not contains. Emit ❗ EDGE-WRONG correction.
Auth flow check (who calls whom for authentication):
For any edge involving a security-layer node of type guards:
# Find where the security layer is invoked relative to the guarded component
grep -n "use(\|app\.use\|router\.use\|middleware\|before\|intercept" \
{guarded_component.file} | head -20
grep -n "{security_layer_name}\|{security_layer_import}" \
{guarded_component.file} | head -20
Verify the layer sits before the guarded component in the call chain, not after or beside.
Shared state check (store edges):
For any store node, verify which components actually write to it vs. read from it:
grep -rn "\.write\|\.set\|\.save\|INSERT\|UPDATE\|fs\.write" \
--include="*.ts" --include="*.rs" src/ | grep "{store_name}" | head -20
Adjust edge direction if write/read direction is backwards.
B4. Emit corrections
For each wrong or unverified edge, produce a GRAPH_CORRECTION:
GRAPH_CORRECTION:
Edge: {from} -> {to} (type: {original_type})
Finding: {what the grep showed}
Action: REPLACE with: {from} -> {to} (type: {corrected_type}, label: {new_label})
OR: REMOVE (relationship does not exist)
OR: REVERSE: {to} -> {from} (direction was backwards)
OR: SPLIT: {from} -> {intermediate} -> {to} (missing intermediate node)
Basis: {file:line}
Step C -- Web Pass (NEEDS-WEB + high-value claims only)
Spend web budget only on:
- Claims marked NEEDS-WEB from Step A
- High-value categories even if CODE-CONFIRMED: CVE status, vendor guarantees, shipped-vs-announced features, package behaviour differences
Web check cap: MAX_WEB_SEARCHES = 10. Once reached, remaining NEEDS-WEB claims
get ~ c{id}: WEB-BUDGET-EXHAUSTED -- code check only.
Run cheapest sources first: changelog/releases, package registry, project docs, GitHub issues. Never more than 1 web search per claim. Never search for things findable in the repo.
Cost reference
| Action | Approx tokens | When used |
|---|---|---|
| grep on cached file (in files_read) | ~50 | Steps A and B |
| ls on directory | ~30 | Adapter/file existence checks |
| sed -n line range on new file (<= 80 ln) | ~400 | Step A targeted read |
| head -80 on new file | ~500 | Step A when no range hint |
| grep for edge verification | ~50 | Step B per edge |
| head -20 for process boundary check | ~200 | Step B deep dive |
| Web search | ~300 | Step C only |
Verifier Report Format
Block 1 -- Per-claim results
✔ c{id}: [CODE] {<=12 word confirmation} -- {file:line}
✔ c{id}: [CODE+WEB] {<=12 word confirmation} -- {file:line; URL}
✔ c{id}: [WEB] {<=12 word confirmation} -- {URL}
❗ c{id}: {<=12 word error} -- {correction}
⚠ c{id}: {partial
---
*Content truncated.*