agentskills.codes
AU

autonomous-dev

>

Install

mkdir -p .claude/skills/autonomous-dev && curl -L -o skill.zip "https://agentskills.codes/api/skills/download/16032" && unzip -o skill.zip -d .claude/skills/autonomous-dev && rm skill.zip

Installs to .claude/skills/autonomous-dev

Activation

This is the description your AI agent reads to decide when to run this skill — the better it matches your request, the more reliably it fires.

Use to develop a feature or bug fix end-to-end through a TDD git-worktree workflow — interactively (developer-led) or unattended (autonomous-mode, driven by the dispatcher). Triggers on phrases like "implement issue #N", "fix this bug", "add a feature", "create a worktree", "write test cases", "push and open a PR", "check CI", "address review comments", "resolve review threads", "/q review", "/codex review", "implement this autonomously", or any partial step in the design → worktree → tests → implement → verify → review → PR → CI → E2E lifecycle. Interactive mode asks for decisions; autonomous mode makes decisions per autonomous-mode.md and posts progress comments to the GitHub issue.
693 charsno explicit “when” triggerlonger than Claude Code's old 250-char listing cap (fine on current versions)

About this skill

TDD Development Workflow

A complete development workflow enforcing test-driven development, git worktree isolation, code review, CI verification, and E2E testing. Works in two modes: interactive (default) for human-guided sessions, and autonomous for fully unattended GitHub issue implementation.

NON-NEGOTIABLE RULES — every step marked MANDATORY is required. Do not skip, defer, or ask the user whether to run these steps. Execute them automatically as part of the workflow. This covers creating PRs, waiting for CI, running E2E tests, and addressing reviewer findings.


Mode Detection

Interactive Mode (default)

Used when a developer is present. The workflow:

  • Asks the user for design approval before proceeding to implementation
  • Presents design canvases and waits for feedback
  • Pauses at key decision points for user input
  • Reports final status and lets the user decide when to merge

Autonomous Mode

Triggered when running inside the scripts/autonomous-dev.sh wrapper. The workflow:

  • Makes all decisions autonomously (see "Decision Making Guidelines" below)
  • Posts progress comments to the GitHub issue instead of asking questions
  • Creates design docs but skips interactive approval
  • Stops after verification -- does not merge (the review agent handles that)
  • Marks requirement checkboxes in the issue body as work progresses

Cross-Platform Notes

This skill works across IDEs that support skills.sh. Map generic verbs in this doc to your IDE's tools (Claude Code's Bash → terminal in Cursor, etc.). Hook-based enforcement is available on Claude Code and Kiro CLI; on Cursor / Windsurf / Gemini CLI, follow each step manually — the discipline is the same.

For the full IDE table + verb-to-tool map, see references/cross-platform.md.


Development Workflow Overview

Follow this workflow for all feature development and bug fixes:

Step 1:  DESIGN CANVAS (Pencil MCP, if available)
Step 2:  CREATE GIT WORKTREE (MANDATORY)
Step 3:  WRITE TEST CASES (TDD)
Step 4:  IMPLEMENT CHANGES
Step 5:  LOCAL VERIFICATION
Step 6:  CODE SIMPLIFICATION
Step 7:  COMMIT AND CREATE PR          -- MANDATORY
Step 8:  PR REVIEW AGENT               -- MANDATORY
Step 9:  WAIT FOR ALL CI CHECKS        -- MANDATORY
Step 10: ADDRESS REVIEWER BOT FINDINGS -- MANDATORY
Step 11: ITERATE UNTIL NO FINDINGS
Step 12: E2E TESTS & READY FOR MERGE   -- MANDATORY
Step 13: CLEANUP WORKTREE

Step 1: Design Canvas

Create a design canvas for new UI work, user-facing features, architecture decisions, and complex data flows. Skip for trivial fixes or refactors that don't change behavior.

  • IDEs with Pencil MCP: create docs/designs/<feature>.pen.
  • IDEs without Pencil MCP: create docs/designs/<feature>.md.

For the full Pencil MCP call sequence, the markdown canvas template, and the per-mode (interactive vs autonomous) approval gate, see references/design-canvas.md.


Step 2: Create Git Worktree (MANDATORY)

Every change MUST be developed in an isolated git worktree. Never develop directly on the main workspace.

Enforced by block-commit-outside-worktree.sh hook (if hooks are installed). Commits outside worktrees are automatically blocked. Direct pushes to main are blocked by block-push-to-main.sh.

Why Worktrees?

  • Isolation: Each feature/fix gets its own directory, preventing cross-contamination
  • Parallel work: Multiple features can be in progress simultaneously
  • Clean main workspace: The main checkout stays on main, ready for quick checks
  • Safe rollback: Discard a worktree without affecting the main workspace

Worktree Creation Process

Execute in your terminal:

# 1. Determine branch name based on change type
#    feat/<name>, fix/<name>, refactor/<name>, etc.
BRANCH_NAME="feat/my-feature"

# 2. Create worktree with new branch from main
git worktree add .worktrees/$BRANCH_NAME -b $BRANCH_NAME

# 3. Enter the worktree
cd .worktrees/$BRANCH_NAME

# 4. Install dependencies (use your project's package manager)
npm install  # or: bun install, yarn install, pnpm install

# 5. Verify clean baseline
npm run build && npm test

Directory Convention

ItemValue
Worktree root.worktrees/ (project-local, gitignored)
Path pattern.worktrees/<branch-name>
Example.worktrees/feat/user-authentication

Safety Checks

Before creating any worktree, verify .worktrees/ is in .gitignore:

git check-ignore -q .worktrees 2>/dev/null || echo "WARNING: .worktrees not in .gitignore!"

All Subsequent Steps Run INSIDE the Worktree

After creating the worktree, all development commands (test, lint, build, commit, push) are executed from within the worktree directory. The main workspace is not touched until cleanup.


Step 3: Write Test Cases (TDD)

Before writing any implementation code:

  1. Read the design canvas and requirements
  2. Identify all user scenarios, edge cases, and error handling paths
  3. Create or edit the test case document: docs/test-cases/<feature>.md
    • List all test scenarios (happy path, edge cases, error handling)
    • Assign test IDs (e.g., TC-AUTH-001)
    • Define expected results and acceptance criteria
  4. Create unit test skeletons
  5. Create E2E test cases if applicable

Step 4: Implement Changes

  • Write code following the test cases (inside the worktree)
  • Write new unit tests for new functionality
  • Update existing tests if behavior changed
  • Ensure implementation covers all test scenarios

Step 5: Local Verification

Execute in your terminal:

timeout 1800 bash -lc 'npm run build && npm run test' > /tmp/verify.log 2>&1; rc=$?; [ $rc -ne 0 ] && tail -100 /tmp/verify.log; exit $rc

Fix any failures before proceeding. Deploy and verify locally if applicable.

How to run long verification

Run your project's build/test suite as one synchronous command with a generous timeout — never background it and poll across turns:

  1. Run the top-level suite command synchronously with an explicit generous timeout. One blocking call (or a few sequential blocking calls, e.g. build then test) that returns the full result within the turn. Capture output and replay only the tail on failure:
    timeout 1800 bash -lc '<your project's build & test command>' > /tmp/verify.log 2>&1; rc=$?; [ $rc -ne 0 ] && tail -100 /tmp/verify.log; exit $rc
    
  2. Never background the top-level suite (no &, no background task mode — whatever the host CLI calls it, e.g. run_in_background) and then poll its log across agent turns. Each poll is a full model round-trip; collective polling cost can exceed the suite's own runtime by orders of magnitude.
  3. If the tool's max timeout genuinely cannot cover the suite, split by directory/prefix into a few sequential synchronous calls — still no polling.
  4. Prefer a project-provided parallel runner when one exists.

Scope: the ban is on backgrounding the TOP-LEVEL verification command. Tests/scripts that internally spawn child processes or local servers are unaffected, as are genuinely event-driven waits (CI checks in Step 9, bot reviews in Steps 10-11).


Step 6: Code Simplification

  1. Use a subagent if your IDE supports them (e.g., code-simplifier:code-simplifier), otherwise review the code manually for unnecessary complexity.
  2. Address simplification suggestions.
  3. Mark complete (if hooks are installed):
    hooks/state-manager.sh mark code-simplifier
    

Step 7: Commit and Create PR (MANDATORY)

Commit

Execute in your terminal:

git add <files>
git commit -m "type(scope): description"
git push -u origin <branch-name>

Create PR

gh pr create --title "type(scope): description" --body "$(cat <<'EOF'
## Summary
<1-3 bullet points describing the change>

## Design
- [ ] Design canvas created (`docs/designs/<feature>.pen`)
- [ ] Design approved

## Test Plan
- [ ] Test cases documented (`docs/test-cases/<feature>.md`)
- [ ] Build passes (`npm run build`)
- [ ] Unit tests pass (`npm run test`)
- [ ] CI checks pass
- [ ] Code simplification review passed
- [ ] PR review agent review passed
- [ ] Reviewer bot findings addressed (no new findings)
- [ ] E2E tests pass

## Checklist
- [ ] New unit tests written for new functionality
- [ ] E2E test cases updated if needed
- [ ] Documentation updated if needed
EOF
)"

Update PR Checklist

After completing each step, update the PR description:

gh pr view {pr_number} --json body --jq '.body' > /tmp/pr_body.md
# Edit the checklist (mark items as [x])
gh pr edit {pr_number} --body "$(cat /tmp/pr_body.md)"

Step 8: PR Review Agent (MANDATORY)

  1. Use a subagent if your IDE supports them (e.g., /pr-review-toolkit:review-pr), otherwise perform a self-review against the PR diff.
  2. Address findings by severity:
    • Critical/Severe: Must fix
    • High: Must fix
    • Medium: Should fix
    • Low: Optional
  3. Mark complete (if hooks are installed):
    hooks/state-manager.sh mark pr-review
    

Step 9: Wait for All CI Checks (MANDATORY -- DO NOT SKIP)

Execute in your terminal:

# Watch all checks until completion
gh pr checks {pr_number} --watch --interval 30

ALL checks must pass: Lint, Unit Tests, Build, Deploy Preview, E2E Tests.

If ANY check fails: analyze logs, fix, push, re-watch. DO NOT proceed until every check shows "pass."

Checks to Monitor

CheckDescriptionAction if Failed
CI / build-and-testBuild + unit testsFix code or update snapshots
Security ScanSAST, npm auditFix security issues
Configured REVIEW_BOTSPer-project bot reviewers (q, codex, claude, custom)Address findings, retrigger via gh-as-user.sh
Other review botsVarious checksAddress findings, re

Content truncated.

Search skills

Search the agent skills registry