Back to Skills

Subagent Driven Development

Use when executing implementation plans with independent tasks in the current session

agent
By Houseofmvps
10913Updated 1 day agoJavaScriptMIT

Skill Content

# Subagent-Driven Development

Execute plan by dispatching fresh subagent per task, with two-stage review after each: spec compliance review first, then code quality review.

**Why subagents:** You delegate tasks to specialized agents with isolated context. By precisely crafting their instructions and context, you ensure they stay focused and succeed at their task. They should never inherit your session's context or history — you construct exactly what they need. This also preserves your own context for coordination work.

**Core principle:** Fresh subagent per task + two-stage review (spec then quality) = high quality, fast iteration

## When to Use

```dot
digraph when_to_use {
    "Have implementation plan?" [shape=diamond];
    "Tasks mostly independent?" [shape=diamond];
    "Stay in this session?" [shape=diamond];
    "subagent-driven-development" [shape=box];
    "executing-plans" [shape=box];
    "Manual execution or brainstorm first" [shape=box];

    "Have implementation plan?" -> "Tasks mostly independent?" [label="yes"];
    "Have implementation plan?" -> "Manual execution or brainstorm first" [label="no"];
    "Tasks mostly independent?" -> "Stay in this session?" [label="yes"];
    "Tasks mostly independent?" -> "Manual execution or brainstorm first" [label="no - tightly coupled"];
    "Stay in this session?" -> "subagent-driven-development" [label="yes"];
    "Stay in this session?" -> "executing-plans" [label="no - parallel session"];
}
```

**vs. Executing Plans (parallel session):**
- Same session (no context switch)
- Fresh subagent per task (no context pollution)
- Two-stage review after each task: spec compliance first, then code quality
- Faster iteration (no human-in-loop between tasks)

## Task Sizing

Before dispatching, classify each task:

| Size | Criteria | Process |
|------|----------|---------|
| **Small** | 1-2 files, < 50 lines changed, clear spec | Implementer only → mark complete |
| **Medium** | 2-4 files, clear spec, some integration | Implementer → spec review → mark complete |
| **Large** | 4+ files, cross-cutting, architectural decisions | Implementer → spec review → code quality review → mark complete |

**Most tasks are Small or Medium.** Only Large tasks need the full 3-agent review cycle. Skipping unnecessary reviews for small tasks saves 2 subagent invocations per task.

## The Process

```dot
digraph process {
    rankdir=TB;

    subgraph cluster_per_task {
        label="Per Task";
        "Classify task size (Small/Medium/Large)" [shape=diamond];
        "Dispatch implementer subagent (./implementer-prompt.md)" [shape=box];
        "Implementer subagent asks questions?" [shape=diamond];
        "Answer questions, provide context" [shape=box];
        "Implementer subagent implements, tests, commits, self-reviews" [shape=box];
        "Mark task complete (Small)" [shape=box];
        "Dispatch spec reviewer subagent (./spec-reviewer-prompt.md)" [shape=box];
        "Spec reviewer subagent confirms code matches spec?" [shape=diamond];
        "Implementer subagent fixes spec gaps" [shape=box];
        "Mark task complete (Medium)" [shape=box];
        "Dispatch code quality reviewer subagent (./code-quality-reviewer-prompt.md)" [shape=box];
        "Code quality reviewer subagent approves?" [shape=diamond];
        "Implementer subagent fixes quality issues" [shape=box];
        "Mark task complete (Large)" [shape=box];
    }

    "Read plan, extract all tasks with full text, note context, create TodoWrite" [shape=box];
    "More tasks remain?" [shape=diamond];
    "Dispatch final code reviewer subagent for entire implementation" [shape=box];
    "Use ultraship:finishing-a-development-branch" [shape=box style=filled fillcolor=lightgreen];

    "Read plan, extract all tasks with full text, note context, create TodoWrite" -> "Classify task size (Small/Medium/Large)";
    "Classify task size (Small/Medium/Large)" -> "Dispatch implementer subagent (./implementer-prompt.md)";
    "Dispatch implementer subagent (./implementer-prompt.md)" -> "Implementer subagent asks questions?";
    "Implementer subagent asks questions?" -> "Answer questions, provide context" [label="yes"];
    "Answer questions, provide context" -> "Dispatch implementer subagent (./implementer-prompt.md)";
    "Implementer subagent asks questions?" -> "Implementer subagent implements, tests, commits, self-reviews" [label="no"];
    "Implementer subagent implements, tests, commits, self-reviews" -> "Mark task complete (Small)" [label="Small"];
    "Implementer subagent implements, tests, commits, self-reviews" -> "Dispatch spec reviewer subagent (./spec-reviewer-prompt.md)" [label="Medium/Large"];
    "Dispatch spec reviewer subagent (./spec-reviewer-prompt.md)" -> "Spec reviewer subagent confirms code matches spec?";
    "Spec reviewer subagent confirms code matches spec?" -> "Implementer subagent fixes spec gaps" [label="no"];
    "Implementer subagent fixes spec gaps" -> "Dispatch spec reviewer subagent (./spec-reviewer-prompt.md)" [label="re-review"];
    "Spec reviewer subagent confirms code matches spec?" -> "Mark task complete (Medium)" [label="Medium"];
    "Spec reviewer subagent confirms code matches spec?" -> "Dispatch code quality reviewer subagent (./code-quality-reviewer-prompt.md)" [label="Large"];
    "Dispatch code quality reviewer subagent (./code-quality-reviewer-prompt.md)" -> "Code quality reviewer subagent approves?";
    "Code quality reviewer subagent approves?" -> "Implementer subagent fixes quality issues" [label="no"];
    "Implementer subagent fixes quality issues" -> "Dispatch code quality reviewer subagent (./code-quality-reviewer-prompt.md)" [label="re-review"];
    "Code quality reviewer subagent approves?" -> "Mark task complete (Large)" [label="yes"];
    "Mark task complete (Small)" -> "More tasks remain?";
    "Mark task complete (Medium)" -> "More tasks remain?";
    "Mark task complete (Large)" -> "More tasks remain?";
    "More tasks remain?" -> "Classify task size (Small/Medium/Large)" [label="yes"];
    "More tasks remain?" -> "Dispatch final code reviewer subagent for entire implementation" [label="no"];
    "Dispatch final code reviewer subagent for entire implementation" -> "Use ultraship:finishing-a-development-branch";
}
```

## Model Selection

Use the least powerful model that can handle each role. This prevents timeouts and saves cost.

| Role | Model | Why |
|------|-------|-----|
| Implementer (Small task) | sonnet | Mechanical work with clear spec — sonnet is fast and accurate |
| Implementer (Medium/Large task) | opus | Cross-file coordination needs deep reasoning |
| Spec reviewer | sonnet | Checklist comparison — no deep reasoning needed |
| Code quality reviewer (Large only) | opus | Spotting subtle bugs, architecture issues, security flaws needs best judgment |
| Final reviewer | opus | Reviewing entire implementation requires holistic understanding |

**Rules:**
- Use opus for anything requiring judgment, strategy, or cross-file reasoning.
- Use sonnet for mechanical tasks with clear specs and tool-running.
- If a sonnet implementer reports BLOCKED, re-dispatch with opus.
- Never downgrade opus agents to sonnet — quality is non-negotiable.
- **Default to opus for implementers** — sonnet failures that the main agent has to fix cost more than opus would have.

## Handling Implementer Status

Implementer subagents report one of four statuses. Handle each appropriately:

**DONE:** Proceed to spec compliance review.

**DONE_WITH_CONCERNS:** The implementer completed the work but flagged doubts. Read the concerns before proceeding. If the concerns are about correctness or scope, address them before review. If they're observations (e.g., "this file is getting large"), note them and proceed to review.

**NEEDS_CONTEXT:** The implementer needs information that wasn't provided. Provide the missing context and re-dispatch.

**BLOCKED:** The implementer cannot complete the task. Assess the blocker:
1. If it's a context problem, provide more context and re-dispatch with the same model
2. If the task requires more reasoning, re-dispatch with a more capable model
3. If the task is too large, break it into smaller pieces
4. If the plan itself is wrong, escalate to the human

**Never** ignore an escalation or force the same model to retry without changes. If the implementer said it's stuck, something needs to change.

## Prompt Templates

- `./implementer-prompt.md` - Dispatch implementer subagent
- `./spec-reviewer-prompt.md` - Dispatch spec compliance reviewer subagent
- `./code-quality-reviewer-prompt.md` - Dispatch code quality reviewer subagent

## Example Workflow

```
You: I'm using Subagent-Driven Development to execute this plan.

[Read plan file once: docs/ultraship/plans/feature-plan.md]
[Extract all 5 tasks with full text and context]
[Create TodoWrite with all tasks]

Task 1: Hook installation script

[Get Task 1 text and context (already extracted)]
[Dispatch implementation subagent with full task text + context]

Implementer: "Before I begin - should the hook be installed at user or system level?"

You: "User level (~/.config/ultraship/hooks/)"

Implementer: "Got it. Implementing now..."
[Later] Implementer:
  - Implemented install-hook command
  - Added tests, 5/5 passing
  - Self-review: Found I missed --force flag, added it
  - Committed

[Dispatch spec compliance reviewer]
Spec reviewer: ✅ Spec compliant - all requirements met, nothing extra

[Get git SHAs, dispatch code quality reviewer]
Code reviewer: Strengths: Good test coverage, clean. Issues: None. Approved.

[Mark Task 1 complete]

Task 2: Recovery modes

[Get Task 2 text and context (already extracted)]
[Dispatch implementation subagent with full task text + context]

Implementer: [No questions, proceeds]
Implementer:
  - Added verify/repair modes
  - 8/8 tests passing
  - Self-review: All good
  - Committed

[Dispatch spec compliance reviewer]
Spec reviewer: ❌ Issues:
  - Missing: Progress reporting (spec says "report every 100 items")
  - Extra: Added --json flag (not requested)

[Implementer fixes issues]
Implementer: Removed --json flag, added progress reporting

[Spec reviewer reviews again]
Spec reviewer: ✅ Spec compliant now

[Dispatch code quality reviewer]
Code reviewer: Strengths: Solid. Issues (Important): Magic number (100)

[Implementer fixes]
Implementer: Extracted PROGRESS_INTERVAL constant

[Code reviewer reviews again]
Code reviewer: ✅ Approved

[Mark Task 2 complete]

...

[After all tasks]
[Dispatch final code-reviewer]
Final reviewer: All requirements met, ready to merge

Done!
```

## Advantages

**vs. Manual execution:**
- Subagents follow TDD naturally
- Fresh context per task (no confusion)
- Parallel-safe (subagents don't interfere)
- Subagent can ask questions (before AND during work)

**vs. Executing Plans:**
- Same session (no handoff)
- Continuous progress (no waiting)
- Review checkpoints automatic

**Efficiency gains:**
- No file reading overhead (controller provides full text)
- Controller curates exactly what context is needed
- Subagent gets complete information upfront
- Questions surfaced before work begins (not after)

**Quality gates:**
- Self-review catches issues before handoff
- Two-stage review: spec compliance, then code quality
- Review loops ensure fixes actually work
- Spec compliance prevents over/under-building
- Code quality ensures implementation is well-built

**Cost:**
- More subagent invocations (implementer + 2 reviewers per task)
- Controller does more prep work (extracting all tasks upfront)
- Review loops add iterations
- But catches issues early (cheaper than debugging later)

## Red Flags

**Never:**
- Start implementation on main/master branch without explicit user consent
- Skip reviews (spec compliance OR code quality)
- Proceed with unfixed issues
- Dispatch multiple implementation subagents in parallel (conflicts)
- Make subagent read plan file (provide full text instead)
- Skip scene-setting context (subagent needs to understand where task fits)
- Ignore subagent questions (answer before letting them proceed)
- Accept "close enough" on spec compliance (spec reviewer found issues = not done)
- Skip review loops (reviewer found issues = implementer fixes = review again)
- Let implementer self-review replace actual review (both are needed)
- **Start code quality review before spec compliance is ✅** (wrong order)
- Move to next task while either review has open issues

**If subagent asks questions:**
- Answer clearly and completely
- Provide additional context if needed
- Don't rush them into implementation

**If reviewer finds issues:**
- Implementer (same subagent) fixes them
- Reviewer reviews again
- Repeat until approved
- Don't skip the re-review

**If subagent fails task:**
- Dispatch fix subagent with specific instructions
- Don't try to fix manually (context pollution)

## Integration

**Required workflow skills:**
- **ultraship:using-git-worktrees** - REQUIRED: Set up isolated workspace before starting
- **ultraship:writing-plans** - Creates the plan this skill executes
- **ultraship:requesting-code-review** - Code review template for reviewer subagents
- **ultraship:finishing-a-development-branch** - Complete development after all tasks

**Subagents should use:**
- **ultraship:test-driven-development** - Subagents follow TDD for each task

**Alternative workflow:**
- **ultraship:executing-plans** - Use for parallel session instead of same-session execution

How to use

  1. Copy the skill content above
  2. Create a .claude/skills directory in your project
  3. Save as .claude/skills/ultraship-subagent-driven-development.md
  4. Use /ultraship-subagent-driven-development in Claude Code to invoke this skill
<div align="center"> <img src="assets/hero-banner.jpg" alt="Ultraship — Claude Code Plugin" width="100%"/>

Claude Code plugin. 43 expert-level skills for building, shipping, and scaling production software. 37 audit tools (accessibility, vibe-coding security, AI evals, pentest, code quality, bundle size, SEO + AI Readiness check) plus a blocking ship-gate close the loop before deploy. A built-in Currency Guard keeps Claude on current docs, not stale training data.

npm version npm downloads npm total GitHub stars License: MIT CI Sponsor


Follow @kaileskkhumar LinkedIn houseofmvps.com kailxlabs.co

Built by Kaileskkhumar, founder of HouseofMVPs and Kailxlabs

</div>
0 dependencies · 274 tests · Node.js ESM · MIT

Install

# Claude Code plugin
claude plugin marketplace add Houseofmvps/ultraship
claude plugin install ultraship

# Or standalone via npx
npx ultraship ship .
npx ultraship seo .
npx ultraship security .

How It Works

flowchart LR
    U["You type a<br/>slash command"] --> S["Skill<br/>(markdown instructions)"]
    S --> A["Agent<br/>(dispatched worker)"]
    S --> T["Tools<br/>(Node.js scripts)"]
    A --> T
    T --> O["JSON Results"]
    O --> R["Scorecard / Report /<br/>Actionable Fixes"]

    style U fill:#f59e0b,stroke:#d97706,color:#000
    style S fill:#8b5cf6,stroke:#7c3aed,color:#fff
    style A fill:#3b82f6,stroke:#2563eb,color:#fff
    style T fill:#10b981,stroke:#059669,color:#000
    style R fill:#ef4444,stroke:#dc2626,color:#fff
flowchart TD
    subgraph Lifecycle["Full Lifecycle Coverage"]
        direction LR
        I["Idea<br/>/brainstorm"] --> B["Build<br/>/sprint"]
        B --> AU["Audit<br/>/ship /seo /secure"]
        AU --> D["Ship<br/>/deploy"]
        D --> L["Launch<br/>/launch /compete"]
        L --> G["Grow<br/>/grow /cost"]
        G --> RE["Rescue<br/>/rescue /canary"]
    end

    style I fill:#8b5cf6,stroke:#7c3aed,color:#fff
    style B fill:#3b82f6,stroke:#2563eb,color:#fff
    style AU fill:#f59e0b,stroke:#d97706,color:#000
    style D fill:#10b981,stroke:#059669,color:#000
    style L fill:#06b6d4,stroke:#0891b2,color:#000
    style G fill:#84cc16,stroke:#65a30d,color:#000
    style RE fill:#ef4444,stroke:#dc2626,color:#fff

What /ship Does

/ship runs 6 tools in parallel and outputs a scorecard:

flowchart LR
    SHIP["/ship"] --> SEO["seo-scanner<br/>63 rules"]
    SHIP --> A11Y["a11y-scanner<br/>WCAG 2.2"]
    SHIP --> SEC["secret-scanner<br/>+ npm audit"]
    SHIP --> CODE["code-profiler<br/>N+1, leaks, ReDoS"]
    SHIP --> BUNDLE["bundle-tracker<br/>JS/CSS/images"]
    SHIP --> ENV["env-validator<br/>+ migration-checker"]

    SEO --> SC["Scorecard<br/>READY TO SHIP"]
    A11Y --> SC
    SEC --> SC
    CODE --> SC
    BUNDLE --> SC
    ENV --> SC

    style SHIP fill:#f59e0b,stroke:#d97706,color:#000
    style SC fill:#10b981,stroke:#059669,color:#000
    style SEO fill:#3b82f6,stroke:#2563eb,color:#fff
    style SEC fill:#3b82f6,stroke:#2563eb,color:#fff
    style CODE fill:#3b82f6,stroke:#2563eb,color:#fff
    style BUNDLE fill:#3b82f6,stroke:#2563eb,color:#fff
    style ENV fill:#3b82f6,stroke:#2563eb,color:#fff
+===========================================+
|      U L T R A S H I P   S C O R E       |
+===========================================+
|  SEO + AI Vis.  92/100  ############-    |
|  Security        95/100  ############-    |
|  Code Quality    88/100  ###########--    |
|  Bundle Size     97/100  ############-    |
+===========================================+
|   OVERALL         90/100                  |
|   STATUS          READY TO SHIP           |
+===========================================+
<details> <summary>Demo</summary> <img src="assets/demo.gif" alt="Ultraship — SEO audit, secret scanning, scorecard" width="100%"/> </details>

Tools (40)

Each tool is a standalone Node.js script (node tools/<name>.mjs). JSON output. Exit 0 always. No build step.

Auditing

ToolWhat it checks
seo-scanner63 rules: 39 SEO (meta tags, canonicals, headings, OG tags, structured data, sitemap, cross-page duplicate/orphan detection), 20 GEO (AI bot access in robots.txt, snippet restrictions, llms.txt, structured data for AI extraction), 4 AEO (FAQPage/HowTo/speakable schema)
a11y-scannerWCAG 2.2 A/AA static checks: missing alt text, unlabeled form controls, icon-only buttons, missing lang/title/main, heading order, positive tabindex, zoom disabled, duplicate ids, broken aria references. Zero false positives.
ship-gateBlocking quality gate — scores all auditors (shared math with /ship), compares to .ultraship/ship-gate.json thresholds, hard-fails on leaked secrets / critical findings, exits 1 on fail. Generates a pre-push hook + GitHub Actions workflow.
secret-scannerAWS keys, Stripe keys, JWT secrets, database URLs, private keys. Redacts values in output.
vibe-security-scannerVibe-Coding Security Sentinel — context secret-scanner misses: server-only secrets behind a NEXT_PUBLIC_/VITE_ prefix, a decoded Supabase service_role key exposed to the client, service_role in a "use client" file, Supabase tables with no RLS. Zero false positives.
eval-scannerLocates every LLM call site (Anthropic, OpenAI, Gemini, Mistral, Cohere, Ollama, Vercel AI SDK, LangChain) by provider + model id, detects the test runner and whether an eval suite exists. Flags AI features shipping with no evals. Seeds /evals. Zero false positives.
code-profilerN+1 queries, sync I/O in handlers, unbounded queries, missing indexes, memory leaks, sequential awaits, ReDoS risk
bundle-trackerJS/CSS/image sizes in build output. Detects heavy deps (momentdayjs, lodash→native). History for before/after. Monorepo-aware.
dep-doctorUnused dependencies via import graph analysis (not just grep). Dead wrapper files. Outdated packages.
content-scorerFlesch-Kincaid readability, keyword density, thin content detection, GEO heading analysis
lighthouse-runnerLighthouse via headless Chrome. Core Web Vitals, render-blocking resources, diagnostics.

Validation

ToolWhat it checks
health-checkHTTP status, response time, SSL certificate (issuer, expiry), 6 security headers
env-validatorCompares .env.example against actual .env. Catches missing/empty/placeholder vars.
migration-checkerPending DB migrations for Drizzle, Prisma, Knex
og-validatorOpen Graph tags, image reachability, size validation
redirect-checkerRedirect chains, loops, mixed HTTP/HTTPS. Sitemap-based bulk check.
api-smoke-testHit API endpoints, check status codes, response times, CORS headers

Generators

ToolWhat it creates
sitemap-generatorsitemap.xml from HTML files and routes
robots-generatorAI-friendly robots.txt (allows GPTBot, PerplexityBot, ClaudeBot)
llms-txt-generatorllms.txt for AI assistant discoverability
structured-data-generatorJSON-LD schema markup

Competitive & Launch

ToolWhat it does
compete-analyzerCompares two URLs: tech stack, SEO score, security headers, response time. ASCII comparison card.
launch-prepReads project, generates PH/Twitter/LinkedIn/HN copy, 14-item checklist, press kit
demo-prepFinds console.logs, TODOs, placeholder text, missing favicons. Scores demo readiness.

Operations

ToolWhat it does
incident-commanderHealth check + git culprit analysis + error patterns + rollback commands + post-mortem template
growth-trackerUptime, git velocity, SEO trajectory, dep health. Stores snapshots for week-over-week comparison.
cost-trackerLog AI token usage per feature/model. Built-in pricing for Claude, GPT-4o, Gemini. Daily trends.
pentest-scannerAutomated penetration testing: XSS, SQLi, SSTI, command injection, path traversal, CORS, JWT, GraphQL introspection, prototype pollution, race conditions, request smuggling. Zero false positives, every finding has proof-of-concept.
canary-monitorPost-deploy canary monitoring: HTTP status, response time, error patterns, baseline regression detection. Auto-saves baselines for future comparison.
retro-analyzerSprint retrospective: git velocity, commit patterns (features vs fixes), test health, hot files, shipping cadence. Generates insights and recommendations.
learnings-managerProject learnings CRUD: save, search, list, prune, export. Structured knowledge that compounds across sessions.

Project Analysis

ToolWhat it does
onboard-generatorAuto-generates developer guide: stack, directory tree, routes, schema, env vars, Mermaid diagram
architecture-mapper4 Mermaid diagrams: system overview, route tree, DB ER, data flow. Circular dependency + orphan detection.
pattern-analyzerAnalyzes testing, error handling, TypeScript usage, CI/CD, git practices. Cross-repo comparison.
audit-historySaves/compares audit scores over time

Integrations (optional)

ToolWhat it does
gsc-clientGoogle Search Console: submit sitemaps, inspect URLs, query rankings (requires ULTRASHIP_GSC_CREDENTIALS)
bing-webmasterBing Webmaster: submit sitemaps/URLs, IndexNow instant push, keyword research, backlinks, site-scan, URL inspection (requires ULTRASHIP_BING_KEY). Powers ChatGPT Search + Microsoft Copilot.
ga4-clientGoogle Analytics 4: overview, top-pages, landing-pages, traffic-sources, conversions, user-journey, devices, realtime, ai-traffic (ChatGPT/Perplexity/Copilot tracking), organic (search-only). --organic flag.
keyword-intelligence12-command keyword engine: analyze, quick-wins, cannibalization, content-gaps, intent-map, trending, high-intent, page-keywords, content-decay, difficulty, anomalies (CTR anomalies), cross-reference (GSC↔GA4). --brand flag for non-brand filtering.
index-doctorIndex diagnosis: inspect URLs via GSC URL Inspection API, diagnose 15+ coverage states, auto-fix and submit to Bing.

View source on GitHub