Back to Skills

Verification Before Completion

Use when about to claim work is complete, fixed, or passing, before committing or creating PRs - requires running verification commands and confirming output before making any success claims; evidence before assertions always

ai
By Houseofmvps
10913Updated 1 day agoJavaScriptMIT

Skill Content

# Verification Before Completion

## Overview

Claiming work is complete without verification is dishonesty, not efficiency.

**Core principle:** Evidence before claims, always.

**Violating the letter of this rule is violating the spirit of this rule.**

## The Iron Law

```
NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE
```

If you haven't run the verification command in this message, you cannot claim it passes.

## The Gate Function

```
BEFORE claiming any status or expressing satisfaction:

1. IDENTIFY: What command proves this claim?
2. RUN: Execute the FULL command (fresh, complete)
3. READ: Full output, check exit code, count failures
4. VERIFY: Does output confirm the claim?
   - If NO: State actual status with evidence
   - If YES: State claim WITH evidence
5. ONLY THEN: Make the claim

Skip any step = lying, not verifying
```

## Common Failures

| Claim | Requires | Not Sufficient |
|-------|----------|----------------|
| Tests pass | Test command output: 0 failures | Previous run, "should pass" |
| Linter clean | Linter output: 0 errors | Partial check, extrapolation |
| Build succeeds | Build command: exit 0 | Linter passing, logs look good |
| Bug fixed | Test original symptom: passes | Code changed, assumed fixed |
| Regression test works | Red-green cycle verified | Test passes once |
| Agent completed | VCS diff shows changes | Agent reports "success" |
| Requirements met | Line-by-line checklist | Tests passing |

## Red Flags - STOP

- Using "should", "probably", "seems to"
- Expressing satisfaction before verification ("Great!", "Perfect!", "Done!", etc.)
- About to commit/push/PR without verification
- Trusting agent success reports
- Relying on partial verification
- Thinking "just this once"
- Tired and wanting work over
- **ANY wording implying success without having run verification**

## Rationalization Prevention

| Excuse | Reality |
|--------|---------|
| "Should work now" | RUN the verification |
| "I'm confident" | Confidence ≠ evidence |
| "Just this once" | No exceptions |
| "Linter passed" | Linter ≠ compiler |
| "Agent said success" | Verify independently |
| "I'm tired" | Exhaustion ≠ excuse |
| "Partial check is enough" | Partial proves nothing |
| "Different words so rule doesn't apply" | Spirit over letter |

## Key Patterns

**Tests:**
```
✅ [Run test command] [See: 34/34 pass] "All tests pass"
❌ "Should pass now" / "Looks correct"
```

**Regression tests (TDD Red-Green):**
```
✅ Write → Run (pass) → Revert fix → Run (MUST FAIL) → Restore → Run (pass)
❌ "I've written a regression test" (without red-green verification)
```

**Build:**
```
✅ [Run build] [See: exit 0] "Build passes"
❌ "Linter passed" (linter doesn't check compilation)
```

**Requirements:**
```
✅ Re-read plan → Create checklist → Verify each → Report gaps or completion
❌ "Tests pass, phase complete"
```

**Agent delegation:**
```
✅ Agent reports success → Check VCS diff → Verify changes → Report actual state
❌ Trust agent report
```

## Why This Matters

From 24 failure memories:
- your human partner said "I don't believe you" - trust broken
- Undefined functions shipped - would crash
- Missing requirements shipped - incomplete features
- Time wasted on false completion → redirect → rework
- Violates: "Honesty is a core value. If you lie, you'll be replaced."

## When To Apply

**ALWAYS before:**
- ANY variation of success/completion claims
- ANY expression of satisfaction
- ANY positive statement about work state
- Committing, PR creation, task completion
- Moving to next task
- Delegating to agents

**Rule applies to:**
- Exact phrases
- Paraphrases and synonyms
- Implications of success
- ANY communication suggesting completion/correctness

## The Bottom Line

**No shortcuts for verification.**

Run the command. Read the output. THEN claim the result.

This is non-negotiable.

How to use

  1. Copy the skill content above
  2. Create a .claude/skills directory in your project
  3. Save as .claude/skills/ultraship-verification-before-completion.md
  4. Use /ultraship-verification-before-completion in Claude Code to invoke this skill
<div align="center"> <img src="assets/hero-banner.jpg" alt="Ultraship — Claude Code Plugin" width="100%"/>

Claude Code plugin. 43 expert-level skills for building, shipping, and scaling production software. 37 audit tools (accessibility, vibe-coding security, AI evals, pentest, code quality, bundle size, SEO + AI Readiness check) plus a blocking ship-gate close the loop before deploy. A built-in Currency Guard keeps Claude on current docs, not stale training data.

npm version npm downloads npm total GitHub stars License: MIT CI Sponsor


Follow @kaileskkhumar LinkedIn houseofmvps.com kailxlabs.co

Built by Kaileskkhumar, founder of HouseofMVPs and Kailxlabs

</div>
0 dependencies · 274 tests · Node.js ESM · MIT

Install

# Claude Code plugin
claude plugin marketplace add Houseofmvps/ultraship
claude plugin install ultraship

# Or standalone via npx
npx ultraship ship .
npx ultraship seo .
npx ultraship security .

How It Works

flowchart LR
    U["You type a<br/>slash command"] --> S["Skill<br/>(markdown instructions)"]
    S --> A["Agent<br/>(dispatched worker)"]
    S --> T["Tools<br/>(Node.js scripts)"]
    A --> T
    T --> O["JSON Results"]
    O --> R["Scorecard / Report /<br/>Actionable Fixes"]

    style U fill:#f59e0b,stroke:#d97706,color:#000
    style S fill:#8b5cf6,stroke:#7c3aed,color:#fff
    style A fill:#3b82f6,stroke:#2563eb,color:#fff
    style T fill:#10b981,stroke:#059669,color:#000
    style R fill:#ef4444,stroke:#dc2626,color:#fff
flowchart TD
    subgraph Lifecycle["Full Lifecycle Coverage"]
        direction LR
        I["Idea<br/>/brainstorm"] --> B["Build<br/>/sprint"]
        B --> AU["Audit<br/>/ship /seo /secure"]
        AU --> D["Ship<br/>/deploy"]
        D --> L["Launch<br/>/launch /compete"]
        L --> G["Grow<br/>/grow /cost"]
        G --> RE["Rescue<br/>/rescue /canary"]
    end

    style I fill:#8b5cf6,stroke:#7c3aed,color:#fff
    style B fill:#3b82f6,stroke:#2563eb,color:#fff
    style AU fill:#f59e0b,stroke:#d97706,color:#000
    style D fill:#10b981,stroke:#059669,color:#000
    style L fill:#06b6d4,stroke:#0891b2,color:#000
    style G fill:#84cc16,stroke:#65a30d,color:#000
    style RE fill:#ef4444,stroke:#dc2626,color:#fff

What /ship Does

/ship runs 6 tools in parallel and outputs a scorecard:

flowchart LR
    SHIP["/ship"] --> SEO["seo-scanner<br/>63 rules"]
    SHIP --> A11Y["a11y-scanner<br/>WCAG 2.2"]
    SHIP --> SEC["secret-scanner<br/>+ npm audit"]
    SHIP --> CODE["code-profiler<br/>N+1, leaks, ReDoS"]
    SHIP --> BUNDLE["bundle-tracker<br/>JS/CSS/images"]
    SHIP --> ENV["env-validator<br/>+ migration-checker"]

    SEO --> SC["Scorecard<br/>READY TO SHIP"]
    A11Y --> SC
    SEC --> SC
    CODE --> SC
    BUNDLE --> SC
    ENV --> SC

    style SHIP fill:#f59e0b,stroke:#d97706,color:#000
    style SC fill:#10b981,stroke:#059669,color:#000
    style SEO fill:#3b82f6,stroke:#2563eb,color:#fff
    style SEC fill:#3b82f6,stroke:#2563eb,color:#fff
    style CODE fill:#3b82f6,stroke:#2563eb,color:#fff
    style BUNDLE fill:#3b82f6,stroke:#2563eb,color:#fff
    style ENV fill:#3b82f6,stroke:#2563eb,color:#fff
+===========================================+
|      U L T R A S H I P   S C O R E       |
+===========================================+
|  SEO + AI Vis.  92/100  ############-    |
|  Security        95/100  ############-    |
|  Code Quality    88/100  ###########--    |
|  Bundle Size     97/100  ############-    |
+===========================================+
|   OVERALL         90/100                  |
|   STATUS          READY TO SHIP           |
+===========================================+
<details> <summary>Demo</summary> <img src="assets/demo.gif" alt="Ultraship — SEO audit, secret scanning, scorecard" width="100%"/> </details>

Tools (40)

Each tool is a standalone Node.js script (node tools/<name>.mjs). JSON output. Exit 0 always. No build step.

Auditing

ToolWhat it checks
seo-scanner63 rules: 39 SEO (meta tags, canonicals, headings, OG tags, structured data, sitemap, cross-page duplicate/orphan detection), 20 GEO (AI bot access in robots.txt, snippet restrictions, llms.txt, structured data for AI extraction), 4 AEO (FAQPage/HowTo/speakable schema)
a11y-scannerWCAG 2.2 A/AA static checks: missing alt text, unlabeled form controls, icon-only buttons, missing lang/title/main, heading order, positive tabindex, zoom disabled, duplicate ids, broken aria references. Zero false positives.
ship-gateBlocking quality gate — scores all auditors (shared math with /ship), compares to .ultraship/ship-gate.json thresholds, hard-fails on leaked secrets / critical findings, exits 1 on fail. Generates a pre-push hook + GitHub Actions workflow.
secret-scannerAWS keys, Stripe keys, JWT secrets, database URLs, private keys. Redacts values in output.
vibe-security-scannerVibe-Coding Security Sentinel — context secret-scanner misses: server-only secrets behind a NEXT_PUBLIC_/VITE_ prefix, a decoded Supabase service_role key exposed to the client, service_role in a "use client" file, Supabase tables with no RLS. Zero false positives.
eval-scannerLocates every LLM call site (Anthropic, OpenAI, Gemini, Mistral, Cohere, Ollama, Vercel AI SDK, LangChain) by provider + model id, detects the test runner and whether an eval suite exists. Flags AI features shipping with no evals. Seeds /evals. Zero false positives.
code-profilerN+1 queries, sync I/O in handlers, unbounded queries, missing indexes, memory leaks, sequential awaits, ReDoS risk
bundle-trackerJS/CSS/image sizes in build output. Detects heavy deps (momentdayjs, lodash→native). History for before/after. Monorepo-aware.
dep-doctorUnused dependencies via import graph analysis (not just grep). Dead wrapper files. Outdated packages.
content-scorerFlesch-Kincaid readability, keyword density, thin content detection, GEO heading analysis
lighthouse-runnerLighthouse via headless Chrome. Core Web Vitals, render-blocking resources, diagnostics.

Validation

ToolWhat it checks
health-checkHTTP status, response time, SSL certificate (issuer, expiry), 6 security headers
env-validatorCompares .env.example against actual .env. Catches missing/empty/placeholder vars.
migration-checkerPending DB migrations for Drizzle, Prisma, Knex
og-validatorOpen Graph tags, image reachability, size validation
redirect-checkerRedirect chains, loops, mixed HTTP/HTTPS. Sitemap-based bulk check.
api-smoke-testHit API endpoints, check status codes, response times, CORS headers

Generators

ToolWhat it creates
sitemap-generatorsitemap.xml from HTML files and routes
robots-generatorAI-friendly robots.txt (allows GPTBot, PerplexityBot, ClaudeBot)
llms-txt-generatorllms.txt for AI assistant discoverability
structured-data-generatorJSON-LD schema markup

Competitive & Launch

ToolWhat it does
compete-analyzerCompares two URLs: tech stack, SEO score, security headers, response time. ASCII comparison card.
launch-prepReads project, generates PH/Twitter/LinkedIn/HN copy, 14-item checklist, press kit
demo-prepFinds console.logs, TODOs, placeholder text, missing favicons. Scores demo readiness.

Operations

ToolWhat it does
incident-commanderHealth check + git culprit analysis + error patterns + rollback commands + post-mortem template
growth-trackerUptime, git velocity, SEO trajectory, dep health. Stores snapshots for week-over-week comparison.
cost-trackerLog AI token usage per feature/model. Built-in pricing for Claude, GPT-4o, Gemini. Daily trends.
pentest-scannerAutomated penetration testing: XSS, SQLi, SSTI, command injection, path traversal, CORS, JWT, GraphQL introspection, prototype pollution, race conditions, request smuggling. Zero false positives, every finding has proof-of-concept.
canary-monitorPost-deploy canary monitoring: HTTP status, response time, error patterns, baseline regression detection. Auto-saves baselines for future comparison.
retro-analyzerSprint retrospective: git velocity, commit patterns (features vs fixes), test health, hot files, shipping cadence. Generates insights and recommendations.
learnings-managerProject learnings CRUD: save, search, list, prune, export. Structured knowledge that compounds across sessions.

Project Analysis

ToolWhat it does
onboard-generatorAuto-generates developer guide: stack, directory tree, routes, schema, env vars, Mermaid diagram
architecture-mapper4 Mermaid diagrams: system overview, route tree, DB ER, data flow. Circular dependency + orphan detection.
pattern-analyzerAnalyzes testing, error handling, TypeScript usage, CI/CD, git practices. Cross-repo comparison.
audit-historySaves/compares audit scores over time

Integrations (optional)

ToolWhat it does
gsc-clientGoogle Search Console: submit sitemaps, inspect URLs, query rankings (requires ULTRASHIP_GSC_CREDENTIALS)
bing-webmasterBing Webmaster: submit sitemaps/URLs, IndexNow instant push, keyword research, backlinks, site-scan, URL inspection (requires ULTRASHIP_BING_KEY). Powers ChatGPT Search + Microsoft Copilot.
ga4-clientGoogle Analytics 4: overview, top-pages, landing-pages, traffic-sources, conversions, user-journey, devices, realtime, ai-traffic (ChatGPT/Perplexity/Copilot tracking), organic (search-only). --organic flag.
keyword-intelligence12-command keyword engine: analyze, quick-wins, cannibalization, content-gaps, intent-map, trending, high-intent, page-keywords, content-decay, difficulty, anomalies (CTR anomalies), cross-reference (GSC↔GA4). --brand flag for non-brand filtering.
index-doctorIndex diagnosis: inspect URLs via GSC URL Inspection API, diagnose 15+ coverage states, auto-fix and submit to Bing.

View source on GitHub