Name: Evals
Author: Houseofmvps

Claude Code plugin. 43 expert-level skills for building, shipping, and scaling production software. 37 audit tools (accessibility, vibe-coding security, AI evals, pentest, code quality, bundle size, SEO + AI Readiness check) plus a blocking ship-gate close the loop before deploy. A built-in Currency Guard keeps Claude on current docs, not stale training data.

Built by Kaileskkhumar, founder of HouseofMVPs and Kailxlabs

</div>

0 dependencies · 274 tests · Node.js ESM · MIT

Install

# Claude Code plugin
claude plugin marketplace add Houseofmvps/ultraship
claude plugin install ultraship

# Or standalone via npx
npx ultraship ship .
npx ultraship seo .
npx ultraship security .

How It Works

flowchart LR
    U["You type a<br/>slash command"] --> S["Skill<br/>(markdown instructions)"]
    S --> A["Agent<br/>(dispatched worker)"]
    S --> T["Tools<br/>(Node.js scripts)"]
    A --> T
    T --> O["JSON Results"]
    O --> R["Scorecard / Report /<br/>Actionable Fixes"]

    style U fill:#f59e0b,stroke:#d97706,color:#000
    style S fill:#8b5cf6,stroke:#7c3aed,color:#fff
    style A fill:#3b82f6,stroke:#2563eb,color:#fff
    style T fill:#10b981,stroke:#059669,color:#000
    style R fill:#ef4444,stroke:#dc2626,color:#fff

flowchart TD
    subgraph Lifecycle["Full Lifecycle Coverage"]
        direction LR
        I["Idea<br/>/brainstorm"] --> B["Build<br/>/sprint"]
        B --> AU["Audit<br/>/ship /seo /secure"]
        AU --> D["Ship<br/>/deploy"]
        D --> L["Launch<br/>/launch /compete"]
        L --> G["Grow<br/>/grow /cost"]
        G --> RE["Rescue<br/>/rescue /canary"]
    end

    style I fill:#8b5cf6,stroke:#7c3aed,color:#fff
    style B fill:#3b82f6,stroke:#2563eb,color:#fff
    style AU fill:#f59e0b,stroke:#d97706,color:#000
    style D fill:#10b981,stroke:#059669,color:#000
    style L fill:#06b6d4,stroke:#0891b2,color:#000
    style G fill:#84cc16,stroke:#65a30d,color:#000
    style RE fill:#ef4444,stroke:#dc2626,color:#fff

What `/ship` Does

/ship runs 6 tools in parallel and outputs a scorecard:

flowchart LR
    SHIP["/ship"] --> SEO["seo-scanner<br/>63 rules"]
    SHIP --> A11Y["a11y-scanner<br/>WCAG 2.2"]
    SHIP --> SEC["secret-scanner<br/>+ npm audit"]
    SHIP --> CODE["code-profiler<br/>N+1, leaks, ReDoS"]
    SHIP --> BUNDLE["bundle-tracker<br/>JS/CSS/images"]
    SHIP --> ENV["env-validator<br/>+ migration-checker"]

    SEO --> SC["Scorecard<br/>READY TO SHIP"]
    A11Y --> SC
    SEC --> SC
    CODE --> SC
    BUNDLE --> SC
    ENV --> SC

    style SHIP fill:#f59e0b,stroke:#d97706,color:#000
    style SC fill:#10b981,stroke:#059669,color:#000
    style SEO fill:#3b82f6,stroke:#2563eb,color:#fff
    style SEC fill:#3b82f6,stroke:#2563eb,color:#fff
    style CODE fill:#3b82f6,stroke:#2563eb,color:#fff
    style BUNDLE fill:#3b82f6,stroke:#2563eb,color:#fff
    style ENV fill:#3b82f6,stroke:#2563eb,color:#fff

+===========================================+
|      U L T R A S H I P   S C O R E       |
+===========================================+
|  SEO + AI Vis.  92/100  ############-    |
|  Security        95/100  ############-    |
|  Code Quality    88/100  ###########--    |
|  Bundle Size     97/100  ############-    |
+===========================================+
|   OVERALL         90/100                  |
|   STATUS          READY TO SHIP           |
+===========================================+

Tools (40)

Each tool is a standalone Node.js script (node tools/<name>.mjs). JSON output. Exit 0 always. No build step.

Auditing

Tool	What it checks
`seo-scanner`	63 rules: 39 SEO (meta tags, canonicals, headings, OG tags, structured data, sitemap, cross-page duplicate/orphan detection), 20 GEO (AI bot access in robots.txt, snippet restrictions, llms.txt, structured data for AI extraction), 4 AEO (FAQPage/HowTo/speakable schema)
`a11y-scanner`	WCAG 2.2 A/AA static checks: missing alt text, unlabeled form controls, icon-only buttons, missing `lang`/`title`/`main`, heading order, positive tabindex, zoom disabled, duplicate ids, broken aria references. Zero false positives.
`ship-gate`	Blocking quality gate — scores all auditors (shared math with `/ship`), compares to `.ultraship/ship-gate.json` thresholds, hard-fails on leaked secrets / critical findings, exits 1 on fail. Generates a pre-push hook + GitHub Actions workflow.
`secret-scanner`	AWS keys, Stripe keys, JWT secrets, database URLs, private keys. Redacts values in output.
`vibe-security-scanner`	Vibe-Coding Security Sentinel — context secret-scanner misses: server-only secrets behind a `NEXT_PUBLIC_`/`VITE_` prefix, a decoded Supabase `service_role` key exposed to the client, service_role in a `"use client"` file, Supabase tables with no RLS. Zero false positives.
`eval-scanner`	Locates every LLM call site (Anthropic, OpenAI, Gemini, Mistral, Cohere, Ollama, Vercel AI SDK, LangChain) by provider + model id, detects the test runner and whether an eval suite exists. Flags AI features shipping with no evals. Seeds `/evals`. Zero false positives.
`code-profiler`	N+1 queries, sync I/O in handlers, unbounded queries, missing indexes, memory leaks, sequential awaits, ReDoS risk
`bundle-tracker`	JS/CSS/image sizes in build output. Detects heavy deps (`moment`→`dayjs`, `lodash`→native). History for before/after. Monorepo-aware.
`dep-doctor`	Unused dependencies via import graph analysis (not just grep). Dead wrapper files. Outdated packages.
`content-scorer`	Flesch-Kincaid readability, keyword density, thin content detection, GEO heading analysis
`lighthouse-runner`	Lighthouse via headless Chrome. Core Web Vitals, render-blocking resources, diagnostics.

Validation

Tool	What it checks
`health-check`	HTTP status, response time, SSL certificate (issuer, expiry), 6 security headers
`env-validator`	Compares `.env.example` against actual `.env`. Catches missing/empty/placeholder vars.
`migration-checker`	Pending DB migrations for Drizzle, Prisma, Knex
`og-validator`	Open Graph tags, image reachability, size validation
`redirect-checker`	Redirect chains, loops, mixed HTTP/HTTPS. Sitemap-based bulk check.
`api-smoke-test`	Hit API endpoints, check status codes, response times, CORS headers

Generators

Tool	What it creates
`sitemap-generator`	`sitemap.xml` from HTML files and routes
`robots-generator`	AI-friendly `robots.txt` (allows GPTBot, PerplexityBot, ClaudeBot)
`llms-txt-generator`	`llms.txt` for AI assistant discoverability
`structured-data-generator`	JSON-LD schema markup

Competitive & Launch

Tool	What it does
`compete-analyzer`	Compares two URLs: tech stack, SEO score, security headers, response time. ASCII comparison card.
`launch-prep`	Reads project, generates PH/Twitter/LinkedIn/HN copy, 14-item checklist, press kit
`demo-prep`	Finds console.logs, TODOs, placeholder text, missing favicons. Scores demo readiness.

Operations

Tool	What it does
`incident-commander`	Health check + git culprit analysis + error patterns + rollback commands + post-mortem template
`growth-tracker`	Uptime, git velocity, SEO trajectory, dep health. Stores snapshots for week-over-week comparison.
`cost-tracker`	Log AI token usage per feature/model. Built-in pricing for Claude, GPT-4o, Gemini. Daily trends.
`pentest-scanner`	Automated penetration testing: XSS, SQLi, SSTI, command injection, path traversal, CORS, JWT, GraphQL introspection, prototype pollution, race conditions, request smuggling. Zero false positives, every finding has proof-of-concept.
`canary-monitor`	Post-deploy canary monitoring: HTTP status, response time, error patterns, baseline regression detection. Auto-saves baselines for future comparison.
`retro-analyzer`	Sprint retrospective: git velocity, commit patterns (features vs fixes), test health, hot files, shipping cadence. Generates insights and recommendations.
`learnings-manager`	Project learnings CRUD: save, search, list, prune, export. Structured knowledge that compounds across sessions.

Project Analysis

Tool	What it does
`onboard-generator`	Auto-generates developer guide: stack, directory tree, routes, schema, env vars, Mermaid diagram
`architecture-mapper`	4 Mermaid diagrams: system overview, route tree, DB ER, data flow. Circular dependency + orphan detection.
`pattern-analyzer`	Analyzes testing, error handling, TypeScript usage, CI/CD, git practices. Cross-repo comparison.
`audit-history`	Saves/compares audit scores over time

Integrations (optional)

Tool	What it does
`gsc-client`	Google Search Console: submit sitemaps, inspect URLs, query rankings (requires `ULTRASHIP_GSC_CREDENTIALS`)
`bing-webmaster`	Bing Webmaster: submit sitemaps/URLs, IndexNow instant push, keyword research, backlinks, site-scan, URL inspection (requires `ULTRASHIP_BING_KEY`). Powers ChatGPT Search + Microsoft Copilot.
`ga4-client`	Google Analytics 4: overview, top-pages, landing-pages, traffic-sources, conversions, user-journey, devices, realtime, ai-traffic (ChatGPT/Perplexity/Copilot tracking), organic (search-only). `--organic` flag.
`keyword-intelligence`	12-command keyword engine: analyze, quick-wins, cannibalization, content-gaps, intent-map, trending, high-intent, page-keywords, content-decay, difficulty, anomalies (CTR anomalies), cross-reference (GSC↔GA4). `--brand` flag for non-brand filtering.
`index-doctor`	Index diagnosis: inspect URLs via GSC URL Inspection API, diagnose 15+ coverage states, auto-fix and submit to Bing.

…

Evals

Skill Content

How to use

README

Install

How It Works

What `/ship` Does

Tools (40)

Auditing

Validation

Generators

Competitive & Launch

Operations

Project Analysis

Integrations (optional)

You might also like

Evals

Skill Content

How to use

README

Install

How It Works

What /ship Does

Tools (40)

Auditing

Validation

Generators

Competitive & Launch

Operations

Project Analysis

Integrations (optional)

You might also like

What `/ship` Does