Pentest

Name: Pentest
Author: Houseofmvps

Automated penetration testing — web, API, browser, GitHub, and local code. Zero false positives. Use when user wants to hack-test their app, find vulnerabilities, or run security pentesting.

githubsecuritytestingbrowserapi

By Houseofmvps

110 13Updated 1 week agoJavaScriptMIT

Skill Content

You are an elite penetration tester. Your job is to find every exploitable vulnerability in the user's application across ALL attack surfaces. Every finding MUST have proof — no guesses, no maybes, no false positives.

## Process

Run all 5 phases. Skip phases only if the attack surface doesn't exist (e.g., no GitHub repo, no browser URL).

---

### Phase 1: Web & API Penetration Test

Run the pentest scanner tool against the user's deployed URL or local dev server:

```bash
node ${CLAUDE_PLUGIN_ROOT}/tools/pentest-scanner.mjs <target-url> --deep
```

If the user has authentication (cookies, tokens, API keys), include them:
```bash
node ${CLAUDE_PLUGIN_ROOT}/tools/pentest-scanner.mjs <target-url> --deep --cookie "session=<value>" --header "Authorization: Bearer <token>"
```

The tool covers:
- Recon: endpoint discovery, tech stack fingerprinting, sensitive file exposure
- Injection: XSS (reflected), SQL injection (error + time-based blind), SSTI, command injection, path traversal
- Auth: JWT analysis (alg:none, expired tokens, sensitive data in payload), cookie flags, CSRF, session fixation
- Config: security headers (deep CSP analysis), CORS misconfiguration, TLS/SSL, server info disclosure
- API: GraphQL introspection, HTTP method tampering, parameter pollution, prototype pollution
- Network: host header injection, HTTP request smuggling, open redirect
- Logic: race conditions (concurrent request testing)
- Disclosure: stack traces, internal paths, source map exposure, error page leakage

**API-specific testing**: For REST APIs, also test:
1. Run the scanner against each API base path: `/api/v1`, `/api`, `/v1`
2. Test BOLA/IDOR: If you see endpoints with IDs (e.g., `/api/users/1`), try sequential IDs and check if access control is enforced
3. Test mass assignment: POST/PUT to endpoints with extra fields (`{"role":"admin","isAdmin":true}`) and check if they persist
4. Test broken function-level auth: Access admin endpoints without admin credentials
5. Test excessive data exposure: Check if API responses return more fields than the UI uses

---

### Phase 2: Browser Penetration Test (via Playwright MCP)

Use the Playwright MCP server to test client-side vulnerabilities that HTTP-only tools can't detect:

1. **Navigate to the target**:
   - Use `browser_navigate` to load the app
   - Use `browser_snapshot` to capture the initial state

2. **DOM-based XSS testing**:
   - Use `browser_fill_form` to inject XSS payloads into every input field
   - Use `browser_evaluate` to check if `document.cookie` is accessible from injected context
   - Test URL hash/fragment-based XSS: navigate to `target#<script>alert(1)</script>`
   - Check `browser_console_messages` for CSP violations or JS errors revealing vulnerabilities

3. **Authentication flow testing**:
   - Test login with default credentials (admin/admin, admin/password, test/test)
   - Test account lockout: attempt 20 rapid login failures, check if account locks
   - Test session persistence: login, close browser, reopen — check if session persists without re-auth
   - Test logout completeness: logout, press back button — check if cached pages are accessible

4. **Client-side storage audit**:
   - Use `browser_evaluate` to dump `localStorage`, `sessionStorage`, `document.cookie`
   - Flag any tokens, passwords, PII, or API keys stored client-side
   - Check if sensitive data persists after logout

5. **Form and input testing**:
   - Submit forms with boundary values (empty, max-length, special chars, negative numbers)
   - Test file upload if present: upload `.html`, `.svg`, `.php` files — check if they execute
   - Test for client-side validation bypass: disable JS validation via `browser_evaluate`, submit invalid data

6. **Mixed content and resource integrity**:
   - Check `browser_network_requests` for HTTP resources loaded on HTTPS pages
   - Check for missing Subresource Integrity (SRI) on CDN scripts
   - Flag external scripts loaded without integrity hashes

7. **Clickjacking test**:
   - Use `browser_evaluate` to check if `window.top === window.self`
   - If the page can be framed (no X-Frame-Options or frame-ancestors CSP), flag it

---

### Phase 3: GitHub Repository Security Audit

If the user has a GitHub repository, analyze it for security issues:

1. **Exposed secrets in git history**:
   - Run: `git log --all -p --diff-filter=A | grep -E '(password|secret|api[_-]?key|token|credential|private[_-]?key)\s*[:=]' | head -50`
   - Check for secrets that were committed and later deleted (still in history)
   - Run: `git log --all --diff-filter=D -- '*.env' '*.pem' '*.key'` to find deleted secret files

2. **Branch protection**:
   - Check if main/master branch has protection rules
   - Check for force-push ability on protected branches
   - Check if PR reviews are required

3. **GitHub Actions security**:
   - Read `.github/workflows/*.yml` files
   - Flag `pull_request_target` with `actions/checkout` of PR code (code injection vector)
   - Flag `${{ github.event.issue.title }}` or similar untrusted input in `run:` blocks (injection)
   - Flag workflows with `permissions: write-all` or missing permissions block
   - Flag use of `actions/checkout@v2` or other unpinned actions (should use SHA)
   - Flag secrets exposed via `echo` in workflow logs

4. **Dependency security**:
   - Run `npm audit` / `pnpm audit` / `yarn audit` for dependency vulnerabilities
   - Check for `postinstall` scripts in dependencies that could be malicious
   - Check for typosquatting risks (packages with similar names to popular ones)
   - Verify lockfile integrity (no modified integrity hashes)

5. **.gitignore audit**:
   - Verify `.env`, `.env.*`, `*.pem`, `*.key`, `node_modules/`, `.DS_Store` are ignored
   - Flag any sensitive file patterns NOT in .gitignore

---

### Phase 4: Local Codebase Security Analysis

Deep static analysis of the local codebase for vulnerability patterns:

1. **Authentication & Authorization**:
   - Search for hardcoded credentials: `grep -r 'password\s*[:=]\s*["\x27][^"\x27]+' --include='*.{ts,js,py,go,java}'`
   - Search for JWT secret in code: `grep -r 'jwt.*secret\|JWT_SECRET' --include='*.{ts,js,env}'`
   - Check for missing auth middleware on routes
   - Check for `verify: false` or `rejectUnauthorized: false` in HTTPS/TLS configs
   - Check for `alg: 'none'` or missing algorithm enforcement in JWT verification

2. **Injection vulnerabilities**:
   - SQL injection: Search for string concatenation in queries (`"SELECT.*" \+ |f"SELECT|\$\{.*\}.*SELECT`)
   - Command injection: Search for `exec(`, `execSync(`, `child_process`, `os.system(`, `subprocess.call(` with user input
   - Path traversal: Search for file operations with user input (`readFileSync(req.`, `open(request.`)
   - XSS: Search for `innerHTML`, `dangerouslySetInnerHTML`, `v-html`, `| safe`, `mark_safe`
   - NoSQL injection: Search for `$where`, `$gt`, `$ne`, `$regex` in query objects from user input
   - LDAP injection: Search for unsanitized input in LDAP filters
   - XML/XXE: Search for XML parsing without disabling external entities

3. **Cryptography issues**:
   - Search for weak hashing: `md5(`, `sha1(`, `crypto.createHash('md5')`
   - Search for weak encryption: `DES`, `RC4`, `ECB mode`
   - Search for `Math.random()` used for security (tokens, IDs, secrets)
   - Search for hardcoded encryption keys/IVs

4. **Data exposure**:
   - Search for PII in logs: `console.log.*password|logger.*email|print.*ssn`
   - Search for stack traces returned to clients: `res.send(err)`, `res.json({ error: err.stack })`
   - Search for overly permissive CORS: `origin: '*'` or `origin: true`
   - Search for sensitive data in URL params (passwords, tokens in GET requests)

5. **Configuration security**:
   - Check for debug mode enabled in production configs
   - Check for default/example credentials in config files
   - Check for overly permissive file permissions
   - Check for missing rate limiting on authentication endpoints
   - Check for missing input validation on API endpoints

6. **Prototype pollution vectors** (Node.js specific):
   - Search for `Object.assign({}, userInput)` without sanitization
   - Search for deep merge/clone functions with user-controlled input
   - Search for `lodash.merge`, `lodash.set`, `lodash.defaultsDeep` with user input

---

### Phase 5: Report

Present findings as a **severity-ranked pentest report**:

#### Report Structure

1. **Executive Summary**: One-paragraph overview for non-technical stakeholders
2. **Pentest Report Card**: Include the ASCII report card from the scanner tool output
3. **Attack Surface**: What was tested (URLs, endpoints, repos, local paths)
4. **Critical & High Findings**: Each with:
   - Title and severity
   - Proof of concept (exact request/response or code location)
   - Impact description (what an attacker could do)
   - Fix with code snippet
5. **Medium & Low Findings**: Grouped by category
6. **Recommendations**: Priority-ordered action items

#### Severity Definitions

- **CRITICAL**: Remote code execution, full database access, authentication bypass, exposed secrets with active credentials
- **HIGH**: XSS, SQL injection, CORS credential theft, exposed admin panels, missing critical security headers
- **MEDIUM**: CSRF, open redirect, information disclosure, missing security headers, weak TLS
- **LOW**: Version disclosure, missing optional headers, minor misconfigurations

#### Verification Standard

Every finding MUST include:
- The exact request or code location that proves the vulnerability
- The exact response or behavior that confirms exploitation
- Why this is not a false positive

If you cannot verify a finding, DO NOT include it. One verified critical finding is worth more than twenty unverified warnings.

How to use

Copy the skill content above
Create a .claude/skills directory in your project
Save as .claude/skills/ultraship-pentest.md
Use /ultraship-pentest in Claude Code to invoke this skill

README

View on GitHub

Claude Code plugin. 43 expert-level skills for building, shipping, and scaling production software. 37 audit tools (accessibility, vibe-coding security, AI evals, pentest, code quality, bundle size, SEO + AI Readiness check) plus a blocking ship-gate close the loop before deploy. A built-in Currency Guard keeps Claude on current docs, not stale training data.

Built by Kaileskkhumar, founder of HouseofMVPs and Kailxlabs

</div>

0 dependencies · 274 tests · Node.js ESM · MIT

Install

# Claude Code plugin
claude plugin marketplace add Houseofmvps/ultraship
claude plugin install ultraship

# Or standalone via npx
npx ultraship ship .
npx ultraship seo .
npx ultraship security .

How It Works

flowchart LR
    U["You type a<br/>slash command"] --> S["Skill<br/>(markdown instructions)"]
    S --> A["Agent<br/>(dispatched worker)"]
    S --> T["Tools<br/>(Node.js scripts)"]
    A --> T
    T --> O["JSON Results"]
    O --> R["Scorecard / Report /<br/>Actionable Fixes"]

    style U fill:#f59e0b,stroke:#d97706,color:#000
    style S fill:#8b5cf6,stroke:#7c3aed,color:#fff
    style A fill:#3b82f6,stroke:#2563eb,color:#fff
    style T fill:#10b981,stroke:#059669,color:#000
    style R fill:#ef4444,stroke:#dc2626,color:#fff

flowchart TD
    subgraph Lifecycle["Full Lifecycle Coverage"]
        direction LR
        I["Idea<br/>/brainstorm"] --> B["Build<br/>/sprint"]
        B --> AU["Audit<br/>/ship /seo /secure"]
        AU --> D["Ship<br/>/deploy"]
        D --> L["Launch<br/>/launch /compete"]
        L --> G["Grow<br/>/grow /cost"]
        G --> RE["Rescue<br/>/rescue /canary"]
    end

    style I fill:#8b5cf6,stroke:#7c3aed,color:#fff
    style B fill:#3b82f6,stroke:#2563eb,color:#fff
    style AU fill:#f59e0b,stroke:#d97706,color:#000
    style D fill:#10b981,stroke:#059669,color:#000
    style L fill:#06b6d4,stroke:#0891b2,color:#000
    style G fill:#84cc16,stroke:#65a30d,color:#000
    style RE fill:#ef4444,stroke:#dc2626,color:#fff

What `/ship` Does

/ship runs 6 tools in parallel and outputs a scorecard:

flowchart LR
    SHIP["/ship"] --> SEO["seo-scanner<br/>63 rules"]
    SHIP --> A11Y["a11y-scanner<br/>WCAG 2.2"]
    SHIP --> SEC["secret-scanner<br/>+ npm audit"]
    SHIP --> CODE["code-profiler<br/>N+1, leaks, ReDoS"]
    SHIP --> BUNDLE["bundle-tracker<br/>JS/CSS/images"]
    SHIP --> ENV["env-validator<br/>+ migration-checker"]

    SEO --> SC["Scorecard<br/>READY TO SHIP"]
    A11Y --> SC
    SEC --> SC
    CODE --> SC
    BUNDLE --> SC
    ENV --> SC

    style SHIP fill:#f59e0b,stroke:#d97706,color:#000
    style SC fill:#10b981,stroke:#059669,color:#000
    style SEO fill:#3b82f6,stroke:#2563eb,color:#fff
    style SEC fill:#3b82f6,stroke:#2563eb,color:#fff
    style CODE fill:#3b82f6,stroke:#2563eb,color:#fff
    style BUNDLE fill:#3b82f6,stroke:#2563eb,color:#fff
    style ENV fill:#3b82f6,stroke:#2563eb,color:#fff

+===========================================+
|      U L T R A S H I P   S C O R E       |
+===========================================+
|  SEO + AI Vis.  92/100  ############-    |
|  Security        95/100  ############-    |
|  Code Quality    88/100  ###########--    |
|  Bundle Size     97/100  ############-    |
+===========================================+
|   OVERALL         90/100                  |
|   STATUS          READY TO SHIP           |
+===========================================+

Tools (40)

Each tool is a standalone Node.js script (node tools/<name>.mjs). JSON output. Exit 0 always. No build step.

Auditing

Tool	What it checks
`seo-scanner`	63 rules: 39 SEO (meta tags, canonicals, headings, OG tags, structured data, sitemap, cross-page duplicate/orphan detection), 20 GEO (AI bot access in robots.txt, snippet restrictions, llms.txt, structured data for AI extraction), 4 AEO (FAQPage/HowTo/speakable schema)
`a11y-scanner`	WCAG 2.2 A/AA static checks: missing alt text, unlabeled form controls, icon-only buttons, missing `lang`/`title`/`main`, heading order, positive tabindex, zoom disabled, duplicate ids, broken aria references. Zero false positives.
`ship-gate`	Blocking quality gate — scores all auditors (shared math with `/ship`), compares to `.ultraship/ship-gate.json` thresholds, hard-fails on leaked secrets / critical findings, exits 1 on fail. Generates a pre-push hook + GitHub Actions workflow.
`secret-scanner`	AWS keys, Stripe keys, JWT secrets, database URLs, private keys. Redacts values in output.
`vibe-security-scanner`	Vibe-Coding Security Sentinel — context secret-scanner misses: server-only secrets behind a `NEXT_PUBLIC_`/`VITE_` prefix, a decoded Supabase `service_role` key exposed to the client, service_role in a `"use client"` file, Supabase tables with no RLS. Zero false positives.
`eval-scanner`	Locates every LLM call site (Anthropic, OpenAI, Gemini, Mistral, Cohere, Ollama, Vercel AI SDK, LangChain) by provider + model id, detects the test runner and whether an eval suite exists. Flags AI features shipping with no evals. Seeds `/evals`. Zero false positives.
`code-profiler`	N+1 queries, sync I/O in handlers, unbounded queries, missing indexes, memory leaks, sequential awaits, ReDoS risk
`bundle-tracker`	JS/CSS/image sizes in build output. Detects heavy deps (`moment`→`dayjs`, `lodash`→native). History for before/after. Monorepo-aware.
`dep-doctor`	Unused dependencies via import graph analysis (not just grep). Dead wrapper files. Outdated packages.
`content-scorer`	Flesch-Kincaid readability, keyword density, thin content detection, GEO heading analysis
`lighthouse-runner`	Lighthouse via headless Chrome. Core Web Vitals, render-blocking resources, diagnostics.

Validation

Tool	What it checks
`health-check`	HTTP status, response time, SSL certificate (issuer, expiry), 6 security headers
`env-validator`	Compares `.env.example` against actual `.env`. Catches missing/empty/placeholder vars.
`migration-checker`	Pending DB migrations for Drizzle, Prisma, Knex
`og-validator`	Open Graph tags, image reachability, size validation
`redirect-checker`	Redirect chains, loops, mixed HTTP/HTTPS. Sitemap-based bulk check.
`api-smoke-test`	Hit API endpoints, check status codes, response times, CORS headers

Generators

Tool	What it creates
`sitemap-generator`	`sitemap.xml` from HTML files and routes
`robots-generator`	AI-friendly `robots.txt` (allows GPTBot, PerplexityBot, ClaudeBot)
`llms-txt-generator`	`llms.txt` for AI assistant discoverability
`structured-data-generator`	JSON-LD schema markup

Competitive & Launch

Tool	What it does
`compete-analyzer`	Compares two URLs: tech stack, SEO score, security headers, response time. ASCII comparison card.
`launch-prep`	Reads project, generates PH/Twitter/LinkedIn/HN copy, 14-item checklist, press kit
`demo-prep`	Finds console.logs, TODOs, placeholder text, missing favicons. Scores demo readiness.

Operations

Tool	What it does
`incident-commander`	Health check + git culprit analysis + error patterns + rollback commands + post-mortem template
`growth-tracker`	Uptime, git velocity, SEO trajectory, dep health. Stores snapshots for week-over-week comparison.
`cost-tracker`	Log AI token usage per feature/model. Built-in pricing for Claude, GPT-4o, Gemini. Daily trends.
`pentest-scanner`	Automated penetration testing: XSS, SQLi, SSTI, command injection, path traversal, CORS, JWT, GraphQL introspection, prototype pollution, race conditions, request smuggling. Zero false positives, every finding has proof-of-concept.
`canary-monitor`	Post-deploy canary monitoring: HTTP status, response time, error patterns, baseline regression detection. Auto-saves baselines for future comparison.
`retro-analyzer`	Sprint retrospective: git velocity, commit patterns (features vs fixes), test health, hot files, shipping cadence. Generates insights and recommendations.
`learnings-manager`	Project learnings CRUD: save, search, list, prune, export. Structured knowledge that compounds across sessions.

Project Analysis

Tool	What it does
`onboard-generator`	Auto-generates developer guide: stack, directory tree, routes, schema, env vars, Mermaid diagram
`architecture-mapper`	4 Mermaid diagrams: system overview, route tree, DB ER, data flow. Circular dependency + orphan detection.
`pattern-analyzer`	Analyzes testing, error handling, TypeScript usage, CI/CD, git practices. Cross-repo comparison.
`audit-history`	Saves/compares audit scores over time

Integrations (optional)

Tool	What it does
`gsc-client`	Google Search Console: submit sitemaps, inspect URLs, query rankings (requires `ULTRASHIP_GSC_CREDENTIALS`)
`bing-webmaster`	Bing Webmaster: submit sitemaps/URLs, IndexNow instant push, keyword research, backlinks, site-scan, URL inspection (requires `ULTRASHIP_BING_KEY`). Powers ChatGPT Search + Microsoft Copilot.
`ga4-client`	Google Analytics 4: overview, top-pages, landing-pages, traffic-sources, conversions, user-journey, devices, realtime, ai-traffic (ChatGPT/Perplexity/Copilot tracking), organic (search-only). `--organic` flag.
`keyword-intelligence`	12-command keyword engine: analyze, quick-wins, cannibalization, content-gaps, intent-map, trending, high-intent, page-keywords, content-decay, difficulty, anomalies (CTR anomalies), cross-reference (GSC↔GA4). `--brand` flag for non-brand filtering.
`index-doctor`	Index diagnosis: inspect URLs via GSC URL Inspection API, diagnose 15+ coverage states, auto-fix and submit to Bing.