Skill Seekers
English | 简体中文 | 日本語 | 한국어 | Español | Français | Deutsch | Português | Türkçe | العربية | हिन्दी | Русский
<a href="https://trendshift.io/repositories/18329" target="_blank"><img src="https://trendshift.io/api/badge/repositories/18329" alt="yusufkaraaslan%2FSkill_Seekers | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
🧠 The data layer for AI systems. Skill Seekers turns documentation sites, GitHub repos, PDFs, videos, notebooks, wikis, and 10+ more source types into structured knowledge assets—ready to power AI Skills (Claude, Gemini, OpenAI), RAG pipelines (LangChain, LlamaIndex, Pinecone), and AI coding assistants (Cursor, Windsurf, Cline) in minutes, not hours.
🌐 Visit SkillSeekersWeb.com - Browse 24+ preset configs, share your configs, and access complete documentation!
📋 View Development Roadmap & Tasks - 134 tasks across 10 categories, pick any to contribute!
🌐 Ecosystem
Skill Seekers is a multi-repo project. Here's where everything lives:
| Repository | Description | Links |
|---|---|---|
| Skill_Seekers | Core CLI & MCP server (this repo) | PyPI |
| skillseekersweb | Website & documentation | Live |
| skill-seekers-configs | Community config repository | |
| skill-seekers-action | GitHub Action for CI/CD | |
| skill-seekers-plugin | Claude Code plugin | |
| homebrew-skill-seekers | Homebrew tap for macOS |
Want to contribute? The website and configs repos are great starting points for new contributors!
🧠 The Data Layer for AI Systems
Skill Seekers is the universal preprocessing layer that sits between raw documentation and every AI system that consumes it. Whether you are building Claude skills, a LangChain RAG pipeline, or a Cursor .cursorrules file — the data preparation is identical. You do it once, and export to all targets.
# One command → structured knowledge asset
skill-seekers create https://docs.react.dev/
# or: skill-seekers create facebook/react
# or: skill-seekers create ./my-project
# Export to any AI system
skill-seekers package output/react --target claude # → Claude AI Skill (ZIP)
skill-seekers package output/react --target langchain # → LangChain Documents
skill-seekers package output/react --target llama-index # → LlamaIndex TextNodes
skill-seekers package output/react --target cursor # → .cursorrules
skill-seekers package output/react --target ibm-bob # → IBM Bob skill directoryWhat gets built
| Output | Target | What it powers |
|---|---|---|
| Claude Skill (ZIP + YAML) | --target claude | Claude Code, Claude API |
| Gemini Skill (tar.gz) | --target gemini | Google Gemini |
| OpenAI / Custom GPT (ZIP) | --target openai | GPT-4o, custom assistants |
| LangChain Documents | --target langchain | QA chains, agents, retrievers |
| LlamaIndex TextNodes | --target llama-index | Query engines, chat engines |
| Haystack Documents | --target haystack | Enterprise RAG pipelines |
| Pinecone-ready (Markdown) | --target markdown | Vector upsert |
| ChromaDB / FAISS / Qdrant | --target chroma/faiss/qdrant | Local vector DBs |
| IBM Bob Skill (directory) | --target ibm-bob | IBM Bob project/global skills |
Cursor .cursorrules | --target markdown → copy SKILL.md | Cursor IDE .cursorrules |
| Windsurf / Cline / Continue | --target claude → copy | VS Code, IntelliJ, Vim |
Why it matters
- ⚡ 99% faster — Days of manual data prep → 15–45 minutes
- 🎯 AI Skill quality — 500+ line SKILL.md files with examples, patterns, and guides
- 📊 RAG-ready chunks — Smart chunking preserves code blocks and maintains context
- 🎬 Videos — Extract code, transcripts, and structured knowledge from YouTube and local videos
- 🔄 Multi-source — Combine 18 source types (docs, GitHub, PDFs, videos, notebooks, wikis, and more) into one knowledge asset
- 🌐 One prep, every target — Export the same asset to 21 platforms without re-scraping
- ✅ Battle-tested — 3,700+ tests, 24+ framework presets, production-ready
🚀 Quick Start (3 Commands)
# 1. Install
pip install skill-seekers
# 2. Create skill from any source
skill-seekers create https://docs.django.com/
# 3. Package for your AI platform
skill-seekers package output/django --target claudeThat's it! You now have output/django-claude.zip ready to use.
# Use a different AI agent for enhancement (default: claude)
skill-seekers create https://docs.django.com/ --agent kimi
skill-seekers create https://docs.django.com/ --agent codex
skill-seekers create https://docs.django.com/ --agent-cmd "my-custom-agent run"🛰️ AI-driven project scan (new)
Point scan at any project and an AI agent reads its manifests, README,
Dockerfile/CI and sampled source imports — then emits one config per detected
framework plus a <project>-codebase.json for your own code. Pins the
detected version so re-running reports bumps:
skill-seekers scan ./my-react-app --out ./configs/scanned/
# → react.json, vite.json, tailwind.json, jest.json, my-react-app-codebase.json
# Then build any of them
skill-seekers create ./configs/scanned/react.jsonIf a detection has no existing preset, the AI generates a fresh config; on exit you can optionally publish it back to the community registry.
Other Sources (18 Supported)
# GitHub repository
skill-seekers create facebook/react
# Local project
skill-seekers create ./my-project
# PDF document
skill-seekers create manual.pdf
# Word document
skill-seekers create report.docx
# EPUB e-book
skill-seekers create book.epub
# Jupyter Notebook
skill-seekers create notebook.ipynb
# OpenAPI spec
skill-seekers create openapi.yaml
# PowerPoint presentation
skill-seekers create presentation.pptx
# AsciiDoc document
skill-seekers create guide.adoc
# Local HTML file (auto-detected by extension)
skill-seekers create page.html
# Whole directory of HTML files (auto-detected for HTML-dominant dirs)
skill-seekers create ./mirror_output/site/
# Force HTML mode on a mixed/code-heavy directory
skill-seekers create ./repo/ --html-path ./repo/docs/build/html/
# RSS/Atom feed
skill-seekers create feed.rss
# Man page
skill-seekers create curl.1
# Video (YouTube, Vimeo, or local file — requires skill-seekers[video])
skill-seekers create --video-url https://www.youtube.com/watch?v=... --name mytutorial
# First time? Auto-install GPU-aware visual deps:
skill-seekers create --setup
# Confluence wiki
skill-seekers create --space-key TEAM --name wiki
# Notion pages
skill-seekers create --database-id ... --name docs
# Slack/Discord chat export
skill-seekers create --chat-export-path ./slack-export --name team-chatExport Everywhere
# Package for multiple platforms
for platform in claude gemini openai langchain; do
skill-seekers package output/django --target $platform
doneWhat is Skill Seekers?
Skill Seekers is the data layer for AI systems. It transforms 18 source types—documentation websites, GitHub repositories, PDFs, videos, Jupyter Notebooks, Word/EPUB/AsciiDoc documents, OpenAPI specs, PowerPoint presentations, RSS feeds, man pages, Confluence wikis, Notion pages, Slack/Discord exports, and more—into structured knowledge assets for every AI target:
| Use Case | What you get | Examples |
|---|---|---|
| AI Skills | Comprehensive SKILL.md + references | Claude Code, Gemini, GPT |
| RAG Pipelines | Chunked documents with rich metadata | LangChain, LlamaIndex, Haystack |
| Vector Databases | Pre-formatted data ready for upsert | Pinecone, Chroma, Weaviate, FAISS |
| AI Coding Assistants | Context files your IDE AI reads automatically | Cursor, Windsurf, Cline, Continue.dev |
📚 Documentation
| I want to... | Read this |
|---|---|
| Get started quickly | Quick Start - 3 commands to first skill |
| Understand concepts | Core Concepts - How it works |
| Scrape sources | Scraping Guide - All source types |
| Enhance skills | Enhancement Guide - AI enhancement |
| Export skills | Packaging Guide - Platform export |
| Look up commands | CLI Reference - All 20 commands |
| Configure | Config Format - JSON specification |
| Fix issues | Troubleshooting - Common problems |
Complete documentation: docs/README.md
Instead of spending days on manual preprocessing, Skill Seekers:
- Ingests — docs, GitHub repos, local codebases, PDFs, videos, notebooks, wikis, and 10+ more source types
- Analyzes — deep AST parsing, pattern detection, API extraction
- Structures — categorized reference files with metadata
- Enhances — AI-powered SKILL.md generation (Claude, Gemini, or local)
- Exports — 16 platform-specific formats from one asset
Why Use This?
For AI Skill Builders (Claude, Gemini, OpenAI)
- 🎯 Production-grade Skills — 500+ line SKILL.md files with code examples, patterns, and guides
- 🔄 Enhancement Workflows — Apply
security-focus,architecture-comprehensive, or custom YAML presets - 🎮 Any Domain — Game engines (Go
…