Back to MCP Servers

Skill Seekers

Transform 17 source types (docs, GitHub repos, PDFs, videos, Jupyter, Confluence, Notion, Slack/Discord) into AI-ready skills and RAG knowledge. 35 MCP tools for scraping, packaging, enhancing, and exporting to vector databases (Weaviate, Chroma, FAISS, Qdrant). Supports 16+ tar…

knowledge-memorygithubslackdiscordscrapingapiairag
By yusufkaraaslan
14k1.5kUpdated 1 week agoPythonMIT

Installation

npx -y Skill_Seekers

Configuration

{
  "mcpServers": {
    "Skill_Seekers": {
      "command": "npx",
      "args": ["-y", "Skill_Seekers"]
    }
  }
}

How to use

  1. Run the installation command above (if needed)
  2. Open your Claude Code settings file (~/.claude/settings.json)
  3. Add the configuration to the mcpServers section
  4. Restart Claude Code to apply changes
<p align="center"> <img src="docs/assets/logo.png" alt="Skill Seekers" width="200"/> </p>

Skill Seekers

English | 简体中文 | 日本語 | 한국어 | Español | Français | Deutsch | Português | Türkçe | العربية | हिन्दी | Русский

Version License: MIT Python 3.10+ MCP Integration Tested Project Board PyPI version PyPI - Downloads PyPI - Python Version Website Twitter Follow GitHub Repo stars PyPI Downloads

<a href="https://trendshift.io/repositories/18329" target="_blank"><img src="https://trendshift.io/api/badge/repositories/18329" alt="yusufkaraaslan%2FSkill_Seekers | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>

🧠 The data layer for AI systems. Skill Seekers turns documentation sites, GitHub repos, PDFs, videos, notebooks, wikis, and 10+ more source types into structured knowledge assets—ready to power AI Skills (Claude, Gemini, OpenAI), RAG pipelines (LangChain, LlamaIndex, Pinecone), and AI coding assistants (Cursor, Windsurf, Cline) in minutes, not hours.

🌐 Visit SkillSeekersWeb.com - Browse 24+ preset configs, share your configs, and access complete documentation!

📋 View Development Roadmap & Tasks - 134 tasks across 10 categories, pick any to contribute!

🌐 Ecosystem

Skill Seekers is a multi-repo project. Here's where everything lives:

RepositoryDescriptionLinks
Skill_SeekersCore CLI & MCP server (this repo)PyPI
skillseekerswebWebsite & documentationLive
skill-seekers-configsCommunity config repository
skill-seekers-actionGitHub Action for CI/CD
skill-seekers-pluginClaude Code plugin
homebrew-skill-seekersHomebrew tap for macOS

Want to contribute? The website and configs repos are great starting points for new contributors!

🧠 The Data Layer for AI Systems

Skill Seekers is the universal preprocessing layer that sits between raw documentation and every AI system that consumes it. Whether you are building Claude skills, a LangChain RAG pipeline, or a Cursor .cursorrules file — the data preparation is identical. You do it once, and export to all targets.

# One command → structured knowledge asset
skill-seekers create https://docs.react.dev/
# or: skill-seekers create facebook/react
# or: skill-seekers create ./my-project

# Export to any AI system
skill-seekers package output/react --target claude      # → Claude AI Skill (ZIP)
skill-seekers package output/react --target langchain   # → LangChain Documents
skill-seekers package output/react --target llama-index # → LlamaIndex TextNodes
skill-seekers package output/react --target cursor      # → .cursorrules
skill-seekers package output/react --target ibm-bob     # → IBM Bob skill directory

What gets built

OutputTargetWhat it powers
Claude Skill (ZIP + YAML)--target claudeClaude Code, Claude API
Gemini Skill (tar.gz)--target geminiGoogle Gemini
OpenAI / Custom GPT (ZIP)--target openaiGPT-4o, custom assistants
LangChain Documents--target langchainQA chains, agents, retrievers
LlamaIndex TextNodes--target llama-indexQuery engines, chat engines
Haystack Documents--target haystackEnterprise RAG pipelines
Pinecone-ready (Markdown)--target markdownVector upsert
ChromaDB / FAISS / Qdrant--target chroma/faiss/qdrantLocal vector DBs
IBM Bob Skill (directory)--target ibm-bobIBM Bob project/global skills
Cursor .cursorrules--target markdown → copy SKILL.mdCursor IDE .cursorrules
Windsurf / Cline / Continue--target claude → copyVS Code, IntelliJ, Vim

Why it matters

  • 99% faster — Days of manual data prep → 15–45 minutes
  • 🎯 AI Skill quality — 500+ line SKILL.md files with examples, patterns, and guides
  • 📊 RAG-ready chunks — Smart chunking preserves code blocks and maintains context
  • 🎬 Videos — Extract code, transcripts, and structured knowledge from YouTube and local videos
  • 🔄 Multi-source — Combine 18 source types (docs, GitHub, PDFs, videos, notebooks, wikis, and more) into one knowledge asset
  • 🌐 One prep, every target — Export the same asset to 21 platforms without re-scraping
  • Battle-tested — 3,700+ tests, 24+ framework presets, production-ready

🚀 Quick Start (3 Commands)

# 1. Install
pip install skill-seekers

# 2. Create skill from any source
skill-seekers create https://docs.django.com/

# 3. Package for your AI platform
skill-seekers package output/django --target claude

That's it! You now have output/django-claude.zip ready to use.

# Use a different AI agent for enhancement (default: claude)
skill-seekers create https://docs.django.com/ --agent kimi
skill-seekers create https://docs.django.com/ --agent codex
skill-seekers create https://docs.django.com/ --agent-cmd "my-custom-agent run"

🛰️ AI-driven project scan (new)

Point scan at any project and an AI agent reads its manifests, README, Dockerfile/CI and sampled source imports — then emits one config per detected framework plus a <project>-codebase.json for your own code. Pins the detected version so re-running reports bumps:

skill-seekers scan ./my-react-app --out ./configs/scanned/
# → react.json, vite.json, tailwind.json, jest.json, my-react-app-codebase.json

# Then build any of them
skill-seekers create ./configs/scanned/react.json

If a detection has no existing preset, the AI generates a fresh config; on exit you can optionally publish it back to the community registry.

Other Sources (18 Supported)

# GitHub repository
skill-seekers create facebook/react

# Local project
skill-seekers create ./my-project

# PDF document
skill-seekers create manual.pdf

# Word document
skill-seekers create report.docx

# EPUB e-book
skill-seekers create book.epub

# Jupyter Notebook
skill-seekers create notebook.ipynb

# OpenAPI spec
skill-seekers create openapi.yaml

# PowerPoint presentation
skill-seekers create presentation.pptx

# AsciiDoc document
skill-seekers create guide.adoc

# Local HTML file (auto-detected by extension)
skill-seekers create page.html

# Whole directory of HTML files (auto-detected for HTML-dominant dirs)
skill-seekers create ./mirror_output/site/

# Force HTML mode on a mixed/code-heavy directory
skill-seekers create ./repo/ --html-path ./repo/docs/build/html/

# RSS/Atom feed
skill-seekers create feed.rss

# Man page
skill-seekers create curl.1

# Video (YouTube, Vimeo, or local file — requires skill-seekers[video])
skill-seekers create --video-url https://www.youtube.com/watch?v=... --name mytutorial
# First time? Auto-install GPU-aware visual deps:
skill-seekers create --setup

# Confluence wiki
skill-seekers create --space-key TEAM --name wiki

# Notion pages
skill-seekers create --database-id ... --name docs

# Slack/Discord chat export
skill-seekers create --chat-export-path ./slack-export --name team-chat

Export Everywhere

# Package for multiple platforms
for platform in claude gemini openai langchain; do
  skill-seekers package output/django --target $platform
done

What is Skill Seekers?

Skill Seekers is the data layer for AI systems. It transforms 18 source types—documentation websites, GitHub repositories, PDFs, videos, Jupyter Notebooks, Word/EPUB/AsciiDoc documents, OpenAPI specs, PowerPoint presentations, RSS feeds, man pages, Confluence wikis, Notion pages, Slack/Discord exports, and more—into structured knowledge assets for every AI target:

Use CaseWhat you getExamples
AI SkillsComprehensive SKILL.md + referencesClaude Code, Gemini, GPT
RAG PipelinesChunked documents with rich metadataLangChain, LlamaIndex, Haystack
Vector DatabasesPre-formatted data ready for upsertPinecone, Chroma, Weaviate, FAISS
AI Coding AssistantsContext files your IDE AI reads automaticallyCursor, Windsurf, Cline, Continue.dev

📚 Documentation

I want to...Read this
Get started quicklyQuick Start - 3 commands to first skill
Understand conceptsCore Concepts - How it works
Scrape sourcesScraping Guide - All source types
Enhance skillsEnhancement Guide - AI enhancement
Export skillsPackaging Guide - Platform export
Look up commandsCLI Reference - All 20 commands
ConfigureConfig Format - JSON specification
Fix issuesTroubleshooting - Common problems

Complete documentation: docs/README.md

Instead of spending days on manual preprocessing, Skill Seekers:

  1. Ingests — docs, GitHub repos, local codebases, PDFs, videos, notebooks, wikis, and 10+ more source types
  2. Analyzes — deep AST parsing, pattern detection, API extraction
  3. Structures — categorized reference files with metadata
  4. Enhances — AI-powered SKILL.md generation (Claude, Gemini, or local)
  5. Exports — 16 platform-specific formats from one asset

Why Use This?

For AI Skill Builders (Claude, Gemini, OpenAI)

  • 🎯 Production-grade Skills — 500+ line SKILL.md files with code examples, patterns, and guides
  • 🔄 Enhancement Workflows — Apply security-focus, architecture-comprehensive, or custom YAML presets
  • 🎮 Any Domain — Game engines (Go

View source on GitHub