Back to MCP Servers

Devrag

Lightweight local RAG MCP server for semantic vector search over markdown documents. Reduces token consumption by 40x with sqlite-vec and multilingual-e5-small embeddings. Supports filtered search by directory and filename patterns.

knowledge-memorysqliteembeddingrag
By tomohiro-owada
6114Updated 2 months agoGo

Installation

npx -y devrag

Configuration

{
  "mcpServers": {
    "devrag": {
      "command": "npx",
      "args": ["-y", "devrag"]
    }
  }
}

How to use

  1. Run the installation command above (if needed)
  2. Open your Claude Code settings file (~/.claude/settings.json)
  3. Add the configuration to the mcpServers section
  4. Restart Claude Code to apply changes

DevRag

Free Local RAG for Claude Code - Save Tokens & Time

日本語版はこちら | Japanese Version

DevRag is a lightweight RAG (Retrieval-Augmented Generation) system designed specifically for developers using Claude Code. Stop wasting tokens by reading entire documents - let vector search find exactly what you need.

Why DevRag?

When using Claude Code, reading documents with the Read tool consumes massive amounts of tokens:

  • Wasting Context: Reading entire docs every time (3,000+ tokens per file)
  • Poor Searchability: Claude doesn't know which file contains what
  • Repetitive: Same documents read multiple times across sessions

With DevRag:

  • 40x Less Tokens: Vector search retrieves only relevant chunks (~200 tokens)
  • 15x Faster: Search in 100ms vs 30 seconds of reading
  • Auto-Discovery: Claude Code finds documents without knowing file names

Features

  • 🤖 Simple RAG - Retrieval-Augmented Generation for Claude Code
  • 📝 Markdown Support - Auto-indexes .md files
  • 🔍 Semantic Search - Natural language queries like "JWT authentication method"
  • 🚀 Single Binary - No Python, models auto-download on first run
  • 💻 CLI & MCP - Use as MCP server or standalone CLI commands
  • 🖥️ Cross-Platform - macOS / Linux / Windows
  • Fast - Auto GPU/CPU detection, incremental sync
  • 🌐 Multilingual - Supports 100+ languages including Japanese & English

Quick Start

1. Download Binary

Get the appropriate binary from Releases:

PlatformFile
macOS (Apple Silicon)devrag-macos-apple-silicon.tar.gz
macOS (Intel)devrag-macos-intel.tar.gz
Linux (x64)devrag-linux-x64.tar.gz
Linux (ARM64)devrag-linux-arm64.tar.gz
Windows (x64)devrag-windows-x64.zip

macOS/Linux:

tar -xzf devrag-*.tar.gz
chmod +x devrag-*
sudo mv devrag-* /usr/local/bin/

Note: macOS releases include libonnxruntime.dylib for CoreML GPU acceleration. Keep it in the same directory as the devrag binary.

Windows:

  • Extract the zip file
  • Place in your preferred location (e.g., C:\Program Files\devrag\)

2. Configure Claude Code

Add to ~/.claude.json or .mcp.json:

{
  "mcpServers": {
    "devrag": {
      "type": "stdio",
      "command": "/usr/local/bin/devrag"
    }
  }
}

Using a custom config file:

{
  "mcpServers": {
    "devrag": {
      "type": "stdio",
      "command": "/usr/local/bin/devrag",
      "args": ["--config", "/path/to/custom-config.json"]
    }
  }
}

3. Add Your Documents

mkdir documents
cp your-notes.md documents/

That's it! Documents are automatically indexed on startup.

4. Search with Claude Code

In Claude Code:

"Search for JWT authentication methods"

Configuration

Create config.json:

{
  "document_patterns": [
    "./documents",
    "./notes/**/*.md",
    "./projects/backend/**/*.md"
  ],
  "db_path": "./vectors.db",
  "chunk_size": 500,
  "search_top_k": 5,
  "compute": {
    "device": "auto",
    "fallback_to_cpu": true
  },
  "model": {
    "name": "multilingual-e5-small",
    "dimensions": 384
  }
}

Configuration Options

  • document_patterns: Array of document paths and glob patterns
    • Supports directory paths: "./documents"
    • Supports glob patterns: "./docs/**/*.md" (recursive)
    • Multiple patterns: Index files from different locations
    • Note: Old documents_dir field is still supported (automatically migrated)
  • db_path: Vector database file path
  • chunk_size: Document chunk size in characters
  • search_top_k: Number of search results to return
  • compute.device: Compute device (auto, cpu, gpu)
  • compute.fallback_to_cpu: Fallback to CPU if GPU unavailable
  • model.name: Embedding model name
  • model.dimensions: Vector dimensions

Command-Line Options

  • --config <path>: Specify a custom configuration file path (default: config.json)

Example:

devrag --config /path/to/custom-config.json

This is useful for:

  • Running multiple instances with different configurations
  • Testing different models or chunk sizes
  • Maintaining separate dev/test/prod configurations

Pattern Examples

{
  "document_patterns": [
    "./documents",                    // All .md files in documents/
    "./notes/**/*.md",                // Recursive search in notes/
    "./projects/*/docs/*.md",         // docs/ in each project
    "/path/to/external/docs"          // Absolute path
  ]
}

MCP Tools

DevRag provides the following tools via Model Context Protocol:

search

Perform semantic vector search with optional filtering

Parameters:

  • query (string, required): Search query in natural language
  • top_k (number, optional): Maximum number of results (default: 5)
  • directory (string, optional): Filter to specific directory (e.g., "docs/api")
  • file_pattern (string, optional): Glob pattern for filename (e.g., "api-.md", ".md")

Returns: Array of search results with filename, chunk content, and similarity score

Examples:

// Basic search
search(query: "JWT authentication")

// Search only in docs/api directory
search(query: "user endpoints", directory: "docs/api")

// Search only files matching pattern
search(query: "deployment", file_pattern: "guide-*.md")

// Combined filters
search(query: "authentication", directory: "docs/api", file_pattern: "auth*.md")

index_markdown

Index a markdown file

Parameters:

  • filepath (string): Path to the file to index

list_documents

List all indexed documents

Returns: Document list with filenames and timestamps

delete_document

Remove a document from the index

Parameters:

  • filepath (string): Path to the file to delete

reindex_document

Re-index a document

Parameters:

  • filepath (string): Path to the file to re-index

CLI Usage

DevRag can also be used as a standalone CLI tool. All MCP tools are available as CLI commands.

# Start MCP server (default)
devrag
devrag serve

# Search documents
devrag search "JWT authentication"
devrag search "deployment" --top-k 10 --directory docs/api

# Index files
devrag index ./docs/api-spec.md
devrag index-code --directory ./src

# List indexed documents
devrag list
devrag list --fields filename

# Delete / Reindex
devrag delete ./docs/old-spec.md --dry-run
devrag reindex ./docs/updated-spec.md

# Code symbol relations
devrag search-relations handleAuth --type calls

# Build dictionary (Japanese-English mapping)
devrag build-dictionary

# Show CLI schema (machine-readable)
devrag schema

Output Format

All commands output JSON by default. Use --output text for human-readable output.

# JSON (default, suitable for scripts and AI agents)
devrag search "authentication"

# Text (human-readable)
devrag search "authentication" --output text

MCP Tool Name Compatibility

CLI commands also accept MCP tool names with underscores:

devrag index_markdown ./docs/api.md    # same as: devrag index
devrag list_documents                  # same as: devrag list
devrag delete_document ./docs/old.md   # same as: devrag delete
devrag reindex_document ./docs/api.md  # same as: devrag reindex

Flag Syntax

Flags must be placed before positional arguments:

# Correct
devrag delete --dry-run file.md

# Incorrect (--dry-run is ignored)
devrag delete file.md --dry-run

Team Development

Perfect for teams with large documentation repositories:

  1. Manage docs in Git: Normal Git workflow
  2. Each developer runs DevRag: Local setup on each machine
  3. Search via Claude Code: Everyone can search all docs
  4. Auto-sync: git pull automatically updates the index

Configure for your project's docs directory:

{
  "document_patterns": [
    "./docs",
    "./api-docs/**/*.md",
    "./wiki/**/*.md"
  ],
  "db_path": "./.devrag/vectors.db"
}

Performance

Environment: MacBook Pro M2, 100 files (1MB total)

OperationTimeTokens
Startup2.3s-
Indexing8.5s-
Search (1 query)95ms~300
Traditional Read25s~12,000

260x faster search, 40x fewer tokens

Development

Run Tests

# All tests
go test ./...

# Specific packages
go test ./internal/config -v
go test ./internal/indexer -v
go test ./internal/embedder -v
go test ./internal/vectordb -v

# Integration tests
go test . -v -run TestEndToEnd

Build

# Using build script
./build.sh

# Direct build
go build -o devrag cmd/main.go

# Cross-platform release build
./scripts/build-release.sh

Creating a Release

# Create version tag
git tag v1.0.1

# Push tag
git push origin v1.0.1

GitHub Actions automatically:

  1. Builds for all platforms
  2. Creates GitHub Release
  3. Uploads binaries
  4. Generates checksums

Project Structure

devrag/
├── cmd/
│   └── main.go              # Entry point
├── internal/
│   ├── cli/                 # CLI commands
│   ├── config/              # Configuration
│   ├── embedder/            # Vector embeddings
│   ├── indexer/             # Indexing logic
│   ├── mcp/                 # MCP server
│   └── vectordb/            # Vector database
├── models/                  # ONNX models
├── build.sh                 # Build script
└── integration_test.go      # Integration tests

Troubleshooting

Model Download Fails

Cause: Internet connection or Hugging Face server issues

Solutions:

  1. Check internet connection
  2. For proxy environments:
    export HTTP_PROXY=http://your-proxy:port
    export HTTPS_PROXY=http://your-proxy:port
  3. Manual download (see models/DOWNLOAD.md)
  4. Retry (incomplete files are auto-removed)

GPU / CoreML Not Working

On macOS, DevRag uses Apple CoreML for GPU/Neural Engine acceleration. Requirements:

  • libonnxruntime.dylib must be in the same directory as the devrag binary
  • macOS releases from GitHub include this file automatically

If CoreML is not available, DevRag falls back to CPU automatically. To tune performance:

# Adjust CPU thread count (default: 4)
DEVRAG_THREADS=4 devrag

To explicitly force CPU mode:

{
  "compute": {
    "device": "cpu",
    "fallback_to_cpu": true
  }
}

Won't Start

  • Ensure Go 1.21+ is installed (for building)
  • Check CGO is enabled: go env CGO_ENABLED
  • Verify dependencies are installed
  • Internet required for first run (model download)

Unexpected Search Results

  • Adjust chunk_size (default: 500)
  • Rebuild index (delete vectors.db and restart)

High Memory Usage

  • GPU mode loads model into VRAM
  • Switch to CPU mode for lower memory usage

Requirements

  • Go 1.21+ (for building from source)
  • CGO enabled (for sqlite-vec)
  • macOS, Linux, or Windows

License

MIT License

Credits

Contributing

Issues and Pull Requests are welcome!

Contributors

Special thanks to all contributors who helped improve DevRag:

  • @badri - Multiple document paths with glob patterns (#2), --config CLI flag (#3)
  • @io41 - Project cleanup and documentation improvements (#4)

Your contributions make DevRag better for everyone!

Author

towada


日本語版

Claude Code用の無料ローカルRAG - トークン&時間を節約

DevRa

View source on GitHub