Back to MCP Servers

Gxtract

GXtract is a MCP server designed to integrate with VS Code and other compatible editors. It provides a suite of tools for interacting with the GroundX platform, enabling you to leverage its powerful document understanding capabilities directly within your development environment.

search-data-extractionrag
By sascharo
13Updated 1 year agoPythonNOASSERTION

Installation

npx -y gxtract

Configuration

{
  "mcpServers": {
    "gxtract": {
      "command": "npx",
      "args": ["-y", "gxtract"]
    }
  }
}

How to use

  1. Run the installation command above (if needed)
  2. Open your Claude Code settings file (~/.claude/settings.json)
  3. Add the configuration to the mcpServers section
  4. Restart Claude Code to apply changes

GXtract MCP Server

<div style="text-align: left;"> <img src="docs/sphinx/source/_static/images/gxtract-logo.png" alt="GXtract Logo" width="200"/> </div>

Documentation Python Version UV Version Ruff License: GPL v3

GXtract is a Model Context Protocol (MCP) server designed to integrate with VS Code and other compatible editors. It provides a suite of tools for interacting with the GroundX platform, enabling you to leverage its powerful document understanding capabilities directly within your development environment.

Table of Contents

Features

  • GroundX Integration: Access GroundX functionalities like document search, querying, and semantic object explanation.
  • MCP Compliant: Built for use with VS Code's MCP client and other MCP-compatible systems.
  • Efficient and Modern: Developed with Python 3.12+ and FastMCP v2 for performance.
  • Easy to Configure: Simple setup for VS Code.
  • Caching: In-memory cache for GroundX metadata to improve performance and reduce API calls.

Architecture

The high-level system architecture of GXtract illustrates how the components interact:

graph TB
    subgraph "Client"
        VSC[VS Code / Editor]
    end

    subgraph "GXtract MCP Server"
        MCP[MCP Interface<br>stdio/http]
        Server[GXtract Server]
        Cache[Metadata Cache]
        Tools[Tool Implementations]
    end

    subgraph "External Services"
        GXAPI[GroundX API]
    end

    VSC -->|MCP Protocol| MCP
    MCP --> Server
    Server --> Tools
    Tools -->|Query| GXAPI
    Tools -->|Read/Write| Cache
    Cache -.->|Refresh| GXAPI

This diagram shows:

  1. Client Integration: VS Code communicates with GXtract using the MCP protocol
  2. Transport Layer: Supports both stdio (for direct VS Code integration) and HTTP transport
  3. Core Components: Server manages tool registration and requests
  4. Caching Layer: Maintains metadata to reduce API calls
  5. Tool Implementation: Provides specialized functions for interacting with GroundX
  6. API Communication: Secure connection to GroundX platform

For more detailed architecture information, see the full documentation.

Prerequisites

  • Python 3.12 or higher.
  • UV (Python package manager): Version 0.7.6 or higher. You can install it from astral.sh/uv.
  • GroundX API Key: You need a valid API key from the GroundX Dashboard.

Installing UV

Before you can use GXtract, you need to install UV (version 0.7.6 or higher), a modern Python package manager written in Rust that offers significant performance improvements over traditional tools.

Quick Installation Methods

Windows (PowerShell 7):

powershell -c "irm https://astral.sh/uv/install.ps1 | iex"

macOS and Linux:

curl -LsSf https://astral.sh/uv/install.sh | sh

Alternative Installation Methods

Using pip:

pip install --upgrade uv

Using Homebrew (macOS):

brew install uv

Using pipx (isolated environment):

pipx install uv

After installation, verify that UV is working correctly:

uv --version

This should display version 0.7.6 or higher. For more information about UV, visit the official documentation.

Quick Start: VS Code Integration

  1. Clone the GXtract Repository:

    git clone <repository_url>  # Replace <repository_url> with the actual URL
    cd gxtract
  2. Install Dependencies using UV: Open a terminal in the gxtract project directory and run:

    uv sync

    This command creates a virtual environment (if one doesn't exist or isn't active) and installs all necessary dependencies specified in pyproject.toml and uv.lock.

  3. Set GroundX API Key: The GXtract server requires your GroundX API key. You need to make this key available as an environment variable named GROUNDX_API_KEY. VS Code will pass this environment variable to the server based on the configuration below. Ensure GROUNDX_API_KEY is set in the environment where VS Code is launched, or configure your shell profile (e.g., .bashrc, .zshrc, PowerShell Profile) to set it.

    Option 1: Using Environment Variables (as shown above)

    This approach reads the API key from your system environment variables:

    "env": {
        "GROUNDX_API_KEY": "${env:GROUNDX_API_KEY}"
    }

    Option 2: Using VS Code's Secure Inputs

    VS Code can prompt for your API key and store it securely. Add this to your settings.json:

    "inputs": [
      {
        "type": "promptString",
        "id": "groundx-api-key",
        "description": "GroundX API Key",
        "password": true
      }
    ]

    Then reference it in your server configuration:

    "env": {
        "GROUNDX_API_KEY": "${input:groundx-api-key}"
    }

    With this approach, VS Code will prompt you for the API key the first time it launches the server, then store it securely in your system's credential manager (Windows Credential Manager, macOS Keychain, or similar).

  4. Configure VS Code settings.json: Open your VS Code settings.json file (Ctrl+Shift+P, then search for "Preferences: Open User Settings (JSON)"). Add or update the mcp.servers configuration:

    "mcp": {
        "servers": {
           "gxtract": { // You can name this server entry as you like, i.e. GXtract
                "command": "uv",
                "type": "stdio", // 💡 http is also supported but VS Code only supports stdio currently
                "args": [
                    // Adjust the path to your gxtract project directory if it's different
                    "--directory", 
                    "DRIVE:\\path\\to\\your\\gxtract", // Example: C:\\Users\\yourname\\projects\\gxtract
                    "--project",
                    "DRIVE:\\path\\to\\your\\gxtract", // Example: C:\\Users\\yourname\\projects\\gxtract
                    "run",
                    "gxtract", // This matches the script name in pyproject.toml
                    "--transport",
                    "stdio" // 💡 Ensure this matches the "type" above
                ],
                "env": {
                    // Option 1: Using environment variables (system-wide)
                    "GROUNDX_API_KEY": "${env:GROUNDX_API_KEY}"
    
                    // Option 2: Using secure VS Code input (uncomment to use)
                    // "GROUNDX_API_KEY": "${input:groundx-api-key}"
                }
            }
        }
    }

    If using Option 2 (secure inputs), add this section (settings.json):

    // 💡 Only needed for Option 2 (secure inputs)
    "inputs": [
        {
            "type": "promptString",
            "id": "groundx-api-key",
            "description": "GroundX API Key",
            "password": true
        }
    ]

    Important:

    • Replace "DRIVE:\\path\\to\\your\\gxtract" with the absolute path to the gxtract directory on your system.
    • The "command": "uv" assumes uv is in your system's PATH. If not, you might need to provide the full path to the uv executable.
    • The server name "GXtract" in settings.json is how it will appear in VS Code's MCP interface.
  5. Reload VS Code: After saving settings.json, you might need to reload VS Code (Ctrl+Shift+P, "Developer: Reload Window") for the changes to take effect.

  6. Using GXtract Tools: Once configured, you can access GXtract's tools through VS Code's MCP features (e.g., via chat @ mentions if your VS Code version supports it, or other MCP integrations).

Available Tools

GXtract provides the following tools for interacting with GroundX:

  • groundx/searchDocuments: Search for documents within your GroundX projects.
  • groundx/queryDocument: Ask specific questions about a document in GroundX.
  • groundx/explainSemanticObject: Get explanations for diagrams, tables, or other semantic objects within documents.
  • cache/refreshMetadataCache: Manually refresh the GroundX metadata cache.
  • cache/refreshCachedResources: Manually refresh the GroundX projects and buckets cache.
  • cache/getCacheStatistics: Get statistics about the cached metadata.
  • cache/listCachedResources: List all currently cached GroundX resources (projects, buckets).

Configuration

The server can be configured via command-line arguments when run directly. When used via VS Code, these are typically set in the args array in settings.json.

  • --transport {stdio|http}: Communication transport type (default: http, but stdio is used for VS Code).
  • --host TEXT: Host address for HTTP transport (default: 127.0.0.1).
  • --port INTEGER: Port for HTTP transport (default: 8080).
  • --log-level {DEBUG|INFO|WARNING|ERROR|CRITICAL}: Logging level (default: INFO).
  • --log-format {text|json}: Log output format (default: text).
  • --disable-cache: Disable the GroundX metadata cache.
  • --cache-ttl INTEGER: Cache Time-To-Live in seconds (default: 3600).

API Key Security

The GroundX API key is sensitive information that should be handled securely. GXtract supports several approaches to provide this key:

  1. Environment Variables (recommended for development):

    • Set GROUNDX_API_KEY in your system or shell environment
    • VS Code will pass it to the server using ${env:GROUNDX_API_KEY} in settings.json
  2. VS Code Secure Storage (recommended for shared workstations):

    • Configure VS Code to prompt for the key and store it securely
    • Uses your system's credential manager (Windows Credential Manager, macOS Keychain)
    • Setup using the inputs section in settings.json as shown in the Quick Start
  3. Direct Environment Variable in VS Code settings (not recommended):

    • It's possible to set the key directly in settings.json: "GROUNDX_API_KEY": "your-api-key-here"
    • This is not recommended as it stores the key in plaintext in your settings.json file

Always ensure your API key is not committed to source control or shared with unauthorized users.

Development

To set up for development:

  1. Clone the repository.
  2. Navigate

View source on GitHub