Back to MCP Servers

Context First

Session memory, context health monitoring, reasoning quality, and truthfulness verification MCP server with 37 tools and tiered memory storage. `npx -y context-first-mcp`

knowledge-memorymonitoringrag
By XJTLUmedia
0Updated 2 months agoTypeScriptMIT

Installation

npx -y context-first-mcp

Configuration

{
  "mcpServers": {
    "Context-First-MCP": {
      "command": "npx",
      "args": ["-y", "Context-First-MCP"]
    }
  }
}

How to use

  1. Run the installation command above (if needed)
  2. Open your Claude Code settings file (~/.claude/settings.json)
  3. Add the configuration to the mcpServers section
  4. Restart Claude Code to apply changes
<div align="center">

Context-First MCP

The MCP server that keeps your AI grounded, coherent, and honest — across every turn.

npm version npm downloads License: MIT MCP Compatible Smithery Glama Node ≥18 TypeScript

npx context-first-mcp

Works instantly with Claude Desktop · Cursor · VS Code · any MCP client · Vercel remote — zero API keys needed.

</div>

37 research-backed tools across 7 layers — context health, state, sandboxing, persistent memory, advanced reasoning, truthfulness verification, orchestration, structured research, and autonomous file export. One context_loop call replaces 6–7 individual tools and returns a unified action directive.


Why Your AI Conversations Break Down

Long AI conversations fail in predictable ways. Context-First fixes all four:

Failure ModeWhat Goes WrongContext-First Solution
Context DriftAI forgets earlier decisions and intent as the conversation growscontext_loop + detect_drift continuously re-anchor every turn
Silent ContradictionNew inputs silently overrule established facts — the AI doesn't noticedetect_conflicts compares every input against locked ground truth
Vague ExecutionAI proceeds on underspecified requirements, producing misaligned outputcheck_ambiguity + abstention_check ask clarifying questions instead of guessing
Hallucinated SuccessTool outputs look successful but didn't actually achieve the goalverify_execution rechecks whether the outcome matches the stated intent

What You Get

37 production-ready tools grouped into 7 layers — plus 1 orchestrator that runs them all:

context_loop  ─────────────────────────────────────────────────────────────────
  ├─ Layer 1 · Context Health   (9 tools)   recap, conflict, ambiguity, depth …
  ├─ Layer 2 · Sandbox          (3 tools)   discover_tools, quarantine, merge
  ├─ Layer 3 · Persistent Memory(6 tools)   store, recall, compact, graph …
  ├─ Layer 4 · Advanced Reasoning(5 tools)  InftyThink, Coconut, KAG, MindEvo …
  ├─ Layer 5 · Truthfulness     (7 tools)   NCB, IOE, verify_first, self_critique…
  └─ State + Research Pipeline + Export     (7 tools)

One call. One directive. One score.

{
  "directive": {
    "action": "clarify",
    "contextHealth": 0.62,
    "instruction": "Resolve with the user: (1) Is this a firm requirement? (2) Which framework?",
    "autoExtractedFacts": { "deploy_to": "Vercel" },
    "suggestedNextTools": ["verify_execution", "quarantine_context"]
  }
}

Quick Start

npx — zero install

npx context-first-mcp

Claude Desktop

{
  "mcpServers": {
    "context-first": {
      "command": "npx",
      "args": ["-y", "context-first-mcp"]
    }
  }
}

Cursor / VS Code

{
  "mcp": {
    "servers": {
      "context-first": {
        "command": "npx",
        "args": ["-y", "context-first-mcp"]
      }
    }
  }
}

Remote (Streamable HTTP)

{
  "mcpServers": {
    "context-first": {
      "url": "https://context-first-mcp.vercel.app/api/mcp"
    }
  }
}

Deploy your own Vercel instance

Deploy with Vercel


Tool Reference

Layer 1: Core Context Health (9 tools)

ToolPurpose
context_loopOne-call orchestrator. Runs 8 stages (ingest→recap→conflict→ambiguity→entropy→abstention→discovery→synthesis) and returns a single directive with action, contextHealth score, extracted facts, and suggested next tools
recap_conversationExtracts hidden intent, key decisions, and produces consolidated state summaries
detect_conflictsCompares new input against ground truth; surfaces contradictions
check_ambiguityIdentifies underspecified requirements and generates clarifying questions
verify_executionValidates whether tool outputs actually achieved the stated goal
entropy_monitorProxy-entropy scoring via lexical diversity, contradiction density, hedge frequency, and n-gram repetition (ERGO)
abstention_check5-dimension confidence scoring — abstains with questions rather than hallucinating (RLAAR)
detect_driftDetects conversation drift from the original intent
check_depthEvaluates response depth against question complexity

Layer 1b: State Management (4 tools)

ToolPurpose
get_stateRetrieve confirmed facts and task status
set_stateLock in ground truth — subsequent conflict checks run against these values
clear_stateReset specific keys or all state
get_history_summaryCompressed conversation history with intent annotations

Layer 2: Sandbox & Discovery (3 tools)

ToolMethodPurpose
discover_toolsMCP-Zero + ScaleMCPNatural-language tool routing — returns only semantically relevant tools, reducing context bloat by up to 98%
quarantine_contextMulti-Agent QuarantineCreate isolated memory silos for sub-tasks, preventing intent dilution
merge_quarantineMulti-Agent QuarantineMerge silo results with noise filtering — only promoted keys return to main context

Layer 3: Persistent Memory (6 tools)

ToolPurpose
memory_storeStore findings, decisions, and intermediate results with metadata
memory_recallRetrieve relevant memories by semantic query
memory_compactCompress and consolidate memory entries
memory_graphBuild and query a knowledge graph from stored memories
memory_inspectInspect memory store contents and statistics
memory_curateDeduplicate and organize memory entries

Layer 4: Advanced Reasoning (5 tools)

ToolMethodPurpose
inftythink_reasonInftyThinkInfinite-depth reasoning with adaptive stopping
coconut_reasonCoconutChain-of-Continuous-Thought in latent space
extracot_compressExtraCoTCompress chain-of-thought while preserving reasoning fidelity
mindevolution_solveMindEvolutionEvolutionary search over the solution space
kagthinker_solveKAG-ThinkerKnowledge-augmented generation with structured thinking

Layer 5: Truthfulness & Verification (7 tools)

ToolPurpose
probe_internal_stateProbe model consistency across paraphrased prompts
detect_truth_directionDetect whether model reasoning is trending toward or away from truth
ncb_checkNeighborhood consistency check across semantically equivalent inputs
check_logical_consistencyVerify logical coherence of reasoning chains
verify_firstPre-verification before committing to claims
ioe_self_correctIntrinsic-extrinsic self-correction
self_critiqueStructured self-critique with improvement suggestions

Research Pipeline & Export (2 tools)

ToolPurpose
research_pipelineStructured research orchestration across init → gather → analyze → verify → finalize. Covers all 34 underlying tool-equivalents — state, sandboxing, memory, reasoning, truthfulness, context health. Writes files autonomously to disk as the pipeline runs; no LLM cooperation needed for file output.
export_research_filesWrites every verified report chunk and/or every raw evidence batch to disk in a single call.

Built on Peer-Reviewed Research

Every core algorithm traces back to a published paper:

AlgorithmPaperarXivTool
MCP-ZeroActive Tool Request2506.01056discover_tools
ScaleMCPSemantic Tool Grouping2505.06416discover_tools registry
ERGOEntropy-based Quality2510.14077entropy_monitor
RLAARCalibrated Abstention2510.18731abstention_check

Implementation highlights:

  • Proxy Entropy (ERGO): 4 response-level proxy signals (lexical diversity, contradiction density, hedge-word frequency, n-gram repetition) replace inaccessible token-level logprobs. Composite score above threshold triggers adaptive context reset.
  • TF-IDF Discovery (MCP-Zero): Pure TypeScript, zero external dependencies. Indexes all tool descriptions at startup; cosine similarity routes queries to the top-k relevant tools only.
  • Inference-Time Abstention (RLAAR): 5-dimension confidence scoring replaces the RL training loop. Abstains with targeted questions when confidence < threshold — no hallucination fallback.

Export Helper (1 tool)

ToolDescription
export_research_filesWrites research artifacts directly to disk. It can automatically expand and write every verified report chunk without asking the LLM to loop finalize manually, and it can also write every gathered raw-evidence batch even when verify has not passed.

context_loop Pipeline

context_loop (single MCP tool call)
├── Stage 1: INGEST     — Store messages to session history
├── Stage 2: RECAP      — Extract intents, decisions, summaries
├── Stage 3: CONFLICT   — Detect contradictions against ground truth
├── Stage 4: AMBIGUITY  — Check for underspecified requirements
├── Stage 5: ENTROPY    — Monitor output quality degradation (ERGO)
├── Stage 6: ABSTENTION — Multi-dimensional confidence check (RLAAR)
├── Stage 7: DISCOVERY  — Suggest relevant next tools (MCP-Zero)
└── Stage 8: SYNTHESIS   — Combine signals → action recommendation + LLM directive

Synthesis Priority: abstain > reset > clarify > proceed

Each stage runs with independent error isolation — a failure in one stage doesn't block the others. The result includes per-stage timing, status, and detailed results for observability.

LLM Directive (NEW)

The context_loop response includes a top-level directive object designed for LLM consumption — a compact, actionable instruction that replaces the need to parse nested stage results:

{
  "directive": {
    "action": "clarify",
    "instruction": "Before proceeding, resolve these issues with the user:\n1. Could you specify exactly what you mean?\n2. Is this a firm requirement or still open for discussion?",
    "questions": ["Could you specify exactly what you mean?", "Is this a firm requirement?"],
    "contextHealth": 0.62,
    "autoExtractedFacts": { "framework": "React", "deploy_to": "Vercel" },
    "suggestedNextTools": ["verify_execution", "quarantine_context"]
  }
}

How context_loop Works

context_loop (single MCP tool call)
├── Stage 1: INGEST     — Store messages to session history
├── Stage 2: RECAP      — Extract intents, decisions, summaries
├── Stage 3: CONFLICT   — Detect contradictions against ground truth
├── Stage 4: AMBIGUITY  — Check for underspecified requirements
├── Stage 5: ENTROPY    — Monitor output quality degradation (ERGO)
├── Stage 6: ABSTENTION — Multi-dimensional co

…
View source on GitHub