New MCP Server Gives AI Agents Instant Access to Private Documentation They Were Never Trained On

Revolutionary Sandboxed Tool Solves LLM Knowledge Gap for Proprietary Libraries

A new open-source tool, docs-mcpserver, is enabling AI coding agents to accurately use internal frameworks, version-specific libraries, and huge specifications—without needing training data. The server acts as a secure, queryable documentation layer between large language models (LLMs) and the codebases they were never exposed to.

New MCP Server Gives AI Agents Instant Access to Private Documentation They Were Never Trained On — Source: dev.to

The Core Problem

Developers using LLMs for coding often struggle because models haven't been trained on private or heavily customized libraries. “LLMs guess at API signatures, waste tokens by parsing entire files, and can't follow version-locked specifications,” explains a senior engineer familiar with the project. “docs-mcpserver gives agents the real names, real signatures, and real shapes from documentation alone.”

How It Works

The MCP server ingests three documentation types: Markdown files (like *.md), API references (C# XML or TypeDoc JSON), and schema definitions (JSON Schema, OpenAPI 3.x, Swagger 2.0). It exposes a single tool that agents call to discover available libraries and fetch only the necessary snippets.

Instead of loading a 4000-line reference file, the agent requests a table of contents and pulls just one chapter. The server is sandboxed with path-traversal protection, ensuring agents read only what's explicitly exposed. “It's like giving the agent a well-indexed, secure library card rather than dumping the whole bookshelf,” the engineer added.

Background: The Long Struggle with Untrained Knowledge

LLMs are trained on public data up to a cutoff date, leaving them blind to internal enterprise frameworks, client-specific APIs, and pinned versions of well-known libraries. Previously, developers either fed entire repositories as context (costly and slow) or relied on the agent's improvisation—often resulting in hallucinated method names or outdated parameters.

Raw file access also posed security risks and token waste. docs-mcpserver changes this by:

Sandboxed scoping – each source is isolated with directory traversal protection.
Piecemeal reading – agents fetch only what they need, reducing token consumption.
Dedicated search – regex and glob support replace ad-hoc grep attempts.
Self-describing – a single tool call reveals all available libraries; no manual path listing required.
GitHub direct – sources can be pulled from a repo URL without cloning.

What This Means for Developers and AI Agents

This tool effectively plugs a critical gap in LLM-based coding. Developers can now confidently generate code against their own internal frameworks, client specifications, or even massive public specs like the DATEX traffic information standard (which the creator personally used).

The multi-library support is a game-changer: a single server can host an in-house framework, a client's framework, and a specific version of a public library side by side. “Instead of running five separate documentation servers, you get one with a small, focused toolset,” the engineer noted. “No token waste, no guesswork.”

For enterprises with proprietary codebases, this means AI agents can finally contribute meaningfully without compromising security or accuracy. The open-source nature invites community improvements, and the setup is straightforward: install, build, and point to a config file or a single folder.

Setting It Up in Under a Minute

Getting started requires Node.js and npm:

npm install @docs-mcpserver/core
npm run build

To serve a single folder: docs-mcpserver ./docs --name "My Docs". For multiple libraries, use a JSON config file:

{
  "name": "dev-docs",
  "description": "Private frameworks",
  "cacheDir": "./cache",
  "libraries": [
    {
      "name": "acme-core",
      "description": "Our internal app framework",
      "sources": [...]
    }
  ]
}

Immediate Impact on AI-Assisted Development

The tool removes a major barrier to using LLMs in production environments. Enterprises that have hesitated to adopt AI code generators due to lack of internal knowledge can now integrate them securely. Developers report significant time savings—no more manually pasting docs or fighting hallucinations.

“It's the difference between a model that guesses and one that verifies,” concluded the engineer. “We expect docs-mcpserver to become a standard component in AI coding workflows.”

For full documentation, visit the project's GitHub repository.

Tags: