pdf2okf·

Wiki

MCP for documents: what it is and how OKF fits

What MCP actually is

MCP, the Model Context Protocol, is a standard interface that lets an AI agent talk to tools and data sources without a bespoke integration for each one. Before a standard like this, every connection was hand-built: this agent to that database, that agent to this file store, each with its own glue code, its own auth, its own quirks. The number of integrations grew with the number of agents multiplied by the number of tools, and almost none of it was reusable.

MCP collapses that. Instead of writing one integration per pairing, a tool or data source exposes a single MCP server that speaks the protocol. Any MCP-aware agent (a coding agent, a chat client, a custom loop) can then connect to that server and use what it offers, without knowing anything tool-specific in advance. Write the server once; every compliant client can read it. The protocol defines a small, typed vocabulary: the server advertises what it can do, the agent discovers it, and calls happen over a consistent channel.

For documents, that is an attractive shape. A document source can present itself as a server, and any agent that speaks MCP can query it the same way it queries everything else.

How an OKF bundle fits the picture

An OKF bundle is already the right raw material. pdf2okf turns a PDF into a directory of plain Markdown files, one per concept, plus a lightweight metadata index. It is structured, self-describing, and sitting on a local filesystem. Wrapping that in an MCP server is a natural fit: the server would expose the bundle's concepts as something an agent can list and read through the protocol, rather than as raw files the agent has to know how to walk.

The honest status: a pdf2okf MCP server is on the roadmap, not shipped. It is a planned piece, not something you can install today. When it lands, the idea is straightforward: point any MCP-aware agent at the bundle through a server entry, and it reads the concepts natively, with the file path doubling as a citation, the same grounded text underneath. The interface gets higher-level; the facts stay the same plain files.

You do not have to wait for it

Here is the part developers tend to miss: you do not need MCP to use an OKF bundle from an agent today. The bundle is just files, and reading files is something every shell-capable agent already supports.

The conversion runs once:

pdf2okf convert my-document.pdf --output ./my-bundle/

From then on, any agent that can run a shell command can query it:

grep -ri "indemnification" ./my-bundle/concepts/

cat ./my-bundle/concepts/indemnification.md

The agent gets back the exact text, with a file path it can cite. There is no retrieval layer to trust and no service to keep running: the "tool" is the filesystem, which every agent already knows. This works identically across Hermes Agent, Odysseus, OpenClaw, Claude Code, Cursor, Codex CLI, and any loop you write yourself. See read your documents from any agent for the full pattern.

CLI today, MCP later: same bundle

It helps to see the two paths as layers over one artifact rather than as competitors:

  • The CLI / plain-file path (available now): maximum compatibility, zero new dependencies. If an agent can grep and cat, it can read your knowledge. This is the floor, and it is already enough.
  • The MCP path (roadmap): a typed, discoverable interface for agents that prefer to call a server rather than shell out. More ergonomic for MCP-native clients, but reading the same OKF bundle, producing the same cited answers.

Crucially, neither path changes the data. The bundle is plain Markdown you own either way. MCP, when it arrives, is a nicer front door to a house you already live in, not a new house, and not a lock-in. For the broader integration story, see integrating OKF bundles into agentic tools.

Where pdf2okf fits

pdf2okf produces the artifact both paths depend on: a portable, OKF-compatible bundle of plain files on your own machine. Today you read it with the CLI and ordinary file tools from any agent. Tomorrow, a planned MCP server will let MCP-aware agents read the very same bundle natively. You are not betting on a protocol. You own the files, and the interfaces are free to improve around them. Join the waitlist at pdf2okf.com to follow the CLI release and the MCP server as it ships.

pdf2okf.com

Be there when it opens.

pdf2okf is in private build, self-hosted, sovereign. Leave an email and you'll be first in.