pdf2okf·

Coming soon · self-hosted · GDPR

AI that reads your PDFs, without the cloud.

pdf2okf turns any PDF into an exact, cited knowledge bundle your AI agent can read, self-hosted on your hardware, or with your own key. Your data never leaves. GDPR-compliant by design.

Early access. No spam: one email when it's live.

an agent reading a bundle, on your hardware
you
Exact, cited, and it ran on your own hardware. Nothing left the building.

the problem

Your best documents are too sensitive for the cloud.

Contracts, patient records, engineering IP, financials: you can't upload them to OpenAI, Google or AWS. So your AI never gets to read the documents that matter most.

where it runs

Your documents never leave the building.

Self-hosted

Runs on your own hardware with a local model, nothing leaves. Sovereign, GDPR-compliant by design.

Bring your own key

Or point it at your own endpoint: Azure EU, Mistral, your vLLM. You decide where processing happens.

Read agent-native: your agent via CLI or Open WebUI, with your model, never ours.

the shape of it

PDF in. Knowledge an agent trusts, out.

$ build

Turn a PDF into an OKF bundle

Text, tables and diagrams become small, linked concept files: the structure made explicit.

bundle/

A bundle you own

Plain markdown + frontmatter. Versionable, portable, no vector database to babysit.

$ read

Your agent reads it exactly

It greps the concepts it needs and answers with the source: every figure and number accounted for.

the economics

You don't re-read the whole PDF every time.

Cloud AI shoves your entire document into a giant context on every question, and bills you for every token, every time. pdf2okf greps only the few concepts an answer needs: a fraction of the tokens, a fraction of the cost, no million-token window to shove around.

why it's different

The model finds the structure. Code does the counting.

Whenever a number has to be exact, code counts it and the model only reports it, so "how many releases?" returns 40, not "around 40." That determinism is what makes a smaller, local model trustworthy, and today's open models (Gemma, Qwen, Mistral) are good enough to run the whole thing on your own box.

pdf2okf.com

Be there when it opens.

pdf2okf is in private build, self-hosted, sovereign. Leave an email and you'll be first in.