Wiki
Everything behind sovereign document AI.
Deep, sourced explainers on the ideas pdf2okf is built on: the Open Knowledge Format, local open models, data sovereignty, and retrieval without a vector database.
Open Knowledge Format & OKFZ
The portable file format your agent reads.
RAG without a vector database
Grep-based, agentic retrieval over plain files.
- RAG without a vector database: grep-based retrieval
- The LLM-wiki pattern: a markdown knowledge base for agents
- Agentic retrieval vs. traditional RAG
- Context engineering: the discipline that contains RAG
- The hidden cost of RAG: re-embedding, hosting, and token bills
- Long context vs. retrieval: should you just paste the whole PDF?
Sovereignty, GDPR & the EU
Why local inference is the clean answer.
Industry & compliance
Sovereign document AI for regulated fields: law, health, finance.
Local & open models
Run it on your own hardware in 2026.
- Running AI locally in 2026: a guide for document Q&A
- Open weights vs. open source vs. fully open
- Which open model for your documents? Gemma 4, Qwen, Mistral, OLMo, EuroLLM
- Local document AI on a Mac: Apple Silicon, MLX & oMLX
- Ollama vs. llama.cpp vs. LM Studio vs. vLLM
- What hardware do you need for local document AI?
Exactness & determinism
Cited, auditable, hallucination-free answers.
Agents, CLI & integration
Read your documents from any agent.
Comparisons & alternatives
Self-hosted alternatives to the cloud tools.