← Blog/Agents

Building a Chief of Staff Agent with Claude Code and Obsidian: The LLM Wiki Pattern for Client Work

Most of the writing about Andrej Karpathy's LLM wiki pattern frames it as a personal productivity play. Drop your notes into a folder, let Claude Code compile them into a living markdown wiki, query it like you would a smart colleague. It's a genuinely good idea, and if you're a researcher or a knowledge worker drowning in your own notes, you should probably try it this weekend.

By ProxyClaw Nashville · April 23, 2026 · LLM wiki for business

At ProxyClaw we've been running the same pattern for something different: client-facing AI agents that need to know a specific business inside and out. When we're building a chief of staff agent for a professional services operator managing dozens of clients across multiple service lines, the wiki pattern is the part of the stack that actually makes the agent useful instead of generic.

This post walks through how we use it, what we changed for client work, and why Claude Code plus Obsidian is the setup we keep landing on.

What the LLM wiki pattern actually is

Karpathy's pattern, laid out in a GitHub gist in April 2026, boils down to this: instead of using retrieval-augmented generation to search raw documents every time someone asks a question, you have the LLM pre-compile those sources into a structured set of markdown files. Each file is a wiki-style entity page with cross-links. The LLM reads the wiki, not the raw sources, when answering.

The three operations are ingest, query, and lint. Ingest takes a new source, reads it, writes summary and entity pages, and updates anything that connects. Query answers questions by reading the already-synthesized wiki. Lint is periodic maintenance where the LLM audits itself for contradictions, orphan pages, and missing concepts.

The shift from standard RAG is real. In a retrieval system, the model rediscovers knowledge from scratch every query. In a wiki, knowledge compounds. Ingest a new source about a client's pricing structure and the LLM doesn't just file it away -- it updates the pricing page, the competitive positioning page, the FAQ page, and anywhere else that concept lives. You end up with a knowledge layer that gets sharper the more sources flow through it.

Why this works for a client engagement, not just personal notes

The critique of the wiki pattern for individuals is fair enough: you could argue a good RAG pipeline gives you most of the benefit with less babysitting. For personal use, that's probably right.

For client work, the math flips. Here's why.

When we're building an agent for a business, the value isn't in answering one question well. It's in the agent behaving consistently across hundreds of interactions with different people in the same company. A support agent that synthesizes customer context differently every time it gets asked is a liability. A chief of staff agent that gives the founder a different version of the Q3 revenue story on Tuesday than it gave on Friday is worse than no agent at all.

The wiki pattern forces a single source of truth. Once a concept is written into the wiki, every query that touches that concept reads the same page. Behavior is consistent because the knowledge layer is consistent. That matters more when you're serving a team than when you're serving yourself.

The second reason is auditability. A client can open the vault, see exactly what the agent believes about their business, and flag anything wrong. Compare that to a RAG system where the knowledge lives in embeddings nobody can read. For clients in regulated industries, that readable-by-humans layer is what makes the whole approach viable.

The Chief of Staff Agent build

To make this concrete, picture a professional services operator running multiple service lines: dozens of clients, multiple calendars to coordinate, multiple inboxes, a practice management system, and leadership to stay aligned with. That's a profile we see constantly in Nashville. The request in these engagements is always some version of the same thing: tell me what I should be focused on each morning, catch things that are falling through the cracks, draft the follow-ups I keep meaning to send, and know enough about my clients that I don't have to re-brief you every session.

The chief of staff agent we build for that kind of client sits on top of a wiki vault with roughly this structure:

The raw/ folder is the landing pad for any new source -- client intake notes, meeting transcripts, signed engagement letters, email exports. The wiki/ folder is the compiled knowledge layer, organized into clients, businesses, people, processes, and concepts. That's the only thing the agent reads when answering a query.

The CLAUDE.md file is the schema: it tells Claude Code what kind of wiki this is, what page types exist, what conventions to follow, and which sources to treat as authoritative when two sources disagree.

Obsidian sits on top of the same folder as the human interface. The client can browse the vault, see the graph view of how everything connects, edit pages directly if the agent got something wrong, and drop new files into raw/ without touching the terminal.

When a new meeting transcript lands in raw/, the agent reads it, decides which of the client's own customers it touches, updates that customer's page with the new discussion points and commitments, appends an entry to the log, and flags if the meeting contradicted anything already on file. A single ingest might touch 8 or 10 wiki pages. None of that logic lives in a vector database. It lives in plain markdown files the operator can read over coffee if they want to.

What changes when the client is the user

The Karpathy gist is written for someone maintaining their own wiki. For a client engagement, we modify the pattern in a few important ways.

The schema does more work. In a personal wiki you can be loose with page types and conventions because you remember them. For a client, the CLAUDE.md file becomes the contract that keeps the agent consistent across sessions and across different team members querying it. We invest a lot of time early in the engagement writing a schema that reflects the client's actual business vocabulary, not a generic template.

Human-in-the-loop is mandatory, not optional. Karpathy flags the risk of the LLM baking a small misunderstanding into the wiki where it then propagates. For personal use, you catch it later. For a client, you build in explicit review gates before a new ingest writes to the wiki. Our agents draft the proposed changes, show them in a diff, and wait for approval for anything structural.

The lint step runs on a schedule, not when the user remembers. Weekly audit jobs check for contradictions, stale pages, orphan concepts, and places where the wiki has drifted from the raw sources. The output goes to the client in a short Slack summary: here's what I updated, here's what looks stale, here are three things I'd like you to resolve.

Raw stays raw. For compliance reasons, we don't delete or modify raw source files. The wiki is the working memory. The raw folder is the record. If the agent ever writes something into the wiki that turns out to be wrong, the client can always go back to the source that caused it and correct both.

The interface layer is separate from the wiki layer. The client doesn't talk to Claude Code directly. They talk to the agent through Slack or text, and the agent queries the wiki behind the scenes. This matters because the wiki layer is the stable thing. We can swap the interface, change which LLM is doing the reasoning, or move where the agent runs, and the knowledge base stays intact.

Who this is the right fit for

Not every client needs an LLM wiki. For a business with simple workflows and thin institutional knowledge, we'll build a more conventional agent with straightforward integrations and skip the wiki layer entirely.

The wiki pattern earns its keep when the client has one or more of these:

A lot of domain-specific context that keeps showing up in conversations. CPAs, attorneys, consultants, fractional executives, agency operators. Every client touchpoint references entities, concepts, and history that a generic agent would have to ask about every time.
Multi-location or multi-entity complexity. Restaurant groups with institutional knowledge varying by location. Service businesses with different procedures per office. Anything where "how we do it here" is actually "how we do it across several heres."
Long-running client relationships where the agent needs to build context over months, not minutes. Chief of staff agents, customer success agents, account management agents. The continuity is the product.
Regulated industries where the client needs to audit what the agent knows. CPAs, financial advisors, healthcare-adjacent businesses. A readable knowledge layer is a feature, not a nice-to-have.

The short version

Karpathy's LLM wiki pattern gives you a knowledge base that an agent can actually reason over -- one that gets smarter with every source you feed it, and one that a human can read and correct directly. Pair it with Claude Code as the engine and Obsidian as the interface, and you have a setup that holds up under the weight of real client data.

For ProxyClaw, this is the part of the stack that turns a generic AI agent into one that belongs to a specific business. The agents we deploy don't just respond to prompts. They carry the context of the business they work for, and they get better the longer they run.

If you're building an agent for your business and the amount of institutional knowledge involved keeps coming up as the blocker, this is probably the shape of the solution. Book a kickoff call and we're happy to talk through whether it's the right fit for what you're trying to do.

Frequently asked questions

What is Andrej Karpathy's LLM wiki pattern?

It's a method for building a personal or organizational knowledge base where an LLM actively reads, compiles, and maintains the content as structured markdown files. Karpathy described it in a GitHub gist in April 2026. Instead of using retrieval to search raw documents, the LLM pre-compiles sources into a wiki of cross-linked entity pages, and queries read the wiki rather than the raw sources.

How is this different from standard RAG?

RAG systems retrieve relevant chunks of raw documents at query time. The LLM wiki pattern pre-compiles those documents into a synthesized knowledge layer, so answers compound over time instead of being rediscovered from scratch. For small-to-medium knowledge bases and high-consistency use cases, the wiki approach tends to outperform RAG on both consistency and auditability.

Why Claude Code and Obsidian specifically?

Claude Code can read and write files on a local filesystem, maintain context across long tasks, and follow the persistent instructions in a CLAUDE.md schema file. Obsidian is a free local-first markdown editor that treats a folder of markdown files as a vault, with backlinks, a graph view, and no proprietary format. Together they give you an AI engine plus a human interface that both operate on the same plain files.

Can a small business actually use this, or is it only for technical teams?

The setup needs someone technical to stand up, but once it's running, the day-to-day use is conversational. A client interacts with their agent through Slack, text, or email. The vault itself can be browsed in Obsidian like any other notes app if the client wants to inspect what the agent knows. ProxyClaw handles the setup, schema, and ongoing care so non-technical teams get the benefit without needing to operate the infrastructure.

What's the risk of the LLM getting something wrong and baking it into the wiki?

Real, and worth planning for. We mitigate it with human-in-the-loop review on ingests that change structural pages, scheduled lint jobs that flag contradictions, and a strict policy of never deleting raw source files. If the wiki ever drifts from the truth, the raw sources are always available to correct it.

Does this work for regulated industries like accounting, legal, or healthcare?

The readable markdown layer is one of the reasons the pattern is viable in regulated contexts. A client can see exactly what the agent believes about their business, audit it, and correct it. That's harder to do with a vector-database RAG system. For compliance-heavy engagements we typically run the whole stack on client-owned infrastructure so data stays in their environment.

ProxyClaw Nashville

Ready to deploy your first AI agent?

We handle the full OpenClaw setup on-site at your Nashville or Middle Tennessee office. Free 30-minute strategy call - no technical knowledge required.

Book a free strategy call