My Mind Is Not the CPU for Bureaucracy

I don’t usually do New Year’s resolutions, but this year I made an exception. It’s not a goal but a boundary, and it’s that I want to stop spending thinking capacity on problems AI can handle well enough.

Things like passport renewals, car maintenance, and health insurance paperwork aren’t intellectually hard problems for me. They’re high-friction re-entries into suspended projects that steal the same executive function I need for creative work. So I’m experimenting with an approach where Claude Code does the thinking and I become the hands and feet that execute.

You might be wondering why I’d give up that much control, but the thing is that I’m not giving up control over outcomes, just over the tedious context-reconstruction that precedes every action.

The Problem Isn’t Time, It’s Context

The standard productivity narrative says “do things faster.” But I’ve realized that for life admin, speed isn’t the bottleneck. The bottleneck is cognitive re-entry.

Every time I return to a bureaucratic project (health insurance setup, car inspection scheduling, government ID renewal), I pay a tax, and it’s not a time tax but a thinking tax. I have to reload the entire context: What did I do last time? Where’s that document? What’s the phone number? What did the person say?

This context reconstruction drains the exact same mental resource I need for interesting problems like creative work and engineering challenges, the stuff that actually benefits from my personal attention.

The insight is simple, which is that most life admin problems are not hard. They’re just high-friction re-entries. The thinking required isn’t creative; it’s reconstructive. And in my experience, Claude Code handles reconstructive thinking reasonably well (though I still verify critical details).

Flipping the Collaboration

For knowledge work, everyone talks about AI as a copilot where the human captain does the thinking and the AI assistant helps with execution.

For life admin, I flipped this around so that Claude does the thinking while I handle the execution.

In my early experiments, this architecture shows promise:

Claude does the thinking (loads context, synthesizes history, generates call scripts, decides what to say)
Human does the grunt work (physically holds the phone, drives to the government office, signs the document)

It might sound like I’m demoting myself to grunt work, but for me the hard part of life admin was never the physical execution. It was always the mental overhead of figuring out where I left off and what to do next. Once I know what to say and have my documents ready, making the call is easy (though I still need to think on my feet when conversations go sideways).

Ghost in the Shell

Now let’s talk about how this actually works. For Claude to be useful, it needs context. It can’t just be a chatbot I explain things to each time. It needs to temporarily inhabit the project’s accumulated knowledge.

I call this approach “Ghost in the Shell” (borrowing from the anime, where consciousness can inhabit different bodies). In my version, the “shell” is the project’s accumulated data and the “ghost” is Claude’s reasoning capability that temporarily inhabits it.

Each project has a brain: notes, meeting transcripts, task history, linked documents. When I trigger a task, Claude loads this brain and reasons from that context.

I should mention the technical setup, which is that I use Claude with MCP integrations that can read from my document database and task manager. I use Mac-specific tools (DEVONthink for documents, OmniFocus for tasks, Hookmark for cross-app links), but the principle should be tool-agnostic since any setup that can aggregate project context and feed it to an LLM would work.

When I select a task in OmniFocus, Claude follows the project hierarchy upward to find the parent project, then loads all linked documents and notes. It builds a timeline of what happened. Then it can act with full context, not just respond to isolated prompts. Sure, Claude doesn’t truly “understand” the project, but it synthesizes the available information well enough for bureaucratic tasks.

The One Question Rule

You might ask whether managing AI becomes its own overhead, and yes, it can. I’ve found that AI assistants can become another attention sink through clarifications, follow-ups, and re-prompts, and before you know it, managing the AI becomes a new kind of bureaucracy.

So I built in what I call the “one question rule.” If Claude is missing information, it can ask one question maximum. Then it has to provide a best-effort plan with assumptions stated. If it’s still stuck, it downgrades by first moving from doing the task to proposing a plan, and then to explaining how this type of problem generally works so I can handle it myself.

This prevents Claude from becoming needy. It forces decisive action with explicit assumptions rather than endless clarification loops.

I also added risk-based escalation. “Autopilot” here means Claude proceeds without asking for confirmation:

High risk (finance, medical, identity): never autopilot
Medium risk: autopilot only for prep work (drafts, research)
Low risk: full autopilot allowed

This doesn’t prevent all errors. I still need to review what Claude produces, and I’ve caught it confidently stating wrong details. But the guardrails prevent the worst-case scenarios while keeping friction low for routine tasks.

Closing the Loop

Claude doesn’t just prep and disappear but waits for me to come back with results.

Here’s a typical flow:

Claude: "Here's your call script for the mechanic. Key points to cover:
     timing, parts availability, price estimate. Come back when done."

[I make the call]

Me: "Done. He said Tuesday at 10am works, and the part needs to be ordered."

Claude: Creates follow-up task that surfaces on the appointment date,
    logs the outcome to project notes (so next time I have this
    context), links everything together.

This conversational loop keeps Claude as the orchestrator throughout. Every interaction gets logged back to the project’s brain, so next time the context is even richer.

What I Actually Delegate

The action types fall into three modes based on how much autonomy Claude gets:

Autopilot (Claude does the work, I review results): Research tasks where Claude queries Perplexity or analyzes code. Processing tasks where it extracts key points from articles or documents. Documentation tasks where it saves findings to DEVONthink and creates links between related items.

Propose (Claude preps, I decide and act): Phone calls where Claude loads my last conversation with someone, generates a script, and tells me what questions to ask. Emails where it drafts messages based on project context. Setup tasks where it researches best practices and generates step-by-step guides. Decisions where it builds comparison matrices and recommends options.

Teach (Claude explains, I learn): Learning tasks where Claude curates resources, creates a study plan, and suggests hands-on exercises, but I do the actual learning.

The key insight is that even when Claude can’t do the task, it can reduce my cognitive load significantly. For a phone call, I usually don’t have to remember anything because Claude already looked up my last interaction with that person and what we discussed. I just read the script (though I still need to adapt when the conversation goes sideways).

Action Types

I built a library of action types as markdown files that Claude reads when working on a task. Each type (research, review, process, draft, analyze, document, setup, learn, coordinate, execute, spec) defines a prep pattern for what Claude should do before I act, a capture pattern for logging results afterward, and which tools to use. For example, the “draft” type handles phone calls and emails by loading contact details and past conversations, then generating a script or message draft.

The process of building this library is ongoing. I pay attention to what types of actions I do repeatedly, then figure out how Claude could help with each one. Some actions start in Propose mode and graduate to Autopilot once I trust the pattern. The goal is to keep expanding what can be delegated as I discover new friction points.

Claude generates plans fresh each time using these patterns plus the project’s accumulated context. In theory, this gets smarter with each use because the context compounds. I haven’t measured this rigorously, but I’ve noticed Claude’s prep work improving as project histories grow richer.

The Real Risk of AI Assistants

What I’ve found is that the real risk of AI assistants isn’t that they make you lazy but that they make you a manager of an intern that asks infinite questions.

I prevent that by design through a kind of contract with Claude that includes clear modes of operation, risk-based guardrails, explicit definitions of done, and a hard cap on questions, so it behaves like a capable assistant rather than a needy one.

Is this worth the setup time? I spent a day building the infrastructure. Whether it pays off depends on how many bureaucratic projects I run through it. Ask me in six months.

For now, I’m experimenting with reserving my thinking capacity for interesting problems. Everything else gets delegated to the ghost in the shell.

2 Replies

JM Benedetto says:

2026.01.09. at 22:22

Hi, Zsolt. I hope you are doing fine. Thanks for sharing your workflow. It is quite promising. Have you pay any thought on encapsulating your recipes as a Claude Skill? Also, could you share the MCP you are using to integrate Claude and Omnifocus?

- Zsolt Benke says:
  
  2026.01.11. at 08:08
  
  I’ve been thinking about whether to share my Claude Code skills. The tricky part is that these skills are quite specific, and sometimes they include private details that I’d rather keep under wraps.
  
  I decided to share these workflows as an “idea spark” instead, because with any AI agent, you can easily take the idea and tweak it to suit your own needs.
  
  Regarding OmniFocus MCP: I use my forked version of this MCP server. https://github.com/themotionmachine/OmniFocus-MCP

Likes

👍 Andy Napper