Two weeks ago I was enamored with OpenClaw, and after a week of using it I decided to write my own replacement. I’m calling it “Compendium” since it’s my personal knowledge management system and AI assistant. I’m writing this because I’ve explained the architecture to a few people now, and I think the pattern of agents, tool sandboxing, etc. is worth writing out.
So if you haven’t seen the previous post, it’s here. I still like the concept of OpenClaw, but I didn’t really love the idea of how much unfettered access it needed to do most of the things, and it was pretty token-hungry which was either expensive or made a local model a bottleneck.
The Principles
Four ideas have guided most of the decisions I’ve made on this project:
1. Markdown First
Everything is stored as markdown files with YAML frontmatter. I’ve made three or four versions of this attempt at a “second brain” system before. I’ve used SQLite, Postgres, pretty much any database option and always have settled on files. I use YAML frontmatter liberally, do any actual writing in Obsidian, and this system works great for me. I might need to add some caching layer that uses SQL, but I want to make sure the source of truth is just markdown. It’s also coincidentally becoming the de facto language of LLMs, which means I can just toss a pile of markdown files at one and it’ll do a pretty good job reasoning about them.
2. Local AI When Possible
I want to run models locally whenever I can. I have a pretty high-end PC (128GB of RAM and nVidia 4080 most notably) I built for this a while ago, so I can run decent models at acceptable speeds for most use-cases.
I’m still waffling on if everything should be local. For things like email analysis, personal notes, etc. I’m making sure run locally and process as async as possible. But larger models run fairly slowly on my setup and it’s not really ideal for conversational cases with more than a round or two of back-and-forth.
3. Self-Expanding
If you look back at my previous post, one of my favorite things about OpenClaw was its self-expansion. The example where I asked it to email an ICS, and it updated its email scripts to support attachments is still one of my favorite examples. But that’s also the scary side of things. If the system is checking for news and hits a prompt injection it doesn’t catch, bad things can happen. So the challenge is finding the right balance of self-expansion while keeping it controlled and auditable.
4. Accessible Everywhere
I need to reach this from anywhere, so I went with Telegram because I’ve built a Telegram bot before and it seemed most straightforward. I was considering Matrix to lean on selfhosting, but my past experiences were… painful.
The Architecture
The Default Flow: Telegram → Local LLM → Tools
When I send a message to Lenny V2, it goes by default to the local LLM (currently Qwen 3 Coder Next 80B), running in a llama.cpp server inside Docker for isolation. I’m not sure this model is my favorite, but its relatively fast, smart, and good at tool calling which is my main requirement. This default flow has access to ~60 MCP tools (of ~140 tools).
The Escape Hatch: GitHub Copilot
Before the message even hits an LLM, the Telegram bot checks the beginning of the text. If it starts with a specific keyword, the rest of the message gets routed to GitHub Copilot CLI (via the SDK) instead of the local model. It’s my escape hatch to enable that self-expansion, but not unbounded.
An example from last week was that I had added a connection to my calendar for viewing and adding events, but didn’t include editing. So to add that, I just send a message like “sudo add a feature to edit a calendar event.” Copilot has access to the full repo, and the project is set up with instructions that explain how new capabilities should include a CLI command, MCP tool, optionally a web UI, and follow the same markdown-first conventions as the rest of the system.
After Copilot makes the changes, an auto-commit hook pushes them, the bot restarts, and the new tools are available. This works pretty reliably since I use Opus 4.6 when doing this flow, and in general these tools are simple scripts with boilerplate wrappers.
Agents Are Just Different Tool Sets
People use the word “agent” and “subagent” a lot. “subagents” are just models using different tools. Same llama.cpp container, model, etc. but different a MCP tool configuration. When kicking-off a new task, we only give tools necessary for the job.
The News Agent
One of the use-cases I have for Lenny is news updates, which is super untrusted content. I have it setup to check RSS feeds and build a summary on a schedule (7 AM, noon, 5 PM, and 9 PM). It’s pretty simple:
- Fetches URLs from the sources.
- Renders each page using Playwright.
- Extracts readable content via Readability.
- Summarizes each article individually via the LLM (with only a single tool to write a file).
- Aggregates all summaries in a final LLM pass that groups and categorizes everything (using two tools - one to read all the files, and the other to write the summary)
Then it sends the overall summary with relevant links in Telegram. If you remember from above, I have a web server running as part of this, so I can easily view the full details on that if I want to dig in deeper to anything.
My local model isn’t multimodal, so image-heavy content (like most Reddit posts) is skipped and I just get the title.
The Email Agent
This is the one I really wanted from OpenClaw but was never comfortable giving it. The email agent runs with its own, restricted tool set:
- It can read emails.
- It can move emails between folders (with an audit trail I can undo via script).
- It can write markdown files to a specific folder.
- It cannot send emails.
- It cannot access the internet.
So the agent that browses the web for news can’t see my email and the agent that reads my email cannot post anything to the internet. Even if something goes wrong in one context, the blast radius is contained.
Now every day, I get a Telegram message showing sender, subject, an LLM-generated classification, and a summary. With four buttons:
- Save to Note — Parses out structured data (tracking numbers, delivery dates, etc.) and creates a markdown note with appropriate frontmatter.
- Archive — Nothing actionable, but I read it.
- Unsubscribe — Parses out the unsubscribe link and gives it to me.
- Wait — Leave it in the inbox for when I’m at a computer.
I have no idea if I’ll stick with this, but it’s helping me keep inbox zero and unsubscribe from too many mailing lists.
The Security Model
The whole approach boils down to selective tool exposure:
| Context | Can Read Data | Can Write Data | Can Access Internet | Can Send Messages |
|---|---|---|---|---|
| Telegram Chat | Via tools | Via tools | No | To me only |
| News Agent | Feed URLs | News Folder | Yes | No |
| Email Agent | Emails | Email Folder | No | No |
| Copilot Escape Hatch | Full repo | Full repo | Yes (with permission) | To me only |
No agent (except Copilot) has unrestricted access. The normal LLM never touches files directly — everything goes through tool interfaces that enforce boundaries. The only way to bypass this would be compromising my Telegram account, and at that point I’d have bigger problems.
Is this “real” sandboxing? I don’t think so? But it’s a practical isolation model that gives me enough confidence to let an LLM handle my email.
What’s Next
The web UI needs work — right now it’s mostly default styling that Claude models generate. Lots of purple gradients that aren’t particularly useful.
I also really need to focus more (the classic problem) instead of just throwing everything at it, and now I’m honing its usefulness pulling things out and adding others. Right now I have:
- Books
- Bookmarks / read-it-later
- Caldav client
- Carddav client
- Kraken games
- Last.fm scrobbles
- News
- Notes
- Oura
- Reitti
- Immich
Whenever I add something new, I define a set of tools, give them to the model on a schedule or trigger, and constrain what it can do. Most of the actual things I’m using it for are subcategories of notes (journal, reminder, mood check-in, shipping notifications) and I probably need to tweak that a bit. Some of these are pretty useless in practice and I’ll probably drop at some point.
But I’m liking this a lot. It’s super easy to expand, has some good practical uses, but not sure if I’ll stick with it for the long-haul.