My journey trying out AI assistants keeps getting more interesting.

I started with OpenClaw, then decided to check out Hermes so I set that one up too. And what is the difference, you might ask? Both are self-hosted AI assistants you run on your own server, both connect to Telegram and other messaging platforms, and both give you a persistent agent that remembers who you are and what you are working on. But they have very different personalities. The most accurate comparison I saw was from a user on Twitter who said something like “Hermes is your Honda Accord, reliable and solid, while OpenClaw is your Ferrari, fast but you are always driving with a wrench in your hand because something is about to break.” I found that both funny and accurate for my own experience.

Image: Twitter (X)

Over the last few weeks the AI assistant I reached for most started tipping from OpenClaw to Hermes. They both have their flaws and their shining qualities but something about Hermes made me like it more. Maybe the way it remembers things across sessions. Maybe the way it quietly improves itself. Hard to pin down exactly.

But here is the thing. It was still just one assistant and I had 99 tasks and counting. Slight exaggeration. The more I used Hermes the more I wanted to do more with it. Blog content, coding projects, a Shopify store, personal life stuff. One agent can handle all of that. But there is another approach. Why do it all with one agent when you can give your agent a team of agents?

I saw people building this kind of setup so as per usual, I decided to try it myself.


Why One Agent Starts to Feel Like Not Enough

Hermes Ai agent Dashboard

When you first set up a self-hosted AI assistant it feels like a superpower. You ask it things, it remembers your context, it gets better over time. For a while that is more than enough.

Then you start piling things on. Research tasks, writing tasks, technical questions, personal reminders, project updates. The agent handles all of it but somewhere along the way the outputs start feeling averaged out. Not bad. Just not sharp. A writing task gets a response that feels slightly too technical. A technical question gets an answer that is a little too conversational. The agent knows too much about too many things and not enough about any one of them.

This is the generalist problem. One agent doing everything is like one employee handling sales, engineering, content, and customer support simultaneously. It works until it does not.

If you already have Hermes running and you are starting to feel this, the profile system is the next thing worth exploring.


What Hermes Profiles Actually Are

Hermes has a built-in system called profiles. Each profile is a completely isolated agent with its own configuration, its own memory, its own skills, its own SOUL.md which is the file that defines the agent’s personality and purpose, and its own Telegram bot if you want one.

You install Hermes once. Then you spin up as many profiles as you need, each one focused on a specific area. They live on the same machine but they do not share context. What the content agent knows stays with the content agent. What the projects agent knows stays with the projects agent.

The command to create a new profile when working from the terminal is simply:

hermes profile create <name>

That creates a new isolated directory under your Hermes installation with its own config, memory files, and skills folder. From there you run setup on that profile to configure the model and API keys, write a focused SOUL.md that tells the agent exactly who it is and what it handles, and optionally give it its own Telegram bot.


How I Set Up the Team

I already had one agent running called Lollie. She had been my general assistant for a while and she was good at her job. Rather than replace her I promoted her.

Lollie is now the Commander. She holds the full picture of everything I am working on, routes requests to the right specialist, and reviews outputs before they come back to me. Think of her as chief of staff. She does not do the deep work herself. She knows who to send it to.

To set up the five specialist profiles I used Claude Code, which was already running on the same server. I gave it a detailed prompt describing each agent’s role, personality, and domain, and it created all five profiles, wrote a focused SOUL.md for each one, and flagged Lollie’s existing SOUL.md for an update rather than an overwrite. That last part matters. If you already have an agent with accumulated personality and memory you do not want to wipe that. You layer the new context on top.

Here’s a generic version of the prompt I gave to Claude Code:

I'm setting up a multi-agent architecture using Hermes Agent on this VM.
I need you to help me create and configure multiple Hermes profiles.
First, run these commands to understand the current state:
- hermes profile list
- ls ~/.hermes/
- cat ~/.hermes/SOUL.md (if it exists)
- hermes config show
- whoami && echo $HOME
Then help me set up the following profile structure:
1. Default profile (existing agent) — Commander/Orchestrator
Read the existing SOUL.md at ~/.hermes/SOUL.md first.
Update it to add the Commander/routing role and awareness of all
specialist profiles without replacing the existing personality,
name, or core identity. Preserve what is already there and layer
the new context on top.
Add a routing table mapping each domain to the right specialist
profile with a one-line trigger description.
Add a brief description of each specialist's area so the Commander
knows who to route to and when.
2. Profile: [name] — [Role Title]
Handles: [describe the area this agent owns]
Mindset: [describe how this agent thinks and approaches its work]
Hard rules: [any non-negotiables for this agent]
Write a focused SOUL.md for this role.
3. Profile: [name] — [Role Title]
Handles: [describe the area this agent owns]
Mindset: [describe how this agent thinks and approaches its work]
Hard rules: [any non-negotiables for this agent]
Write a focused SOUL.md for this role.
(repeat for each additional profile)
For each new profile:
- Create it using: hermes profile create <name>
- Write an appropriate SOUL.md to ~/.hermes/profiles/<name>/SOUL.md
- Do NOT configure API keys or Telegram tokens, I will do those manually
- Do NOT start any gateways
After creating all profiles, list them and show me the full content of
every SOUL.md file one by one without summarising so I can review
before proceeding.

After the profiles were created the steps for each one were:

Run setup to configure the model and API key (example below for setting up the ‘wheel’ profile. This is for the terminal, but if using claude code ask if it can set up the ai models in addition):

wheel setup

Create a new Telegram bot via BotFather for that profile. You send BotFather the command /newbot, give it a name and a username ending in “bot”, and it hands you a token. One bot per profile, one token per bot.

Add the token to that profile’s .env file:

nano ~/.hermes/profiles/wheel/.env
TELEGRAM_TOKEN=your_token_here

Install and start the gateway:

wheel gateway install
wheel gateway start

Check it is running:

wheel gateway status

Do that for each profile and you have five separate Telegram bots, each one a focused specialist you can message directly.


Meet the Team

Hermes Ai agents
Graphical representation of Ai agents

This is the part I enjoyed most. Each agent has a name, a defined personality, and a specific area it owns. Nothing bleeds across.

Lollie is the Commander. She was the original agent and she stays at the top. She knows the full picture, holds the master context, and routes work to whoever is best placed to handle it. She does not specialise. She coordinates.

Wheel handles all things content. Blog posts, newsletter, social captions, SEO. She knows the brand voice, the writing rules, the content framework. When anything content related comes up Wheel is who gets the task.

Forge handles technical builds and projects. Coding work, server infrastructure, anything that involves building or debugging something. Her mindset is to investigate before implementing and to think carefully about the knock-on effects of any change.

Engine organises projects and researches skills relevant to whatever is being built. If a new project is starting and I need to know what tools, frameworks, or approaches are worth considering, Engine is who I ask. She maps the landscape before anyone starts building.

Fro handles an ecommerce store I run. Product descriptions, store operations, the specific brand voice for that context. She is completely scoped away from everything else. What happens in Fro stays in Fro.

Muse is the personal one. Health, social life, things that have nothing to do with work or building. She knows nothing about any of the other agents or what I am working on. That separation is deliberate. Some things should not be optimised.


How They Talk to Each Other

This was the part I was most curious about when I started researching this. If you have six agents running do you have to manually copy context between them? Do you have to be the messenger?

The answer is no, and it comes down to two native Hermes features.

The first is delegate_task. Any agent can spin up a temporary subagent to handle a specific piece of work and pass the result back. Fast, no setup needed, useful for quick in-session tasks where the result needs to come straight back into the conversation.

The second is the Kanban board. This is a shared SQLite database that all profiles on the same machine can read from and write to. Lollie can create a task, assign it to Wheel, and the Hermes dispatcher will automatically spawn Wheel as a background worker to complete it. Wheel finishes, marks it done, and the result comes back to Lollie. No manual context passing, no being the messenger yourself.

You initialise it with:

hermes kanban init

When I watched Lollie route a task to Wheel without me doing anything in between I will be honest, it felt like something clicking into place.

Image courtesy: Nous Research

The Cost Question Nobody Talks About

Here is something most multi-agent setup posts skip over entirely. More agents means more API calls, which means more cost. If you are running six agents all hitting a frontier model for every task, that adds up faster than you expect.

A few things worth thinking about before you build this out:

Match the model to the work. Not every agent needs your most capable and expensive model. An agent handling personal reminders or simple store tasks does not need the same horsepower as one doing technical research or writing long-form content. Hermes lets you set a different model per profile. Use that.

Set cheaper models for auxiliary tasks. Hermes uses your main model by default for background jobs like context compression, title generation, and session search. You can override these in each profile’s config.yaml and point them at a cheaper model like Gemini Flash. Those background tasks do not need frontier model quality and routing them to a cheaper model is one of the quickest wins for keeping costs manageable.

Be intentional about automation. Automating agent tasks is powerful but automated tasks that fire frequently on an expensive model will quietly drain your budget. Start with manual triggers and only automate once you know the task is working well and the cost per run is acceptable.

Use the Kanban board wisely. Background workers spawned by the dispatcher are full agent sessions. They cost tokens. For lightweight or frequent tasks consider whether a simpler skill or a cron job would do the same job at a fraction of the cost.

Running a multi-agent setup on a budget is absolutely doable. It just requires more intentionality than running one agent.


What I Would Do Differently

Start with fewer agents than you think you need. I built all five in one session which was straightforward but getting the SOUL.md files right for each one takes real thought. A thin or vague SOUL.md produces a thin and vague agent. Spend more time on those files than you think is necessary.

Also do not build the automation layer before the agents are stable. Spend a week just talking to each one directly first. You will learn quickly where the gaps are.


FAQ

Can you run multiple Hermes agents on the same machine? Yes. Hermes has a native profile system that lets you run multiple isolated agents from a single installation. Each profile gets its own memory, skills, configuration, and messaging channels. You create a new profile with one command and configure it independently from there.

How do Hermes agents talk to each other without you being in the middle? Through two built-in features. delegate_task lets a parent agent spin up a temporary subagent for a specific job and collect the result. The Kanban board is a shared SQLite task queue where any profile can create tasks, assign them to other profiles, and have the dispatcher automatically spawn the right agent as a worker to complete them without you being involved.

What is the difference between delegate_task and the Hermes Kanban board? delegate_task is a function call. It is fast, in-session, and the result comes straight back into the parent agent’s context. The Kanban board is a durable work queue. Use it when work crosses agent boundaries, needs to survive a restart, involves dependencies between tasks, or you want a human to be able to interject at any point.


What Comes Next

The setup is done but this is clearly the beginning not the end. I want to build out dedicated skill files for each agent so they have deeper specialist knowledge in their areas. I want to wire the Kanban board up properly so Lollie is doing more routing automatically. And I want to see what months of accumulated memory in a focused domain actually produces in terms of output quality.

If you are building something similar or already have a multi-agent setup running I would genuinely like to hear about it. Come find me at newsletter.augustwheel.com and let me know what you are working on.


Discover more from August Wheel

Subscribe to get the latest posts sent to your email.

Leave a Reply

Trending

Discover more from August Wheel

Subscribe now to keep reading and get access to the full archive.

Continue reading