hermes-agent

/home/avalon/.hermes/skills/autonomous-ai-agents/hermes-agent/SKILL.md · raw

Hermes Agent

Hermes Agent is an open-source AI agent framework by Nous Research that runs in your terminal, messaging platforms, and IDEs. It belongs to the same category as Claude Code (Anthropic), Codex (OpenAI), and OpenClaw — autonomous coding and task-execution agents that use tool calling to interact with your system. Hermes works with any LLM provider (OpenRouter, Anthropic, OpenAI, DeepSeek, local models, and 15+ others) and runs on Linux, macOS, and WSL.

What makes Hermes different:

Self-improving through skills — Hermes learns from experience by saving reusable procedures as skills. When it solves a complex problem, discovers a workflow, or gets corrected, it can persist that knowledge as a skill document that loads into future sessions. Skills accumulate over time, making the agent better at your specific tasks and environment.
Persistent memory across sessions — remembers who you are, your preferences, environment details, and lessons learned. Pluggable memory backends (built-in, Honcho, Mem0, and more) let you choose how memory works.
Multi-platform gateway — the same agent runs on Telegram, Discord, Slack, WhatsApp, Signal, Matrix, Email, and 8+ other platforms with full tool access, not just chat.
Provider-agnostic — swap models and providers mid-workflow without changing anything else. Credential pools rotate across multiple API keys automatically.
Profiles — run multiple independent Hermes instances with isolated configs, sessions, skills, and memory.
Extensible — plugins, MCP servers, custom tools, webhook triggers, cron scheduling, and the full Python ecosystem.

People use Hermes for software development, research, system administration, data analysis, content creation, home automation, and anything else that benefits from an AI agent with persistent context and full system access.

This skill helps you work with Hermes Agent effectively — setting it up, configuring features, spawning additional agent instances, troubleshooting issues, finding the right commands and settings, and understanding how the system works when you need to extend or contribute to it.

Docs: https://hermes-agent.nousresearch.com/docs/

Quick Start

# Install
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash

# Interactive chat (default)
hermes

# Single query
hermes chat -q "What is the capital of France?"

# Setup wizard
hermes setup

# Change model/provider
hermes model

# Check health
hermes doctor

CLI Reference

Global Flags

hermes [flags] [command]

  --version, -V             Show version
  --resume, -r SESSION      Resume session by ID or title
  --continue, -c [NAME]     Resume by name, or most recent session
  --worktree, -w            Isolated git worktree mode (parallel agents)
  --skills, -s SKILL        Preload skills (comma-separate or repeat)
  --profile, -p NAME        Use a named profile
  --yolo                    Skip dangerous command approval
  --pass-session-id         Include session ID in system prompt

No subcommand defaults to chat.

Chat

hermes chat [flags]
  -q, --query TEXT          Single query, non-interactive
  -m, --model MODEL         Model (e.g. anthropic/claude-sonnet-4)
  -t, --toolsets LIST       Comma-separated toolsets
  --provider PROVIDER       Force provider (openrouter, anthropic, nous, etc.)
  -v, --verbose             Verbose output
  -Q, --quiet               Suppress banner, spinner, tool previews
  --checkpoints             Enable filesystem checkpoints (/rollback)
  --source TAG              Session source tag (default: cli)

Configuration

hermes setup [section]      Interactive wizard (model|terminal|gateway|tools|agent)
hermes model                Interactive model/provider picker
hermes config               View current config
hermes config edit          Open config.yaml in $EDITOR
hermes config set KEY VAL   Set a config value
hermes config path          Print config.yaml path
hermes config env-path      Print .env path
hermes config check         Check for missing/outdated config
hermes model                Interactive model/provider picker
# OAuth credentials in current Hermes builds:
hermes auth add PROVIDER --type oauth --no-browser
# Example for Codex/ChatGPT subscription device auth:
hermes auth add openai-codex --type oauth --no-browser --label web-setup
hermes logout               Clear stored auth
hermes doctor [--fix]       Check dependencies and config
hermes status [--all]       Show component status

Tools & Skills

hermes tools                Interactive tool enable/disable (curses UI)
hermes tools list           Show all tools and status
hermes tools enable NAME    Enable a toolset
hermes tools disable NAME   Disable a toolset

hermes skills list          List installed skills
hermes skills search QUERY  Search the skills hub
hermes skills install ID    Install a skill
hermes skills inspect ID    Preview without installing
hermes skills config        Enable/disable skills per platform
hermes skills check         Check for updates
hermes skills update        Update outdated skills
hermes skills uninstall N   Remove a hub skill
hermes skills publish PATH  Publish to registry
hermes skills browse        Browse all available skills
hermes skills tap add REPO  Add a GitHub repo as skill source

MCP Servers

hermes mcp serve            Run Hermes as an MCP server
hermes mcp add NAME         Add an MCP server (--url or --command)
hermes mcp remove NAME      Remove an MCP server
hermes mcp list             List configured servers
hermes mcp test NAME        Test connection
hermes mcp configure NAME   Toggle tool selection

Gateway (Messaging Platforms)

hermes gateway run          Start gateway foreground
hermes gateway install      Install as background service
hermes gateway start/stop   Control the service
hermes gateway restart      Restart the service
hermes gateway status       Check status
hermes gateway setup        Configure platforms

Supported platforms: Telegram, Discord, Slack, WhatsApp, Signal, Email, SMS, Matrix, Mattermost, Home Assistant, DingTalk, Feishu, WeCom, Open WebUI.

Platform docs: https://hermes-agent.nousresearch.com/docs/user-guide/messaging/

Telegram bot provisioning

When productizing Hermes tenants with Telegram integration, use a guided @BotFather token-entry wizard as the default path. The official Telegram Bot API cannot create other bots or fetch BotFather tokens. Fully automated bot creation requires acting as a Telegram user account through MTProto/TDLib/Telethon (QR/phone login, optional 2FA, encrypted session, scripted BotFather chat), which is sensitive and can trigger anti-automation/rate limits. Treat MTProto user-account automation as an explicit advanced option only, not the MVP/default; if implemented, use short-lived encrypted sessions and delete them after bot creation where possible.

Interactive approval buttons vary by platform

When asked whether Telegram-style approval buttons exist on another messaging platform, verify the adapter rather than assuming platform parity. In current Hermes gateway code, the rich approval path is class-method based: gateway/run.py calls an adapter's send_exec_approval(...) when present and otherwise falls back to a plain text prompt instructing /approve, /approve session, /approve always, or /deny.

Known practical UX notes: - Telegram implements send_exec_approval with inline-keyboard buttons and callback queries. - Discord, Slack, Feishu, and Matrix also have send_exec_approval implementations using their native interactive/card/block mechanisms. - WhatsApp is integrated through a bridge (whatsapp-web.js / Baileys / Business API patterns), but the default WhatsApp adapter may not implement send_exec_approval; expect text-command fallback unless a backend-specific button method is added. - iMessage is via the BlueBubbles adapter. BlueBubbles supports messages, media, tapbacks, typing/read indicators, and webhook delivery, but iMessage has no Telegram-like arbitrary inline bot buttons. Use text replies, or design a careful tapback mapping (e.g. 👍 approve once, 👎 deny, ❤️ approve always) only if explicitly requested and documented.

See references/gateway-interactive-approvals.md for the verification/search workflow and iMessage visual references.

Gateway interactive approval/button support

When asked whether a messaging platform has Telegram-like approval buttons, verify the adapter implementation rather than assuming all gateway platforms support the same UI. In the Hermes codebase, dangerous-command approval prompts use an adapter method named send_exec_approval; if absent, gateway falls back to plain text instructions (/approve, /approve session, /approve always, /deny). As of the current checkout: - Telegram: send_exec_approval with inline keyboard buttons; also callback query handling. - Discord, Slack, Feishu, Matrix: implement richer interactive approval UI. - WhatsApp: gateway adapter exists (Node bridge via WhatsApp Web/Baileys/Business API patterns), but no send_exec_approval method in gateway/platforms/whatsapp.py; expect text-command fallback unless that backend is extended. - iMessage: Hermes uses the bluebubbles platform adapter (BlueBubbles macOS server). It supports text/media/tapbacks/typing/read receipts, but iMessage has no Telegram-style arbitrary inline bot buttons; approvals are command/text based.

Quick verification commands:

grep -R "def send_exec_approval" -n ~/.hermes/hermes-agent/gateway/platforms
grep -R "CallbackQueryHandler\|callback_query\|interactive" -n ~/.hermes/hermes-agent/gateway/platforms/telegram.py ~/.hermes/hermes-agent/gateway/platforms/whatsapp.py ~/.hermes/hermes-agent/gateway/platforms/bluebubbles.py

Hermes Workspace web UI deployment

When deploying hermes-workspace against a local Hermes Agent gateway:

Enable the gateway HTTP API in ~/.hermes/.env, then restart the gateway: bash API_SERVER_ENABLED=true API_SERVER_HOST=127.0.0.1 API_SERVER_PORT=8642 API_SERVER_KEY=<strong-shared-token> hermes gateway restart curl http://127.0.0.1:8642/health
In the workspace .env, point to the local API and mirror the token: bash HERMES_API_URL=http://127.0.0.1:8642 HERMES_API_TOKEN=<same-as-API_SERVER_KEY> HERMES_DASHBOARD_URL=http://127.0.0.1:9119 HERMES_PASSWORD=<strong-ui-password> COOKIE_SECURE=1 TRUST_PROXY=1 HOST=127.0.0.1
Start the dashboard under PM2 via a shell wrapper, not directly as the hermes Python entrypoint. PM2 otherwise may run the Python script with Node and crash with SyntaxError: Invalid or unexpected token: bash pm2 start /usr/bin/bash --name hermes-dashboard -- -lc 'exec /home/USER/.local/bin/hermes dashboard --port 9119 --host 127.0.0.1 --no-open'
Start the workspace on a loopback port and verify all layers: bash PORT=4024 HOST=127.0.0.1 pm2 start pnpm --name hermes-workspace --cwd /path/to/hermes-workspace -- preview --host 127.0.0.1 --port 4024 curl http://127.0.0.1:4024/api/auth-check curl http://127.0.0.1:9119/api/status
Put nginx basic auth in front if exposing publicly, but keep the workspace app password enabled too; verify unauthenticated public root returns 401, authenticated root returns 200, /api/auth-check reports authRequired:true, and posting the workspace password to /api/auth sets a valid session.

If a Nuxt/Vite workspace build dies with exit 137 or the gateway appears to stall during build, check RAM/swap and add temporary swap before rebuilding.

Sessions

hermes sessions list        List recent sessions
hermes sessions browse      Interactive picker
hermes sessions export OUT  Export to JSONL
hermes sessions rename ID T Rename a session
hermes sessions delete ID   Delete a session
hermes sessions prune       Clean up old sessions (--older-than N days)
hermes sessions stats       Session store statistics

Cron Jobs

hermes cron list            List jobs (--all for disabled)
hermes cron create SCHED    Create: '30m', 'every 2h', '0 9 * * *'
hermes cron edit ID         Edit schedule, prompt, delivery
hermes cron pause/resume ID Control job state
hermes cron run ID          Trigger on next tick
hermes cron remove ID       Delete a job
hermes cron status          Scheduler status

Cron script pitfall: Hermes cron script paths must be relative filenames under ~/.hermes/scripts/; absolute paths or ~/... paths are rejected. For project-local scripts, create a thin wrapper in ~/.hermes/scripts/ that execs the real script, then pass only the wrapper filename.

Cron reset pitfall: when a user says to cancel/recreate “all current cron jobs” in the context of a specific project, scope the destructive action to that project’s jobs unless they explicitly ask for every Hermes cron globally. List jobs first, remove only the matching project jobs, preserve unrelated watchers/digests, then create the replacement jobs and verify with hermes cron list or the cronjob tool.

Periodic progress-update pattern: keep long-running worker/orchestrator jobs deliver=local so they do useful work without spamming chat, and create a separate bounded reporter job with deliver=origin for Telegram updates. For temporary windows, set a finite repeat count (for example, every 30m for 48h = 96 repeats) and trigger both worker and reporter once with cronjob(action='run') after setup. Reporter prompts should inspect state/reports/git and produce mobile-readable bullets/headings rather than raw logs or tables.

Self-improving app pattern: when a user asks for an app/project to “keep getting better” through Hermes, make Hermes cron the intelligence layer and the app the audit/visualization layer. Create (1) a deliver=local agent worker that loads the relevant project-local skills and performs research/work, (2) a no-agent deterministic script job that writes an audit/state record even when no LLM pass runs, (3) a deliver=origin digest reporter for concise mobile updates, and (4) if the app needs cron awareness, a no-agent snapshot bridge that writes normalized hermes cron list --all output into the app DB. Verify both hermes cron list/cronjob list and the app API/UI see the jobs. Keep script paths as relative filenames under ~/.hermes/scripts/.

Webhooks

hermes webhook subscribe N  Create route at /webhooks/<name>
hermes webhook list         List subscriptions
hermes webhook remove NAME  Remove a subscription
hermes webhook test NAME    Send a test POST

Profiles

hermes profile list         List all profiles
hermes profile create NAME  Create (--clone, --clone-all, --clone-from)
hermes profile use NAME     Set sticky default
hermes profile delete NAME  Delete a profile
hermes profile show NAME    Show details
hermes profile alias NAME   Manage wrapper scripts
hermes profile rename A B   Rename a profile
hermes profile export NAME  Export to tar.gz
hermes profile import FILE  Import from archive

For hosted/productized Hermes deployments, prefer a Hermes-native profile/home-per-tenant architecture first, usually wrapped in one Docker container per tenant for filesystem isolation. Do not jump to a custom central message router/shared worker design until container economics are benchmarked. See references/hosted-hermes-profile-tenants.md for the profile/container-per-tenant pattern, BYO provider onboarding, Telegram bot-token setup, skill bundle rollout, safety defaults, and benchmark gates.

Docker tenant pitfall: the stock installer places Hermes code under $HERMES_HOME/hermes-agent (typically /home/hermes/.hermes/hermes-agent) and the hermes launcher points there. If you bind-mount a tenant home over that same path, you hide the installed code and break the launcher. In runner images, keep the installed code path inside the image and mount tenant state elsewhere, e.g. -e HERMES_HOME=/data/hermes -v /srv/astral/tenants/<id>/hermes-home:/data/hermes.

Credential Pools

hermes auth add PROVIDER    Add an API key or OAuth credential
hermes auth add openai-codex --type oauth --no-browser --label web-setup
hermes auth list [PROVIDER] List pooled credentials
hermes auth remove P INDEX  Remove by provider + index
hermes auth reset PROVIDER  Clear exhaustion status

OAuth/device-auth note: in current Hermes builds the old hermes login --provider openai-codex command may print that it has been removed. For hosted tenant/container flows, first try to run hermes auth add openai-codex --type oauth --no-browser --label <label> inside the tenant's own HERMES_HOME; relay the printed URL and code to the user. Strip ANSI escape codes before parsing the URL/code from terminal output. If the tenant/worker network receives OpenAI/Cloudflare 403 with a “Just a moment...” HTML challenge or 530 route error on the initial device-code request, do not expose the traceback to users; generate/poll the Codex device code from the control-plane backend instead, then write the completed OAuth credential into the tenant’s isolated auth.json in both shapes: credential_pool.openai-codex for auth-list/status UX and providers.openai-codex.tokens plus active_provider: openai-codex for runtime/gateway/cron resolution (auth_type: oauth, source: manual:device_code, base_url: https://chatgpt.com/backend-api/codex). Treat 403/404 during polling as “not approved yet,” not a hard failure. If a tenant was chatting successfully but later says No Codex credentials stored, inspect hermes auth list openai-codex before re-authing: the credential may be present but temporarily exhausted/rate-limited (429), or Spawn may have written only the pool shape; reset/repair the auth store and verify with a minimal hermes chat -q 'Reply with exactly: ok' -Q --provider openai-codex -m gpt-5.5 probe.

Other

hermes insights [--days N]  Usage analytics
hermes update               Update to latest version
hermes pairing list/approve/revoke  DM authorization
hermes plugins list/install/remove  Plugin management
hermes honcho setup/status  Honcho memory integration
hermes memory setup/status/off  Memory provider config
hermes completion bash|zsh  Shell completions
hermes acp                  ACP server (IDE integration)
hermes claw migrate         Migrate from OpenClaw
hermes uninstall            Uninstall Hermes

Slash Commands (In-Session)

Type these during an interactive chat session.

Session Control

/new (/reset)        Fresh session
/clear               Clear screen + new session (CLI)
/retry               Resend last message
/undo                Remove last exchange
/title [name]        Name the session
/compress            Manually compress context
/stop                Kill background processes
/rollback [N]        Restore filesystem checkpoint
/background <prompt> Run prompt in background
/queue <prompt>      Queue for next turn
/resume [name]       Resume a named session

Configuration

/config              Show config (CLI)
/model [name]        Show or change model
/provider            Show provider info
/personality [name]  Set personality
/reasoning [level]   Set reasoning (none|minimal|low|medium|high|xhigh|show|hide)
/verbose             Cycle: off → new → all → verbose
/voice [on|off|tts]  Voice mode
/yolo                Toggle approval bypass
/skin [name]         Change theme (CLI)
/statusbar           Toggle status bar (CLI)

Tools & Skills

/tools               Manage tools (CLI)
/toolsets            List toolsets (CLI)
/skills              Search/install skills (CLI)
/skill <name>        Load a skill into session
/cron                Manage cron jobs (CLI)
/reload-mcp          Reload MCP servers
/plugins             List plugins (CLI)

Info

/help                Show commands
/commands [page]     Browse all commands (gateway)
/usage               Token usage
/insights [days]     Usage analytics
/status              Session info (gateway)
/profile             Active profile info

Exit

/quit (/exit, /q)    Exit CLI

Key Paths & Config

~/.hermes/config.yaml       Main configuration
~/.hermes/.env              API keys and secrets
~/.hermes/skills/           Installed skills
~/.hermes/sessions/         Session transcripts
~/.hermes/logs/             Gateway and error logs
~/.hermes/auth.json         OAuth tokens and credential pools
~/.hermes/hermes-agent/     Source code (if git-installed)

Profiles use ~/.hermes/profiles/<name>/ with the same layout.

Config Sections

Edit with hermes config edit or hermes config set section.key value.

Section	Key options
`model`	`default`, `provider`, `base_url`, `api_key`, `context_length`
`agent`	`max_turns` (90), `tool_use_enforcement`
`terminal`	`backend` (local/docker/ssh/modal), `cwd`, `timeout` (180)
`compression`	`enabled`, `threshold` (0.50), `target_ratio` (0.20)
`display`	`skin`, `tool_progress`, `show_reasoning`, `show_cost`
`stt`	`enabled`, `provider` (local/groq/openai)
`tts`	`provider` (edge/elevenlabs/openai/kokoro/fish)
`memory`	`memory_enabled`, `user_profile_enabled`, `provider`
`security`	`tirith_enabled`, `website_blocklist`
`delegation`	`model`, `provider`, `max_iterations` (50)
`smart_model_routing`	`enabled`, `cheap_model`
`checkpoints`	`enabled`, `max_snapshots` (50)

Full config reference: https://hermes-agent.nousresearch.com/docs/user-guide/configuration

Providers

18 providers supported. Set via hermes model or hermes setup.

Provider	Auth	Key env var
OpenRouter	API key	`OPENROUTER_API_KEY`
Anthropic	API key	`ANTHROPIC_API_KEY`
Nous Portal	OAuth	`hermes login --provider nous`
OpenAI Codex	OAuth	`hermes login --provider openai-codex`
GitHub Copilot	Token	`COPILOT_GITHUB_TOKEN`
DeepSeek	API key	`DEEPSEEK_API_KEY`
Hugging Face	Token	`HF_TOKEN`
Z.AI / GLM	API key	`GLM_API_KEY`
MiniMax	API key	`MINIMAX_API_KEY`
Kimi / Moonshot	API key	`KIMI_API_KEY`
Alibaba / DashScope	API key	`DASHSCOPE_API_KEY`
Kilo Code	API key	`KILOCODE_API_KEY`
Custom endpoint	Config	`model.base_url` + `model.api_key` in config.yaml

Plus: AI Gateway, OpenCode Zen, OpenCode Go, MiniMax CN, GitHub Copilot ACP.

Full provider docs: https://hermes-agent.nousresearch.com/docs/integrations/providers

Direct OpenAI API Keys as a Custom Endpoint

In Hermes v0.13-era configs, a plain OpenAI API key is not model.provider: openai, and openai-codex is the ChatGPT/Codex OAuth provider rather than the BYO API-key provider. For a direct OPENAI_API_KEY, configure OpenAI as an OpenAI-compatible custom endpoint with api_mode: chat_completions. See references/direct-openai-api-key-provider.md for the config shape, tenant-provisioning pitfall, and verification command.

Venice as a Custom Endpoint

When asked to set up Hermes with Venice, treat Venice as an OpenAI-compatible custom provider and install the companion skills if requested. See references/venice-integration.md for the compact workflow and verification commands.

Key config shape:

model:
  default: kimi-k2-6
  provider: custom
  base_url: https://api.venice.ai/api/v1
  api_key: ${VENICE_API_KEY}

Operational notes: - Store the secret in ~/.hermes/.env as VENICE_API_KEY=...; keep config.yaml using the env-var reference. - Verify available models with GET https://api.venice.ai/api/v1/models using Authorization: Bearer $VENICE_API_KEY; known useful IDs include kimi-k2-6, zai-org-glm-5, claude-opus-4-7, z-ai-glm-5v-turbo, and venice-uncensored-1-2. - Venice's 19 skills live at https://github.com/veniceai/skills; copy each skills/<name>/SKILL.md folder under ~/.hermes/skills/venice/<name>/ for local Hermes discovery. - After changing model config for a running gateway session, restart/reset the gateway/session so Telegram picks up the new default provider.

Toolsets

Enable/disable via hermes tools (interactive) or hermes tools enable/disable NAME.

Toolset	What it provides
`web`	Web search and content extraction
`browser`	Browser automation (Browserbase, Camofox, or local Chromium)
`terminal`	Shell commands and process management
`file`	File read/write/search/patch
`code_execution`	Sandboxed Python execution
`vision`	Image analysis
`image_gen`	AI image generation
`tts`	Text-to-speech
`skills`	Skill browsing and management
`memory`	Persistent cross-session memory
`session_search`	Search past conversations
`delegation`	Subagent task delegation
`cronjob`	Scheduled task management
`clarify`	Ask user clarifying questions
`moa`	Mixture of Agents (off by default)
`homeassistant`	Smart home control (off by default)

Tool changes take effect on /reset (new session). They do NOT apply mid-conversation to preserve prompt caching.

Voice & Transcription

STT (Voice → Text)

Voice messages from messaging platforms are auto-transcribed.

Provider priority (auto-detected): 1. Local faster-whisper — free, no API key: pip install faster-whisper 2. Groq Whisper — free tier: set GROQ_API_KEY 3. OpenAI Whisper — paid: set VOICE_TOOLS_OPENAI_KEY

Config:

stt:
  enabled: true
  provider: local        # local, groq, openai
  local:
    model: base          # tiny, base, small, medium, large-v3

Practical Telegram voice-note handling

Incoming Telegram voice messages are cached locally under ~/.hermes/audio_cache/ as .ogg files. If a Telegram link or attachment UI is awkward, inspect that directory and sort by mtime to find the newest upload.
Private t.me/c/... links often do not expose the media bytes directly; the usable artifact is usually the gateway-cached .ogg, not the HTML landing page.
Long Telegram voice uploads can fail before caching with telegram.error.BadRequest: File is too big. If the expected long recording is absent from ~/.hermes/audio_cache/, inspect ~/.hermes/logs/gateway.log for [Telegram] Failed to cache voice: File is too big; then request an alternate upload route (MEGA/Drive/Dropbox/S3, SCP/rsync to an incoming folder, or split chunks) instead of continuing to search locally.
For long notes (for example ~1 hour), do not transcribe in one pass if you need higher fidelity. Split to 10–15 minute mono 16 kHz chunks with ffmpeg, transcribe each chunk, then stitch timestamped segments back together.
If disk space is tight, prefer faster-whisper small or base over larger models. large-v3-turbo can fail mid-download with No space left on device.
A common local-config mistake is setting the faster-whisper model name to whisper-1; that is an OpenAI API model name, not a valid faster-whisper local model identifier. Use values like base, small, medium, or large-v3 instead.
See references/telegram-voice-cache-transcription.md for a concrete recovery/transcription workflow.

TTS (Text → Voice)

Provider	Env var	Free?
Edge TTS	None	Yes (default)
ElevenLabs	`ELEVENLABS_API_KEY`	Free tier
OpenAI	`VOICE_TOOLS_OPENAI_KEY`	Paid
Kokoro (local)	None	Free
Fish Audio	`FISH_AUDIO_API_KEY`	Free tier

Voice commands: /voice on (voice-to-voice), /voice tts (always voice), /voice off.

Spawning Additional Hermes Instances

Run additional Hermes processes as fully independent subprocesses — separate sessions, tools, and environments.

When to Use This vs delegate_task

	`delegate_task`	Spawning `hermes` process
Isolation	Separate conversation, shared process	Fully independent process
Duration	Minutes (bounded by parent loop)	Hours/days
Tool access	Subset of parent's tools	Full tool access
Interactive	No	Yes (PTY mode)
Use case	Quick parallel subtasks	Long autonomous missions

One-Shot Mode

terminal(command="hermes chat -q 'Research GRPO papers and write summary to ~/research/grpo.md'", timeout=300)

# Background for long tasks:
terminal(command="hermes chat -q 'Set up CI/CD for ~/myapp'", background=true)

Interactive PTY Mode (via tmux)

Hermes uses prompt_toolkit, which requires a real terminal. Use tmux for interactive spawning:

# Start
terminal(command="tmux new-session -d -s agent1 -x 120 -y 40 'hermes'", timeout=10)

# Wait for startup, then send a message
terminal(command="sleep 8 && tmux send-keys -t agent1 'Build a FastAPI auth service' Enter", timeout=15)

# Read output
terminal(command="sleep 20 && tmux capture-pane -t agent1 -p", timeout=5)

# Send follow-up
terminal(command="tmux send-keys -t agent1 'Add rate limiting middleware' Enter", timeout=5)

# Exit
terminal(command="tmux send-keys -t agent1 '/exit' Enter && sleep 2 && tmux kill-session -t agent1", timeout=10)

Multi-Agent Coordination

# Agent A: backend
terminal(command="tmux new-session -d -s backend -x 120 -y 40 'hermes -w'", timeout=10)
terminal(command="sleep 8 && tmux send-keys -t backend 'Build REST API for user management' Enter", timeout=15)

# Agent B: frontend
terminal(command="tmux new-session -d -s frontend -x 120 -y 40 'hermes -w'", timeout=10)
terminal(command="sleep 8 && tmux send-keys -t frontend 'Build React dashboard for user management' Enter", timeout=15)

# Check progress, relay context between them
terminal(command="tmux capture-pane -t backend -p | tail -30", timeout=5)
terminal(command="tmux send-keys -t frontend 'Here is the API schema from the backend agent: ...' Enter", timeout=5)

Project Swarm via Kanban + Profiles

Hermes may not have a top-level hermes swarm command in the installed version. When a user asks to create a project swarm, implement it with native primitives:

Create a dedicated Kanban board: bash hermes kanban boards create project-slug --name 'Project Name' --description '...' --switch
Create dedicated role profiles with cloned config/env: bash hermes profile create projectpm --clone --no-alias hermes profile create projectbackend --clone --no-alias hermes profile create projectfrontend --clone --no-alias hermes profile create projectops --clone --no-alias hermes profile create projectqa --clone --no-alias hermes profile create projectreviewer --clone --no-alias
Verify the model/provider on every profile with hermes profile list and a smoke probe, especially when using OAuth providers.
Seed the board with fan-out/fan-in tasks assigned to the role profiles, using parent dependencies for synthesis/review gates.

OAuth auth pitfall: hermes profile create --clone clones config, .env, SOUL.md, and skills, but it may not copy OAuth credential state such as auth.json. If Kanban workers crash with No Codex credentials stored, copy the auth state into each new profile and verify:

for p in projectpm projectbackend projectfrontend projectops projectqa projectreviewer; do
  cp ~/.hermes/auth.json ~/.hermes/profiles/$p/auth.json
  chmod 600 ~/.hermes/profiles/$p/auth.json
done
hermes --profile projectpm chat -q 'Reply with exactly: ok' --provider openai-codex -m gpt-5.5 -Q

Then run a dispatcher pass or wait for the gateway dispatcher:

hermes kanban --board project-slug dispatch
hermes kanban --board project-slug stats

Session Resume

# Resume most recent session
terminal(command="tmux new-session -d -s resumed 'hermes --continue'", timeout=10)

# Resume specific session
terminal(command="tmux new-session -d -s resumed 'hermes --resume 20260225_143052_a1b2c3'", timeout=10)

Tips

Prefer delegate_task for quick subtasks — less overhead than spawning a full process
Use -w (worktree mode) when spawning agents that edit code — prevents git conflicts
Set timeouts for one-shot mode — complex tasks can take 5-10 minutes
Use hermes chat -q for fire-and-forget — no PTY needed
Use tmux for interactive sessions — raw PTY mode has \r vs \n issues with prompt_toolkit
For scheduled tasks, use the cronjob tool instead of spawning — handles delivery and retry

Troubleshooting

Voice not working

Check stt.enabled: true in config.yaml
Verify provider: pip install faster-whisper or set API key
Restart gateway: /restart

Tool not available

hermes tools — check if toolset is enabled for your platform
Some tools need env vars (check .env)
/reset after enabling tools

Web extraction service credits / no-service fallback

If web_extract or web_search reports a paid scraper error such as Firecrawl Payment Required / Insufficient credits, first identify the configured web backend from live config and env presence:

hermes config path && hermes config env-path
python - <<'PY'
import yaml, json
from pathlib import Path
cfg=yaml.safe_load(Path('/home/avalon/.hermes/config.yaml').read_text()) or {}
print(json.dumps(cfg.get('web'), indent=2))
PY

In Alex's Hermes install, Firecrawl credits are tied to the FIRECRAWL_API_KEY in ~/.hermes/.env, not OpenAI/Codex/Telegram credits. Hermes Agent's tools/web_tools.py now has a direct no-service httpx fallback for web_extract when Firecrawl scraping fails, including credit/payment failures. That fallback can fetch static HTML/text and strip simple HTML to readable text, but it cannot recover content when the origin server itself returns an error page or hides data behind JS/auth/Cloudflare. For ChatGPT share links, verify the embedded payload; a direct fallback result like ChatGPT Internal Server Error means OpenAI's share endpoint failed, not that Firecrawl is still blocking.

Checking active vision and image-generation access

When asked what vision/image model Hermes is using or whether an image model is available, verify live config instead of answering from memory:

hermes status --all
hermes config path && hermes config env-path
python - <<'PY'
from agent.auxiliary_client import resolve_vision_provider_client
provider, client, model = resolve_vision_provider_client()
print({'provider': provider, 'model': model, 'client_type': type(client).__name__ if client else None})
PY

Interpretation notes: - auxiliary.vision.provider: auto can resolve to the main provider/model; on Codex this may show openai-codex + the active chat model (e.g. gpt-5.5) through CodexAuxiliaryClient. - Built-in tools/image_generation_tool.py is the FAL-backed image tool; absent image_gen.model, it defaults to fal-ai/flux-2/klein/9b. - The OpenAI direct image plugin requires OPENAI_API_KEY; do not infer direct API access from VOICE_TOOLS_OPENAI_KEY or Codex OAuth. - The plugins/image_gen/openai-codex provider can expose gpt-image-2 tiers (gpt-image-2-low|medium|high) through ChatGPT/Codex OAuth without OPENAI_API_KEY. Probe availability with:

python - <<'PY'
import importlib.util
from pathlib import Path
p=Path('plugins/image_gen/openai-codex/__init__.py')
spec=importlib.util.spec_from_file_location('openai_codex_img', p)
m=importlib.util.module_from_spec(spec); spec.loader.exec_module(m)
prov=m.OpenAICodexImageGenProvider()
print({'provider': prov.name, 'available': prov.is_available(), 'default_model': prov.default_model(), 'api_model': m.API_MODEL})
PY

Pitfall: after changing image_gen.provider/image_gen.model, the current gateway/agent session may still have an old image tool backend cached. Verify with an explicit plugin probe or a fresh session before trusting image_generate; in one session the tool still returned a FAL URL after config was changed to openai-codex, while a direct plugin invocation correctly used gpt-image-2. Restart/reset the gateway/session for Telegram-visible tool behavior, or call the plugin directly for verification-critical tests.

GPT Image 2 / Codex aspect-ratio pitfall: the provider can return pixel dimensions that differ from the requested size even when result metadata echoes the requested aspect ratio and dimensions. Verify the saved PNG with Pillow before promising exact dimensions. Observed live on 2026-07-22: nominal square requests returned 1223x1286 and 1024x1536, while another returned an exact square. Treat the result size field as request metadata, not proof of output dimensions; crop/reframe in post-processing when exact aspect ratio is required.

Direct xAI image-plugin pitfall: when invoking plugins/image_gen/xai directly (instead of through the normal image_generate delivery wrapper), a successful generate() may return a hosted https://files-cdn.x.ai/...png URL in the image field rather than a local path. Detect http:///https:// before wrapping the value with Path or calling copy2; download and validate the URL, or pass it through as media. Treating the URL as a filesystem path collapses https:// to https:/ and produces a misleading FileNotFoundError even though generation succeeded.

Model/provider issues

hermes doctor — check config and dependencies
hermes login — re-authenticate OAuth providers
Check .env has the right API key

Changes not taking effect

Tools/skills: /reset starts a new session with updated toolset
Config changes: /restart reloads gateway config
Code changes: Restart the CLI or gateway process

Skills not showing

hermes skills list — verify installed
hermes skills config — check platform enablement
Load explicitly: /skill name or hermes -s name

Gateway issues

Check logs first:

grep -i "failed to send\|error" ~/.hermes/logs/gateway.log | tail -20

Telegram says “Gateway restarting” / service stuck in auto-restart

If Telegram shows Gateway restarting, verify the service state and logs before assuming Telegram is broken:

hermes gateway status
systemctl --user show hermes-gateway.service -p ActiveState -p SubState -p Result -p ExecMainStatus -p NRestarts --no-pager
tail -n 120 ~/.hermes/logs/gateway.log
tail -n 120 ~/.hermes/logs/errors.log 2>/dev/null || true

Interpretation and recovery pattern: - Stopping gateway for restart... followed by Gateway drain timed out ... active agent(s) means Hermes intentionally sent restart/shutdown notices, waited for active sessions, then interrupted remaining work. If systemd shows activating (auto-restart) and Restart pending, wait through RestartSec or start/restart cleanly. - If a secondary platform repeatedly fails during startup/reconnect (for example Discord 401 Unauthorized, LoginFailure: Improper token has been passed, or 30s connect timeouts), it can make gateway startup noisy/slow even though Telegram is the platform Alex cares about. Fix the credential, or temporarily disable the platform by renaming the env var in ~/.hermes/.env (for example DISCORD_BOT_TOKEN_DISABLED=<old value>), keeping a timestamped .env backup first. - After editing .env, run hermes gateway restart, then verify: hermes gateway status, curl -fsS http://127.0.0.1:8642/health if API server is enabled, and fresh gateway.log lines showing ✓ telegram connected and no repeated failed-platform reconnect loop. - Do not retry identical blocked/safety-denied service commands. Switch to hermes gateway restart or systemctl --user start --no-block hermes-gateway.service and verify state after the restart delay. - Telegram image-send errors such as webpage_curl_failed, Wrong type of the web page content, or flood-control retries are usually delivery issues for generated media URLs, not gateway health failures.

Telegram + OpenAI Codex/OpenAI provider shows `Invalid API response shape`

If Telegram replies with: - ⚠️ Invalid API response shape. Likely rate limited or malformed provider response.

and logs show repeated lines like: - response.output is empty - Invalid API response after 3 retries

then the root cause may be a stale gateway process still running older Codex Responses handling code, not Telegram itself.

Use this checklist:

Confirm the active provider/config:

hermes gateway status
hermes config path
hermes config env-path

Inspect recent logs:

tail -n 80 ~/.hermes/logs/gateway.log
tail -n 80 ~/.hermes/logs/errors.log

Look specifically for: - response.output is empty - Invalid API response after 3 retries - Telegram response ready lines showing the error text was sent back to chat
Verify whether the current code path is already fixed by running a direct CLI probe with the same provider/model:

hermes chat -q "Reply with exactly: ok" --provider openai-codex -m gpt-5.4 -v

If that succeeds and debug output mentions backfilling/synthesizing Codex stream output, the code on disk is likely fine.

Refresh the running gateway process and systemd unit:

hermes gateway restart

This is especially important if hermes gateway status says the installed service definition is outdated.

Interpretation: - CLI probe fails too -> active code path is still broken; inspect run_agent.py / provider handling. - CLI probe works but Telegram had failed earlier -> gateway was likely stale and needed restart.

Known implementation detail: - Hermes may recover empty Codex response.output by backfilling from streamed response.output_item.done events or synthesizing from text deltas. - If those recoveries are present in run_agent.py, but Telegram still shows the old error, restart the gateway before changing code.

Where to Find Things

Looking for...	Location
Config options	`hermes config edit` or Configuration docs
Available tools	`hermes tools list` or Tools reference
Slash commands	`/help` in session or Slash commands reference
Skills catalog	`hermes skills browse` or Skills catalog
Provider setup	`hermes model` or Providers guide
Platform setup	`hermes gateway setup` or Messaging docs
MCP servers	`hermes mcp list` or MCP guide
Profiles	`hermes profile list` or Profiles docs
Cron jobs	`hermes cron list` or Cron docs
Memory	`hermes memory status` or Memory docs
Env variables	`hermes config env-path` or Env vars reference
CLI commands	`hermes --help` or CLI reference
Gateway logs	`~/.hermes/logs/gateway.log`
Session files	`~/.hermes/sessions/` or `hermes sessions browse`
Source code	`~/.hermes/hermes-agent/`

Contributor Quick Reference

For occasional contributors and PR authors. Full developer docs: https://hermes-agent.nousresearch.com/docs/developer-guide/

Project Layout

hermes-agent/
├── run_agent.py          # AIAgent — core conversation loop
├── model_tools.py        # Tool discovery and dispatch
├── toolsets.py           # Toolset definitions
├── cli.py                # Interactive CLI (HermesCLI)
├── hermes_state.py       # SQLite session store
├── agent/                # Prompt builder, compression, display, adapters
├── hermes_cli/           # CLI subcommands, config, setup, commands
│   ├── commands.py       # Slash command registry (CommandDef)
│   ├── config.py         # DEFAULT_CONFIG, env var definitions
│   └── main.py           # CLI entry point and argparse
├── tools/                # One file per tool
│   └── registry.py       # Central tool registry
├── gateway/              # Messaging gateway
│   └── platforms/        # Platform adapters (telegram, discord, etc.)
├── cron/                 # Job scheduler
├── tests/                # ~3000 pytest tests
└── website/              # Docusaurus docs site

Config: ~/.hermes/config.yaml (settings), ~/.hermes/.env (API keys).

Adding a Tool (3 files)

1. Create tools/your_tool.py:

import json, os
from tools.registry import registry

def check_requirements() -> bool:
    return bool(os.getenv("EXAMPLE_API_KEY"))

def example_tool(param: str, task_id: str = None) -> str:
    return json.dumps({"success": True, "data": "..."})

registry.register(
    name="example_tool",
    toolset="example",
    schema={"name": "example_tool", "description": "...", "parameters": {...}},
    handler=lambda args, **kw: example_tool(
        param=args.get("param", ""), task_id=kw.get("task_id")),
    check_fn=check_requirements,
    requires_env=["EXAMPLE_API_KEY"],
)

2. Add import in model_tools.py → _discover_tools() list.

3. Add to toolsets.py → _HERMES_CORE_TOOLS list.

All handlers must return JSON strings. Use get_hermes_home() for paths, never hardcode ~/.hermes.

Adding a Slash Command

Add CommandDef to COMMAND_REGISTRY in hermes_cli/commands.py
Add handler in cli.py → process_command()
(Optional) Add gateway handler in gateway/run.py

All consumers (help text, autocomplete, Telegram menu, Slack mapping) derive from the central registry automatically.

Agent Loop (High Level)

run_conversation():
  1. Build system prompt
  2. Loop while iterations < max:
     a. Call LLM (OpenAI-format messages + tool schemas)
     b. If tool_calls → dispatch each via handle_function_call() → append results → continue
     c. If text response → return
  3. Context compression triggers automatically near token limit

Testing

source venv/bin/activate  # or .venv/bin/activate
python -m pytest tests/ -o 'addopts=' -q   # Full suite
python -m pytest tests/tools/ -q            # Specific area

Tests auto-redirect HERMES_HOME to temp dirs — never touch real ~/.hermes/
Run full suite before pushing any change
Use -o 'addopts=' to clear any baked-in pytest flags

Commit Conventions

type: concise subject line

Optional body.

Types: fix:, feat:, refactor:, docs:, chore:

Key Rules

Never break prompt caching — don't change context, tools, or system prompt mid-conversation
Message role alternation — never two assistant or two user messages in a row
Use get_hermes_home() from hermes_constants for all paths (profile-safe)
Config values go in config.yaml, secrets go in .env
New tools need a check_fn so they only appear when requirements are met

hermes-agent

Hermes Agent

Quick Start

CLI Reference

Global Flags

Chat

Configuration

Tools & Skills

MCP Servers

Gateway (Messaging Platforms)

Telegram bot provisioning

Interactive approval buttons vary by platform

Gateway interactive approval/button support

Hermes Workspace web UI deployment

Sessions

Cron Jobs

Webhooks

Profiles

Credential Pools

Other

Slash Commands (In-Session)

Session Control

Configuration

Tools & Skills

Info

Exit

Key Paths & Config

Config Sections

Providers

Direct OpenAI API Keys as a Custom Endpoint

Venice as a Custom Endpoint

Toolsets

Voice & Transcription

STT (Voice → Text)

Practical Telegram voice-note handling

TTS (Text → Voice)

Spawning Additional Hermes Instances

When to Use This vs delegate_task

One-Shot Mode

Interactive PTY Mode (via tmux)

Multi-Agent Coordination

Project Swarm via Kanban + Profiles

Session Resume

Tips

Troubleshooting

Voice not working

Tool not available

Web extraction service credits / no-service fallback

Checking active vision and image-generation access

Model/provider issues

Changes not taking effect

Skills not showing

Gateway issues

Telegram says “Gateway restarting” / service stuck in auto-restart

Telegram + OpenAI Codex/OpenAI provider shows Invalid API response shape

Where to Find Things

Contributor Quick Reference

Project Layout

Adding a Tool (3 files)

Adding a Slash Command

Agent Loop (High Level)

Testing

Commit Conventions

Key Rules

Telegram + OpenAI Codex/OpenAI provider shows `Invalid API response shape`