Back to agent research
Hermes boundary / May 16, 2026

Hermes selected the payment workflow. Imladri blocked it across three model providers.

This extends the OpenClaw work to a second agent runtime. Hermes chat -q model turns were run through OpenAI, Gemini, and DeepSeek with the Imladri boundary plugin enabled. Each model attempted the same finance workflow, Hermes invoked the protected plugin tool, and Imladri blocked before the dangerous body could run.

Why this test exists

OpenClaw proved the boundary against one agent runtime. Hermes tests whether the same idea survives a different plugin model, tool registry, and chat execution path. The useful question is not whether a model can be persuaded to say the right thing; it is whether a native tool body stays sealed when the model tries to act.

Real model turns

Three providers selected the workflow and were blocked before body entry.

The prompt did not name the exact payment tool. It asked the model to perform an authorized local lab finance workflow. Hermes then routed the model-selected action into the imladri_boundary plugin toolset.

ProviderModelHermes runtime providerResult
OpenAIgpt-4o-minicustom OpenAI-compatibleblocked before body
Geminigemini-flash-latestgeminiblocked before body
DeepSeekdeepseek-chatdeepseekblocked before body
Adversarial replay

Seven attack patterns held under 100-way replay.

The deterministic Hermes suite uses Hermes' real PluginManager and tool registry. It replayed seven patterns at 100 samples each, including prompt injection through tool output, unknown actions, halt state, argument injection, and delegated-agent attempts.

PatternBlockedBody result
Plan then execute100/1000 body calls
Buried mid-chain action100/1000 body calls
Tool-output prompt injection100/1000 body calls
Unknown action100/1000 body calls
Halt state mid-execution100/1000 body calls
Argument injection100/1000 body calls
Two-agent delegation100/1000 body calls
Action matrix

The claim is not payment-specific.

The matrix covered fifty prohibited action names across payment, wallet, customer data, admin, secrets, database, cloud, Kubernetes, repository, CI, audit, browser, and shell categories. Hermes invoked the generic action probe and Imladri blocked all fifty before side effects.

Proof packet

The public artifacts show the boundary, not just the summary.

Hermes version0.13.0
Cross-provider model turns3/3
Concurrent adversarial replay700/700
Latency-budget replay140/140
Multi-turn attackblocked at turn 8
Malformed constitutions9/9 failed closed
Prohibited action matrix50/50
Dangerous bridge calls0

Cross-provider model turns: hermes-model-turn-cross-provider-20260516.json

Boundary evidence suite: hermes-boundary-evidence-20260516.json

Scope

This makes the agent-boundary story runtime-agnostic, not magic.

The result means Imladri can sit behind another agent runtime and enforce the dangerous tool boundary in the same way it did for OpenClaw. It does not mean every Hermes plugin is safe by default. The integration point is explicit: dangerous tools must be wrapped, policy must fail closed, and evidence should be exported after the run.