Stop Routers from Sneaking Bad Commands into Your Agent

At a Glance

Make the router unable to see plaintext: running only the request/response path inside a provable enclave prevents tool-call tampering and secret exfiltration while keeping provider routing intact.

ON THIS PAGE

What They Found

Putting just the data path of an API router inside a verifiable hardware enclave and requiring the client to check that enclave before sending plaintext removes the operator’s ability to rewrite tool calls or scan requests for secrets. The small, auditable enclave image forwards bytes verbatim to a pinned provider and reports only usage to the host, shrinking the trusted code to what’s necessary. Aegis was implemented on a mainstream cloud enclave platform, formally analyzed, and empirically shown to block all four malicious-router attack classes with only millisecond-level overhead. This aligns with Dynamic Task Routing Pattern in guiding how to route tasks securely.

Key Data

1Median added relay overhead: ~5.7 ms for small request bodies (enclave vs plain host).

2End-to-end median latencies for real provider workloads: 902 ms to 1,175 ms across providers in tests.

3Per-request streaming overhead: median ~5.3 ms to first byte and ~19.5 ms to first token compared to direct upstream.

Implications

Engineers building agents that execute external tools (like coding assistants) — to stop dangerous command tampering and supply-chain swaps. Security and platform teams that host or evaluate API routers — to offer provable trust without forcing provider changes. Teams that audit agent behavior can use enclave-backed routers to get stronger integrity guarantees for agent interactions. For governance and collaboration considerations, see Handoff Pattern and Human-in-the-Loop Pattern.

Need expert guidance?We can help implement this

Learn More

Key Figures

Figure 4: Steady-state relay overhead against a local upstream, for the protected path and for the identical relay running as a plain host process. (a) Overhead for a small body, medians 5.7 5.7 and 3.7 3.7 ms. (b) Median overhead against body size; the curve gap is the enclave term. (c) Its decomposition; the enclave term overtakes the relay machinery between 10 10 and 100 100 KB.

Fig 4: Figure 4: Steady-state relay overhead against a local upstream, for the protected path and for the identical relay running as a plain host process. (a) Overhead for a small body, medians 5.7 5.7 and 3.7 3.7 ms. (b) Median overhead against body size; the curve gap is the enclave term. (c) Its decomposition; the enclave term overtakes the relay machinery between 10 10 and 100 100 KB.

$Figure 5: Real-provider workloads and streaming through the verified path. (a) End-to-end latency CDFs, 100 100 provider-native requests per provider, medians 902 902 to 1175 1175 ms. (b) Median (solid) and 95th-percentile (dashed) latency against concurrency; all 600 600 requests succeed, each median within 20 % 20\% of its single-request value. (c) Paired per-request overhead on a fixed 256 256 -token streamed completion, median 5.3 5.3 ms to the first byte and 19.5 19.5 ms to the first token. (d) Inter-chunk intervals through Aegis and direct coincide (medians 3.5 3.5 and 3.0 3.0 ms).$

Fig 5: Figure 5: Real-provider workloads and streaming through the verified path. (a) End-to-end latency CDFs, 100 100 provider-native requests per provider, medians 902 902 to 1175 1175 ms. (b) Median (solid) and 95th-percentile (dashed) latency against concurrency; all 600 600 requests succeed, each median within 20 % 20\% of its single-request value. (c) Paired per-request overhead on a fixed 256 256 -token streamed completion, median 5.3 5.3 ms to the first byte and 19.5 19.5 ms to the first token. (d) Inter-chunk intervals through Aegis and direct coincide (medians 3.5 3.5 and 3.0 3.0 ms).

Ready to evaluate your AI agents?

Learn how ReputAgent helps teams build trustworthy AI through systematic evaluation.

Learn More

Yes, But...

Some metadata still leaks: sizes, timing, cadence, account, provider, and usage remain observable and can be exploited (record-size reconstruction is possible). The current prototype relies on the cloud platform’s networking model (host performs outbound networking), so below-image network egress control is limited. Attestation is not absolute — hardware vulnerabilities, certificate authority compromise, or gaps in machine-checked non-interference could weaken guarantees and require follow-up hardening like key pinning or traffic padding. See also Model Context Protocol (MCP) Pattern.

Methodology & More

Aegis changes who can see and modify agent-provider traffic by splitting the router: everything that touches plaintext (the data path) runs inside a small, reproducibly built enclave; authentication, account selection, billing and scheduling stay on the untrusted host but never carry the plaintext. A client-side verifier refuses to release the request body until it cryptographically verifies the enclave’s measurement and certificate, so the operator cannot read, alter, or redirect requests once the client insists on the attestation. The design pins the destination set inside the enclave image so request routing cannot be changed by the host. Blackboard Pattern can illustrate how different components contribute to a shared state without leaking plaintext, and Agentic RAG Pattern can guide how agent capabilities are integrated with trusted routing. The team formalized confidentiality, faithful relay, and destination confinement claims in game-based proofs and a symbolic model; implemented the split on a widely available cloud enclave platform; and tested Aegis against the four classes of malicious-router attacks (rewrites, typosquat swaps, trigger-gated behavior, and passive secret exfiltration). All four attacks succeed against a plaintext router but are blocked under Aegis. Performance costs are modest (single-digit milliseconds added for relay start and small increases in streaming), and provider-native routing is preserved. Remaining work includes machine-checking the full non-interference property, stronger defenses against size/timing leakage (padding or shaping), and hardening for certificate authority or platform-level risks. Dynamic Task Routing Pattern may inform future routing refinements, and Model Context Protocol (MCP) Pattern can help coordinate contextual constraints across the system.

Need expert guidance?We can help implement this

Learn More

Credibility Assessment:

No affiliations provided and low author h-indices (≤4); arXiv preprint with no citations — limited signals of established credibility.

agent trust signals multi-agent trust agent reliability agent governance

Not sure where to start?