ENTERPRISE

Runtime protection for AI agents at scale

Monitor and control every action your AI coding agents take. Block prompt injection attacks before they execute.

ARCHITECTURE

How AgentSteer works

Integrates as a PreToolUse hook at the agent framework level. Every tool call is intercepted, scored, and either allowed or blocked before execution.

AI Agent

Claude Code
OpenHands
Any Python

→

PreToolUse Hook

Intercepts
every tool call
before execution

→

Sanitize

Strips API keys
tokens, secrets
before scoring

→

Security Model

Scores action
against task
description

→

Allow / Block

Agent continues
or sees block
reason

The security model can run via API (OpenRouter) or self-hosted for full data sovereignty.

WHY TEAMS CHOOSE AGENTSTEER

Built for security-conscious teams

Self-hosted deployment

Run the security model locally for complete data sovereignty. No tool call data leaves your environment.

Fully auditable

Inspect every policy, every scoring rule, every decision pathway. Full transparency into how security decisions are made.

Four security policies

Read-only enforcement, delegation detection, category mismatch, and target verification provide comprehensive coverage.

Framework agnostic

PreToolUse hook works with Claude Code, OpenHands, and any Python agent. Three lines of code to integrate.

Full audit trail

Every scored action is logged with timestamps, scores, policy violations, and block decisions. Export to your SIEM.

Secret pre-filtering

API keys, tokens, and env var values are stripped before reaching the security model or logs. Pattern-based and value-based redaction ensures sensitive data never leaves your machine.

BENCHMARKS

Evaluated on AgentDojo

Tested with prompt injection attacks across multiple agent frameworks. All evaluation data is publicly available.

100%

Attacks blocked

Claude Code, AgentDojo n=20

95%

Attacks blocked

OpenHands, AgentDojo n=20

96.5%

Per-action detection

n=1,000 actions scored

View all evaluation traces

All numbers from AgentDojo prompt injection benchmark. Agent model: Claude Haiku 4.5. Monitor model: oss-safeguard-20b via OpenRouter. Full trajectories available in evaluations.

SECURITY POLICIES

Comprehensive policy coverage

Read-only enforcement

Prevents write actions when the task only requires reading. Stops agents from modifying files, databases, or configurations they should only be inspecting.

Task: "Read the config file" / Agent tries: write_file() → BLOCKED

Delegation detection

Catches attempts to send data to external systems. Blocks unauthorized emails, API calls, file uploads, and URL sharing.

Task: "Summarize PRs" / Agent tries: send_email(to='external') → BLOCKED

Category mismatch

Detects when an agent's action type does not match the requested task. Prevents code editing agents from sending emails, or file management agents from making network requests.

Task: "Edit code" / Agent tries: send_email() → BLOCKED

Target verification

Validates that actions target the correct recipients and resources. Catches agents sending data to wrong email addresses or modifying the wrong files.

Task: "Email alice@co" / Agent tries: email eve@evil → BLOCKED

Ready to secure your AI agents?

Read the Documentation View Evaluations