Agentic AI is crossing a threshold. For the past two years, most enterprise AI projects were bounded: a model answered questions, a human reviewed the output, and nothing happened autonomously. That boundary is dissolving. Agents now plan multi-step workflows, invoke tools, write to systems of record, trigger approvals, and take real-world actions with or without a human in the loop for each step.
This shift is creating a new category of security problem, and the industry is beginning to respond. In early 2026, NIST's Center for AI Standards and Innovation issued a Request for Information asking developers, deployers, and security researchers how autonomous AI systems should be secured. The responses, including detailed submissions from major cloud providers, are shaping what will become the first generation of industry standards for agentic AI security.
Those responses are important and worth reading. But they are largely written from the perspective of cloud infrastructure providers: how to secure the model execution environment, the underlying compute, the network boundary, and the authentication layer. That framing is correct as far as it goes. AWS's published principles, for instance, provide sound architectural guidance for organizations building agentic services on cloud infrastructure.
The enterprise operator, however, faces a different version of the problem.
When agents move from sandboxed environments into production enterprise workflows, touching CRM records, triggering provisioning requests, coordinating across teams, handling sensitive data, the security challenge shifts from “how do we secure the infrastructure running the agent” to “how do we ensure the agent behaves correctly, consistently, and within policy across thousands of actions we can’t individually review.” That is an operations problem as much as an infrastructure problem. It requires what we call an agentic harness: the structured layer of governance, policy enforcement, and workflow formalization that constrains agent behavior without removing its capability, the way a harness enables work while preventing falls.
What follows is our attempt to articulate those principles: five foundations for securing agentic AI in the enterprise, grounded in the realities of production deployment rather than infrastructure design.
Before the principles, the context. Most discussions of agentic AI security focus on adversarial attacks (e.g., prompt injection, model manipulation, supply chain compromise, etc). These are real risks. But in enterprise deployments, the failures we see most often aren't adversarial. They are structural:
The bottleneck is not model capability. It is enterprise-grade execution: the governance structures, workflow formalization, and deterministic controls that make autonomous action trustworthy at scale. In other words, it is the absence of a well-designed agentic harness.
A sound, secure development lifecycle for agentic systems must cover three categories of components, not two.
The first two are well-understood. Traditional software components (APIs, databases, orchestration logic) require the established practices: code review, static analysis, dependency scanning, and threat modeling. AI components (foundation models, prompt templates, retrieval pipelines) require additional rigor: behavioral testing, adversarial evaluation, and continuous monitoring, because probabilistic systems cannot be validated by regression testing alone.
The third category receives less attention: workflow definitions, the runbooks, playbooks, and skill templates that encode how an agent is supposed to behave in a specific context.
These are not static artifacts. They evolve as teams refine agent capabilities. They are typically informal. And they carry security implications that are invisible until something goes wrong. When workflow logic lives in informal documents, Slack threads, or institutional memory, agents fill the gaps with inference. An agent told to "handle incident escalations" without a versioned, audited playbook will improvise, and the improvisation may satisfy the model's criteria for correct behavior while violating a compliance requirement the security team cares about.
The SDL extension for agentic systems, therefore, includes version control and review processes for workflow definitions, behavioral testing against known-good workflow traces, and drift detection when agent behavior deviates from versioned baselines.
Tribal knowledge is a security vulnerability. Formalizing organizational know-how into versioned, testable, auditable workflow definitions is security work.
Agents inherit the full attack surface of traditional software. Privilege escalation, confused deputy issues, code injection at tool boundaries, session hijacking, and supply chain compromise all extend directly into agentic systems. This is not a new insight, but it bears emphasis because agentic architectures can make these risks feel secondary to "AI-specific" concerns.
What genuinely changes in agentic contexts is blast radius.
Human operators naturally pause. They escalate when something seems unusual. They have intuition for "this doesn't feel right." Agents do not hesitate. An agent operating with excessive privileges will exercise those privileges completely, consistently, and at machine speed. The same excessive permission that a human might never use in practice becomes a reliable attack surface when an agent is operating autonomously.
Three traditional controls deserve particular emphasis:
This is the most important architectural principle for enterprise agentic AI, and it requires being stated plainly.
LLMs cannot enforce security boundaries. They can be instructed to refuse certain requests, but prompt injection and adversarial inputs can override those instructions. They can be told to respect access controls, but they have no reliable mechanism to enforce them. Security policies expressed as prompts like "never take action on production systems without approval," "always confirm with the user before deleting records" are aspirations, not controls.
The failure mode here is subtle. A well-aligned model will follow these instructions in the vast majority of cases. This creates confidence. But the long tail of cases, such as edge cases, adversarial inputs, and novel situations the designer didn't anticipate, is precisely where security controls matter most, and it is where prompt-based constraints are least reliable.
The security enforcement mechanism must be external to the agent, deterministic in its operation, and comprehensive in its coverage. Every interaction between the agent and the outside world should pass through it. Model manipulation cannot bypass it.
In practice, this means:
The governance layer that wraps agent execution is itself a security control, and it is the core of what we mean by an agentic harness. Its integrity is as important as the model’s alignment. These are separate concerns and must be treated as such.
Most agentic security discussions focus on what agents can do. The harder problem, in practice, is what agents know—whether the procedures they follow accurately reflect how your organization wants work to be done.
An agent executing an incident triage workflow needs accurate, complete knowledge of: which logs to gather, how to classify severity, which team to route to based on affected system and time of day, when to escalate versus handle autonomously, what the documentation requirements are for compliance, and which exceptions to the standard procedure are recognized. If any of that knowledge is missing or wrong, the agent will fill the gap; and the fill may be technically coherent while being operationally, legally, or policy-incorrectly wrong, consistently, across thousands of executions.
The security implications of poorly-specified workflow knowledge include incorrect escalation routing, missing approval steps for exception conditions, mishandled data due to absent privacy guidance, and inconsistent behavior across runs because procedures weren't deterministic.
The control here is workflow formalization before deployment:
Agents should be taught. Teaching should be a controlled process. The organizational know-how that makes a workflow safe to automate is an asset that requires the same governance discipline as the code that runs it.
Every agentic deployment faces the same calibration question: where should the agent act autonomously, and where should a human make the final call?
The right answer begins conservative and expands based on demonstrated performance. High-consequence actions such as writes to production data above a defined scope, financial transactions above a threshold, external communications containing sensitive information, and access provisioning for privileged roles start with human approval. The agent recommends; a human decides. Over time, as the evidence base shows sustained alignment between agent recommendations and human decisions, autonomy expands for those specific operation types.
But this progression creates a well-understood failure mode if not designed carefully. If every consequential action requires human approval, the volume of decisions overwhelms reviewers. Review becomes reflexive. Humans approve without genuine evaluation. You have not added security; you have added liability and security theater.
The design principle is: scope human oversight to decisions where human judgment genuinely adds value. This means:
These five principles translate into a specific set of architectural components for enterprise agentic AI. Together, they compose the agentic harness:
None of this is conceptually new. Isolation, least privilege, immutable audit trails, policy-as-code, change management—these are established practices. What is new is applying them to a system where the actor is an AI reasoning engine operating at machine speed, and where the workflow definitions governing behavior require their own security controls alongside the code.
The infrastructure layer matters. Model alignment matters. The cloud security primitives that providers have documented matter.
But for the enterprise operator deploying agents across real workflows, workflows that touch customer data, financial systems, access controls, and production infrastructure, the security challenge is ultimately an operations challenge.
Agents that act autonomously on enterprise systems must be taught correctly, governed consistently, and constrained deterministically. The agentic harness that makes this possible, the versioned workflow definitions, the policy enforcement layer, the approval gates, and the behavioral monitoring are not features layered on top of security. It is the security.
The organizations that get agentic AI right in the near term will not be those that find the best model. They will be those who build the agentic harness with the same rigor they apply to the software it runs alongside.