Recent Notes
45.6% of teams still rely on shared API keys for agent-to-agent authentication. Only 21.9% treat #AI agents as independent identity-bearing entities.
That's Strata's 2026 research. ISACA called it "The Looming Authorization Crisis" a year earlier. The pattern is consistent: teams authenticate the agent and assume that covers scope.
It doesn't. Authentication confirms identity. Authorization defines what that identity is allowed to do. In multi-agent workflows, an authenticated agent with broad permissions can chain actions across systems that no single human would be authorised to perform. OAuth 2.0 and OIDC were designed for one principal, one session. They break when the principal spawns sub-agents with inherited permissions.
SailPoint found 80% of IT professionals have witnessed AI agents performing unauthorised actions. NIST published a concept paper in February 2026 specifically because the existing identity framework doesn't hold.
The agent was authenticated. The action wasn't authorised. They're not the same problem.
Prompt injection is not a session problem anymore.
Microsoft Threat Intelligence documented 50+ real-world examples of AI memory poisoning across 31 companies and 14 sectors in February 2026. OWASP classified it as ASI06, a top agentic risk. The attack is simple: malicious instructions get embedded in an agent's long-term memory. The agent recalls them days or weeks later. It doesn't know it's been compromised. It thinks it learned something useful.
The MemoryGraft research team calls this "implanting malicious successful experiences." The agent defends beliefs it should never have learned.
Now picture a compliance agent whose risk threshold has been silently shifting for three months. You don't get a breach notification. You get a regulatory exam where the agent's decisions don't match the policy it was supposed to enforce.
The session ended. The poisoned memory didn't.
Your #AI #agent has the same credentials as your senior analyst. It also never logs off, never triggers a session timeout, and chains API calls at machine speed across every downstream system simultaneously.
CyberArk documented a 2026 supply chain attack on the OpenAI plugin ecosystem that compromised agent credentials across 47 enterprise deployments. Attackers had access to financial records for six months. Nobody noticed, because the fraud detection stack was calibrated for human behaviour patterns. Agents don't have behaviour patterns. They have execution loops.
The security team authenticated the agent. The attacker inherited the session. The payment system saw a valid credential doing valid things at 3 AM on a Sunday at 400 requests per second, and flagged nothing.
One credential was compromised. The other was governed. Most institutions can't tell you which one their agents are running on.
88% of organizations are deploying AI. 25% have board-level policies governing that deployment. The remaining 63% are not underprepared. They are exposed.
AI-related securities class actions doubled in 2024. The first half of 2025 produced 12 filings. The legal theory is Caremark: directors breached fiduciary duty by failing to establish AI oversight controls. The SEC's 2026 examination priorities name AI governance explicitly, requiring documented inventories, risk classifications, and model lifecycle controls.
Two-thirds of board directors report limited or no knowledge of AI. 42% of those using AI to support board work are running consumer-grade tools, uploading documents with no data classification review.
The regulator sees a governance failure. The board sees a technology question. Those are not the same exposure.
Gartner projects 40% of enterprise applications will embed task-specific AI agents by 2026. Only 6% of those organizations have an advanced AI security framework in place.
That's not a lag. That's a structural gap in institutional governance.
42% of organizations have no formal agentic AI strategy. 35% have no strategy at all. What they do have: production deployments, active tool integrations, and agents operating under service accounts that weren't provisioned for autonomous decision chains.
The risk management documentation doesn't exist because the deployment happened before the governance process did. When the audit comes, the question isn't whether the agent was authorized. It's whether anyone can demonstrate what the authorization covered.
Design review passed. Risk documentation was never written.
Clear the session, clear the threat. That assumption just failed.
LangChain CVE-2025-68664 demonstrated how malicious instructions in LLM response fields persist through serialization cycles. One prompt injection in cached data becomes durable compromise. The instruction doesn't disappear when the session ends. It replays into every future context window.
Anthropic detected a Chinese state campaign where AI executed 80-90% of operations. Not because the model was compromised. Because memory poisoning turned one successful injection into persistent instruction across sessions, users, and deployments.
Security reviews focus on input validation per request. Session-level controls. Clear the context, move on.
Incident response asks: "When did the breach start?" The answer is "unknown, could be any conversation that touched this agent's persistent state." Forensic timeline reconstruction fails because the attack vector is distributed across historical context.
The security team sees prompt injection. The incident sees a supply chain problem in conversational memory.
#AI
🤙1
A model that passes your safety evaluation has been tested against a generic threat surface. Your institution does not have a generic threat surface.
Stanford HELM and MITRE ATLAS both document adversarial robustness degrading significantly outside benchmark distributions. No published safety benchmark tests against institution-specific data, internal terminology, or proprietary workflow triggers.
Your security team ran the evaluation, reviewed the results, and cleared the model for deployment. The evaluation was real. The threat surface it tested was not yours.
Your production environment has specific characteristics: internal document naming conventions, employee workflow patterns, system identifiers that appear nowhere in any benchmark dataset. An adversary who maps that structure can craft inputs the model has never encountered in testing.
The model behaves safely in the lab. It encounters your institution's specific attack surface in production.
Safety evaluation covers the general case. Production exposure is always the specific case.
#AI #AIAgent
You reviewed the tools your agent has access to. You did not review what becomes reachable when those tools are called in sequence.
OWASP's 2025 Top 10 for LLM Applications explicitly documents chained authorization escalation as the primary lateral movement pattern in agentic environments. The attack is not one malicious tool call. It is a path.
#AI #Agent calls a read-only analytics tool. That tool passes a token to a reporting service. The reporting service has write access to a data warehouse the original agent was never authorized to touch.
No single step looks suspicious in isolation. Each tool call was within scope. The authorization boundary was crossed at the chain level, not the component level.
Your security review assessed permissions per tool. Your adversary assessed permissions across the graph.
The individual actions were authorized. The cumulative access was never governed.
The human approval checkpoint is in the architecture diagram. It is not in the production latency budget.
Gartner's #agentic #AI findings documented enterprise agents executing thousands of micro-decisions per hour. The ServiceNow Virtual Agent incident showed approved diagrams with oversight checkpoints the system's throughput had already made operationally impossible.
Your compliance team documented human-in-the-loop oversight. They met the regulatory requirement on paper.
What they did not model: at what transaction volume the human checkpoint becomes a rubber stamp. At what latency threshold the approval step gets removed to keep the system functional. At what point the documented control and the production behavior diverge completely.
A regulatory examiner does not review your architecture diagram. They pull the audit log and trace the action back to an approval event that does not exist.
The control was designed. The oversight was never operational.
An #AI system that learns in production is, by definition, not the system you deployed.
EU AI Act Article 3(23) and November 2025 practice guidance are explicit: continuous fine-tuning, RAG updates, and agent memory evolution routinely qualify as substantial modification, triggering full re-conformity assessment. Early 2026 enforcement notes document widespread non-compliance in live high-risk systems.
The operational assumption is that ongoing learning is maintenance. The regulatory position is that any non-foreseen change affecting risk profile or purpose restarts the compliance clock. Agentic systems change continuously by design. That means the compliance clock is running constantly, and most institutions aren't tracking it.
Post-audit discovery of undocumented model drift in a trading or compliance agent doesn't produce a remediation notice. It produces a fine up to 6% of global turnover, or a cease-and-desist on the entire AI deployment.
The engineering team sees a system improving over time. The regulator sees a system operating without valid conformity assessment.
That distinction is measured in enforcement actions, not design reviews.
Your #AI agent isn't using its own identity. It's using yours.
CyberArk documented a 96:1 machine-to-human ratio in financial services agentic deployments. One human credential. Ninety-six agents operating under it. No session isolation. No per-action audit trail. No distinction in the access log.
IAM teams see delegation. What they're actually running is shadow machine identity at institutional scale: entitlements accumulating silently, accountability dissolving across every chained action.
When a high-value transaction executes under a "legitimate" human credential and the agent that triggered it has no discrete identity of its own, the GLBA audit doesn't find a breach. It finds a governance failure.
The security team sees an efficiency model. The OCC examiner sees an identity architecture that can't be audited.
Those aren't the same problem.
Security reviews are designed for deterministic systems where code paths are predictable.
AI agents are probabilistic interpreters where context influences behavior.
You can audit what the agent can access. You can't audit what it will interpret as instructions.