AI Workflow and Agent Security Risks: Prompt Injection, Credential Leakage, and Recommendation Poisoning

ai-platform-securitydata-exfiltration-methodidentity-authentication-vulnerabilitybuild-pipeline-compromiseleaked-secret-api-key

Updated April 28, 2026 at 03:03 PM6 sources

AI Workflow and Agent Security Risks: Prompt Injection, Credential Leakage, and Recommendation Poisoning

Get Ahead of Threats Like This

Know if you're exposed. Before adversaries strike.

Multiple reports warn that the most immediate AI security risk is attackers hijacking trusted workflows—AI copilots/agents, CI pipelines, SaaS admin planes, and identity control points—rather than “AI” being a standalone threat category. Commentary and research highlight how prompt-injection-style techniques can turn normal user actions (e.g., clicking a legitimate-looking link) into silent data exfiltration or unsafe tool use, and how autonomous agents can still complete scams even when they can correctly label a page as phishing. 1Password introduced an open-source benchmark, Security Comprehension and Awareness Measure (SCAM), to test whether AI agents behave safely in realistic workplace tasks (email triage, link clicking, retrieving credentials from a vault, and form-filling) using production-like APIs; in testing, models that could identify phishing when asked still proceeded to retrieve and submit real credentials during routine workflows.

Microsoft research described AI recommendation poisoning affecting 31 companies across 14 industries, where hidden instructions embedded in “Summarize with AI” links attempt to inject persistent directives into an assistant’s memory via URL prompt parameters, biasing future recommendations (e.g., prioritizing a specific domain/company). Separately, identity-focused analysis argues that as AI increases automation and API-driven decisioning, identity becomes the enterprise control plane, making IAM architecture and resilience (including where policy evaluation and authorization live) a central security concern at “AI scale.” Two SC Media opinion pieces broaden the theme: one ties recent supply-chain and developer-workflow compromises (e.g., malicious packages/actions and token theft) to the same “trusted workflow” abuse pattern, while another discusses mobile apps as an early-warning surface for supply-chain risk (including AI arriving via third-party SDKs), but it is more forward-looking guidance than incident reporting.

Timeline

Apr 19, 2026
Researchers disclose AI agent integration flaws and vendors dispute severity
Researchers reported that Anthropic Claude Code Security Review, Google Gemini CLI Action, and Microsoft GitHub Copilot integrations with GitHub Actions could be abused to steal API keys and access tokens. The article also describes a separate dispute over Anthropic’s Model Context Protocol design, which researchers said could expose up to 200,000 servers, while vendors reportedly paid bug bounties but did not issue CVEs or public advisories for the platform-level issues.
Feb 12, 2026
Microsoft publishes research on AI recommendation poisoning
Microsoft disclosed research showing that assistants such as ChatGPT, Claude, Grok, and Microsoft 365 Copilot can be manipulated through hidden instructions embedded in 'Summarize with AI' links and URL prompt parameters. It also shared threat-hunting guidance for detecting these links in email and Microsoft Teams messages.
Feb 12, 2026
1Password open-sources the SCAM AI agent safety benchmark
1Password released the Security Comprehension and Awareness Measure (SCAM) under the MIT License, along with tooling to replay scenarios and export video results. The benchmark is intended to help researchers and enterprises evaluate whether AI agents behave safely in realistic workflows.
Feb 12, 2026
1Password finds security guidance sharply reduces AI agent failures
In SCAM testing, 1Password found that providing a short security-skills document significantly reduced critical failures and, for several models, eliminated them across repeated runs. Some models still remained inconsistent or continued to fail specific scenarios, including forwarding notes containing embedded passwords and access keys.
Feb 12, 2026
1Password tests frontier AI agents with SCAM benchmark scenarios
1Password evaluated eight AI models across 30 realistic workplace security scenarios, finding safety scores ranging from 35% to 92% and critical failures in every model under baseline conditions. The tests showed agents could recognize phishing in isolation yet still perform unsafe actions such as entering credentials into attacker-controlled pages or forwarding secrets.
Dec 14, 2025
Microsoft observes AI recommendation-poisoning attempts across 31 companies
Over a 60-day period, Microsoft recorded 50 unique prompt-based memory-poisoning attempts tied to 31 companies, showing that hidden instructions in AI-summary links were being used to manipulate assistant recommendations. The activity was attributed largely to legitimate businesses rather than typical cybercriminal SEO operators.
Jan 17, 2025
NIST publishes AI agent hijacking evaluation research
NIST's Center for AI Standards and Innovation published research on AI agent hijacking, using the open-source AgentDojo framework and Claude 3.5 Sonnet agents to measure how indirect prompt injection attacks can drive harmful actions. The study introduced new attacks and high-impact scenarios such as remote code execution, database exfiltration, and automated phishing, and found that repeated attack attempts significantly increased compromise rates.

See the full picture in Mallory

Mallory subscribers get deeper analysis on every story, including:

Impact Assessment

Who’s affected and how

Technical Details

Deep-dive technical analysis

Response Recommendations

Actionable next steps for your team

Indicators of Compromise

IPs, domains, hashes, and more

AI Threads

Ask questions and take action on every story

Advanced Filters

Filter by topic, classification, timeframe

Scheduled Alerts

Get matching stories delivered automatically

Related Entities

Sources

AI vendors' response to security flaws: It wasn't me • The Register

April 19, 2026 at 12:00 AM

scworld

The AI threat isn’t one exploit: It’s attackers hijacking trusted workflows | SC Media

February 12, 2026 at 04:41 PM

help net security

1Password open sources a benchmark to stop AI agents from leaking credentials - Help Net Security

February 12, 2026 at 12:00 AM

symantec blog

Identity is the control plane, and AI just changed the game | SECURITY.COM

February 12, 2026 at 12:00 AM

dark reading

Those 'Summarize With AI' Buttons May Be Lying to You

February 12, 2026 at 12:00 AM

1 more from sources like nist

Security leaders are warning that **AI agents are increasingly operating as “digital employees”** inside enterprise workflows—triaging alerts, coordinating investigations, and moving work across security tools—often with **broad permissions and limited governance**. The core risk highlighted is that organizations are deploying high-authority agents like plug-ins (reused service accounts, overbroad roles, weak oversight), creating fast-acting operators that can be manipulated and that lack the contextual judgment and policy awareness expected of human staff. Related commentary also raises concerns about **AI-to-AI communication** and “non-human-readable” behaviors that could reduce auditability and complicate investigations and control enforcement. In parallel, public examples show how quickly AI can accelerate **vulnerability discovery**: Microsoft Azure CTO Mark Russinovich reported using *Claude Opus 4.6* to decompile decades-old Apple II 6502 machine code and identify multiple issues, underscoring that similar techniques could be applied to **embedded/legacy firmware at scale**. Anthropic has also cautioned that advanced models can find high-severity flaws even in heavily tested codebases, reinforcing the likelihood that both defenders and attackers will leverage AI for faster bug-finding. Separate enterprise IT coverage notes that organizations are **reallocating budgets toward AI** by consolidating tools and renegotiating contracts, which can indirectly increase security exposure if cost-cutting reduces overlapping controls or if AI adoption outpaces governance and identity/access management maturity.

Yesterday

AI’s Impact on Secure Coding, Security Operations, and Workforce Strain

Security leaders and practitioners are increasingly framing **AI** as both a force-multiplier for defenders and a risk amplifier for software and operations. Commentary and executive guidance highlighted that AI-assisted fuzzing, static analysis, and large-scale pattern recognition can surface vulnerabilities faster than traditional review, but that faster discovery does not automatically reduce enterprise risk because real-world impact depends on exposure, identity/privilege design, data flows, and business process dependencies. Separately, industry guidance on “rolling out AI” emphasized practical governance measures—knowledge-sharing, partnering, and automation—arguing that the same capabilities that make AI valuable also expand the attack surface and the speed at which threats evolve. Operational reporting also underscored how AI-related and traditional threats are converging in day-to-day security work. A monthly security briefing cited rapid weaponization of a critical BeyondTrust Remote Support pre-auth RCE (**CVE-2026-1731**) with proof-of-concept and exploitation observed shortly after disclosure, later treated as a zero-day and reportedly used in ransomware activity; it also noted emerging integrity risks such as **AI recommendation poisoning** (manipulating AI-generated outputs via hidden instructions) and an AI tooling supply-chain incident involving an unintended update to the *Cline CLI* coding assistant after a compromised token. In parallel, survey results pointed to sustained **workforce burnout**—U.S. security professionals averaging significant weekly overtime and reporting emotional exhaustion—while also indicating a skills shift toward communication and stakeholder management as AI tooling adoption increases cross-functional demands.

Today

Enterprise Security Risks From Agentic and Generative AI Deployments

Enterprises are rapidly integrating **agentic AI** assistants with high-privilege connections to ticketing systems, source code repositories, chat platforms, and cloud dashboards, enabling actions such as opening pull requests, querying internal databases, and triggering automated workflows with limited human oversight. Reporting citing Cisco’s *State of AI Security 2026* indicates many organizations are moving forward with these deployments despite low security readiness, expanding exposure across model interfaces, tool integrations, and the broader supply chain. Multiple sources highlight that attacker techniques against AI systems are maturing, particularly **prompt injection/jailbreaks** and multi-turn attacks that exploit session state, memory, and tool-calling to drive unsafe actions or data leakage. Separately, adversaries are using generative AI for **deepfake-enabled social engineering** (including video/voice impersonation to bypass identity verification and authorize sensitive actions) and for scalable brand impersonation via malicious ad campaigns; one widely cited example involved Arup, where a deepfake video call led to authorization of a fraudulent HK$200 million transfer. Overall, the material is primarily risk and threat reporting (not a single incident), emphasizing that AI systems’ contextual behavior and privileged integrations create new control gaps that traditional security testing and defenses may not detect.

1 months ago

Get Ahead of Threats Like This

Mallory continuously monitors global threat intelligence and correlates it with your attack surface. Know if you're exposed. Before adversaries strike.

AI Workflow and Agent Security Risks: Prompt Injection, Credential Leakage, and Recommendation Poisoning

Get Ahead of Threats Like This

Timeline

Researchers disclose AI agent integration flaws and vendors dispute severity

Microsoft publishes research on AI recommendation poisoning

1Password open-sources the SCAM AI agent safety benchmark

1Password finds security guidance sharply reduces AI agent failures

1Password tests frontier AI agents with SCAM benchmark scenarios

Microsoft observes AI recommendation-poisoning attempts across 31 companies

NIST publishes AI agent hijacking evaluation research

See the full picture in Mallory

Related Entities

Malware

Organizations

Affected Products

Sources

Related Stories

Security Risks and Offensive Potential of Agentic AI and Automated Vulnerability Discovery

AI’s Impact on Secure Coding, Security Operations, and Workforce Strain

Enterprise Security Risks From Agentic and Generative AI Deployments

Get Ahead of Threats Like This