Indirect Prompt Injection and Data Exfiltration Risks in Enterprise AI Agents

ai-platform-securitydata-exfiltration-methodai-enabled-threat-activityinitial-access-method

Updated April 24, 2026 at 10:01 PM11 sources

Indirect Prompt Injection and Data Exfiltration Risks in Enterprise AI Agents

Get Ahead of Threats Like This

Know if you're exposed. Before adversaries strike.

Security researchers warned that AI agents and retrieval-augmented generation (RAG) systems can be turned into data-exfiltration channels when attackers poison inputs or embed malicious instructions in content the model is expected to process. One report described a 0-click indirect prompt injection against OpenClaw agents in which hidden instructions cause the agent to generate an attacker-controlled URL containing sensitive data such as API keys or private conversations in query parameters; messaging platforms like Telegram or Discord can then automatically request that URL for link previews, silently delivering the data to the attacker. The same reporting noted concerns about insecure defaults that allow agents to browse, execute tasks, and access local files, expanding the blast radius of prompt-injection abuse.

Related analysis highlighted that the same core weakness extends beyond standalone agents to enterprise RAG deployments, where the integrity of the knowledge base becomes part of the security boundary. If attackers can poison indexed documents in systems such as SharePoint or Confluence, they can manipulate retrieval results and influence model outputs, including security workflows and analyst guidance. Broader commentary on agentic AI threat convergence reinforced that prompt engineering is no longer just a productivity technique but an emerging exploit class, with adversaries using prompt injection and context manipulation against AI-enabled security operations. Together, the reporting shows that enterprise AI risk increasingly depends on controlling untrusted content, hardening agent permissions, and treating prompts, retrieved documents, and downstream integrations as attack surfaces.

Timeline

Apr 23, 2026
Google reports rising malicious prompt injection activity on the public web
Google published an analysis of indirect prompt injection content found in Common Crawl data, concluding that most observed attacks were low-sophistication experiments, pranks, SEO manipulation, or attempts to influence AI summaries rather than mature operational campaigns. The study also identified some malicious examples involving attempted data exfiltration, destructive commands, and anti-agent traps, and reported a 32% relative increase in malicious-category detections between November 2025 and February 2026.
Apr 22, 2026
Forcepoint documents 10 indirect prompt injection payloads seen in the wild
Forcepoint X-Labs published verified examples of 10 web-based indirect prompt injection payloads embedded in webpages to manipulate AI agents. The report detailed attack goals including denial of service, output hijacking, traffic redirection, financial fraud, and destructive command execution, along with concealment methods such as CSS invisibility, HTML comments, accessibility-layer abuse, and metadata poisoning.
Mar 17, 2026
Research introduces Agent Commander prompt-based C2 for AI agents
Research published by Embrace The Red introduced 'Agent Commander,' a proof-of-concept prompt-based command-and-control framework for compromised AI agents. The work showed how agents including OpenClaw, Kimi Claw, and NanoClaw could be hijacked through indirect prompt injection and maintained through persistence mechanisms such as HEARTBEAT.md changes or scheduled tasks.
Mar 16, 2026
CNCERT warns OpenClaw's default security posture creates enterprise risk
CNCERT warned that OpenClaw's default configuration poses enterprise risk because the agents can browse, execute tasks, and access local files, increasing the impact of indirect prompt injection attacks. The warning framed the issue as an architectural problem tied to agent autonomy and integrations.
Mar 16, 2026
PromptArmor demonstrates zero-click data exfiltration via OpenClaw agents
PromptArmor demonstrated that indirect prompt injection in OpenClaw AI agents could force the agent to generate attacker-controlled links containing sensitive data, which messaging platforms such as Telegram or Discord would automatically fetch via link previews. This created a zero-click exfiltration path for data such as API keys and private conversations.

See the full picture in Mallory

Mallory subscribers get deeper analysis on every story, including:

Impact Assessment

Who’s affected and how

Technical Details

Deep-dive technical analysis

Response Recommendations

Actionable next steps for your team

Indicators of Compromise

IPs, domains, hashes, and more

AI Threads

Ask questions and take action on every story

Advanced Filters

Filter by topic, classification, timeframe

Scheduled Alerts

Get matching stories delivered automatically

Related Entities

Sources

help net security

Indirect prompt injection is taking hold in the wild - Help Net Security

April 24, 2026 at 12:00 AM

google security blog

Google Online Security Blog: AI threats in the wild: The current state of prompt injections on the web

April 23, 2026 at 12:00 AM

zdnet zero day

How indirect prompt injection attacks on AI work - and 6 ways to shut them down | ZDNET

April 23, 2026 at 12:00 AM

forcepoint

Indirect Prompt Injection in the Wild: X-Labs Finds 10 IPI Payloads

April 22, 2026 at 11:57 AM

help net security

Microsoft details AI prompt abuse techniques targeting AI assistants - Help Net Security

March 24, 2026 at 12:00 AM

5 more from sources like cyberthrone, embrace the red blog and cyber security news

Threat researchers and security experts reported that **indirect prompt injection (IDPI)** is being actively used in the wild to manipulate AI agents by embedding hidden instructions in otherwise normal-looking web content (e.g., HTML, metadata, comments, or invisible text). Reported impacts include coercing agents into leaking sensitive data, executing unauthorized actions (including server-side commands), and manipulating downstream systems such as **AI-based ad review** and search ranking workflows (e.g., SEO poisoning and phishing promotion), indicating the technique has moved from theoretical to operational abuse. Separate testing of a healthcare AI used in a prescription-management context showed how **prompt injection** can bypass safeguards to reveal system prompts, generate harmful content, and—via persistence mechanisms such as **SOAP notes**—introduce longer-lived manipulations that could influence clinical outputs (e.g., altering suggested dosages) before human approval. Other items in the set were primarily business/consumer AI commentary (data-management investment surveys, bot-ecosystem interview, and general “dark side of AI” discussion) and did not materially add incident-level or technical detail about prompt-injection exploitation beyond broad risk framing.

1 months ago

Prompt Injection Attacks Abuse AI Agent Memory and Link Previews for Manipulation and Data Exfiltration

Security researchers reported multiple **prompt-injection-driven attack paths** that exploit how AI assistants and agentic systems process untrusted content. Microsoft researchers described **AI recommendation/memory poisoning** (mapped in MITRE ATLAS as **`AML.T0080: Memory Poisoning`**) in which attackers insert instructions that cause an assistant to persistently “remember” certain companies, sites, or services as trusted or preferred, shaping future recommendations in later, unrelated conversations. Observed activity over a 60-day period included **50 distinct prompt samples** tied to **31 organizations across 14 industries**, with potential downstream impact in high-stakes domains like health, finance, and security where manipulated recommendations can mislead users without obvious signs of tampering. A separate finding highlighted how **AI agents embedded in messaging apps** can be coerced into leaking secrets via **malicious link previews**. PromptArmor demonstrated that an attacker can use chat-based prompt injection to trick an AI agent into generating an attacker-controlled URL that includes sensitive data (e.g., API keys) as parameters; when messaging platforms (e.g., Slack/Telegram) automatically fetch **link preview** metadata, the preview request can become a **zero-click exfiltration channel**—no user needs to click the link for the data-bearing request to be sent. Together, the reports underscore that agent features intended to improve usability—*persistent memory*, URL-based prompt prepopulation (e.g., “Summarize with AI” buttons), and automatic preview fetching—can be repurposed into scalable manipulation and data-loss mechanisms when untrusted prompts are processed implicitly.

1 months ago

Prompt Injection Attacks and Security Challenges in AI Systems

Prompt injection has emerged as a critical security concern in the deployment of large language models (LLMs) and AI agents, with attackers exploiting the way these systems interpret and execute instructions. Security researchers have drawn parallels between prompt injection and earlier vulnerabilities like SQL injection, highlighting its potential to undermine the intended behavior of AI models. Prompt injection involves manipulating the input prompts to override or bypass the system-level instructions set by developers, leading to unauthorized actions or data leakage. The attack surface is broad, as LLMs are increasingly integrated into applications and workflows, making them attractive targets for adversaries. Multiple organizations, including OpenAI, Microsoft, and Anthropic, have initiated efforts to address prompt injection, but the problem remains unsolved due to the complexity and adaptability of AI models. Real-world demonstrations have shown that prompt injection can be used to break out of agentic applications, bypass browser security rules, and even persistently compromise AI systems through mechanisms like memory manipulation. Security conferences such as BlackHat USA 2024 have featured research on exploiting AI-powered tools like Microsoft 365 Copilot, where attackers can escalate privileges or exfiltrate data by crafting malicious prompts or leveraging markdown image vectors. Researchers have also identified that AI agents can be tricked into ignoring browser security policies, such as CORS, leading to potential cross-origin data leaks. Defensive measures, such as intentionally limiting AI capabilities or implementing stricter input filtering, have been adopted by some vendors, but these often come at the cost of reduced functionality. The security community is actively developing standards, such as the OWASP Agent Observability Standard, to improve monitoring and detection of prompt injection attempts. Despite these efforts, adversaries continue to find novel ways to exploit prompt injection, including dynamic manipulation of tool descriptions and bypassing image filtering mechanisms. The rapid evolution of AI technologies and the proliferation of agentic applications have made it challenging to keep pace with emerging threats. Security researchers emphasize the need for ongoing vigilance, robust testing, and collaboration across the industry to mitigate the risks associated with prompt injection. The use of AI in sensitive environments, such as enterprise productivity suites and web browsers, amplifies the potential impact of successful attacks. As AI adoption accelerates, organizations must prioritize understanding and defending against prompt injection to safeguard their systems and data. The ongoing research and public disclosures serve as a call to action for both developers and defenders to address this evolving threat landscape.

1 months ago

Get Ahead of Threats Like This

Mallory continuously monitors global threat intelligence and correlates it with your attack surface. Know if you're exposed. Before adversaries strike.

Indirect Prompt Injection and Data Exfiltration Risks in Enterprise AI Agents

Get Ahead of Threats Like This

Timeline

Google reports rising malicious prompt injection activity on the public web

Forcepoint documents 10 indirect prompt injection payloads seen in the wild

Research introduces Agent Commander prompt-based C2 for AI agents

CNCERT warns OpenClaw's default security posture creates enterprise risk

PromptArmor demonstrates zero-click data exfiltration via OpenClaw agents

See the full picture in Mallory

Related Entities

Vulnerabilities

Malware

Organizations

Affected Products

Sources

Related Stories

Indirect Prompt Injection and Prompt Manipulation Risks in AI Agents

Prompt Injection Attacks Abuse AI Agent Memory and Link Previews for Manipulation and Data Exfiltration

Prompt Injection Attacks and Security Challenges in AI Systems

Get Ahead of Threats Like This