Resource Exhaustion in tritonserver-backend-vllm-cuda-12.9 | CVE-2025-48956

Q: How to fix?

Upgrade Chainguard tritonserver-backend-vllm-cuda-12.9 to version 25.7.1_git20250821-r1 or higher.

Threat Intelligence

0.3% (53^rd percentile)

Do your applications use this vulnerable package?

In a few clicks we can analyze your entire application and see what components are vulnerable in your application, and suggest you quick fixes.

Test your applications

Snyk Learn

Learn about Resource Exhaustion vulnerabilities in an interactive lesson.

Start learning

Snyk IDSNYK-CHAINGUARDLATEST-TRITONSERVERBACKENDVLLMCUDA129-12485244
published4 Sept 2025
disclosed21 Aug 2025

Report a new vulnerability Found a mistake?

Introduced: 21 Aug 2025

NewCVE-2025-48956 (opens in a new tab) CWE-400 (opens in a new tab)

How to fix?

Upgrade Chainguard tritonserver-backend-vllm-cuda-12.9 to version 25.7.1_git20250821-r1 or higher.

NVD Description

Note: Versions mentioned in the description apply only to the upstream tritonserver-backend-vllm-cuda-12.9 package and not the tritonserver-backend-vllm-cuda-12.9 package as distributed by Chainguard. See How to fix? for Chainguard relevant fixed versions and status.

vLLM is an inference and serving engine for large language models (LLMs). From 0.1.0 to before 0.10.1.1, a Denial of Service (DoS) vulnerability can be triggered by sending a single HTTP GET request with an extremely large header to an HTTP endpoint. This results in server memory exhaustion, potentially leading to a crash or unresponsiveness. The attack does not require authentication, making it exploitable by any remote user. This vulnerability is fixed in 0.10.1.1.

Resource Exhaustion Affecting tritonserver-backend-vllm-cuda-12.9 package, versions <25.7.1_git20250821-r1

Severity