vllm 0.10.2 vulnerabilities

Known vulnerabilities in the vllm package. This does not include vulnerabilities belonging to this package’s dependencies.

Vulnerability	Vulnerable Version
M Server-side Request Forgery (SSRF) vllm is an A high-throughput and memory-efficient inference and serving engine for LLMs Affected versions of this package are vulnerable to Server-side Request Forgery (SSRF) via the `load_from_url` and `load_from_url_async` methods of the `MediaConnector` class, which fetch and process media from user-supplied URLs without sufficient restrictions on target hosts. An attacker can coerce the vLLM server into making arbitrary requests to internal network resources. Note: This vulnerability is particularly critical in containerized environments like `llm-d`, where a compromised vLLM pod could be used to scan the internal network, interact with other pods, and potentially cause denial of service or access sensitive data. ##Workaround To address this vulnerability, it is essential to restrict the URLs that the MediaConnector can access. The principle of least privilege should be applied. It is recommend to implement a configurable allowlist or denylist for domains and IP addresses. Allowlist: The most secure approach is to allow connections only to a predefined list of trusted domains. This could be configured via a command-line argument, such as `--allowed-media-domains`. By default, this list could be empty, forcing administrators to explicitly enable external media fetching. Denylist: Alternatively, a denylist could block access to private IP address ranges (`127.0.0.1`, `10.0.0.0/8`, `172.16.0.0/12`, `192.168.0.0/16`) and other sensitive domains. A check should be added at the beginning of the `load_from_url` methods to validate the parsed hostname against this list before any connection is made. How to fix Server-side Request Forgery (SSRF)? Upgrade `vllm` to version 0.11.0 or higher.	[0.5.0,0.11.0)
H Allocation of Resources Without Limits or Throttling vllm is an A high-throughput and memory-efficient inference and serving engine for LLMs Affected versions of this package are vulnerable to Allocation of Resources Without Limits or Throttling through the `chat_template` and `chat_template_kwargs` parameters. An attacker can cause excessive CPU and memory consumption by submitting a crafted Jinja template via these parameters, leading to service unavailability. How to fix Allocation of Resources Without Limits or Throttling? Upgrade `vllm` to version 0.11.0 or higher.	[,0.11.0)
H Covert Timing Channel vllm is an A high-throughput and memory-efficient inference and serving engine for LLMs Affected versions of this package are vulnerable to Covert Timing Channel via the `api_server` component. An attacker can gain unauthorized access by exploiting differences in response times during API key validation. How to fix Covert Timing Channel? Upgrade `vllm` to version 0.11.0 or higher.	[,0.11.0)
H Deserialization of Untrusted Data vllm is an A high-throughput and memory-efficient inference and serving engine for LLMs Affected versions of this package are vulnerable to Deserialization of Untrusted Data via the `SUB` ZeroMQ socket, where the deserialization is performed using the unsafe `pickle` library. An attacker on the same cluster can execute arbitrary code on the remote machine by sending maliciously crafted deserialized payloads. Note The V0 engine is off by default since v0.8.0, and the V1 engine is not affected. Due to the V0 engine's deprecated status and the invasive nature of a fix, the developers recommend ensuring a secure network environment if the V0 engine with multi-host tensor parallelism is still in use. How to fix Deserialization of Untrusted Data? There is no fixed version for `vllm`.	[0.5.2,)
H Deserialization of Untrusted Data vllm is an A high-throughput and memory-efficient inference and serving engine for LLMs Affected versions of this package are vulnerable to Deserialization of Untrusted Data in the `MessageQueue.dequeue()` API function. An attacker can execute arbitrary code by sending a malicious payload to the message queue. How to fix Deserialization of Untrusted Data? There is no fixed version for `vllm`.	[0,)

Vulnerability

Vulnerable Version

Server-side Request Forgery (SSRF)

vllm is an A high-throughput and memory-efficient inference and serving engine for LLMs

Affected versions of this package are vulnerable to Server-side Request Forgery (SSRF) via the load_from_url and load_from_url_async methods of the MediaConnector class, which fetch and process media from user-supplied URLs without sufficient restrictions on target hosts. An attacker can coerce the vLLM server into making arbitrary requests to internal network resources.

Note:

This vulnerability is particularly critical in containerized environments like llm-d, where a compromised vLLM pod could be used to scan the internal network, interact with other pods, and potentially cause denial of service or access sensitive data.

##Workaround

To address this vulnerability, it is essential to restrict the URLs that the MediaConnector can access. The principle of least privilege should be applied.

It is recommend to implement a configurable allowlist or denylist for domains and IP addresses.

Allowlist: The most secure approach is to allow connections only to a predefined list of trusted domains. This could be configured via a command-line argument, such as --allowed-media-domains. By default, this list could be empty, forcing administrators to explicitly enable external media fetching.
Denylist: Alternatively, a denylist could block access to private IP address ranges (127.0.0.1, 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16) and other sensitive domains.

A check should be added at the beginning of the load_from_url methods to validate the parsed hostname against this list before any connection is made.

How to fix Server-side Request Forgery (SSRF)?

Upgrade vllm to version 0.11.0 or higher.

[0.5.0,0.11.0)

Allocation of Resources Without Limits or Throttling

vllm is an A high-throughput and memory-efficient inference and serving engine for LLMs

Affected versions of this package are vulnerable to Allocation of Resources Without Limits or Throttling through the chat_template and chat_template_kwargs parameters. An attacker can cause excessive CPU and memory consumption by submitting a crafted Jinja template via these parameters, leading to service unavailability.

How to fix Allocation of Resources Without Limits or Throttling?

Upgrade vllm to version 0.11.0 or higher.

[,0.11.0)

Covert Timing Channel

vllm is an A high-throughput and memory-efficient inference and serving engine for LLMs

Affected versions of this package are vulnerable to Covert Timing Channel via the api_server component. An attacker can gain unauthorized access by exploiting differences in response times during API key validation.

How to fix Covert Timing Channel?

Upgrade vllm to version 0.11.0 or higher.

[,0.11.0)

Deserialization of Untrusted Data

vllm is an A high-throughput and memory-efficient inference and serving engine for LLMs

Affected versions of this package are vulnerable to Deserialization of Untrusted Data via the SUB ZeroMQ socket, where the deserialization is performed using the unsafe pickle library. An attacker on the same cluster can execute arbitrary code on the remote machine by sending maliciously crafted deserialized payloads.

Note The V0 engine is off by default since v0.8.0, and the V1 engine is not affected. Due to the V0 engine's deprecated status and the invasive nature of a fix, the developers recommend ensuring a secure network environment if the V0 engine with multi-host tensor parallelism is still in use.

How to fix Deserialization of Untrusted Data?

There is no fixed version for vllm.

[0.5.2,)

Deserialization of Untrusted Data

vllm is an A high-throughput and memory-efficient inference and serving engine for LLMs

Affected versions of this package are vulnerable to Deserialization of Untrusted Data in the MessageQueue.dequeue() API function. An attacker can execute arbitrary code by sending a malicious payload to the message queue.

How to fix Deserialization of Untrusted Data?

There is no fixed version for vllm.

[0,)

vllm@0.10.2 vulnerabilities

A high-throughput and memory-efficient inference and serving engine for LLMs

latest version

first published

latest version published

licenses detected

Direct Vulnerabilities

How to fix?