Server-Side Request Forgery (SSRF) in langchain | CVE-2024-3095

Q: How to fix?

Upgrade langchain to version 0.2.10 or higher.

Threat Intelligence

Proof of Concept

0.21% (44^th percentile)

Do your applications use this vulnerable package?

In a few clicks we can analyze your entire application and see what components are vulnerable in your application, and suggest you quick fixes.

Test your applications

Snyk Learn

Learn about Server-Side Request Forgery (SSRF) vulnerabilities in an interactive lesson.

Start learning

Snyk IDSNYK-PYTHON-LANGCHAIN-7217837
published7 Jun 2024
disclosed6 Jun 2024
creditElias Hohl

Report a new vulnerability Found a mistake?

Introduced: 6 Jun 2024

CVE-2024-3095 (opens in a new tab) CWE-918 (opens in a new tab)

How to fix?

Upgrade langchain to version 0.2.10 or higher.

Overview

langchain is a Building applications with LLMs through composability

Affected versions of this package are vulnerable to Server-Side Request Forgery (SSRF) through the Web Research Retriever component. An attacker can execute port scans, access local services, and potentially read instance metadata from cloud environments by sending crafted requests to the server.

Note: This SSRF vulnerability makes it possible to scan ports, abuse the Web Explorer server as a proxy for attacks on third parties and interact with servers in the local network including reading their response data, which may allow to extract instance metadata if in a cloud environment. The attack consequences of interacting with local services depends heavily on the nature of these services. Regularly admin-privileged services are exposed locally on servers, so the consequences can go all the way up to arbitrary code execution. Sending POST requests is not possible, only GET, but integrity may still be affected as a result of stolen credentials or because especially on internal APIs also GET requests can be state-changing. For all these reasons, the Confidentiality, Integrity, Availability metrics are set to H, L, L, the result is not an uncommon score for SSRF vulnerabilities.

PoC

Step 1: Launch the langchain Web Explorer https://github.com/langchain-ai/web-explorer/ (which implements a user interface for the Web Research Retriever)

Step 2: Edit an attacker-owned google-indexed website with a code like this:

<?php
// Check for the real IP if behind Cloudflare
$realIP = $_SERVER['HTTP_CF_CONNECTING_IP'] ?? $_SERVER['REMOTE_ADDR'];

if ($realIP === 'victim_ip') {
    // Redirect to the specified URL
    header('Location: http://localhost:4321/real&#39;);
    exit; // Ensure no further execution in case of redirect
}
?>

The example code is written for an attacker website that is behind Cloudflare. If the victim IP is not known, one can also use a range allowlist / denylist instead. The IP check is only there to prevent Google and other search engines from de-indexing the site (the Googlebot should not be redirected).

Step 3: Send a query in the Web Explorer that will run a search which returns the attacker website in the search results. What the exact query is depends on the content of your specific site. I tried that briefly with a modified subsite of my own website to confirm the vulnerability (using my hardcoded IP), but took that change offline quickly for obvious reasons. The WebResearchRetriever will open all the websites of the first few search results, and it will follow redirects, also our redirect to the local IP address. This is the SSRF vulnerability. We can also direct the Web Explorer in our LLM query to print back the response content of the websites, allowing us to read response data from local services.

References

CVSS Base Scores

version 3.1

Attack Vector (AV)
Network
Attack Complexity (AC)
High
Privileges Required (PR)
Low
User Interaction (UI)
None

Scope (S)
Changed

Confidentiality (C)
High
Integrity (I)
None
Availability (A)
None

Server-Side Request Forgery (SSRF) Affecting langchain package, versions [,0.2.10)

Severity