Server-Side Request Forgery (SSRF) Affecting langchain package, versions [,0.2.10)


Severity

Recommended
0.0
medium
0
10

CVSS assessment made by Snyk's Security Team

    Threat Intelligence

    Exploit Maturity
    Proof of concept
    EPSS
    0.05% (20th percentile)

Do your applications use this vulnerable package?

In a few clicks we can analyze your entire application and see what components are vulnerable in your application, and suggest you quick fixes.

Test your applications
  • Snyk ID SNYK-PYTHON-LANGCHAIN-7217837
  • published 7 Jun 2024
  • disclosed 6 Jun 2024
  • credit Elias Hohl

How to fix?

Upgrade langchain to version 0.2.10 or higher.

Overview

langchain is a Building applications with LLMs through composability

Affected versions of this package are vulnerable to Server-Side Request Forgery (SSRF) through the Web Research Retriever component. An attacker can execute port scans, access local services, and potentially read instance metadata from cloud environments by sending crafted requests to the server.

Note: This SSRF vulnerability makes it possible to scan ports, abuse the Web Explorer server as a proxy for attacks on third parties and interact with servers in the local network including reading their response data, which may allow to extract instance metadata if in a cloud environment. The attack consequences of interacting with local services depends heavily on the nature of these services. Regularly admin-privileged services are exposed locally on servers, so the consequences can go all the way up to arbitrary code execution. Sending POST requests is not possible, only GET, but integrity may still be affected as a result of stolen credentials or because especially on internal APIs also GET requests can be state-changing. For all these reasons, the Confidentiality, Integrity, Availability metrics are set to H, L, L, the result is not an uncommon score for SSRF vulnerabilities.

PoC

Step 1: Launch the langchain Web Explorer https://github.com/langchain-ai/web-explorer/ (which implements a user interface for the Web Research Retriever)

Step 2: Edit an attacker-owned google-indexed website with a code like this:

<?php
// Check for the real IP if behind Cloudflare
$realIP = $_SERVER['HTTP_CF_CONNECTING_IP'] ?? $_SERVER['REMOTE_ADDR'];

if ($realIP === 'victim_ip') { // Redirect to the specified URL header('Location: http://localhost:4321/real&#39;); exit; // Ensure no further execution in case of redirect } ?>

The example code is written for an attacker website that is behind Cloudflare. If the victim IP is not known, one can also use a range allowlist / denylist instead. The IP check is only there to prevent Google and other search engines from de-indexing the site (the Googlebot should not be redirected).

Step 3: Send a query in the Web Explorer that will run a search which returns the attacker website in the search results. What the exact query is depends on the content of your specific site. I tried that briefly with a modified subsite of my own website to confirm the vulnerability (using my hardcoded IP), but took that change offline quickly for obvious reasons. The WebResearchRetriever will open all the websites of the first few search results, and it will follow redirects, also our redirect to the local IP address. This is the SSRF vulnerability. We can also direct the Web Explorer in our LLM query to print back the response content of the websites, allowing us to read response data from local services.

CVSS Scores

version 3.1
Expand this section

Snyk

Recommended
6.3 medium
  • Attack Vector (AV)
    Network
  • Attack Complexity (AC)
    High
  • Privileges Required (PR)
    Low
  • User Interaction (UI)
    None
  • Scope (S)
    Changed
  • Confidentiality (C)
    High
  • Integrity (I)
    None
  • Availability (A)
    None
Expand this section

NVD

7.7 high