Files or Directories Accessible to External Parties Affecting scrapy package, versions [,2.11.2)
Do your applications use this vulnerable package?
In a few clicks we can analyze your entire application and see what components are vulnerable in your application, and suggest you quick fixes.
Test your applications- Snyk ID SNYK-PYTHON-SCRAPY-6841836
- published 15 May 2024
- disclosed 14 May 2024
- credit Marcos Santos
How to fix?
Upgrade Scrapy
to version 2.11.2 or higher.
Overview
Scrapy is a high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages.
Affected versions of this package are vulnerable to Files or Directories Accessible to External Parties via the DOWNLOAD_HANDLERS
setting. An attacker can redirect traffic to unintended protocols such as file://
or s3://
, potentially accessing sensitive data or credentials by manipulating the start URLs of a spider and observing the output.
Notes:
HTTP redirects should only work between URLs that use the
http://
orhttps://
schemes.A malicious actor, given write access to the start requests of a spider and read access to the spider output, could exploit this vulnerability to:
a) Redirect to any local file using the file:// scheme to read its contents.
b) Redirect to an ftp:// URL of a malicious FTP server to obtain the FTP username and password configured in the spider or project.
c) Redirect to any s3:// URL to read its content using the S3 credentials configured in the spider or project.
A spider that always outputs the entire contents of a response would be completely vulnerable.
A spider that extracted only fragments from the response could significantly limit vulnerable data.