scrapy@2.2.1 vulnerabilities

A high-level Web Crawling and Web Scraping framework

latest version

2.11.2
first published

15 years ago
latest version published

6 months ago
licenses detected
- BSD-2-Clause
  [0,)
View scrapy package health on Snyk Advisor Open this link in a new tab

Report a new vulnerability Found a mistake?

Direct Vulnerabilities

Known vulnerabilities in the scrapy package. This does not include vulnerabilities belonging to this package’s dependencies.

Automatically find and fix vulnerabilities affecting your projects. Snyk scans for vulnerabilities and provides fixes for free.

Fix for free

Vulnerability	Vulnerable Version
M URL Redirection to Untrusted Site ('Open Redirect') Scrapy is a high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages. Affected versions of this package are vulnerable to URL Redirection to Untrusted Site ('Open Redirect') due to the improper handling of scheme-specific proxy settings during HTTP redirects. An attacker can potentially intercept sensitive information by exploiting the failure to switch proxies when redirected from HTTP to HTTPS URLs or vice versa. How to fix URL Redirection to Untrusted Site ('Open Redirect')? Upgrade `Scrapy` to version 2.11.2 or higher.	[,2.11.2)
M Files or Directories Accessible to External Parties Scrapy is a high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages. Affected versions of this package are vulnerable to Files or Directories Accessible to External Parties via the `DOWNLOAD_HANDLERS` setting. An attacker can redirect traffic to unintended protocols such as `file://` or `s3://`, potentially accessing sensitive data or credentials by manipulating the start URLs of a spider and observing the output. Notes: HTTP redirects should only work between URLs that use the `http://` or `https://` schemes. A malicious actor, given write access to the start requests of a spider and read access to the spider output, could exploit this vulnerability to: a) Redirect to any local file using the file:// scheme to read its contents. b) Redirect to an ftp:// URL of a malicious FTP server to obtain the FTP username and password configured in the spider or project. c) Redirect to any s3:// URL to read its content using the S3 credentials configured in the spider or project. A spider that always outputs the entire contents of a response would be completely vulnerable. A spider that extracted only fragments from the response could significantly limit vulnerable data. How to fix Files or Directories Accessible to External Parties? Upgrade `Scrapy` to version 2.11.2 or higher.	[,2.11.2)
M Exposure of Sensitive Information to an Unauthorized Actor Scrapy is a high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages. Affected versions of this package are vulnerable to Exposure of Sensitive Information to an Unauthorized Actor due to improper handling of HTTP headers during cross-origin redirects. An attacker can intercept the `Authorization` header and potentially access sensitive information by exploiting this misconfiguration in redirect scenarios where the domain remains the same but the scheme or port changes. Note: In the context of a man-in-the-middle attack, this could be used to get access to the value of that Authorization header. How to fix Exposure of Sensitive Information to an Unauthorized Actor? Upgrade `Scrapy` to version 2.11.2 or higher.	[,2.11.2)
H Information Exposure Through Sent Data Scrapy is a high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages. Affected versions of this package are vulnerable to Information Exposure Through Sent Data due to the failure to remove the `Authorization` header when redirecting across domains. An attacker can potentially allow for account hijacking by exploiting the exposure of the `Authorization` header to unauthorized actors. How to fix Information Exposure Through Sent Data? Upgrade `Scrapy` to version 2.11.1 or higher.	[,2.11.1)
H Regular Expression Denial of Service (ReDoS) Scrapy is a high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages. Affected versions of this package are vulnerable to Regular Expression Denial of Service (ReDoS) when parsing content. An attacker can cause extreme CPU and memory usage by handling a malicious response. How to fix Regular Expression Denial of Service (ReDoS)? Upgrade `Scrapy` to version 2.11.1 or higher.	[,2.11.1)
H Improper Resource Shutdown or Release Scrapy is a high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages. Affected versions of this package are vulnerable to Improper Resource Shutdown or Release due to the enforcement of response size limits only during the download of raw, usually-compressed response bodies and not during decompression. A malicious website being scraped could send a small response that, upon decompression, could exhaust the memory available to the process, potentially affecting any other process sharing that memory, and affecting disk usage in case of uncompressed response caching. How to fix Improper Resource Shutdown or Release? Upgrade `Scrapy` to version 1.8.4, 2.11.1 or higher.	[,1.8.4) [2.0.0,2.11.1)
H Regular Expression Denial of Service (ReDoS) Scrapy is a high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages. Affected versions of this package are vulnerable to Regular Expression Denial of Service (ReDoS) via the `XMLFeedSpider` class or any subclass that uses the default node iterator `iternodes`, as well as direct uses of the `scrapy.utils.iterators.xmliter` function. An attacker can cause extreme CPU and memory usage during the parsing of its content by handling a malicious response. Note: For versions 2.6.0 to 2.11.0, the vulnerable function is `open_in_browser` for a response without a base tag. How to fix Regular Expression Denial of Service (ReDoS)? Upgrade `Scrapy` to version 1.8.4, 2.11.1 or higher.	[,1.8.4) [2.0.0,2.11.1)
H Origin Validation Error Scrapy is a high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages. Affected versions of this package are vulnerable to Origin Validation Error due to the improper handling of the `Authorization` header during cross-domain redirects. An attacker can leak sensitive information by inducing the server to redirect a request with the `Authorization` header to a different domain. How to fix Origin Validation Error? Upgrade `Scrapy` to version 1.8.4, 2.11.1 or higher.	[,1.8.4) [2.0.0,2.11.1)
M Credential Exposure Scrapy is a high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages. Affected versions of this package are vulnerable to Credential Exposure via the `process_request()` function in `downloadermiddlewares/httpproxy.py`. A proxy can leak credentials to another proxy if third-party downloader middlewares leave `Proxy-Authentication` headers unchanged when updating `proxy` metadata for a new request. NOTE: To fully mitigate the effects of vulnerability, replacing or upgrading the third-party downloader middleware might be necessary after upgrading. How to fix Credential Exposure? Upgrade `Scrapy` to version 1.8.3, 2.6.2 or higher.	[,1.8.3) [2.0.0,2.6.2)
H Information Exposure Scrapy is a high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages. Affected versions of this package are vulnerable to Information Exposure via responses from domain names whose public domain name suffix contains 1 or more periods are able to set cookies that are included in requests to any other domain sharing the same domain name suffix. How to fix Information Exposure? Upgrade `Scrapy` to version 1.8.2, 2.6.0 or higher.	[,1.8.2) [2.0.0,2.6.0)
M Information Exposure Scrapy is a high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages. Affected versions of this package are vulnerable to Information Exposure in which a spider could leak cookie headers when being forwarded to a third party, potentially attacker-controlled website. How to fix Information Exposure? Upgrade `Scrapy` to version 2.6.0 or higher.	[,2.6.0)
M Information Exposure Scrapy is a high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages. Affected versions of this package are vulnerable to Information Exposure. If you use `HttpAuthMiddleware` (i.e. the `http_user` and `http_pass` spider attributes) for HTTP authentication, all requests will expose your credentials to the request target. This includes requests generated by Scrapy components, such as `robots.txt` requests sent by Scrapy when the `ROBOTSTXT_OBEY` setting is set to `True`, or as requests reached through redirects. How to fix Information Exposure? Upgrade `Scrapy` to version 2.5.1, 1.8.1 or higher.	[2.0.0,2.5.1) [,1.8.1)
M Denial of Service (DoS) via `S3FilesStore`. Files are stored in memory before uploaded to s3, increasing memory usage if giant or many files are being uploaded at the same time.	[0,)

Go back to all versions of this package

scrapy@2.2.1 vulnerabilities

A high-level Web Crawling and Web Scraping framework

latest version

first published

latest version published

licenses detected

Direct Vulnerabilities