scrapy@1.1.0 vulnerabilities

A high-level Web Crawling and Web Scraping framework

Direct Vulnerabilities

Known vulnerabilities in the scrapy package. This does not include vulnerabilities belonging to this package’s dependencies.

Automatically find and fix vulnerabilities affecting your projects. Snyk scans for vulnerabilities and provides fixes for free.
Fix for free
Vulnerability Vulnerable Version
  • H
Regular Expression Denial of Service (ReDoS)

Scrapy is a high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages.

Affected versions of this package are vulnerable to Regular Expression Denial of Service (ReDoS) when parsing content. An attacker can cause extreme CPU and memory usage by handling a malicious response.

How to fix Regular Expression Denial of Service (ReDoS)?

Upgrade Scrapy to version 2.11.1 or higher.

[,2.11.1)
  • H
Improper Resource Shutdown or Release

Scrapy is a high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages.

Affected versions of this package are vulnerable to Improper Resource Shutdown or Release due to the enforcement of response size limits only during the download of raw, usually-compressed response bodies and not during decompression. A malicious website being scraped could send a small response that, upon decompression, could exhaust the memory available to the process, potentially affecting any other process sharing that memory, and affecting disk usage in case of uncompressed response caching.

How to fix Improper Resource Shutdown or Release?

Upgrade Scrapy to version 1.8.4, 2.11.1 or higher.

[,1.8.4) [2.0.0,2.11.1)
  • H
Regular Expression Denial of Service (ReDoS)

Scrapy is a high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages.

Affected versions of this package are vulnerable to Regular Expression Denial of Service (ReDoS) via the XMLFeedSpider class or any subclass that uses the default node iterator iternodes, as well as direct uses of the scrapy.utils.iterators.xmliter function. An attacker can cause extreme CPU and memory usage during the parsing of its content by handling a malicious response.

Note:

For versions 2.6.0 to 2.11.0, the vulnerable function is open_in_browser for a response without a base tag.

How to fix Regular Expression Denial of Service (ReDoS)?

Upgrade Scrapy to version 1.8.4, 2.11.1 or higher.

[,1.8.4) [2.0.0,2.11.1)
  • H
Origin Validation Error

Scrapy is a high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages.

Affected versions of this package are vulnerable to Origin Validation Error due to the improper handling of the Authorization header during cross-domain redirects. An attacker can leak sensitive information by inducing the server to redirect a request with the Authorization header to a different domain.

How to fix Origin Validation Error?

Upgrade Scrapy to version 1.8.4, 2.11.1 or higher.

[,1.8.4) [2.0.0,2.11.1)
  • M
Credential Exposure

Scrapy is a high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages.

Affected versions of this package are vulnerable to Credential Exposure via the process_request() function in downloadermiddlewares/httpproxy.py. A proxy can leak credentials to another proxy if third-party downloader middlewares leave Proxy-Authentication headers unchanged when updating proxy metadata for a new request.

NOTE: To fully mitigate the effects of vulnerability, replacing or upgrading the third-party downloader middleware might be necessary after upgrading.

How to fix Credential Exposure?

Upgrade Scrapy to version 1.8.3, 2.6.2 or higher.

[,1.8.3) [2.0.0,2.6.2)
  • H
Information Exposure

Scrapy is a high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages.

Affected versions of this package are vulnerable to Information Exposure via responses from domain names whose public domain name suffix contains 1 or more periods are able to set cookies that are included in requests to any other domain sharing the same domain name suffix.

How to fix Information Exposure?

Upgrade Scrapy to version 1.8.2, 2.6.0 or higher.

[,1.8.2) [2.0.0,2.6.0)
  • M
Information Exposure

Scrapy is a high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages.

Affected versions of this package are vulnerable to Information Exposure in which a spider could leak cookie headers when being forwarded to a third party, potentially attacker-controlled website.

How to fix Information Exposure?

Upgrade Scrapy to version 2.6.0 or higher.

[,2.6.0)
  • M
Information Exposure

Scrapy is a high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages.

Affected versions of this package are vulnerable to Information Exposure. If you use HttpAuthMiddleware (i.e. the http_user and http_pass spider attributes) for HTTP authentication, all requests will expose your credentials to the request target. This includes requests generated by Scrapy components, such as robots.txt requests sent by Scrapy when the ROBOTSTXT_OBEY setting is set to True, or as requests reached through redirects.

How to fix Information Exposure?

Upgrade Scrapy to version 2.5.1, 1.8.1 or higher.

[2.0.0,2.5.1) [,1.8.1)
  • M
Denial of Service (DoS)

via S3FilesStore. Files are stored in memory before uploaded to s3, increasing memory usage if giant or many files are being uploaded at the same time.

[0,)