Insecure Storage of Sensitive Information in scikit-learn | CVE-2024-5206

Q: How to fix?

There is no fixed version for Debian:13 scikit-learn .

Threat Intelligence

0.03% (7^th percentile)

Do your applications use this vulnerable package?

In a few clicks we can analyze your entire application and see what components are vulnerable in your application, and suggest you quick fixes.

Test your applications

Snyk IDSNYK-DEBIAN13-SCIKITLEARN-7219034
published7 Jun 2024
disclosed6 Jun 2024

Report a new vulnerability Found a mistake?

Introduced: 6 Jun 2024

CVE-2024-5206 (opens in a new tab) CWE-922 (opens in a new tab)

How to fix?

There is no fixed version for Debian:13 scikit-learn.

NVD Description

Note: Versions mentioned in the description apply only to the upstream scikit-learn package and not the scikit-learn package as distributed by Debian. See How to fix? for Debian:13 relevant fixed versions and status.

A sensitive data leakage vulnerability was identified in scikit-learn's TfidfVectorizer, specifically in versions up to and including 1.4.1.post1, which was fixed in version 1.5.0. The vulnerability arises from the unexpected storage of all tokens present in the training data within the stop_words_ attribute, rather than only storing the subset of tokens required for the TF-IDF technique to function. This behavior leads to the potential leakage of sensitive information, as the stop_words_ attribute could contain tokens that were meant to be discarded and not stored, such as passwords or keys. The impact of this vulnerability varies based on the nature of the data being processed by the vectorizer.

References

CVSS Base Scores

version 3.1

Attack Vector (AV)
Local
Attack Complexity (AC)
High
Privileges Required (PR)
Low
User Interaction (UI)
None

Scope (S)
Unchanged

Confidentiality (C)
High
Integrity (I)
None
Availability (A)
None

Insecure Storage of Sensitive Information Affecting scikit-learn package, versions *

Severity