Insecure Storage of Sensitive Information Affecting scikit-learn package, versions *


Severity

Recommended
0.0
medium
0
10

Snyk's Security Team recommends NVD's CVSS assessment. Learn more

Threat Intelligence

EPSS
0.04% (12th percentile)

Do your applications use this vulnerable package?

In a few clicks we can analyze your entire application and see what components are vulnerable in your application, and suggest you quick fixes.

Test your applications
  • Snyk IDSNYK-DEBIAN13-SCIKITLEARN-7219034
  • published7 Jun 2024
  • disclosed6 Jun 2024

Introduced: 6 Jun 2024

CVE-2024-5206  (opens in a new tab)
CWE-922  (opens in a new tab)

How to fix?

There is no fixed version for Debian:13 scikit-learn.

NVD Description

Note: Versions mentioned in the description apply only to the upstream scikit-learn package and not the scikit-learn package as distributed by Debian. See How to fix? for Debian:13 relevant fixed versions and status.

A sensitive data leakage vulnerability was identified in scikit-learn's TfidfVectorizer, specifically in versions up to and including 1.4.1.post1, which was fixed in version 1.5.0. The vulnerability arises from the unexpected storage of all tokens present in the training data within the stop_words_ attribute, rather than only storing the subset of tokens required for the TF-IDF technique to function. This behavior leads to the potential leakage of sensitive information, as the stop_words_ attribute could contain tokens that were meant to be discarded and not stored, such as passwords or keys. The impact of this vulnerability varies based on the nature of the data being processed by the vectorizer.

CVSS Scores

version 3.1