Deadlock Affecting rtla package, versions <0:5.14.0-427.33.1.el9_4


Severity

Recommended
high

Based on Red Hat Enterprise Linux security rating.

Threat Intelligence

EPSS
0.04% (6th percentile)

Do your applications use this vulnerable package?

In a few clicks we can analyze your entire application and see what components are vulnerable in your application, and suggest you quick fixes.

Test your applications
  • Snyk IDSNYK-RHEL9-RTLA-8467202
  • published5 Dec 2024
  • disclosed5 Jul 2024

Introduced: 5 Jul 2024

CVE-2024-39476  (opens in a new tab)
CWE-833  (opens in a new tab)

How to fix?

Upgrade RHEL:9 rtla to version 0:5.14.0-427.33.1.el9_4 or higher.
This issue was patched in RHSA-2024:5928.

NVD Description

Note: Versions mentioned in the description apply only to the upstream rtla package and not the rtla package as distributed by RHEL. See How to fix? for RHEL:9 relevant fixed versions and status.

In the Linux kernel, the following vulnerability has been resolved:

md/raid5: fix deadlock that raid5d() wait for itself to clear MD_SB_CHANGE_PENDING

Xiao reported that lvm2 test lvconvert-raid-takeover.sh can hang with small possibility, the root cause is exactly the same as commit bed9e27baf52 ("Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d"")

However, Dan reported another hang after that, and junxiao investigated the problem and found out that this is caused by plugged bio can't issue from raid5d().

Current implementation in raid5d() has a weird dependence:

  1. md_check_recovery() from raid5d() must hold 'reconfig_mutex' to clear MD_SB_CHANGE_PENDING;
  2. raid5d() handles IO in a deadloop, until all IO are issued;
  3. IO from raid5d() must wait for MD_SB_CHANGE_PENDING to be cleared;

This behaviour is introduce before v2.6, and for consequence, if other context hold 'reconfig_mutex', and md_check_recovery() can't update super_block, then raid5d() will waste one cpu 100% by the deadloop, until 'reconfig_mutex' is released.

Refer to the implementation from raid1 and raid10, fix this problem by skipping issue IO if MD_SB_CHANGE_PENDING is still set after md_check_recovery(), daemon thread will be woken up when 'reconfig_mutex' is released. Meanwhile, the hang problem will be fixed as well.

CVSS Scores

version 3.1