The probability is the direct output of the EPSS model, and conveys an overall sense of the threat of exploitation in the wild. The percentile measures the EPSS probability relative to all known EPSS scores. Note: This data is updated daily, relying on the latest available EPSS model version. Check out the EPSS documentation for more details.
In a few clicks we can analyze your entire application and see what components are vulnerable in your application, and suggest you quick fixes.
Test your applicationsUpgrade Amazon-Linux:2023 perf6.18-debuginfo to version 1:6.18.20-20.229.amzn2023 or higher.
This issue was patched in ALAS2023-2026-1596.
Note: Versions mentioned in the description apply only to the upstream perf6.18-debuginfo package and not the perf6.18-debuginfo package as distributed by Amazon-Linux.
See How to fix? for Amazon-Linux:2023 relevant fixed versions and status.
In the Linux kernel, the following vulnerability has been resolved:
perf/x86: Move event pointer setup earlier in x86_pmu_enable()
A production AMD EPYC system crashed with a NULL pointer dereference in the PMU NMI handler:
BUG: kernel NULL pointer dereference, address: 0000000000000198 RIP: x86_perf_event_update+0xc/0xa0 Call Trace: <NMI> amd_pmu_v2_handle_irq+0x1a6/0x390 perf_event_nmi_handler+0x24/0x40
The faulting instruction is cmpq $0x0, 0x198(%rdi) with RDI=0,
corresponding to the if (unlikely(!hwc->event_base)) check in
x86_perf_event_update() where hwc = &event->hw and event is NULL.
drgn inspection of the vmcore on CPU 106 showed a mismatch between cpuc->active_mask and cpuc->events[]:
active_mask: 0x1e (bits 1, 2, 3, 4) events[1]: 0xff1100136cbd4f38 (valid) events[2]: 0x0 (NULL, but active_mask bit 2 set) events[3]: 0xff1100076fd2cf38 (valid) events[4]: 0xff1100079e990a90 (valid)
The event that should occupy events[2] was found in event_list[2] with hw.idx=2 and hw.state=0x0, confirming x86_pmu_start() had run (which clears hw.state and sets active_mask) but events[2] was never populated.
Another event (event_list[0]) had hw.state=0x7 (STOPPED|UPTODATE|ARCH), showing it was stopped when the PMU rescheduled events, confirming the throttle-then-reschedule sequence occurred.
The root cause is commit 7e772a93eb61 ("perf/x86: Fix NULL event access and potential PEBS record loss") which moved the cpuc->events[idx] assignment out of x86_pmu_start() and into step 2 of x86_pmu_enable(), after the PERF_HES_ARCH check. This broke any path that calls pmu->start() without going through x86_pmu_enable() -- specifically the unthrottle path:
perf_adjust_freq_unthr_events() -> perf_event_unthrottle_group() -> perf_event_unthrottle() -> event->pmu->start(event, 0) -> x86_pmu_start() // sets active_mask but not events[]
The race sequence is:
A group of perf events overflows, triggering group throttle via perf_event_throttle_group(). All events are stopped: active_mask bits cleared, events[] preserved (x86_pmu_stop no longer clears events[] after commit 7e772a93eb61).
While still throttled (PERF_HES_STOPPED), x86_pmu_enable() runs due to other scheduling activity. Stopped events that need to move counters get PERF_HES_ARCH set and events[old_idx] cleared. In step 2 of x86_pmu_enable(), PERF_HES_ARCH causes these events to be skipped -- events[new_idx] is never set.
The timer tick unthrottles the group via pmu->start(). Since commit 7e772a93eb61 removed the events[] assignment from x86_pmu_start(), active_mask[new_idx] is set but events[new_idx] remains NULL.
A PMC overflow NMI fires. The handler iterates active counters, finds active_mask[2] set, reads events[2] which is NULL, and crashes dereferencing it.
Move the cpuc->events[hwc->idx] assignment in x86_pmu_enable() to before the PERF_HES_ARCH check, so that events[] is populated even for events that are not immediately started. This ensures the unthrottle path via pmu->start() always finds a valid event pointer.