Goal Reached Thanks to every supporter — we hit 100%!

Goal: 1000 CNY · Raised: 1310 CNY

100%

CVE-2026-46223— cgroup: Defer css percpu_ref kill on rmdir until cgroup is depopulated

AI Predicted 5.5 Difficulty: Moderate EPSS 0.02% · P5

Affected Version Matrix 8

VendorProductVersion RangeStatus
LinuxLinux1b164b876c36c3eb5561dd9b37702b04401b0166< 33fa2e6b1507a0a377a151a8826438bedad1d0b0affected
1b164b876c36c3eb5561dd9b37702b04401b0166< 93618edf753838a727dbff63c7c291dee22d656baffected
78c72bce4a87819126211c0d24e18350010604fbaffected
6.19.12< 6.20affected
7.0affected
< 7.0unaffected
7.0.9≤ 7.0.*unaffected
7.1-rc3≤ *unaffected
Get alerts for future matching vulnerabilitiesLog in to subscribe

I. Basic Information for CVE-2026-46223

Vulnerability Information

Have questions about the vulnerability? See if Shenlong's analysis helps!
View Shenlong Deep Dive ↗

Although we use advanced large model technology, its output may still contain inaccurate or outdated information.Shenlong tries to ensure data accuracy, but please verify and judge based on the actual situation.

Vulnerability Title
cgroup: Defer css percpu_ref kill on rmdir until cgroup is depopulated
Source: NVD (National Vulnerability Database)
Vulnerability Description
In the Linux kernel, the following vulnerability has been resolved: cgroup: Defer css percpu_ref kill on rmdir until cgroup is depopulated A chain of commits going back to v7.0 reworked rmdir to satisfy the controller invariant that a subsystem's ->css_offline() must not run while tasks are still doing kernel-side work in the cgroup. [1] d245698d727a ("cgroup: Defer task cgroup unlink until after the task is done switching out") [2] a72f73c4dd9b ("cgroup: Don't expose dead tasks in cgroup") [3] 1b164b876c36 ("cgroup: Wait for dying tasks to leave on rmdir") [4] 4c56a8ac6869 ("cgroup: Fix cgroup_drain_dying() testing the wrong condition") [5] 13e786b64bd3 ("cgroup: Increment nr_dying_subsys_* from rmdir context") [1] moved task cset unlink from do_exit() to finish_task_switch() so a task's cset link drops only after the task has fully stopped scheduling. That made tasks past exit_signals() linger on cset->tasks until their final context switch, which led to a series of problems as what userspace expected to see after rmdir diverged from what the kernel needs to wait for. [2]-[5] tried to bridge that divergence: [2] filtered the exiting tasks from cgroup.procs; [3] had rmdir(2) sleep in TASK_UNINTERRUPTIBLE for them; [4] fixed the wait's condition; [5] made nr_dying_subsys_* visible synchronously. The cgroup_drain_dying() wait in [3] turned out to be a dead end. When the rmdir caller is also the reaper of a zombie that pins a pidns teardown (e.g. host PID 1 systemd reaping orphan pids that were re-parented to it during the same teardown), rmdir blocks in TASK_UNINTERRUPTIBLE waiting for those pids to free, the pids can't free because PID 1 is the reaper and it's stuck in rmdir, and the system A-A deadlocks. No internal lock ordering breaks this; the wait itself is the bug. The css killing side that drove the original reorder, however, can be made cleanly asynchronous: ->css_offline() is already async, run from css_killed_work_fn() driven by percpu_ref_kill_and_confirm(). The fix is to make that chain start only after all tasks have left the cgroup. rmdir's user-visible side then returns as soon as cgroup.procs and friends are empty, while ->css_offline() still runs only after the cgroup is fully drained. Verified by the original reproducer (pidns teardown + zombie reaper, runs under vng) which hangs vanilla and succeeds here, and by per-commit deterministic repros for [2], [3], [4], [5] with a boot parameter that widens the post-exit_signals() window so each state is reliably reachable. Some stress tests on top of that. cgroup_apply_control_disable() has the same shape of pre-existing race: when a controller is disabled via subtree_control, kill_css() ran synchronously while tasks past exit_signals() could still be linked to the cgroup's csets, and ->css_offline() could fire before they drained. This patch preserves the existing synchronous behavior at that call site (kill_css_sync() + kill_css_finish() back-to-back) and a follow-up patch will defer kill_css_finish() there using a per-css trigger. This seems like the right approach and I don't see problems with it. The changes are somewhat invasive but not excessively so, so backporting to -stable should be okay. If something does turn out to be wrong, the fallback is to revert the entire chain ([1]-[5]) and rework in the development branch instead. v2: Pin cgrp across the deferred destroy work with explicit cgroup_get()/cgroup_put() around queue_work() and the work_fn. v1 wasn't actually broken (ordered cgroup_offline_wq + queue_work order in cgroup_task_dead() saved it) but the explicit ref removes the dependency on those non-obvious invariants. Also note the pre-existing cgroup_apply_control_disable() race in the description; a follow-up will defer kill_css_finish() there.
Source: NVD (National Vulnerability Database)
CVSS Information
N/A
Source: NVD (National Vulnerability Database)
Vulnerability Type
N/A
Source: NVD (National Vulnerability Database)
Vulnerability Title
Linux kernel 安全漏洞
Source: CNNVD (China National Vulnerability Database)
Vulnerability Description
Linux kernel是美国Linux基金会的开源操作系统Linux所使用的内核。 Linux kernel存在安全漏洞,该漏洞源于cgroup中rmdir时css percpu_ref杀死延迟不足,可能导致竞争条件。
Source: CNNVD (China National Vulnerability Database)
CVSS Information
N/A
Source: CNNVD (China National Vulnerability Database)
Vulnerability Type
N/A
Source: CNNVD (China National Vulnerability Database)

Affected Products

VendorProductAffected VersionsCPESubscribe
LinuxLinux 1b164b876c36c3eb5561dd9b37702b04401b0166 ~ 33fa2e6b1507a0a377a151a8826438bedad1d0b0 -
LinuxLinux 7.0 -

II. Public POCs for CVE-2026-46223

#POC DescriptionSource LinkShenlong Link
AI-Generated POCPremium

No public POC found.

Login to generate AI POC

III. Intelligence Information for CVE-2026-46223

登录查看更多情报信息。

Patches & Fixes for CVE-2026-46223 (2)

Same Patch Batch · Linux · 2026-05-28 · 138 CVEs total

CVE-2026-461359.8 CRITICALnvmet-tcp: fix race between ICReq handling and queue teardown
CVE-2026-461379.8 CRITICALmptcp: pm: ADD_ADDR rtx: fix potential data-race
CVE-2026-461959.8 CRITICALsmb: client: validate dacloffset before building DACL pointers
CVE-2026-461159.8 CRITICALblock: add pgmap check to biovec_phys_mergeable
CVE-2026-461859.1 CRITICALsmb/client: fix out-of-bounds read in symlink_data()
CVE-2026-461199.1 CRITICALlibceph: Fix slab-out-of-bounds access in auth message processing
CVE-2026-461559.1 CRITICALsmb/client: fix out-of-bounds read in smb2_compound_op()
CVE-2026-461528.8 HIGHwifi: mac80211: drop stray 'static' from fast-RX rx_result
CVE-2026-461988.8 HIGHbatman-adv: fix integer overflow on buff_pos
CVE-2026-461258.8 HIGHwifi: mac80211: remove station if connection prep fails
CVE-2026-462388.8 HIGHbatman-adv: stop caching unowned originator pointers in BAT IV
CVE-2026-461748.8 HIGHx86/CPU/AMD: Prevent improper isolation of shared resources in Zen2's op cache
CVE-2026-461138.8 HIGHKVM: x86: Fix shadow paging use-after-free due to unexpected GFN
CVE-2026-461668.8 HIGHwifi: mac80211: use safe list iteration in radar detect work
CVE-2026-462128.8 HIGHbatman-adv: bla: prevent use-after-free when deleting claims
CVE-2026-462328.1 HIGHHID: playstation: Clamp num_touch_reports
CVE-2026-461388.1 HIGHBluetooth: hci_event: Fix OOB read and infinite loop in hci_le_create_big_complete_evt
CVE-2026-462097.8 HIGHdrm/gem: Fix inconsistent plane dimension calculation in drm_gem_fb_init_with_funcs()
CVE-2026-462087.8 HIGHbatman-adv: stop tp_meter sessions during mesh teardown
CVE-2026-462277.8 HIGHsctp: revalidate list cursor after sctp_sendmsg_to_asoc() in SCTP_SENDALL

Showing top 20 of 138 CVEs. View all on vendor page &rarr; →

IV. Related Vulnerabilities

V. Comments for CVE-2026-46223

No comments yet


Leave a comment