In the Linux kernel, the following vulnerability has been resolved:
net/sched: fix lockdep splat in qdisc_tree_reduce_backlog()
qdisc_tree_reduce_backlog() is called with the qdisc lock held,
not RTNL.
We must use qdi...Show moreIn the Linux kernel, the following vulnerability has been resolved:
net/sched: fix lockdep splat in qdisc_tree_reduce_backlog()
qdisc_tree_reduce_backlog() is called with the qdisc lock held,
not RTNL.
We must use qdisc_lookup_rcu() instead of qdisc_lookup()
syzbot reported:
WARNING: suspicious RCU usage
6.1.74-syzkaller #0 Not tainted
-----------------------------
net/sched/sch_api.c:305 suspicious rcu_dereference_protected() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
3 locks held by udevd/1142:
#0: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:306 [inline]
#0: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:747 [inline]
#0: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: net_tx_action+0x64a/0x970 net/core/dev.c:5282
#1: ffff888171861108 (&sch->q.lock){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:350 [inline]
#1: ffff888171861108 (&sch->q.lock){+.-.}-{2:2}, at: net_tx_action+0x754/0x970 net/core/dev.c:5297
#2: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:306 [inline]
#2: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:747 [inline]
#2: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: qdisc_tree_reduce_backlog+0x84/0x580 net/sched/sch_api.c:792
stack backtrace:
CPU: 1 PID: 1142 Comm: udevd Not tainted 6.1.74-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/25/2024
Call Trace:
<TASK>
[<ffffffff85b85f14>] __dump_stack lib/dump_stack.c:88 [inline]
[<ffffffff85b85f14>] dump_stack_lvl+0x1b1/0x28f lib/dump_stack.c:106
[<ffffffff85b86007>] dump_stack+0x15/0x1e lib/dump_stack.c:113
[<ffffffff81802299>] lockdep_rcu_suspicious+0x1b9/0x260 kernel/locking/lockdep.c:6592
[<ffffffff84f0054c>] qdisc_lookup+0xac/0x6f0 net/sched/sch_api.c:305
[<ffffffff84f037c3>] qdisc_tree_reduce_backlog+0x243/0x580 net/sched/sch_api.c:811
[<ffffffff84f5b78c>] pfifo_tail_enqueue+0x32c/0x4b0 net/sched/sch_fifo.c:51
[<ffffffff84fbcf63>] qdisc_enqueue include/net/sch_generic.h:833 [inline]
[<ffffffff84fbcf63>] netem_dequeue+0xeb3/0x15d0 net/sched/sch_netem.c:723
[<ffffffff84eecab9>] dequeue_skb net/sched/sch_generic.c:292 [inline]
[<ffffffff84eecab9>] qdisc_restart net/sched/sch_generic.c:397 [inline]
[<ffffffff84eecab9>] __qdisc_run+0x249/0x1e60 net/sched/sch_generic.c:415
[<ffffffff84d7aa96>] qdisc_run+0xd6/0x260 include/net/pkt_sched.h:125
[<ffffffff84d85d29>] net_tx_action+0x7c9/0x970 net/core/dev.c:5313
[<ffffffff85e002bd>] __do_softirq+0x2bd/0x9bd kernel/softirq.c:616
[<ffffffff81568bca>] invoke_softirq kernel/softirq.c:447 [inline]
[<ffffffff81568bca>] __irq_exit_rcu+0xca/0x230 kernel/softirq.c:700
[<ffffffff81568ae9>] irq_exit_rcu+0x9/0x20 kernel/softirq.c:712
[<ffffffff85b89f52>] sysvec_apic_timer_interrupt+0x42/0x90 arch/x86/kernel/apic/apic.c:1107
[<ffffffff85c00ccb>] asm_sysvec_apic_timer_interrupt+0x1b/0x20 arch/x86/include/asm/idtentry.h:656Show less |
In the Linux kernel, the following vulnerability has been resolved:
ALSA: scarlett2: Add missing mutex lock around get meter levels
As scarlett2_meter_ctl_get() uses meter_level_map[], the data_mutex
should be locked w...Show moreIn the Linux kernel, the following vulnerability has been resolved:
ALSA: scarlett2: Add missing mutex lock around get meter levels
As scarlett2_meter_ctl_get() uses meter_level_map[], the data_mutex
should be locked while accessing it.Show less |
In the Linux kernel, the following vulnerability has been resolved:
LoongArch: Define the __io_aw() hook as mmiowb()
Commit fb24ea52f78e0d595852e ("drivers: Remove explicit invocations of
mmiowb()") remove all mmiowb()...Show moreIn the Linux kernel, the following vulnerability has been resolved:
LoongArch: Define the __io_aw() hook as mmiowb()
Commit fb24ea52f78e0d595852e ("drivers: Remove explicit invocations of
mmiowb()") remove all mmiowb() in drivers, but it says:
"NOTE: mmiowb() has only ever guaranteed ordering in conjunction with
spin_unlock(). However, pairing each mmiowb() removal in this patch with
the corresponding call to spin_unlock() is not at all trivial, so there
is a small chance that this change may regress any drivers incorrectly
relying on mmiowb() to order MMIO writes between CPUs using lock-free
synchronisation."
The mmio in radeon_ring_commit() is protected by a mutex rather than a
spinlock, but in the mutex fastpath it behaves similar to spinlock. We
can add mmiowb() calls in the radeon driver but the maintainer says he
doesn't like such a workaround, and radeon is not the only example of
mutex protected mmio.
So we should extend the mmiowb tracking system from spinlock to mutex,
and maybe other locking primitives. This is not easy and error prone, so
we solve it in the architectural code, by simply defining the __io_aw()
hook as mmiowb(). And we no longer need to override queued_spin_unlock()
so use the generic definition.
Without this, we get such an error when run 'glxgears' on weak ordering
architectures such as LoongArch:
radeon 0000:04:00.0: ring 0 stalled for more than 10324msec
radeon 0000:04:00.0: ring 3 stalled for more than 10240msec
radeon 0000:04:00.0: GPU lockup (current fence id 0x000000000001f412 last fence id 0x000000000001f414 on ring 3)
radeon 0000:04:00.0: GPU lockup (current fence id 0x000000000000f940 last fence id 0x000000000000f941 on ring 0)
radeon 0000:04:00.0: scheduling IB failed (-35).
[drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35)
radeon 0000:04:00.0: scheduling IB failed (-35).
[drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35)
radeon 0000:04:00.0: scheduling IB failed (-35).
[drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35)
radeon 0000:04:00.0: scheduling IB failed (-35).
[drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35)
radeon 0000:04:00.0: scheduling IB failed (-35).
[drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35)
radeon 0000:04:00.0: scheduling IB failed (-35).
[drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35)
radeon 0000:04:00.0: scheduling IB failed (-35).
[drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35)Show less |
In the Linux kernel, the following vulnerability has been resolved:
md/dm-raid: don't call md_reap_sync_thread() directly
Currently md_reap_sync_thread() is called from raid_message() directly
without holding 'reconfig...Show moreIn the Linux kernel, the following vulnerability has been resolved:
md/dm-raid: don't call md_reap_sync_thread() directly
Currently md_reap_sync_thread() is called from raid_message() directly
without holding 'reconfig_mutex', this is definitely unsafe because
md_reap_sync_thread() can change many fields that is protected by
'reconfig_mutex'.
However, hold 'reconfig_mutex' here is still problematic because this
will cause deadlock, for example, commit 130443d60b1b ("md: refactor
idle/frozen_sync_thread() to fix deadlock").
Fix this problem by using stop_sync_thread() to unregister sync_thread,
like md/raid did.Show less |
In the Linux kernel, the following vulnerability has been resolved:
soc: fsl: qbman: Always disable interrupts when taking cgr_lock
smp_call_function_single disables IRQs when executing the callback. To
prevent deadloc...Show moreIn the Linux kernel, the following vulnerability has been resolved:
soc: fsl: qbman: Always disable interrupts when taking cgr_lock
smp_call_function_single disables IRQs when executing the callback. To
prevent deadlocks, we must disable IRQs when taking cgr_lock elsewhere.
This is already done by qman_update_cgr and qman_delete_cgr; fix the
other lockers.Show less |
In the Linux kernel, the following vulnerability has been resolved:
dm snapshot: fix lockup in dm_exception_table_exit
There was reported lockup when we exit a snapshot with many exceptions.
Fix this by adding "cond_re...Show moreIn the Linux kernel, the following vulnerability has been resolved:
dm snapshot: fix lockup in dm_exception_table_exit
There was reported lockup when we exit a snapshot with many exceptions.
Fix this by adding "cond_resched" to the loop that frees the exceptions.Show less |
In the Linux kernel, the following vulnerability has been resolved:
drm/amdgpu: fix deadlock while reading mqd from debugfs
An errant disk backup on my desktop got into debugfs and triggered the
following deadlock scen...Show moreIn the Linux kernel, the following vulnerability has been resolved:
drm/amdgpu: fix deadlock while reading mqd from debugfs
An errant disk backup on my desktop got into debugfs and triggered the
following deadlock scenario in the amdgpu debugfs files. The machine
also hard-resets immediately after those lines are printed (although I
wasn't able to reproduce that part when reading by hand):
[ 1318.016074][ T1082] ======================================================
[ 1318.016607][ T1082] WARNING: possible circular locking dependency detected
[ 1318.017107][ T1082] 6.8.0-rc7-00015-ge0c8221b72c0 #17 Not tainted
[ 1318.017598][ T1082] ------------------------------------------------------
[ 1318.018096][ T1082] tar/1082 is trying to acquire lock:
[ 1318.018585][ T1082] ffff98c44175d6a0 (&mm->mmap_lock){++++}-{3:3}, at: __might_fault+0x40/0x80
[ 1318.019084][ T1082]
[ 1318.019084][ T1082] but task is already holding lock:
[ 1318.020052][ T1082] ffff98c4c13f55f8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: amdgpu_debugfs_mqd_read+0x6a/0x250 [amdgpu]
[ 1318.020607][ T1082]
[ 1318.020607][ T1082] which lock already depends on the new lock.
[ 1318.020607][ T1082]
[ 1318.022081][ T1082]
[ 1318.022081][ T1082] the existing dependency chain (in reverse order) is:
[ 1318.023083][ T1082]
[ 1318.023083][ T1082] -> #2 (reservation_ww_class_mutex){+.+.}-{3:3}:
[ 1318.024114][ T1082] __ww_mutex_lock.constprop.0+0xe0/0x12f0
[ 1318.024639][ T1082] ww_mutex_lock+0x32/0x90
[ 1318.025161][ T1082] dma_resv_lockdep+0x18a/0x330
[ 1318.025683][ T1082] do_one_initcall+0x6a/0x350
[ 1318.026210][ T1082] kernel_init_freeable+0x1a3/0x310
[ 1318.026728][ T1082] kernel_init+0x15/0x1a0
[ 1318.027242][ T1082] ret_from_fork+0x2c/0x40
[ 1318.027759][ T1082] ret_from_fork_asm+0x11/0x20
[ 1318.028281][ T1082]
[ 1318.028281][ T1082] -> #1 (reservation_ww_class_acquire){+.+.}-{0:0}:
[ 1318.029297][ T1082] dma_resv_lockdep+0x16c/0x330
[ 1318.029790][ T1082] do_one_initcall+0x6a/0x350
[ 1318.030263][ T1082] kernel_init_freeable+0x1a3/0x310
[ 1318.030722][ T1082] kernel_init+0x15/0x1a0
[ 1318.031168][ T1082] ret_from_fork+0x2c/0x40
[ 1318.031598][ T1082] ret_from_fork_asm+0x11/0x20
[ 1318.032011][ T1082]
[ 1318.032011][ T1082] -> #0 (&mm->mmap_lock){++++}-{3:3}:
[ 1318.032778][ T1082] __lock_acquire+0x14bf/0x2680
[ 1318.033141][ T1082] lock_acquire+0xcd/0x2c0
[ 1318.033487][ T1082] __might_fault+0x58/0x80
[ 1318.033814][ T1082] amdgpu_debugfs_mqd_read+0x103/0x250 [amdgpu]
[ 1318.034181][ T1082] full_proxy_read+0x55/0x80
[ 1318.034487][ T1082] vfs_read+0xa7/0x360
[ 1318.034788][ T1082] ksys_read+0x70/0xf0
[ 1318.035085][ T1082] do_syscall_64+0x94/0x180
[ 1318.035375][ T1082] entry_SYSCALL_64_after_hwframe+0x46/0x4e
[ 1318.035664][ T1082]
[ 1318.035664][ T1082] other info that might help us debug this:
[ 1318.035664][ T1082]
[ 1318.036487][ T1082] Chain exists of:
[ 1318.036487][ T1082] &mm->mmap_lock --> reservation_ww_class_acquire --> reservation_ww_class_mutex
[ 1318.036487][ T1082]
[ 1318.037310][ T1082] Possible unsafe locking scenario:
[ 1318.037310][ T1082]
[ 1318.037838][ T1082] CPU0 CPU1
[ 1318.038101][ T1082] ---- ----
[ 1318.038350][ T1082] lock(reservation_ww_class_mutex);
[ 1318.038590][ T1082] lock(reservation_ww_class_acquire);
[ 1318.038839][ T1082] lock(reservation_ww_class_mutex);
[ 1318.039083][ T1082] rlock(&mm->mmap_lock);
[ 1318.039328][ T1082]
[ 1318.039328][ T1082] *** DEADLOCK ***
[ 1318.039328][ T1082]
[ 1318.040029][ T1082] 1 lock held by tar/1082:
[ 1318.040259][ T1082] #0: ffff98c4c13f55f8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: amdgpu_debugfs_mqd_read+0x6a/0x250 [amdgpu]
[ 1318.040560][ T1082]
[ 1318.040560][ T1082] stack backtrace:
[
---truncated---Show less |
In the Linux kernel, the following vulnerability has been resolved:
btrfs: zoned: fix lock ordering in btrfs_zone_activate()
The btrfs CI reported a lockdep warning as follows by running generic
generic/129.
WARNIN...Show moreIn the Linux kernel, the following vulnerability has been resolved:
btrfs: zoned: fix lock ordering in btrfs_zone_activate()
The btrfs CI reported a lockdep warning as follows by running generic
generic/129.
WARNING: possible circular locking dependency detected
6.7.0-rc5+ #1 Not tainted
------------------------------------------------------
kworker/u5:5/793427 is trying to acquire lock:
ffff88813256d028 (&cache->lock){+.+.}-{2:2}, at: btrfs_zone_finish_one_bg+0x5e/0x130
but task is already holding lock:
ffff88810a23a318 (&fs_info->zone_active_bgs_lock){+.+.}-{2:2}, at: btrfs_zone_finish_one_bg+0x34/0x130
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&fs_info->zone_active_bgs_lock){+.+.}-{2:2}:
...
-> #0 (&cache->lock){+.+.}-{2:2}:
...
This is because we take fs_info->zone_active_bgs_lock after a block_group's
lock in btrfs_zone_activate() while doing the opposite in other places.
Fix the issue by expanding the fs_info->zone_active_bgs_lock's critical
section and taking it before a block_group's lock.Show less |
In the Linux kernel, the following vulnerability has been resolved:
debugfs: fix wait/cancellation handling during remove
Ben Greear further reports deadlocks during concurrent debugfs
remove while files are being acce...Show moreIn the Linux kernel, the following vulnerability has been resolved:
debugfs: fix wait/cancellation handling during remove
Ben Greear further reports deadlocks during concurrent debugfs
remove while files are being accessed, even though the code in
question now uses debugfs cancellations. Turns out that despite
all the review on the locking, we missed completely that the
logic is wrong: if the refcount hits zero we can finish (and
need not wait for the completion), but if it doesn't we have
to trigger all the cancellations. As written, we can _never_
get into the loop triggering the cancellations. Fix this, and
explain it better while at it.Show less |
In the Linux kernel, the following vulnerability has been resolved:
drm/nouveau: fix stale locked mutex in nouveau_gem_ioctl_pushbuf
If VM_BIND is enabled on the client the legacy submission ioctl can't be
used, howeve...Show moreIn the Linux kernel, the following vulnerability has been resolved:
drm/nouveau: fix stale locked mutex in nouveau_gem_ioctl_pushbuf
If VM_BIND is enabled on the client the legacy submission ioctl can't be
used, however if a client tries to do so regardless it will return an
error. In this case the clients mutex remained unlocked leading to a
deadlock inside nouveau_drm_postclose or any other nouveau ioctl call.Show less |
In the Linux kernel, the following vulnerability has been resolved:
btrfs: fix deadlock with fiemap and extent locking
While working on the patchset to remove extent locking I got a lockdep
splat with fiemap and pagefa...Show moreIn the Linux kernel, the following vulnerability has been resolved:
btrfs: fix deadlock with fiemap and extent locking
While working on the patchset to remove extent locking I got a lockdep
splat with fiemap and pagefaulting with my new extent lock replacement
lock.
This deadlock exists with our normal code, we just don't have lockdep
annotations with the extent locking so we've never noticed it.
Since we're copying the fiemap extent to user space on every iteration
we have the chance of pagefaulting. Because we hold the extent lock for
the entire range we could mkwrite into a range in the file that we have
mmap'ed. This would deadlock with the following stack trace
[<0>] lock_extent+0x28d/0x2f0
[<0>] btrfs_page_mkwrite+0x273/0x8a0
[<0>] do_page_mkwrite+0x50/0xb0
[<0>] do_fault+0xc1/0x7b0
[<0>] __handle_mm_fault+0x2fa/0x460
[<0>] handle_mm_fault+0xa4/0x330
[<0>] do_user_addr_fault+0x1f4/0x800
[<0>] exc_page_fault+0x7c/0x1e0
[<0>] asm_exc_page_fault+0x26/0x30
[<0>] rep_movs_alternative+0x33/0x70
[<0>] _copy_to_user+0x49/0x70
[<0>] fiemap_fill_next_extent+0xc8/0x120
[<0>] emit_fiemap_extent+0x4d/0xa0
[<0>] extent_fiemap+0x7f8/0xad0
[<0>] btrfs_fiemap+0x49/0x80
[<0>] __x64_sys_ioctl+0x3e1/0xb50
[<0>] do_syscall_64+0x94/0x1a0
[<0>] entry_SYSCALL_64_after_hwframe+0x6e/0x76
I wrote an fstest to reproduce this deadlock without my replacement lock
and verified that the deadlock exists with our existing locking.
To fix this simply don't take the extent lock for the entire duration of
the fiemap. This is safe in general because we keep track of where we
are when we're searching the tree, so if an ordered extent updates in
the middle of our fiemap call we'll still emit the correct extents
because we know what offset we were on before.
The only place we maintain the lock is searching delalloc. Since the
delalloc stuff can change during writeback we want to lock the extent
range so we have a consistent view of delalloc at the time we're
checking to see if we need to set the delalloc flag.
With this patch applied we no longer deadlock with my testcase.Show less |
In the Linux kernel, the following vulnerability has been resolved:
nvme: fix reconnection fail due to reserved tag allocation
We found a issue on production environment while using NVMe over RDMA,
admin_q reconnect fa...Show moreIn the Linux kernel, the following vulnerability has been resolved:
nvme: fix reconnection fail due to reserved tag allocation
We found a issue on production environment while using NVMe over RDMA,
admin_q reconnect failed forever while remote target and network is ok.
After dig into it, we found it may caused by a ABBA deadlock due to tag
allocation. In my case, the tag was hold by a keep alive request
waiting inside admin_q, as we quiesced admin_q while reset ctrl, so the
request maked as idle and will not process before reset success. As
fabric_q shares tagset with admin_q, while reconnect remote target, we
need a tag for connect command, but the only one reserved tag was held
by keep alive command which waiting inside admin_q. As a result, we
failed to reconnect admin_q forever. In order to fix this issue, I
think we should keep two reserved tags for admin queue.Show less |
In the Linux kernel, the following vulnerability has been resolved:
IB/core: Fix a nested dead lock as part of ODP flow
Fix a nested dead lock as part of ODP flow by using mmput_async().
From the below call trace [1]...Show moreIn the Linux kernel, the following vulnerability has been resolved:
IB/core: Fix a nested dead lock as part of ODP flow
Fix a nested dead lock as part of ODP flow by using mmput_async().
From the below call trace [1] can see that calling mmput() once we have
the umem_odp->umem_mutex locked as required by
ib_umem_odp_map_dma_and_lock() might trigger in the same task the
exit_mmap()->__mmu_notifier_release()->mlx5_ib_invalidate_range() which
may dead lock when trying to lock the same mutex.
Moving to use mmput_async() will solve the problem as the above
exit_mmap() flow will be called in other task and will be executed once
the lock will be available.
[1]
[64843.077665] task:kworker/u133:2 state:D stack: 0 pid:80906 ppid:
2 flags:0x00004000
[64843.077672] Workqueue: mlx5_ib_page_fault mlx5_ib_eqe_pf_action [mlx5_ib]
[64843.077719] Call Trace:
[64843.077722] <TASK>
[64843.077724] __schedule+0x23d/0x590
[64843.077729] schedule+0x4e/0xb0
[64843.077735] schedule_preempt_disabled+0xe/0x10
[64843.077740] __mutex_lock.constprop.0+0x263/0x490
[64843.077747] __mutex_lock_slowpath+0x13/0x20
[64843.077752] mutex_lock+0x34/0x40
[64843.077758] mlx5_ib_invalidate_range+0x48/0x270 [mlx5_ib]
[64843.077808] __mmu_notifier_release+0x1a4/0x200
[64843.077816] exit_mmap+0x1bc/0x200
[64843.077822] ? walk_page_range+0x9c/0x120
[64843.077828] ? __cond_resched+0x1a/0x50
[64843.077833] ? mutex_lock+0x13/0x40
[64843.077839] ? uprobe_clear_state+0xac/0x120
[64843.077860] mmput+0x5f/0x140
[64843.077867] ib_umem_odp_map_dma_and_lock+0x21b/0x580 [ib_core]
[64843.077931] pagefault_real_mr+0x9a/0x140 [mlx5_ib]
[64843.077962] pagefault_mr+0xb4/0x550 [mlx5_ib]
[64843.077992] pagefault_single_data_segment.constprop.0+0x2ac/0x560
[mlx5_ib]
[64843.078022] mlx5_ib_eqe_pf_action+0x528/0x780 [mlx5_ib]
[64843.078051] process_one_work+0x22b/0x3d0
[64843.078059] worker_thread+0x53/0x410
[64843.078065] ? process_one_work+0x3d0/0x3d0
[64843.078073] kthread+0x12a/0x150
[64843.078079] ? set_kthread_struct+0x50/0x50
[64843.078085] ret_from_fork+0x22/0x30
[64843.078093] </TASK>Show less |
In the Linux kernel, the following vulnerability has been resolved:
cgroup: Add missing cpus_read_lock() to cgroup_attach_task_all()
syzbot is hitting percpu_rwsem_assert_held(&cpu_hotplug_lock) warning at
cpuset_attac...Show moreIn the Linux kernel, the following vulnerability has been resolved:
cgroup: Add missing cpus_read_lock() to cgroup_attach_task_all()
syzbot is hitting percpu_rwsem_assert_held(&cpu_hotplug_lock) warning at
cpuset_attach() [1], for commit 4f7e7236435ca0ab ("cgroup: Fix
threadgroup_rwsem <-> cpus_read_lock() deadlock") missed that
cpuset_attach() is also called from cgroup_attach_task_all().
Add cpus_read_lock() like what cgroup_procs_write_start() does.Show less |
In the Linux kernel, the following vulnerability has been resolved:
media: usbtv: Remove useless locks in usbtv_video_free()
Remove locks calls in usbtv_video_free() because
are useless and may led to a deadlock as rep...Show moreIn the Linux kernel, the following vulnerability has been resolved:
media: usbtv: Remove useless locks in usbtv_video_free()
Remove locks calls in usbtv_video_free() because
are useless and may led to a deadlock as reported here:
https://syzkaller.appspot.com/x/bisect.txt?x=166dc872180000
Also remove usbtv_stop() call since it will be called when
unregistering the device.
Before 'c838530d230b' this issue would only be noticed if you
disconnect while streaming and now it is noticeable even when
disconnecting while not streaming.
[hverkuil: fix minor spelling mistake in log message]Show less |
In the Linux kernel, the following vulnerability has been resolved:
NFS: Fix nfs_netfs_issue_read() xarray locking for writeback interrupt
The loop inside nfs_netfs_issue_read() currently does not disable
interrupts wh...Show moreIn the Linux kernel, the following vulnerability has been resolved:
NFS: Fix nfs_netfs_issue_read() xarray locking for writeback interrupt
The loop inside nfs_netfs_issue_read() currently does not disable
interrupts while iterating through pages in the xarray to submit
for NFS read. This is not safe though since after taking xa_lock,
another page in the mapping could be processed for writeback inside
an interrupt, and deadlock can occur. The fix is simple and clean
if we use xa_for_each_range(), which handles the iteration with RCU
while reducing code complexity.
The problem is easily reproduced with the following test:
mount -o vers=3,fsc 127.0.0.1:/export /mnt/nfs
dd if=/dev/zero of=/mnt/nfs/file1.bin bs=4096 count=1
echo 3 > /proc/sys/vm/drop_caches
dd if=/mnt/nfs/file1.bin of=/dev/null
umount /mnt/nfs
On the console with a lockdep-enabled kernel a message similar to
the following will be seen:
================================
WARNING: inconsistent lock state
6.7.0-lockdbg+ #10 Not tainted
--------------------------------
inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
test5/1708 [HC0[0]:SC0[0]:HE1:SE1] takes:
ffff888127baa598 (&xa->xa_lock#4){+.?.}-{3:3}, at:
nfs_netfs_issue_read+0x1b2/0x4b0 [nfs]
{IN-SOFTIRQ-W} state was registered at:
lock_acquire+0x144/0x380
_raw_spin_lock_irqsave+0x4e/0xa0
__folio_end_writeback+0x17e/0x5c0
folio_end_writeback+0x93/0x1b0
iomap_finish_ioend+0xeb/0x6a0
blk_update_request+0x204/0x7f0
blk_mq_end_request+0x30/0x1c0
blk_complete_reqs+0x7e/0xa0
__do_softirq+0x113/0x544
__irq_exit_rcu+0xfe/0x120
irq_exit_rcu+0xe/0x20
sysvec_call_function_single+0x6f/0x90
asm_sysvec_call_function_single+0x1a/0x20
pv_native_safe_halt+0xf/0x20
default_idle+0x9/0x20
default_idle_call+0x67/0xa0
do_idle+0x2b5/0x300
cpu_startup_entry+0x34/0x40
start_secondary+0x19d/0x1c0
secondary_startup_64_no_verify+0x18f/0x19b
irq event stamp: 176891
hardirqs last enabled at (176891): [<ffffffffa67a0be4>]
_raw_spin_unlock_irqrestore+0x44/0x60
hardirqs last disabled at (176890): [<ffffffffa67a0899>]
_raw_spin_lock_irqsave+0x79/0xa0
softirqs last enabled at (176646): [<ffffffffa515d91e>]
__irq_exit_rcu+0xfe/0x120
softirqs last disabled at (176633): [<ffffffffa515d91e>]
__irq_exit_rcu+0xfe/0x120
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&xa->xa_lock#4);
<Interrupt>
lock(&xa->xa_lock#4);
*** DEADLOCK ***
2 locks held by test5/1708:
#0: ffff888127baa498 (&sb->s_type->i_mutex_key#22){++++}-{4:4}, at:
nfs_start_io_read+0x28/0x90 [nfs]
#1: ffff888127baa650 (mapping.invalidate_lock#3){.+.+}-{4:4}, at:
page_cache_ra_unbounded+0xa4/0x280
stack backtrace:
CPU: 6 PID: 1708 Comm: test5 Kdump: loaded Not tainted 6.7.0-lockdbg+
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-1.fc39
04/01/2014
Call Trace:
dump_stack_lvl+0x5b/0x90
mark_lock+0xb3f/0xd20
__lock_acquire+0x77b/0x3360
_raw_spin_lock+0x34/0x80
nfs_netfs_issue_read+0x1b2/0x4b0 [nfs]
netfs_begin_read+0x77f/0x980 [netfs]
nfs_netfs_readahead+0x45/0x60 [nfs]
nfs_readahead+0x323/0x5a0 [nfs]
read_pages+0xf3/0x5c0
page_cache_ra_unbounded+0x1c8/0x280
filemap_get_pages+0x38c/0xae0
filemap_read+0x206/0x5e0
nfs_file_read+0xb7/0x140 [nfs]
vfs_read+0x2a9/0x460
ksys_read+0xb7/0x140Show less |
In the Linux kernel, the following vulnerability has been resolved:
r8169: fix LED-related deadlock on module removal
Binding devm_led_classdev_register() to the netdev is problematic
because on module removal we get a...Show moreIn the Linux kernel, the following vulnerability has been resolved:
r8169: fix LED-related deadlock on module removal
Binding devm_led_classdev_register() to the netdev is problematic
because on module removal we get a RTNL-related deadlock. Fix this
by avoiding the device-managed LED functions.
Note: We can safely call led_classdev_unregister() for a LED even
if registering it failed, because led_classdev_unregister() detects
this and is a no-op in this case.Show less |
In the Linux kernel, the following vulnerability has been resolved:
net/mlx5e: Prevent deadlock while disabling aRFS
When disabling aRFS under the `priv->state_lock`, any scheduled
aRFS works are canceled using the `ca...Show moreIn the Linux kernel, the following vulnerability has been resolved:
net/mlx5e: Prevent deadlock while disabling aRFS
When disabling aRFS under the `priv->state_lock`, any scheduled
aRFS works are canceled using the `cancel_work_sync` function,
which waits for the work to end if it has already started.
However, while waiting for the work handler, the handler will
try to acquire the `state_lock` which is already acquired.
The worker acquires the lock to delete the rules if the state
is down, which is not the worker's responsibility since
disabling aRFS deletes the rules.
Add an aRFS state variable, which indicates whether the aRFS is
enabled and prevent adding rules when the aRFS is disabled.
Kernel log:
======================================================
WARNING: possible circular locking dependency detected
6.7.0-rc4_net_next_mlx5_5483eb2 #1 Tainted: G I
------------------------------------------------------
ethtool/386089 is trying to acquire lock:
ffff88810f21ce68 ((work_completion)(&rule->arfs_work)){+.+.}-{0:0}, at: __flush_work+0x74/0x4e0
but task is already holding lock:
ffff8884a1808cc0 (&priv->state_lock){+.+.}-{3:3}, at: mlx5e_ethtool_set_channels+0x53/0x200 [mlx5_core]
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&priv->state_lock){+.+.}-{3:3}:
__mutex_lock+0x80/0xc90
arfs_handle_work+0x4b/0x3b0 [mlx5_core]
process_one_work+0x1dc/0x4a0
worker_thread+0x1bf/0x3c0
kthread+0xd7/0x100
ret_from_fork+0x2d/0x50
ret_from_fork_asm+0x11/0x20
-> #0 ((work_completion)(&rule->arfs_work)){+.+.}-{0:0}:
__lock_acquire+0x17b4/0x2c80
lock_acquire+0xd0/0x2b0
__flush_work+0x7a/0x4e0
__cancel_work_timer+0x131/0x1c0
arfs_del_rules+0x143/0x1e0 [mlx5_core]
mlx5e_arfs_disable+0x1b/0x30 [mlx5_core]
mlx5e_ethtool_set_channels+0xcb/0x200 [mlx5_core]
ethnl_set_channels+0x28f/0x3b0
ethnl_default_set_doit+0xec/0x240
genl_family_rcv_msg_doit+0xd0/0x120
genl_rcv_msg+0x188/0x2c0
netlink_rcv_skb+0x54/0x100
genl_rcv+0x24/0x40
netlink_unicast+0x1a1/0x270
netlink_sendmsg+0x214/0x460
__sock_sendmsg+0x38/0x60
__sys_sendto+0x113/0x170
__x64_sys_sendto+0x20/0x30
do_syscall_64+0x40/0xe0
entry_SYSCALL_64_after_hwframe+0x46/0x4e
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&priv->state_lock);
lock((work_completion)(&rule->arfs_work));
lock(&priv->state_lock);
lock((work_completion)(&rule->arfs_work));
*** DEADLOCK ***
3 locks held by ethtool/386089:
#0: ffffffff82ea7210 (cb_lock){++++}-{3:3}, at: genl_rcv+0x15/0x40
#1: ffffffff82e94c88 (rtnl_mutex){+.+.}-{3:3}, at: ethnl_default_set_doit+0xd3/0x240
#2: ffff8884a1808cc0 (&priv->state_lock){+.+.}-{3:3}, at: mlx5e_ethtool_set_channels+0x53/0x200 [mlx5_core]
stack backtrace:
CPU: 15 PID: 386089 Comm: ethtool Tainted: G I 6.7.0-rc4_net_next_mlx5_5483eb2 #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x60/0xa0
check_noncircular+0x144/0x160
__lock_acquire+0x17b4/0x2c80
lock_acquire+0xd0/0x2b0
? __flush_work+0x74/0x4e0
? save_trace+0x3e/0x360
? __flush_work+0x74/0x4e0
__flush_work+0x7a/0x4e0
? __flush_work+0x74/0x4e0
? __lock_acquire+0xa78/0x2c80
? lock_acquire+0xd0/0x2b0
? mark_held_locks+0x49/0x70
__cancel_work_timer+0x131/0x1c0
? mark_held_locks+0x49/0x70
arfs_del_rules+0x143/0x1e0 [mlx5_core]
mlx5e_arfs_disable+0x1b/0x30 [mlx5_core]
mlx5e_ethtool_set_channels+0xcb/0x200 [mlx5_core]
ethnl_set_channels+0x28f/0x3b0
ethnl_default_set_doit+0xec/0x240
genl_family_rcv_msg_doit+0xd0/0x120
genl_rcv_msg+0x188/0x2c0
? ethn
---truncated---Show less |
In the Linux kernel, the following vulnerability has been resolved:
net/sched: Fix mirred deadlock on device recursion
When the mirred action is used on a classful egress qdisc and a packet is
mirrored or redirected to...Show moreIn the Linux kernel, the following vulnerability has been resolved:
net/sched: Fix mirred deadlock on device recursion
When the mirred action is used on a classful egress qdisc and a packet is
mirrored or redirected to self we hit a qdisc lock deadlock.
See trace below.
[..... other info removed for brevity....]
[ 82.890906]
[ 82.890906] ============================================
[ 82.890906] WARNING: possible recursive locking detected
[ 82.890906] 6.8.0-05205-g77fadd89fe2d-dirty #213 Tainted: G W
[ 82.890906] --------------------------------------------
[ 82.890906] ping/418 is trying to acquire lock:
[ 82.890906] ffff888006994110 (&sch->q.lock){+.-.}-{3:3}, at:
__dev_queue_xmit+0x1778/0x3550
[ 82.890906]
[ 82.890906] but task is already holding lock:
[ 82.890906] ffff888006994110 (&sch->q.lock){+.-.}-{3:3}, at:
__dev_queue_xmit+0x1778/0x3550
[ 82.890906]
[ 82.890906] other info that might help us debug this:
[ 82.890906] Possible unsafe locking scenario:
[ 82.890906]
[ 82.890906] CPU0
[ 82.890906] ----
[ 82.890906] lock(&sch->q.lock);
[ 82.890906] lock(&sch->q.lock);
[ 82.890906]
[ 82.890906] *** DEADLOCK ***
[ 82.890906]
[..... other info removed for brevity....]
Example setup (eth0->eth0) to recreate
tc qdisc add dev eth0 root handle 1: htb default 30
tc filter add dev eth0 handle 1: protocol ip prio 2 matchall \
action mirred egress redirect dev eth0
Another example(eth0->eth1->eth0) to recreate
tc qdisc add dev eth0 root handle 1: htb default 30
tc filter add dev eth0 handle 1: protocol ip prio 2 matchall \
action mirred egress redirect dev eth1
tc qdisc add dev eth1 root handle 1: htb default 30
tc filter add dev eth1 handle 1: protocol ip prio 2 matchall \
action mirred egress redirect dev eth0
We fix this by adding an owner field (CPU id) to struct Qdisc set after
root qdisc is entered. When the softirq enters it a second time, if the
qdisc owner is the same CPU, the packet is dropped to break the loop.Show less |
In the Linux kernel, the following vulnerability has been resolved:
interconnect: Don't access req_list while it's being manipulated
The icc_lock mutex was split into separate icc_lock and icc_bw_lock
mutexes in [1] to...Show moreIn the Linux kernel, the following vulnerability has been resolved:
interconnect: Don't access req_list while it's being manipulated
The icc_lock mutex was split into separate icc_lock and icc_bw_lock
mutexes in [1] to avoid lockdep splats. However, this didn't adequately
protect access to icc_node::req_list.
The icc_set_bw() function will eventually iterate over req_list while
only holding icc_bw_lock, but req_list can be modified while only
holding icc_lock. This causes races between icc_set_bw(), of_icc_get(),
and icc_put().
Example A:
CPU0 CPU1
---- ----
icc_set_bw(path_a)
mutex_lock(&icc_bw_lock);
icc_put(path_b)
mutex_lock(&icc_lock);
aggregate_requests()
hlist_for_each_entry(r, ...
hlist_del(...
<r = invalid pointer>
Example B:
CPU0 CPU1
---- ----
icc_set_bw(path_a)
mutex_lock(&icc_bw_lock);
path_b = of_icc_get()
of_icc_get_by_index()
mutex_lock(&icc_lock);
path_find()
path_init()
aggregate_requests()
hlist_for_each_entry(r, ...
hlist_add_head(...
<r = invalid pointer>
Fix this by ensuring icc_bw_lock is always held before manipulating
icc_node::req_list. The additional places icc_bw_lock is held don't
perform any memory allocations, so we should still be safe from the
original lockdep splats that motivated the separate locks.
[1] commit af42269c3523 ("interconnect: Fix locking for runpm vs reclaim")Show less |