In the Linux kernel, the following vulnerability has been resolved:
net: microchip: sparx5: Fix potential null-ptr-deref in sparx_stats_init() and sparx5_start()
sparx_stats_init() calls create_singlethread_workqueue() and not
checked the ret value, which may return NULL. And a null-ptr-deref may
happen:
sparx_stats_init()
create_singlethread_workqueue() # failed, sparx5->stats_queue is NULL
queue_delayed_work()
queue_delayed_work_on()
__queue_delayed_work() # warning here, but continue
__queue_work() # access wq->flags, null-ptr-deref
Check the ret value and return -ENOMEM if it is NULL. So as
sparx5_start().
In the Linux kernel, the following vulnerability has been resolved:
net: lan966x: Fix potential null-ptr-deref in lan966x_stats_init()
lan966x_stats_init() calls create_singlethread_workqueue() and not
checked the ret value, which may return NULL. And a null-ptr-deref may
happen:
lan966x_stats_init()
create_singlethread_workqueue() # failed, lan966x->stats_queue is NULL
queue_delayed_work()
queue_delayed_work_on()
__queue_delayed_work() # warning here, but continue
__queue_work() # access wq->flags, null-ptr-deref
Check the ret value and return -ENOMEM if it is NULL.
In the Linux kernel, the following vulnerability has been resolved:
netdevsim: Fix memory leak of nsim_dev->fa_cookie
kmemleak reports this issue:
unreferenced object 0xffff8881bac872d0 (size 8):
comm "sh", pid 58603, jiffies 4481524462 (age 68.065s)
hex dump (first 8 bytes):
04 00 00 00 de ad be ef ........
backtrace:
[<00000000c80b8577>] __kmalloc+0x49/0x150
[<000000005292b8c6>] nsim_dev_trap_fa_cookie_write+0xc1/0x210 [netdevsim]
[<0000000093d78e77>] full_proxy_write+0xf3/0x180
[<000000005a662c16>] vfs_write+0x1c5/0xaf0
[<000000007aabf84a>] ksys_write+0xed/0x1c0
[<000000005f1d2e47>] do_syscall_64+0x3b/0x90
[<000000006001c6ec>] entry_SYSCALL_64_after_hwframe+0x63/0xcd
The issue occurs in the following scenarios:
nsim_dev_trap_fa_cookie_write()
kmalloc() fa_cookie
nsim_dev->fa_cookie = fa_cookie
..
nsim_drv_remove()
The fa_cookie allocked in nsim_dev_trap_fa_cookie_write() is not freed. To
fix, add kfree(nsim_dev->fa_cookie) to nsim_drv_remove().
In the Linux kernel, the following vulnerability has been resolved:
ftrace: Fix null pointer dereference in ftrace_add_mod()
The @ftrace_mod is allocated by kzalloc(), so both the members {prev,next}
of @ftrace_mode->list are NULL, it's not a valid state to call list_del().
If kstrdup() for @ftrace_mod->{func|module} fails, it goes to @out_free
tag and calls free_ftrace_mod() to destroy @ftrace_mod, then list_del()
will write prev->next and next->prev, where null pointer dereference
happens.
BUG: kernel NULL pointer dereference, address: 0000000000000008
Oops: 0002 [#1] PREEMPT SMP NOPTI
Call Trace:
<TASK>
ftrace_mod_callback+0x20d/0x220
? do_filp_open+0xd9/0x140
ftrace_process_regex.isra.51+0xbf/0x130
ftrace_regex_write.isra.52.part.53+0x6e/0x90
vfs_write+0xee/0x3a0
? __audit_filter_op+0xb1/0x100
? auditd_test_task+0x38/0x50
ksys_write+0xa5/0xe0
do_syscall_64+0x3a/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Kernel panic - not syncing: Fatal exception
So call INIT_LIST_HEAD() to initialize the list member to fix this issue.
In the Linux kernel, the following vulnerability has been resolved:
tracing: Fix race where eprobes can be called before the event
The flag that tells the event to call its triggers after reading the event
is set for eprobes after the eprobe is enabled. This leads to a race where
the eprobe may be triggered at the beginning of the event where the record
information is NULL. The eprobe then dereferences the NULL record causing
a NULL kernel pointer bug.
Test for a NULL record to keep this from happening.
In the Linux kernel, the following vulnerability has been resolved:
rethook: fix a potential memleak in rethook_alloc()
In rethook_alloc(), the variable rh is not freed or passed out
if handler is NULL, which could lead to a memleak, fix it.
[Masami: Add "rethook:" tag to the title.]
Acke-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
In the Linux kernel, the following vulnerability has been resolved:
iio: adc: at91_adc: fix possible memory leak in at91_adc_allocate_trigger()
If iio_trigger_register() returns error, it should call iio_trigger_free()
to give up the reference that hold in iio_trigger_alloc(), so that it can
call iio_trig_release() to free memory when the refcount hit to 0.
In the Linux kernel, the following vulnerability has been resolved:
iio: trigger: sysfs: fix possible memory leak in iio_sysfs_trig_init()
dev_set_name() allocates memory for name, it need be freed
when device_add() fails, call put_device() to give up the
reference that hold in device_initialize(), so that it can
be freed in kobject_cleanup() when the refcount hit to 0.
Fault injection test can trigger this:
unreferenced object 0xffff8e8340a7b4c0 (size 32):
comm "modprobe", pid 243, jiffies 4294678145 (age 48.845s)
hex dump (first 32 bytes):
69 69 6f 5f 73 79 73 66 73 5f 74 72 69 67 67 65 iio_sysfs_trigge
72 00 a7 40 83 8e ff ff 00 86 13 c4 f6 ee ff ff r..@............
backtrace:
[<0000000074999de8>] __kmem_cache_alloc_node+0x1e9/0x360
[<00000000497fd30b>] __kmalloc_node_track_caller+0x44/0x1a0
[<000000003636c520>] kstrdup+0x2d/0x60
[<0000000032f84da2>] kobject_set_name_vargs+0x1e/0x90
[<0000000092efe493>] dev_set_name+0x4e/0x70
In the Linux kernel, the following vulnerability has been resolved:
io_uring: fix multishot accept request leaks
Having REQ_F_POLLED set doesn't guarantee that the request is
executed as a multishot from the polling path. Fortunately for us, if
the code thinks it's multishot issue when it's not, it can only ask to
skip completion so leaking the request. Use issue_flags to mark
multipoll issues.
In the Linux kernel, the following vulnerability has been resolved:
Input: iforce - invert valid length check when fetching device IDs
syzbot is reporting uninitialized value at iforce_init_device() [1], for
commit 6ac0aec6b0a6 ("Input: iforce - allow callers supply data buffer
when fetching device IDs") is checking that valid length is shorter than
bytes to read. Since iforce_get_id_packet() stores valid length when
returning 0, the caller needs to check that valid length is longer than or
equals to bytes to read.
In the Linux kernel, the following vulnerability has been resolved:
misc/vmw_vmci: fix an infoleak in vmci_host_do_receive_datagram()
`struct vmci_event_qp` allocated by qp_notify_peer() contains padding,
which may carry uninitialized data to the userspace, as observed by
KMSAN:
BUG: KMSAN: kernel-infoleak in instrument_copy_to_user ./include/linux/instrumented.h:121
instrument_copy_to_user ./include/linux/instrumented.h:121
_copy_to_user+0x5f/0xb0 lib/usercopy.c:33
copy_to_user ./include/linux/uaccess.h:169
vmci_host_do_receive_datagram drivers/misc/vmw_vmci/vmci_host.c:431
vmci_host_unlocked_ioctl+0x33d/0x43d0 drivers/misc/vmw_vmci/vmci_host.c:925
vfs_ioctl fs/ioctl.c:51
...
Uninit was stored to memory at:
kmemdup+0x74/0xb0 mm/util.c:131
dg_dispatch_as_host drivers/misc/vmw_vmci/vmci_datagram.c:271
vmci_datagram_dispatch+0x4f8/0xfc0 drivers/misc/vmw_vmci/vmci_datagram.c:339
qp_notify_peer+0x19a/0x290 drivers/misc/vmw_vmci/vmci_queue_pair.c:1479
qp_broker_attach drivers/misc/vmw_vmci/vmci_queue_pair.c:1662
qp_broker_alloc+0x2977/0x2f30 drivers/misc/vmw_vmci/vmci_queue_pair.c:1750
vmci_qp_broker_alloc+0x96/0xd0 drivers/misc/vmw_vmci/vmci_queue_pair.c:1940
vmci_host_do_alloc_queuepair drivers/misc/vmw_vmci/vmci_host.c:488
vmci_host_unlocked_ioctl+0x24fd/0x43d0 drivers/misc/vmw_vmci/vmci_host.c:927
...
Local variable ev created at:
qp_notify_peer+0x54/0x290 drivers/misc/vmw_vmci/vmci_queue_pair.c:1456
qp_broker_attach drivers/misc/vmw_vmci/vmci_queue_pair.c:1662
qp_broker_alloc+0x2977/0x2f30 drivers/misc/vmw_vmci/vmci_queue_pair.c:1750
Bytes 28-31 of 48 are uninitialized
Memory access of size 48 starts at ffff888035155e00
Data copied to user address 0000000020000100
Use memset() to prevent the infoleaks.
Also speculatively fix qp_notify_peer_local(), which may suffer from the
same problem.
In the Linux kernel, the following vulnerability has been resolved:
mmc: sdhci-pci: Fix possible memory leak caused by missing pci_dev_put()
pci_get_device() will increase the reference count for the returned
pci_dev. We need to use pci_dev_put() to decrease the reference count
before amd_probe() returns. There is no problem for the 'smbus_dev ==
NULL' branch because pci_dev_put() can also handle the NULL input
parameter case.
In the Linux kernel, the following vulnerability has been resolved:
blk-cgroup: properly pin the parent in blkcg_css_online
blkcg_css_online is supposed to pin the blkcg of the parent, but
397c9f46ee4d refactored things and along the way, changed it to pin the
css instead. This results in extra pins, and we end up leaking blkcgs
and cgroups.
In the Linux kernel, the following vulnerability has been resolved:
x86/sgx: Add overflow check in sgx_validate_offset_length()
sgx_validate_offset_length() function verifies "offset" and "length"
arguments provided by userspace, but was missing an overflow check on
their addition. Add it.
In the Linux kernel, the following vulnerability has been resolved:
x86/fpu: Drop fpregs lock before inheriting FPU permissions
Mike Galbraith reported the following against an old fork of preempt-rt
but the same issue also applies to the current preempt-rt tree.
BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: systemd
preempt_count: 1, expected: 0
RCU nest depth: 0, expected: 0
Preemption disabled at:
fpu_clone
CPU: 6 PID: 1 Comm: systemd Tainted: G E (unreleased)
Call Trace:
<TASK>
dump_stack_lvl
? fpu_clone
__might_resched
rt_spin_lock
fpu_clone
? copy_thread
? copy_process
? shmem_alloc_inode
? kmem_cache_alloc
? kernel_clone
? __do_sys_clone
? do_syscall_64
? __x64_sys_rt_sigprocmask
? syscall_exit_to_user_mode
? do_syscall_64
? syscall_exit_to_user_mode
? do_syscall_64
? syscall_exit_to_user_mode
? do_syscall_64
? exc_page_fault
? entry_SYSCALL_64_after_hwframe
</TASK>
Mike says:
The splat comes from fpu_inherit_perms() being called under fpregs_lock(),
and us reaching the spin_lock_irq() therein due to fpu_state_size_dynamic()
returning true despite static key __fpu_state_size_dynamic having never
been enabled.
Mike's assessment looks correct. fpregs_lock on a PREEMPT_RT kernel disables
preemption so calling spin_lock_irq() in fpu_inherit_perms() is unsafe. This
problem exists since commit
9e798e9aa14c ("x86/fpu: Prepare fpu_clone() for dynamically enabled features").
Even though the original bug report should not have enabled the paths at
all, the bug still exists.
fpregs_lock is necessary when editing the FPU registers or a task's FP
state but it is not necessary for fpu_inherit_perms(). The only write
of any FP state in fpu_inherit_perms() is for the new child which is
not running yet and cannot context switch or be borrowed by a kernel
thread yet. Hence, fpregs_lock is not protecting anything in the new
child until clone() completes and can be dropped earlier. The siglock
still needs to be acquired by fpu_inherit_perms() as the read of the
parent's permissions has to be serialised.
[ bp: Cleanup splat. ]
In the Linux kernel, the following vulnerability has been resolved:
perf: Improve missing SIGTRAP checking
To catch missing SIGTRAP we employ a WARN in __perf_event_overflow(),
which fires if pending_sigtrap was already set: returning to user space
without consuming pending_sigtrap, and then having the event fire again
would re-enter the kernel and trigger the WARN.
This, however, seemed to miss the case where some events not associated
with progress in the user space task can fire and the interrupt handler
runs before the IRQ work meant to consume pending_sigtrap (and generate
the SIGTRAP).
syzbot gifted us this stack trace:
| WARNING: CPU: 0 PID: 3607 at kernel/events/core.c:9313 __perf_event_overflow
| Modules linked in:
| CPU: 0 PID: 3607 Comm: syz-executor100 Not tainted 6.1.0-rc2-syzkaller-00073-g88619e77b33d #0
| Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/11/2022
| RIP: 0010:__perf_event_overflow+0x498/0x540 kernel/events/core.c:9313
| <...>
| Call Trace:
| <TASK>
| perf_swevent_hrtimer+0x34f/0x3c0 kernel/events/core.c:10729
| __run_hrtimer kernel/time/hrtimer.c:1685 [inline]
| __hrtimer_run_queues+0x1c6/0xfb0 kernel/time/hrtimer.c:1749
| hrtimer_interrupt+0x31c/0x790 kernel/time/hrtimer.c:1811
| local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1096 [inline]
| __sysvec_apic_timer_interrupt+0x17c/0x640 arch/x86/kernel/apic/apic.c:1113
| sysvec_apic_timer_interrupt+0x40/0xc0 arch/x86/kernel/apic/apic.c:1107
| asm_sysvec_apic_timer_interrupt+0x16/0x20 arch/x86/include/asm/idtentry.h:649
| <...>
| </TASK>
In this case, syzbot produced a program with event type
PERF_TYPE_SOFTWARE and config PERF_COUNT_SW_CPU_CLOCK. The hrtimer
manages to fire again before the IRQ work got a chance to run, all while
never having returned to user space.
Improve the WARN to check for real progress in user space: approximate
this by storing a 32-bit hash of the current IP into pending_sigtrap,
and if an event fires while pending_sigtrap still matches the previous
IP, we assume no progress (false negatives are possible given we could
return to user space and trigger again on the same IP).
In the Linux kernel, the following vulnerability has been resolved:
perf/x86/amd: Fix crash due to race between amd_pmu_enable_all, perf NMI and throttling
amd_pmu_enable_all() does:
if (!test_bit(idx, cpuc->active_mask))
continue;
amd_pmu_enable_event(cpuc->events[idx]);
A perf NMI of another event can come between these two steps. Perf NMI
handler internally disables and enables _all_ events, including the one
which nmi-intercepted amd_pmu_enable_all() was in process of enabling.
If that unintentionally enabled event has very low sampling period and
causes immediate successive NMI, causing the event to be throttled,
cpuc->events[idx] and cpuc->active_mask gets cleared by x86_pmu_stop().
This will result in amd_pmu_enable_event() getting called with event=NULL
when amd_pmu_enable_all() resumes after handling the NMIs. This causes a
kernel crash:
BUG: kernel NULL pointer dereference, address: 0000000000000198
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
[...]
Call Trace:
<TASK>
amd_pmu_enable_all+0x68/0xb0
ctx_resched+0xd9/0x150
event_function+0xb8/0x130
? hrtimer_start_range_ns+0x141/0x4a0
? perf_duration_warn+0x30/0x30
remote_function+0x4d/0x60
__flush_smp_call_function_queue+0xc4/0x500
flush_smp_call_function_queue+0x11d/0x1b0
do_idle+0x18f/0x2d0
cpu_startup_entry+0x19/0x20
start_secondary+0x121/0x160
secondary_startup_64_no_verify+0xe5/0xeb
</TASK>
amd_pmu_disable_all()/amd_pmu_enable_all() calls inside perf NMI handler
were recently added as part of BRS enablement but I'm not sure whether
we really need them. We can just disable BRS in the beginning and enable
it back while returning from NMI. This will solve the issue by not
enabling those events whose active_masks are set but are not yet enabled
in hw pmu.
In the Linux kernel, the following vulnerability has been resolved:
scsi: target: tcm_loop: Fix possible name leak in tcm_loop_setup_hba_bus()
If device_register() fails in tcm_loop_setup_hba_bus(), the name allocated
by dev_set_name() need be freed. As comment of device_register() says, it
should use put_device() to give up the reference in the error path. So fix
this by calling put_device(), then the name can be freed in kobject_cleanup().
The 'tl_hba' will be freed in tcm_loop_release_adapter(), so it don't need
goto error label in this case.
In the Linux kernel, the following vulnerability has been resolved:
arm64/mm: fix incorrect file_map_count for non-leaf pmd/pud
The page table check trigger BUG_ON() unexpectedly when collapse hugepage:
------------[ cut here ]------------
kernel BUG at mm/page_table_check.c:82!
Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
Dumping ftrace buffer:
(ftrace buffer empty)
Modules linked in:
CPU: 6 PID: 68 Comm: khugepaged Not tainted 6.1.0-rc3+ #750
Hardware name: linux,dummy-virt (DT)
pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : page_table_check_clear.isra.0+0x258/0x3f0
lr : page_table_check_clear.isra.0+0x240/0x3f0
[...]
Call trace:
page_table_check_clear.isra.0+0x258/0x3f0
__page_table_check_pmd_clear+0xbc/0x108
pmdp_collapse_flush+0xb0/0x160
collapse_huge_page+0xa08/0x1080
hpage_collapse_scan_pmd+0xf30/0x1590
khugepaged_scan_mm_slot.constprop.0+0x52c/0xac8
khugepaged+0x338/0x518
kthread+0x278/0x2f8
ret_from_fork+0x10/0x20
[...]
Since pmd_user_accessible_page() doesn't check if a pmd is leaf, it
decrease file_map_count for a non-leaf pmd comes from collapse_huge_page().
and so trigger BUG_ON() unexpectedly.
Fix this problem by using pmd_leaf() insteal of pmd_present() in
pmd_user_accessible_page(). Moreover, use pud_leaf() for
pud_user_accessible_page() too.
In the Linux kernel, the following vulnerability has been resolved:
Input: i8042 - fix leaking of platform device on module removal
Avoid resetting the module-wide i8042_platform_device pointer in
i8042_probe() or i8042_remove(), so that the device can be properly
destroyed by i8042_exit() on module unload.
In the Linux kernel, the following vulnerability has been resolved:
macvlan: enforce a consistent minimal mtu
macvlan should enforce a minimal mtu of 68, even at link creation.
This patch avoids the current behavior (which could lead to crashes
in ipv6 stack if the link is brought up)
$ ip link add macvlan1 link eno1 mtu 8 type macvlan # This should fail !
$ ip link sh dev macvlan1
5: macvlan1@eno1: <BROADCAST,MULTICAST> mtu 8 qdisc noop
state DOWN mode DEFAULT group default qlen 1000
link/ether 02:47:6c:24:74:82 brd ff:ff:ff:ff:ff:ff
$ ip link set macvlan1 mtu 67
Error: mtu less than device minimum.
$ ip link set macvlan1 mtu 68
$ ip link set macvlan1 mtu 8
Error: mtu less than device minimum.
In the Linux kernel, the following vulnerability has been resolved:
KVM: x86/xen: Fix eventfd error handling in kvm_xen_eventfd_assign()
Should not call eventfd_ctx_put() in case of error.
[Introduce new goto target instead. - Paolo]
In the Linux kernel, the following vulnerability has been resolved:
drm/amd/display: Fix optc2_configure warning on dcn314
[Why]
dcn314 uses optc2_configure_crc() that wraps
optc1_configure_crc() + set additional registers
not applicable to dcn314.
It's not critical but when used leads to warning like:
WARNING: drivers/gpu/drm/amd/amdgpu/../display/dc/dc_helper.c
Call Trace:
<TASK>
generic_reg_set_ex+0x6d/0xe0 [amdgpu]
optc2_configure_crc+0x60/0x80 [amdgpu]
dc_stream_configure_crc+0x129/0x150 [amdgpu]
amdgpu_dm_crtc_configure_crc_source+0x5d/0xe0 [amdgpu]
[How]
Use optc1_configure_crc() directly
In the Linux kernel, the following vulnerability has been resolved:
ALSA: usb-audio: Drop snd_BUG_ON() from snd_usbmidi_output_open()
snd_usbmidi_output_open() has a check of the NULL port with
snd_BUG_ON(). snd_BUG_ON() was used as this shouldn't have happened,
but in reality, the NULL port may be seen when the device gives an
invalid endpoint setup at the descriptor, hence the driver skips the
allocation. That is, the check itself is valid and snd_BUG_ON()
should be dropped from there. Otherwise it's confusing as if it were
a real bug, as recently syzbot stumbled on it.
In the Linux kernel, the following vulnerability has been resolved:
dm ioctl: fix misbehavior if list_versions races with module loading
__list_versions will first estimate the required space using the
"dm_target_iterate(list_version_get_needed, &needed)" call and then will
fill the space using the "dm_target_iterate(list_version_get_info,
&iter_info)" call. Each of these calls locks the targets using the
"down_read(&_lock)" and "up_read(&_lock)" calls, however between the first
and second "dm_target_iterate" there is no lock held and the target
modules can be loaded at this point, so the second "dm_target_iterate"
call may need more space than what was the first "dm_target_iterate"
returned.
The code tries to handle this overflow (see the beginning of
list_version_get_info), however this handling is incorrect.
The code sets "param->data_size = param->data_start + needed" and
"iter_info.end = (char *)vers+len" - "needed" is the size returned by the
first dm_target_iterate call; "len" is the size of the buffer allocated by
userspace.
"len" may be greater than "needed"; in this case, the code will write up
to "len" bytes into the buffer, however param->data_size is set to
"needed", so it may write data past the param->data_size value. The ioctl
interface copies only up to param->data_size into userspace, thus part of
the result will be truncated.
Fix this bug by setting "iter_info.end = (char *)vers + needed;" - this
guarantees that the second "dm_target_iterate" call will write only up to
the "needed" buffer and it will exit with "DM_BUFFER_FULL_FLAG" if it
overflows the "needed" space - in this case, userspace will allocate a
larger buffer and retry.
Note that there is also a bug in list_version_get_needed - we need to add
"strlen(tt->name) + 1" to the needed size, not "strlen(tt->name)".
In the Linux kernel, the following vulnerability has been resolved:
gfs2: Check sb_bsize_shift after reading superblock
Fuzzers like to scribble over sb_bsize_shift but in reality it's very
unlikely that this field would be corrupted on its own. Nevertheless it
should be checked to avoid the possibility of messy mount errors due to
bad calculations. It's always a fixed value based on the block size so
we can just check that it's the expected value.
Tested with:
mkfs.gfs2 -O -p lock_nolock /dev/vdb
for i in 0 -1 64 65 32 33; do
gfs2_edit -p sb field sb_bsize_shift $i /dev/vdb
mount /dev/vdb /mnt/test && umount /mnt/test
done
Before this patch we get a withdraw after
[ 76.413681] gfs2: fsid=loop0.0: fatal: invalid metadata block
[ 76.413681] bh = 19 (type: exp=5, found=4)
[ 76.413681] function = gfs2_meta_buffer, file = fs/gfs2/meta_io.c, line = 492
and with UBSAN configured we also get complaints like
[ 76.373395] UBSAN: shift-out-of-bounds in fs/gfs2/ops_fstype.c:295:19
[ 76.373815] shift exponent 4294967287 is too large for 64-bit type 'long unsigned int'
After the patch, these complaints don't appear, mount fails immediately
and we get an explanation in dmesg.
In the Linux kernel, the following vulnerability has been resolved:
9p: trans_fd/p9_conn_cancel: drop client lock earlier
syzbot reported a double-lock here and we no longer need this
lock after requests have been moved off to local list:
just drop the lock earlier.
In the Linux kernel, the following vulnerability has been resolved:
9p/trans_fd: always use O_NONBLOCK read/write
syzbot is reporting hung task at p9_fd_close() [1], for p9_mux_poll_stop()
from p9_conn_destroy() from p9_fd_close() is failing to interrupt already
started kernel_read() from p9_fd_read() from p9_read_work() and/or
kernel_write() from p9_fd_write() from p9_write_work() requests.
Since p9_socket_open() sets O_NONBLOCK flag, p9_mux_poll_stop() does not
need to interrupt kernel_read()/kernel_write(). However, since p9_fd_open()
does not set O_NONBLOCK flag, but pipe blocks unless signal is pending,
p9_mux_poll_stop() needs to interrupt kernel_read()/kernel_write() when
the file descriptor refers to a pipe. In other words, pipe file descriptor
needs to be handled as if socket file descriptor.
We somehow need to interrupt kernel_read()/kernel_write() on pipes.
A minimal change, which this patch is doing, is to set O_NONBLOCK flag
from p9_fd_open(), for O_NONBLOCK flag does not affect reading/writing
of regular files. But this approach changes O_NONBLOCK flag on userspace-
supplied file descriptors (which might break userspace programs), and
O_NONBLOCK flag could be changed by userspace. It would be possible to set
O_NONBLOCK flag every time p9_fd_read()/p9_fd_write() is invoked, but still
remains small race window for clearing O_NONBLOCK flag.
If we don't want to manipulate O_NONBLOCK flag, we might be able to
surround kernel_read()/kernel_write() with set_thread_flag(TIF_SIGPENDING)
and recalc_sigpending(). Since p9_read_work()/p9_write_work() works are
processed by kernel threads which process global system_wq workqueue,
signals could not be delivered from remote threads when p9_mux_poll_stop()
from p9_conn_destroy() from p9_fd_close() is called. Therefore, calling
set_thread_flag(TIF_SIGPENDING)/recalc_sigpending() every time would be
needed if we count on signals for making kernel_read()/kernel_write()
non-blocking.
[Dominique: add comment at Christian's suggestion]
In the Linux kernel, the following vulnerability has been resolved:
netlink: Bounds-check struct nlmsgerr creation
In preparation for FORTIFY_SOURCE doing bounds-check on memcpy(),
switch from __nlmsg_put to nlmsg_put(), and explain the bounds check
for dealing with the memcpy() across a composite flexible array struct.
Avoids this future run-time warning:
memcpy: detected field-spanning write (size 32) of single field "&errmsg->msg" at net/netlink/af_netlink.c:2447 (size 16)
In the Linux kernel, the following vulnerability has been resolved:
net/9p: use a dedicated spinlock for trans_fd
Shamelessly copying the explanation from Tetsuo Handa's suggested
patch[1] (slightly reworded):
syzbot is reporting inconsistent lock state in p9_req_put()[2],
for p9_tag_remove() from p9_req_put() from IRQ context is using
spin_lock_irqsave() on "struct p9_client"->lock but trans_fd
(not from IRQ context) is using spin_lock().
Since the locks actually protect different things in client.c and in
trans_fd.c, just replace trans_fd.c's lock by a new one specific to the
transport (client.c's protect the idr for fid/tag allocations,
while trans_fd.c's protects its own req list and request status field
that acts as the transport's state machine)
In the Linux kernel, the following vulnerability has been resolved:
bpf: Prevent bpf program recursion for raw tracepoint probes
We got report from sysbot [1] about warnings that were caused by
bpf program attached to contention_begin raw tracepoint triggering
the same tracepoint by using bpf_trace_printk helper that takes
trace_printk_lock lock.
Call Trace:
<TASK>
? trace_event_raw_event_bpf_trace_printk+0x5f/0x90
bpf_trace_printk+0x2b/0xe0
bpf_prog_a9aec6167c091eef_prog+0x1f/0x24
bpf_trace_run2+0x26/0x90
native_queued_spin_lock_slowpath+0x1c6/0x2b0
_raw_spin_lock_irqsave+0x44/0x50
bpf_trace_printk+0x3f/0xe0
bpf_prog_a9aec6167c091eef_prog+0x1f/0x24
bpf_trace_run2+0x26/0x90
native_queued_spin_lock_slowpath+0x1c6/0x2b0
_raw_spin_lock_irqsave+0x44/0x50
bpf_trace_printk+0x3f/0xe0
bpf_prog_a9aec6167c091eef_prog+0x1f/0x24
bpf_trace_run2+0x26/0x90
native_queued_spin_lock_slowpath+0x1c6/0x2b0
_raw_spin_lock_irqsave+0x44/0x50
bpf_trace_printk+0x3f/0xe0
bpf_prog_a9aec6167c091eef_prog+0x1f/0x24
bpf_trace_run2+0x26/0x90
native_queued_spin_lock_slowpath+0x1c6/0x2b0
_raw_spin_lock_irqsave+0x44/0x50
__unfreeze_partials+0x5b/0x160
...
The can be reproduced by attaching bpf program as raw tracepoint on
contention_begin tracepoint. The bpf prog calls bpf_trace_printk
helper. Then by running perf bench the spin lock code is forced to
take slow path and call contention_begin tracepoint.
Fixing this by skipping execution of the bpf program if it's
already running, Using bpf prog 'active' field, which is being
currently used by trampoline programs for the same reason.
Moving bpf_prog_inc_misses_counter to syscall.c because
trampoline.c is compiled in just for CONFIG_BPF_JIT option.
[1] https://lore.kernel.org/bpf/YxhFe3EwqchC%2FfYf@krava/T/#t
In the Linux kernel, the following vulnerability has been resolved:
ntfs: check overflow when iterating ATTR_RECORDs
Kernel iterates over ATTR_RECORDs in mft record in ntfs_attr_find().
Because the ATTR_RECORDs are next to each other, kernel can get the next
ATTR_RECORD from end address of current ATTR_RECORD, through current
ATTR_RECORD length field.
The problem is that during iteration, when kernel calculates the end
address of current ATTR_RECORD, kernel may trigger an integer overflow bug
in executing `a = (ATTR_RECORD*)((u8*)a + le32_to_cpu(a->length))`. This
may wrap, leading to a forever iteration on 32bit systems.
This patch solves it by adding some checks on calculating end address
of current ATTR_RECORD during iteration.
In the Linux kernel, the following vulnerability has been resolved:
wifi: mac80211: Purge vif txq in ieee80211_do_stop()
After ieee80211_do_stop() SKB from vif's txq could still be processed.
Indeed another concurrent vif schedule_and_wake_txq call could cause
those packets to be dequeued (see ieee80211_handle_wake_tx_queue())
without checking the sdata current state.
Because vif.drv_priv is now cleared in this function, this could lead to
driver crash.
For example in ath12k, ahvif is store in vif.drv_priv. Thus if
ath12k_mac_op_tx() is called after ieee80211_do_stop(), ahvif->ah can be
NULL, leading the ath12k_warn(ahvif->ah,...) call in this function to
trigger the NULL deref below.
Unable to handle kernel paging request at virtual address dfffffc000000001
KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
batman_adv: bat0: Interface deactivated: brbh1337
Mem abort info:
ESR = 0x0000000096000004
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x04: level 0 translation fault
Data abort info:
ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
CM = 0, WnR = 0, TnD = 0, TagAccess = 0
GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[dfffffc000000001] address between user and kernel address ranges
Internal error: Oops: 0000000096000004 [#1] SMP
CPU: 1 UID: 0 PID: 978 Comm: lbd Not tainted 6.13.0-g633f875b8f1e #114
Hardware name: HW (DT)
pstate: 10000005 (nzcV daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : ath12k_mac_op_tx+0x6cc/0x29b8 [ath12k]
lr : ath12k_mac_op_tx+0x174/0x29b8 [ath12k]
sp : ffffffc086ace450
x29: ffffffc086ace450 x28: 0000000000000000 x27: 1ffffff810d59ca4
x26: ffffff801d05f7c0 x25: 0000000000000000 x24: 000000004000001e
x23: ffffff8009ce4926 x22: ffffff801f9c0800 x21: ffffff801d05f7f0
x20: ffffff8034a19f40 x19: 0000000000000000 x18: ffffff801f9c0958
x17: ffffff800bc0a504 x16: dfffffc000000000 x15: ffffffc086ace4f8
x14: ffffff801d05f83c x13: 0000000000000000 x12: ffffffb003a0bf03
x11: 0000000000000000 x10: ffffffb003a0bf02 x9 : ffffff8034a19f40
x8 : ffffff801d05f818 x7 : 1ffffff0069433dc x6 : ffffff8034a19ee0
x5 : ffffff801d05f7f0 x4 : 0000000000000000 x3 : 0000000000000001
x2 : 0000000000000000 x1 : dfffffc000000000 x0 : 0000000000000008
Call trace:
ath12k_mac_op_tx+0x6cc/0x29b8 [ath12k] (P)
ieee80211_handle_wake_tx_queue+0x16c/0x260
ieee80211_queue_skb+0xeec/0x1d20
ieee80211_tx+0x200/0x2c8
ieee80211_xmit+0x22c/0x338
__ieee80211_subif_start_xmit+0x7e8/0xc60
ieee80211_subif_start_xmit+0xc4/0xee0
__ieee80211_subif_start_xmit_8023.isra.0+0x854/0x17a0
ieee80211_subif_start_xmit_8023+0x124/0x488
dev_hard_start_xmit+0x160/0x5a8
__dev_queue_xmit+0x6f8/0x3120
br_dev_queue_push_xmit+0x120/0x4a8
__br_forward+0xe4/0x2b0
deliver_clone+0x5c/0xd0
br_flood+0x398/0x580
br_dev_xmit+0x454/0x9f8
dev_hard_start_xmit+0x160/0x5a8
__dev_queue_xmit+0x6f8/0x3120
ip6_finish_output2+0xc28/0x1b60
__ip6_finish_output+0x38c/0x638
ip6_output+0x1b4/0x338
ip6_local_out+0x7c/0xa8
ip6_send_skb+0x7c/0x1b0
ip6_push_pending_frames+0x94/0xd0
rawv6_sendmsg+0x1a98/0x2898
inet_sendmsg+0x94/0xe0
__sys_sendto+0x1e4/0x308
__arm64_sys_sendto+0xc4/0x140
do_el0_svc+0x110/0x280
el0_svc+0x20/0x60
el0t_64_sync_handler+0x104/0x138
el0t_64_sync+0x154/0x158
To avoid that, empty vif's txq at ieee80211_do_stop() so no packet could
be dequeued after ieee80211_do_stop() (new packets cannot be queued
because SDATA_STATE_RUNNING is cleared at this point).
In the Linux kernel, the following vulnerability has been resolved:
ASoC: Intel: avs: Fix null-ptr-deref in avs_component_probe()
devm_kasprintf() returns NULL when memory allocation fails. Currently,
avs_component_probe() does not check for this case, which results in a
NULL pointer dereference.
In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: btrtl: Prevent potential NULL dereference
The btrtl_initialize() function checks that rtl_load_file() either
had an error or it loaded a zero length file. However, if it loaded
a zero length file then the error code is not set correctly. It
results in an error pointer vs NULL bug, followed by a NULL pointer
dereference. This was detected by Smatch:
drivers/bluetooth/btrtl.c:592 btrtl_initialize() warn: passing zero to 'ERR_PTR'
In the Linux kernel, the following vulnerability has been resolved:
ethtool: cmis_cdb: use correct rpl size in ethtool_cmis_module_poll()
rpl is passed as a pointer to ethtool_cmis_module_poll(), so the correct
size of rpl is sizeof(*rpl) which should be just 1 byte. Using the
pointer size instead can cause stack corruption:
Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ethtool_cmis_wait_for_cond+0xf4/0x100
CPU: 72 UID: 0 PID: 4440 Comm: kworker/72:2 Kdump: loaded Tainted: G OE 6.11.0 #24
Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
Hardware name: Dell Inc. PowerEdge R760/04GWWM, BIOS 1.6.6 09/20/2023
Workqueue: events module_flash_fw_work
Call Trace:
<TASK>
panic+0x339/0x360
? ethtool_cmis_wait_for_cond+0xf4/0x100
? __pfx_status_success+0x10/0x10
? __pfx_status_fail+0x10/0x10
__stack_chk_fail+0x10/0x10
ethtool_cmis_wait_for_cond+0xf4/0x100
ethtool_cmis_cdb_execute_cmd+0x1fc/0x330
? __pfx_status_fail+0x10/0x10
cmis_cdb_module_features_get+0x6d/0xd0
ethtool_cmis_cdb_init+0x8a/0xd0
ethtool_cmis_fw_update+0x46/0x1d0
module_flash_fw_work+0x17/0xa0
process_one_work+0x179/0x390
worker_thread+0x239/0x340
? __pfx_worker_thread+0x10/0x10
kthread+0xcc/0x100
? __pfx_kthread+0x10/0x10
ret_from_fork+0x2d/0x50
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1a/0x30
</TASK>
In the Linux kernel, the following vulnerability has been resolved:
net: mctp: Set SOCK_RCU_FREE
Bind lookup runs under RCU, so ensure that a socket doesn't go away in
the middle of a lookup.
In the Linux kernel, the following vulnerability has been resolved:
cxgb4: fix memory leak in cxgb4_init_ethtool_filters() error path
In the for loop used to allocate the loc_array and bmap for each port, a
memory leak is possible when the allocation for loc_array succeeds,
but the allocation for bmap fails. This is because when the control flow
goes to the label free_eth_finfo, only the allocations starting from
(i-1)th iteration are freed.
Fix that by freeing the loc_array in the bmap allocation error path.
In the Linux kernel, the following vulnerability has been resolved:
net: dsa: mv88e6xxx: avoid unregistering devlink regions which were never registered
Russell King reports that a system with mv88e6xxx dereferences a NULL
pointer when unbinding this driver:
https://lore.kernel.org/netdev/Z_lRkMlTJ1KQ0kVX@shell.armlinux.org.uk/
The crash seems to be in devlink_region_destroy(), which is not NULL
tolerant but is given a NULL devlink global region pointer.
At least on some chips, some devlink regions are conditionally registered
since the blamed commit, see mv88e6xxx_setup_devlink_regions_global():
if (cond && !cond(chip))
continue;
These are MV88E6XXX_REGION_STU and MV88E6XXX_REGION_PVT. If the chip
does not have an STU or PVT, it should crash like this.
To fix the issue, avoid unregistering those regions which are NULL, i.e.
were skipped at mv88e6xxx_setup_devlink_regions_global() time.
In the Linux kernel, the following vulnerability has been resolved:
net: ti: icss-iep: Fix possible NULL pointer dereference for perout request
The ICSS IEP driver tracks perout and pps enable state with flags.
Currently when disabling pps and perout signals during icss_iep_exit(),
results in NULL pointer dereference for perout.
To fix the null pointer dereference issue, the icss_iep_perout_enable_hw
function can be modified to directly clear the IEP CMP registers when
disabling PPS or PEROUT, without referencing the ptp_perout_request
structure, as its contents are irrelevant in this case.
In the Linux kernel, the following vulnerability has been resolved:
drm/msm/dpu: Fix error pointers in dpu_plane_virtual_atomic_check
The function dpu_plane_virtual_atomic_check was dereferencing pointers
returned by drm_atomic_get_plane_state without checking for errors. This
could lead to undefined behavior if the function returns an error pointer.
This commit adds checks using IS_ERR to ensure that plane_state is
valid before dereferencing them.
Similar to commit da29abe71e16
("drm/amd/display: Fix error pointers in amdgpu_dm_crtc_mem_type_changed").
Patchwork: https://patchwork.freedesktop.org/patch/643132/
In the Linux kernel, the following vulnerability has been resolved:
lib/iov_iter: fix to increase non slab folio refcount
When testing EROFS file-backed mount over v9fs on qemu, I encountered a
folio UAF issue. The page sanity check reports the following call trace.
The root cause is that pages in bvec are coalesced across a folio bounary.
The refcount of all non-slab folios should be increased to ensure
p9_releas_pages can put them correctly.
BUG: Bad page state in process md5sum pfn:18300
page: refcount:0 mapcount:0 mapping:00000000d5ad8e4e index:0x60 pfn:0x18300
head: order:0 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
aops:z_erofs_aops ino:30b0f dentry name(?):"GoogleExtServicesCn.apk"
flags: 0x100000000000041(locked|head|node=0|zone=1)
raw: 0100000000000041 dead000000000100 dead000000000122 ffff888014b13bd0
raw: 0000000000000060 0000000000000020 00000000ffffffff 0000000000000000
head: 0100000000000041 dead000000000100 dead000000000122 ffff888014b13bd0
head: 0000000000000060 0000000000000020 00000000ffffffff 0000000000000000
head: 0100000000000000 0000000000000000 ffffffffffffffff 0000000000000000
head: 0000000000000010 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
Call Trace:
dump_stack_lvl+0x53/0x70
bad_page+0xd4/0x220
__free_pages_ok+0x76d/0xf30
__folio_put+0x230/0x320
p9_release_pages+0x179/0x1f0
p9_virtio_zc_request+0xa2a/0x1230
p9_client_zc_rpc.constprop.0+0x247/0x700
p9_client_read_once+0x34d/0x810
p9_client_read+0xf3/0x150
v9fs_issue_read+0x111/0x360
netfs_unbuffered_read_iter_locked+0x927/0x1390
netfs_unbuffered_read_iter+0xa2/0xe0
vfs_iocb_iter_read+0x2c7/0x460
erofs_fileio_rq_submit+0x46b/0x5b0
z_erofs_runqueue+0x1203/0x21e0
z_erofs_readahead+0x579/0x8b0
read_pages+0x19f/0xa70
page_cache_ra_order+0x4ad/0xb80
filemap_readahead.isra.0+0xe7/0x150
filemap_get_pages+0x7aa/0x1890
filemap_read+0x320/0xc80
vfs_read+0x6c6/0xa30
ksys_read+0xf9/0x1c0
do_syscall_64+0x9e/0x1a0
entry_SYSCALL_64_after_hwframe+0x71/0x79
In the Linux kernel, the following vulnerability has been resolved:
ksmbd: fix the warning from __kernel_write_iter
[ 2110.972290] ------------[ cut here ]------------
[ 2110.972301] WARNING: CPU: 3 PID: 735 at fs/read_write.c:599 __kernel_write_iter+0x21b/0x280
This patch doesn't allow writing to directory.