In the Linux kernel, the following vulnerability has been resolved:
x86: stop playing stack games in profile_pc()
The 'profile_pc()' function is used for timer-based profiling, which
isn't really all that relevant any more to begin with, but it also ends
up making assumptions based on the stack layout that aren't necessarily
valid.
Basically, the code tries to account the time spent in spinlocks to the
caller rather than the spinlock, and while I support that as a concept,
it's not worth the code complexity or the KASAN warnings when no serious
profiling is done using timers anyway these days.
And the code really does depend on stack layout that is only true in the
simplest of cases. We've lost the comment at some point (I think when
the 32-bit and 64-bit code was unified), but it used to say:
Assume the lock function has either no stack frame or a copy
of eflags from PUSHF.
which explains why it just blindly loads a word or two straight off the
stack pointer and then takes a minimal look at the values to just check
if they might be eflags or the return pc:
Eflags always has bits 22 and up cleared unlike kernel addresses
but that basic stack layout assumption assumes that there isn't any lock
debugging etc going on that would complicate the code and cause a stack
frame.
It causes KASAN unhappiness reported for years by syzkaller [1] and
others [2].
With no real practical reason for this any more, just remove the code.
Just for historical interest, here's some background commits relating to
this code from 2006:
0cb91a229364 ("i386: Account spinlocks to the caller during profiling for !FP kernels")
31679f38d886 ("Simplify profile_pc on x86-64")
and a code unification from 2009:
ef4512882dbe ("x86: time_32/64.c unify profile_pc")
but the basics of this thing actually goes back to before the git tree.
In the Linux kernel, the following vulnerability has been resolved:
serial: 8250_omap: Implementation of Errata i2310
As per Errata i2310[0], Erroneous timeout can be triggered,
if this Erroneous interrupt is not cleared then it may leads
to storm of interrupts, therefore apply Errata i2310 solution.
[0] https://www.ti.com/lit/pdf/sprz536 page 23
In the Linux kernel, the following vulnerability has been resolved:
drm/xe: Check pat.ops before dumping PAT settings
We may leave pat.ops unset when running on brand new platform or
when running as a VF. While the former is unlikely, the latter
is valid (future) use case and will cause NPD when someone will
try to dump PAT settings by debugfs.
It's better to check pointer to pat.ops instead of specific .dump
hook, as we have this hook always defined for every .ops variant.
In the Linux kernel, the following vulnerability has been resolved:
pinctrl: fix deadlock in create_pinctrl() when handling -EPROBE_DEFER
In create_pinctrl(), pinctrl_maps_mutex is acquired before calling
add_setting(). If add_setting() returns -EPROBE_DEFER, create_pinctrl()
calls pinctrl_free(). However, pinctrl_free() attempts to acquire
pinctrl_maps_mutex, which is already held by create_pinctrl(), leading to
a potential deadlock.
This patch resolves the issue by releasing pinctrl_maps_mutex before
calling pinctrl_free(), preventing the deadlock.
This bug was discovered and resolved using Coverity Static Analysis
Security Testing (SAST) by Synopsys, Inc.
In the Linux kernel, the following vulnerability has been resolved:
ASoC: fsl-asoc-card: set priv->pdev before using it
priv->pdev pointer was set after being used in
fsl_asoc_card_audmux_init().
Move this assignment at the start of the probe function, so
sub-functions can correctly use pdev through priv.
fsl_asoc_card_audmux_init() dereferences priv->pdev to get access to the
dev struct, used with dev_err macros.
As priv is zero-initialised, there would be a NULL pointer dereference.
Note that if priv->dev is dereferenced before assignment but never used,
for example if there is no error to be printed, the driver won't crash
probably due to compiler optimisations.
In the Linux kernel, the following vulnerability has been resolved:
drm/panel: ilitek-ili9881c: Fix warning with GPIO controllers that sleep
The ilitek-ili9881c controls the reset GPIO using the non-sleeping
gpiod_set_value() function. This complains loudly when the GPIO
controller needs to sleep. As the caller can sleep, use
gpiod_set_value_cansleep() to fix the issue.
In the Linux kernel, the following vulnerability has been resolved:
usb: dwc3: core: remove lock of otg mode during gadget suspend/resume to avoid deadlock
When config CONFIG_USB_DWC3_DUAL_ROLE is selected, and trigger system
to enter suspend status with below command:
echo mem > /sys/power/state
There will be a deadlock issue occurring. Detailed invoking path as
below:
dwc3_suspend_common()
spin_lock_irqsave(&dwc->lock, flags); <-- 1st
dwc3_gadget_suspend(dwc);
dwc3_gadget_soft_disconnect(dwc);
spin_lock_irqsave(&dwc->lock, flags); <-- 2nd
This issue is exposed by commit c7ebd8149ee5 ("usb: dwc3: gadget: Fix
NULL pointer dereference in dwc3_gadget_suspend") that removes the code
of checking whether dwc->gadget_driver is NULL or not. It causes the
following code is executed and deadlock occurs when trying to get the
spinlock. In fact, the root cause is the commit 5265397f9442("usb: dwc3:
Remove DWC3 locking during gadget suspend/resume") that forgot to remove
the lock of otg mode. So, remove the redundant lock of otg mode during
gadget suspend/resume.
In the Linux kernel, the following vulnerability has been resolved:
ftruncate: pass a signed offset
The old ftruncate() syscall, using the 32-bit off_t misses a sign
extension when called in compat mode on 64-bit architectures. As a
result, passing a negative length accidentally succeeds in truncating
to file size between 2GiB and 4GiB.
Changing the type of the compat syscall to the signed compat_off_t
changes the behavior so it instead returns -EINVAL.
The native entry point, the truncate() syscall and the corresponding
loff_t based variants are all correct already and do not suffer
from this mistake.
In the Linux kernel, the following vulnerability has been resolved:
xdp: Remove WARN() from __xdp_reg_mem_model()
syzkaller reports a warning in __xdp_reg_mem_model().
The warning occurs only if __mem_id_init_hash_table() returns an error. It
returns the error in two cases:
1. memory allocation fails;
2. rhashtable_init() fails when some fields of rhashtable_params
struct are not initialized properly.
The second case cannot happen since there is a static const rhashtable_params
struct with valid fields. So, warning is only triggered when there is a
problem with memory allocation.
Thus, there is no sense in using WARN() to handle this error and it can be
safely removed.
WARNING: CPU: 0 PID: 5065 at net/core/xdp.c:299 __xdp_reg_mem_model+0x2d9/0x650 net/core/xdp.c:299
CPU: 0 PID: 5065 Comm: syz-executor883 Not tainted 6.8.0-syzkaller-05271-gf99c5f563c17 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
RIP: 0010:__xdp_reg_mem_model+0x2d9/0x650 net/core/xdp.c:299
Call Trace:
xdp_reg_mem_model+0x22/0x40 net/core/xdp.c:344
xdp_test_run_setup net/bpf/test_run.c:188 [inline]
bpf_test_run_xdp_live+0x365/0x1e90 net/bpf/test_run.c:377
bpf_prog_test_run_xdp+0x813/0x11b0 net/bpf/test_run.c:1267
bpf_prog_test_run+0x33a/0x3b0 kernel/bpf/syscall.c:4240
__sys_bpf+0x48d/0x810 kernel/bpf/syscall.c:5649
__do_sys_bpf kernel/bpf/syscall.c:5738 [inline]
__se_sys_bpf kernel/bpf/syscall.c:5736 [inline]
__x64_sys_bpf+0x7c/0x90 kernel/bpf/syscall.c:5736
do_syscall_64+0xfb/0x240
entry_SYSCALL_64_after_hwframe+0x6d/0x75
Found by Linux Verification Center (linuxtesting.org) with syzkaller.
In the Linux kernel, the following vulnerability has been resolved:
RDMA/restrack: Fix potential invalid address access
struct rdma_restrack_entry's kern_name was set to KBUILD_MODNAME
in ib_create_cq(), while if the module exited but forgot del this
rdma_restrack_entry, it would cause a invalid address access in
rdma_restrack_clean() when print the owner of this rdma_restrack_entry.
These code is used to help find one forgotten PD release in one of the
ULPs. But it is not needed anymore, so delete them.
In the Linux kernel, the following vulnerability has been resolved:
ocfs2: fix DIO failure due to insufficient transaction credits
The code in ocfs2_dio_end_io_write() estimates number of necessary
transaction credits using ocfs2_calc_extend_credits(). This however does
not take into account that the IO could be arbitrarily large and can
contain arbitrary number of extents.
Extent tree manipulations do often extend the current transaction but not
in all of the cases. For example if we have only single block extents in
the tree, ocfs2_mark_extent_written() will end up calling
ocfs2_replace_extent_rec() all the time and we will never extend the
current transaction and eventually exhaust all the transaction credits if
the IO contains many single block extents. Once that happens a
WARN_ON(jbd2_handle_buffer_credits(handle) <= 0) is triggered in
jbd2_journal_dirty_metadata() and subsequently OCFS2 aborts in response to
this error. This was actually triggered by one of our customers on a
heavily fragmented OCFS2 filesystem.
To fix the issue make sure the transaction always has enough credits for
one extent insert before each call of ocfs2_mark_extent_written().
Heming Zhao said:
------
PANIC: "Kernel panic - not syncing: OCFS2: (device dm-1): panic forced after error"
PID: xxx TASK: xxxx CPU: 5 COMMAND: "SubmitThread-CA"
#0 machine_kexec at ffffffff8c069932
#1 __crash_kexec at ffffffff8c1338fa
#2 panic at ffffffff8c1d69b9
#3 ocfs2_handle_error at ffffffffc0c86c0c [ocfs2]
#4 __ocfs2_abort at ffffffffc0c88387 [ocfs2]
#5 ocfs2_journal_dirty at ffffffffc0c51e98 [ocfs2]
#6 ocfs2_split_extent at ffffffffc0c27ea3 [ocfs2]
#7 ocfs2_change_extent_flag at ffffffffc0c28053 [ocfs2]
#8 ocfs2_mark_extent_written at ffffffffc0c28347 [ocfs2]
#9 ocfs2_dio_end_io_write at ffffffffc0c2bef9 [ocfs2]
#10 ocfs2_dio_end_io at ffffffffc0c2c0f5 [ocfs2]
#11 dio_complete at ffffffff8c2b9fa7
#12 do_blockdev_direct_IO at ffffffff8c2bc09f
#13 ocfs2_direct_IO at ffffffffc0c2b653 [ocfs2]
#14 generic_file_direct_write at ffffffff8c1dcf14
#15 __generic_file_write_iter at ffffffff8c1dd07b
#16 ocfs2_file_write_iter at ffffffffc0c49f1f [ocfs2]
#17 aio_write at ffffffff8c2cc72e
#18 kmem_cache_alloc at ffffffff8c248dde
#19 do_io_submit at ffffffff8c2ccada
#20 do_syscall_64 at ffffffff8c004984
#21 entry_SYSCALL_64_after_hwframe at ffffffff8c8000ba
In the Linux kernel, the following vulnerability has been resolved:
mlxsw: spectrum_buffers: Fix memory corruptions on Spectrum-4 systems
The following two shared buffer operations make use of the Shared Buffer
Status Register (SBSR):
# devlink sb occupancy snapshot pci/0000:01:00.0
# devlink sb occupancy clearmax pci/0000:01:00.0
The register has two masks of 256 bits to denote on which ingress /
egress ports the register should operate on. Spectrum-4 has more than
256 ports, so the register was extended by cited commit with a new
'port_page' field.
However, when filling the register's payload, the driver specifies the
ports as absolute numbers and not relative to the first port of the port
page, resulting in memory corruptions [1].
Fix by specifying the ports relative to the first port of the port page.
[1]
BUG: KASAN: slab-use-after-free in mlxsw_sp_sb_occ_snapshot+0xb6d/0xbc0
Read of size 1 at addr ffff8881068cb00f by task devlink/1566
[...]
Call Trace:
<TASK>
dump_stack_lvl+0xc6/0x120
print_report+0xce/0x670
kasan_report+0xd7/0x110
mlxsw_sp_sb_occ_snapshot+0xb6d/0xbc0
mlxsw_devlink_sb_occ_snapshot+0x75/0xb0
devlink_nl_sb_occ_snapshot_doit+0x1f9/0x2a0
genl_family_rcv_msg_doit+0x20c/0x300
genl_rcv_msg+0x567/0x800
netlink_rcv_skb+0x170/0x450
genl_rcv+0x2d/0x40
netlink_unicast+0x547/0x830
netlink_sendmsg+0x8d4/0xdb0
__sys_sendto+0x49b/0x510
__x64_sys_sendto+0xe5/0x1c0
do_syscall_64+0xc1/0x1d0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
[...]
Allocated by task 1:
kasan_save_stack+0x33/0x60
kasan_save_track+0x14/0x30
__kasan_kmalloc+0x8f/0xa0
copy_verifier_state+0xbc2/0xfb0
do_check_common+0x2c51/0xc7e0
bpf_check+0x5107/0x9960
bpf_prog_load+0xf0e/0x2690
__sys_bpf+0x1a61/0x49d0
__x64_sys_bpf+0x7d/0xc0
do_syscall_64+0xc1/0x1d0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Freed by task 1:
kasan_save_stack+0x33/0x60
kasan_save_track+0x14/0x30
kasan_save_free_info+0x3b/0x60
poison_slab_object+0x109/0x170
__kasan_slab_free+0x14/0x30
kfree+0xca/0x2b0
free_verifier_state+0xce/0x270
do_check_common+0x4828/0xc7e0
bpf_check+0x5107/0x9960
bpf_prog_load+0xf0e/0x2690
__sys_bpf+0x1a61/0x49d0
__x64_sys_bpf+0x7d/0xc0
do_syscall_64+0xc1/0x1d0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
In the Linux kernel, the following vulnerability has been resolved:
netfilter: nf_tables: fully validate NFT_DATA_VALUE on store to data registers
register store validation for NFT_DATA_VALUE is conditional, however,
the datatype is always either NFT_DATA_VALUE or NFT_DATA_VERDICT. This
only requires a new helper function to infer the register type from the
set datatype so this conditional check can be removed. Otherwise,
pointer to chain object can be leaked through the registers.
In the Linux kernel, the following vulnerability has been resolved:
net: mana: Fix possible double free in error handling path
When auxiliary_device_add() returns error and then calls
auxiliary_device_uninit(), callback function adev_release
calls kfree(madev). We shouldn't call kfree(madev) again
in the error handling path. Set 'madev' to NULL.
In the Linux kernel, the following vulnerability has been resolved:
bpf: Take return from set_memory_ro() into account with bpf_prog_lock_ro()
set_memory_ro() can fail, leaving memory unprotected.
Check its return and take it into account as an error.
In the Linux kernel, the following vulnerability has been resolved:
bpf: Mark bpf prog stack with kmsan_unposion_memory in interpreter mode
syzbot reported uninit memory usages during map_{lookup,delete}_elem.
==========
BUG: KMSAN: uninit-value in __dev_map_lookup_elem kernel/bpf/devmap.c:441 [inline]
BUG: KMSAN: uninit-value in dev_map_lookup_elem+0xf3/0x170 kernel/bpf/devmap.c:796
__dev_map_lookup_elem kernel/bpf/devmap.c:441 [inline]
dev_map_lookup_elem+0xf3/0x170 kernel/bpf/devmap.c:796
____bpf_map_lookup_elem kernel/bpf/helpers.c:42 [inline]
bpf_map_lookup_elem+0x5c/0x80 kernel/bpf/helpers.c:38
___bpf_prog_run+0x13fe/0xe0f0 kernel/bpf/core.c:1997
__bpf_prog_run256+0xb5/0xe0 kernel/bpf/core.c:2237
==========
The reproducer should be in the interpreter mode.
The C reproducer is trying to run the following bpf prog:
0: (18) r0 = 0x0
2: (18) r1 = map[id:49]
4: (b7) r8 = 16777216
5: (7b) *(u64 *)(r10 -8) = r8
6: (bf) r2 = r10
7: (07) r2 += -229
^^^^^^^^^^
8: (b7) r3 = 8
9: (b7) r4 = 0
10: (85) call dev_map_lookup_elem#1543472
11: (95) exit
It is due to the "void *key" (r2) passed to the helper. bpf allows uninit
stack memory access for bpf prog with the right privileges. This patch
uses kmsan_unpoison_memory() to mark the stack as initialized.
This should address different syzbot reports on the uninit "void *key"
argument during map_{lookup,delete}_elem.
Twisted is an event-based framework for internet applications, supporting Python 3.6+. The `twisted.web.util.redirectTo` function contains an HTML injection vulnerability. If application code allows an attacker to control the redirect URL this vulnerability may result in Reflected Cross-Site Scripting (XSS) in the redirect response HTML body. This vulnerability is fixed in 24.7.0rc1.
In the Linux kernel, the following vulnerability has been resolved:
usb: atm: cxacru: fix endpoint checking in cxacru_bind()
Syzbot is still reporting quite an old issue [1] that occurs due to
incomplete checking of present usb endpoints. As such, wrong
endpoints types may be used at urb sumbitting stage which in turn
triggers a warning in usb_submit_urb().
Fix the issue by verifying that required endpoint types are present
for both in and out endpoints, taking into account cmd endpoint type.
Unfortunately, this patch has not been tested on real hardware.
[1] Syzbot report:
usb 1-1: BOGUS urb xfer, pipe 1 != type 3
WARNING: CPU: 0 PID: 8667 at drivers/usb/core/urb.c:502 usb_submit_urb+0xed2/0x18a0 drivers/usb/core/urb.c:502
Modules linked in:
CPU: 0 PID: 8667 Comm: kworker/0:4 Not tainted 5.14.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: usb_hub_wq hub_event
RIP: 0010:usb_submit_urb+0xed2/0x18a0 drivers/usb/core/urb.c:502
...
Call Trace:
cxacru_cm+0x3c0/0x8e0 drivers/usb/atm/cxacru.c:649
cxacru_card_status+0x22/0xd0 drivers/usb/atm/cxacru.c:760
cxacru_bind+0x7ac/0x11a0 drivers/usb/atm/cxacru.c:1209
usbatm_usb_probe+0x321/0x1ae0 drivers/usb/atm/usbatm.c:1055
cxacru_usb_probe+0xdf/0x1e0 drivers/usb/atm/cxacru.c:1363
usb_probe_interface+0x315/0x7f0 drivers/usb/core/driver.c:396
call_driver_probe drivers/base/dd.c:517 [inline]
really_probe+0x23c/0xcd0 drivers/base/dd.c:595
__driver_probe_device+0x338/0x4d0 drivers/base/dd.c:747
driver_probe_device+0x4c/0x1a0 drivers/base/dd.c:777
__device_attach_driver+0x20b/0x2f0 drivers/base/dd.c:894
bus_for_each_drv+0x15f/0x1e0 drivers/base/bus.c:427
__device_attach+0x228/0x4a0 drivers/base/dd.c:965
bus_probe_device+0x1e4/0x290 drivers/base/bus.c:487
device_add+0xc2f/0x2180 drivers/base/core.c:3354
usb_set_configuration+0x113a/0x1910 drivers/usb/core/message.c:2170
usb_generic_driver_probe+0xba/0x100 drivers/usb/core/generic.c:238
usb_probe_device+0xd9/0x2c0 drivers/usb/core/driver.c:293
In the Linux kernel, the following vulnerability has been resolved:
drm/nouveau/dispnv04: fix null pointer dereference in nv17_tv_get_ld_modes
In nv17_tv_get_ld_modes(), the return value of drm_mode_duplicate() is
assigned to mode, which will lead to a possible NULL pointer dereference
on failure of drm_mode_duplicate(). Add a check to avoid npd.
In the Linux kernel, the following vulnerability has been resolved:
drm/amdgpu: avoid using null object of framebuffer
Instead of using state->fb->obj[0] directly, get object from framebuffer
by calling drm_gem_fb_get_obj() and return error code when object is
null to avoid using null object of framebuffer.
In the Linux kernel, the following vulnerability has been resolved:
drm/nouveau/dispnv04: fix null pointer dereference in nv17_tv_get_hd_modes
In nv17_tv_get_hd_modes(), the return value of drm_mode_duplicate() is
assigned to mode, which will lead to a possible NULL pointer dereference
on failure of drm_mode_duplicate(). The same applies to drm_cvt_mode().
Add a check to avoid null pointer dereference.
In the Linux kernel, the following vulnerability has been resolved:
can: mcp251xfd: fix infinite loop when xmit fails
When the mcp251xfd_start_xmit() function fails, the driver stops
processing messages, and the interrupt routine does not return,
running indefinitely even after killing the running application.
Error messages:
[ 441.298819] mcp251xfd spi2.0 can0: ERROR in mcp251xfd_start_xmit: -16
[ 441.306498] mcp251xfd spi2.0 can0: Transmit Event FIFO buffer not empty. (seq=0x000017c7, tef_tail=0x000017cf, tef_head=0x000017d0, tx_head=0x000017d3).
... and repeat forever.
The issue can be triggered when multiple devices share the same SPI
interface. And there is concurrent access to the bus.
The problem occurs because tx_ring->head increments even if
mcp251xfd_start_xmit() fails. Consequently, the driver skips one TX
package while still expecting a response in
mcp251xfd_handle_tefif_one().
Resolve the issue by starting a workqueue to write the tx obj
synchronously if err = -EBUSY. In case of another error, decrement
tx_ring->head, remove skb from the echo stack, and drop the message.
[mkl: use more imperative wording in patch description]
In the Linux kernel, the following vulnerability has been resolved:
bcachefs: Fix sb_field_downgrade validation
- bch2_sb_downgrade_validate() wasn't checking for a downgrade entry
extending past the end of the superblock section
- for_each_downgrade_entry() is used in to_text() and needs to work on
malformed input; it also was missing a check for a field extending
past the end of the section
In the Linux kernel, the following vulnerability has been resolved:
net: can: j1939: enhanced error handling for tightly received RTS messages in xtp_rx_rts_session_new
This patch enhances error handling in scenarios with RTS (Request to
Send) messages arriving closely. It replaces the less informative WARN_ON_ONCE
backtraces with a new error handling method. This provides clearer error
messages and allows for the early termination of problematic sessions.
Previously, sessions were only released at the end of j1939_xtp_rx_rts().
Potentially this could be reproduced with something like:
testj1939 -r vcan0:0x80 &
while true; do
# send first RTS
cansend vcan0 18EC8090#1014000303002301;
# send second RTS
cansend vcan0 18EC8090#1014000303002301;
# send abort
cansend vcan0 18EC8090#ff00000000002301;
done
In the Linux kernel, the following vulnerability has been resolved:
nvme-fabrics: use reserved tag for reg read/write command
In some scenarios, if too many commands are issued by nvme command in
the same time by user tasks, this may exhaust all tags of admin_q. If
a reset (nvme reset or IO timeout) occurs before these commands finish,
reconnect routine may fail to update nvme regs due to insufficient tags,
which will cause kernel hang forever. In order to workaround this issue,
maybe we can let reg_read32()/reg_read64()/reg_write32() use reserved
tags. This maybe safe for nvmf:
1. For the disable ctrl path, we will not issue connect command
2. For the enable ctrl / fw activate path, since connect and reg_xx()
are called serially.
So the reserved tags may still be enough while reg_xx() use reserved tags.
In the Linux kernel, the following vulnerability has been resolved:
ila: block BH in ila_output()
As explained in commit 1378817486d6 ("tipc: block BH
before using dst_cache"), net/core/dst_cache.c
helpers need to be called with BH disabled.
ila_output() is called from lwtunnel_output()
possibly from process context, and under rcu_read_lock().
We might be interrupted by a softirq, re-enter ila_output()
and corrupt dst_cache data structures.
Fix the race by using local_bh_disable().
In the Linux kernel, the following vulnerability has been resolved:
io_uring: fix possible deadlock in io_register_iowq_max_workers()
The io_register_iowq_max_workers() function calls io_put_sq_data(),
which acquires the sqd->lock without releasing the uring_lock.
Similar to the commit 009ad9f0c6ee ("io_uring: drop ctx->uring_lock
before acquiring sqd->lock"), this can lead to a potential deadlock
situation.
To resolve this issue, the uring_lock is released before calling
io_put_sq_data(), and then it is re-acquired after the function call.
This change ensures that the locks are acquired in the correct
order, preventing the possibility of a deadlock.
In the Linux kernel, the following vulnerability has been resolved:
nvmet: always initialize cqe.result
The spec doesn't mandate that the first two double words (aka results)
for the command queue entry need to be set to 0 when they are not
used (not specified). Though, the target implemention returns 0 for TCP
and FC but not for RDMA.
Let's make RDMA behave the same and thus explicitly initializing the
result field. This prevents leaking any data from the stack.
In the Linux kernel, the following vulnerability has been resolved:
btrfs: qgroup: fix quota root leak after quota disable failure
If during the quota disable we fail when cleaning the quota tree or when
deleting the root from the root tree, we jump to the 'out' label without
ever dropping the reference on the quota root, resulting in a leak of the
root since fs_info->quota_root is no longer pointing to the root (we have
set it to NULL just before those steps).
Fix this by always doing a btrfs_put_root() call under the 'out' label.
This is a problem that exists since qgroups were first added in 2012 by
commit bed92eae26cc ("Btrfs: qgroup implementation and prototypes"), but
back then we missed a kfree on the quota root and free_extent_buffer()
calls on its root and commit root nodes, since back then roots were not
yet reference counted.
In the Linux kernel, the following vulnerability has been resolved:
null_blk: fix validation of block size
Block size should be between 512 and PAGE_SIZE and be a power of 2. The current
check does not validate this, so update the check.
Without this patch, null_blk would Oops due to a null pointer deref when
loaded with bs=1536 [1].
[axboe: remove unnecessary braces and != 0 check]
In the Linux kernel, the following vulnerability has been resolved:
NFSv4: Fix memory leak in nfs4_set_security_label
We leak nfs_fattr and nfs4_label every time we set a security xattr.
In the Linux kernel, the following vulnerability has been resolved:
cachefiles: add consistency check for copen/cread
This prevents malicious processes from completing random copen/cread
requests and crashing the system. Added checks are listed below:
* Generic, copen can only complete open requests, and cread can only
complete read requests.
* For copen, ondemand_id must not be 0, because this indicates that the
request has not been read by the daemon.
* For cread, the object corresponding to fd and req should be the same.
In the Linux kernel, the following vulnerability has been resolved:
wifi: cfg80211: wext: add extra SIOCSIWSCAN data check
In 'cfg80211_wext_siwscan()', add extra check whether number of
channels passed via 'ioctl(sock, SIOCSIWSCAN, ...)' doesn't exceed
IW_MAX_FREQUENCIES and reject invalid request with -EINVAL otherwise.
In the Linux kernel, the following vulnerability has been resolved:
s390/sclp: Fix sclp_init() cleanup on failure
If sclp_init() fails it only partially cleans up: if there are multiple
failing calls to sclp_init() sclp_state_change_event will be added several
times to sclp_reg_list, which results in the following warning:
------------[ cut here ]------------
list_add double add: new=000003ffe1598c10, prev=000003ffe1598bf0, next=000003ffe1598c10.
WARNING: CPU: 0 PID: 1 at lib/list_debug.c:35 __list_add_valid_or_report+0xde/0xf8
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.10.0-rc3
Krnl PSW : 0404c00180000000 000003ffe0d6076a (__list_add_valid_or_report+0xe2/0xf8)
R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
...
Call Trace:
[<000003ffe0d6076a>] __list_add_valid_or_report+0xe2/0xf8
([<000003ffe0d60766>] __list_add_valid_or_report+0xde/0xf8)
[<000003ffe0a8d37e>] sclp_init+0x40e/0x450
[<000003ffe00009f2>] do_one_initcall+0x42/0x1e0
[<000003ffe15b77a6>] do_initcalls+0x126/0x150
[<000003ffe15b7a0a>] kernel_init_freeable+0x1ba/0x1f8
[<000003ffe0d6650e>] kernel_init+0x2e/0x180
[<000003ffe000301c>] __ret_from_fork+0x3c/0x60
[<000003ffe0d759ca>] ret_from_fork+0xa/0x30
Fix this by removing sclp_state_change_event from sclp_reg_list when
sclp_init() fails.
In the Linux kernel, the following vulnerability has been resolved:
btrfs: scrub: handle RST lookup error correctly
[BUG]
When running btrfs/060 with forced RST feature, it would crash the
following ASSERT() inside scrub_read_endio():
ASSERT(sector_nr < stripe->nr_sectors);
Before that, we would have tree dump from
btrfs_get_raid_extent_offset(), as we failed to find the RST entry for
the range.
[CAUSE]
Inside scrub_submit_extent_sector_read() every time we allocated a new
bbio we immediately called btrfs_map_block() to make sure there was some
RST range covering the scrub target.
But if btrfs_map_block() fails, we immediately call endio for the bbio,
while the bbio is newly allocated, it's completely empty.
Then inside scrub_read_endio(), we go through the bvecs to find
the sector number (as bi_sector is no longer reliable if the bio is
submitted to lower layers).
And since the bio is empty, such bvecs iteration would not find any
sector matching the sector, and return sector_nr == stripe->nr_sectors,
triggering the ASSERT().
[FIX]
Instead of calling btrfs_map_block() after allocating a new bbio, call
btrfs_map_block() first.
Since our only objective of calling btrfs_map_block() is only to update
stripe_len, there is really no need to do that after btrfs_alloc_bio().
This new timing would avoid the problem of handling empty bbio
completely, and in fact fixes a possible race window for the old code,
where if the submission thread is the only owner of the pending_io, the
scrub would never finish (since we didn't decrease the pending_io
counter).
Although the root cause of RST lookup failure still needs to be
addressed.
In the Linux kernel, the following vulnerability has been resolved:
ibmvnic: Add tx check to prevent skb leak
Below is a summary of how the driver stores a reference to an skb during
transmit:
tx_buff[free_map[consumer_index]]->skb = new_skb;
free_map[consumer_index] = IBMVNIC_INVALID_MAP;
consumer_index ++;
Where variable data looks like this:
free_map == [4, IBMVNIC_INVALID_MAP, IBMVNIC_INVALID_MAP, 0, 3]
consumer_index^
tx_buff == [skb=null, skb=<ptr>, skb=<ptr>, skb=null, skb=null]
The driver has checks to ensure that free_map[consumer_index] pointed to
a valid index but there was no check to ensure that this index pointed
to an unused/null skb address. So, if, by some chance, our free_map and
tx_buff lists become out of sync then we were previously risking an
skb memory leak. This could then cause tcp congestion control to stop
sending packets, eventually leading to ETIMEDOUT.
Therefore, add a conditional to ensure that the skb address is null. If
not then warn the user (because this is still a bug that should be
patched) and free the old pointer to prevent memleak/tcp problems.
In the Linux kernel, the following vulnerability has been resolved:
powerpc/pseries: Whitelist dtl slub object for copying to userspace
Reading the dispatch trace log from /sys/kernel/debug/powerpc/dtl/cpu-*
results in a BUG() when the config CONFIG_HARDENED_USERCOPY is enabled as
shown below.
kernel BUG at mm/usercopy.c:102!
Oops: Exception in kernel mode, sig: 5 [#1]
LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: xfs libcrc32c dm_service_time sd_mod t10_pi sg ibmvfc
scsi_transport_fc ibmveth pseries_wdt dm_multipath dm_mirror dm_region_hash dm_log dm_mod fuse
CPU: 27 PID: 1815 Comm: python3 Not tainted 6.10.0-rc3 #85
Hardware name: IBM,9040-MRX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1060.00 (NM1060_042) hv:phyp pSeries
NIP: c0000000005d23d4 LR: c0000000005d23d0 CTR: 00000000006ee6f8
REGS: c000000120c078c0 TRAP: 0700 Not tainted (6.10.0-rc3)
MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE> CR: 2828220f XER: 0000000e
CFAR: c0000000001fdc80 IRQMASK: 0
[ ... GPRs omitted ... ]
NIP [c0000000005d23d4] usercopy_abort+0x78/0xb0
LR [c0000000005d23d0] usercopy_abort+0x74/0xb0
Call Trace:
usercopy_abort+0x74/0xb0 (unreliable)
__check_heap_object+0xf8/0x120
check_heap_object+0x218/0x240
__check_object_size+0x84/0x1a4
dtl_file_read+0x17c/0x2c4
full_proxy_read+0x8c/0x110
vfs_read+0xdc/0x3a0
ksys_read+0x84/0x144
system_call_exception+0x124/0x330
system_call_vectored_common+0x15c/0x2ec
--- interrupt: 3000 at 0x7fff81f3ab34
Commit 6d07d1cd300f ("usercopy: Restrict non-usercopy caches to size 0")
requires that only whitelisted areas in slab/slub objects can be copied to
userspace when usercopy hardening is enabled using CONFIG_HARDENED_USERCOPY.
Dtl contains hypervisor dispatch events which are expected to be read by
privileged users. Hence mark this safe for user access.
Specify useroffset=0 and usersize=DISPATCH_LOG_BYTES to whitelist the
entire object.
In the Linux kernel, the following vulnerability has been resolved:
powerpc/eeh: avoid possible crash when edev->pdev changes
If a PCI device is removed during eeh_pe_report_edev(), edev->pdev
will change and can cause a crash, hold the PCI rescan/remove lock
while taking a copy of edev->pdev->bus.
In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: hci_core: cancel all works upon hci_unregister_dev()
syzbot is reporting that calling hci_release_dev() from hci_error_reset()
due to hci_dev_put() from hci_error_reset() can cause deadlock at
destroy_workqueue(), for hci_error_reset() is called from
hdev->req_workqueue which destroy_workqueue() needs to flush.
We need to make sure that hdev->{rx_work,cmd_work,tx_work} which are
queued into hdev->workqueue and hdev->{power_on,error_reset} which are
queued into hdev->req_workqueue are no longer running by the moment
destroy_workqueue(hdev->workqueue);
destroy_workqueue(hdev->req_workqueue);
are called from hci_release_dev().
Call cancel_work_sync() on these work items from hci_unregister_dev()
as soon as hdev->list is removed from hci_dev_list.
In the Linux kernel, the following vulnerability has been resolved:
bluetooth/l2cap: sync sock recv cb and release
The problem occurs between the system call to close the sock and hci_rx_work,
where the former releases the sock and the latter accesses it without lock protection.
CPU0 CPU1
---- ----
sock_close hci_rx_work
l2cap_sock_release hci_acldata_packet
l2cap_sock_kill l2cap_recv_frame
sk_free l2cap_conless_channel
l2cap_sock_recv_cb
If hci_rx_work processes the data that needs to be received before the sock is
closed, then everything is normal; Otherwise, the work thread may access the
released sock when receiving data.
Add a chan mutex in the rx callback of the sock to achieve synchronization between
the sock release and recv cb.
Sock is dead, so set chan data to NULL, avoid others use invalid sock pointer.
In the Linux kernel, the following vulnerability has been resolved:
drm/radeon: check bo_va->bo is non-NULL before using it
The call to radeon_vm_clear_freed might clear bo_va->bo, so
we have to check it before dereferencing it.
In the Linux kernel, the following vulnerability has been resolved:
firmware: cs_dsp: Use strnlen() on name fields in V1 wmfw files
Use strnlen() instead of strlen() on the algorithm and coefficient name
string arrays in V1 wmfw files.
In V1 wmfw files the name is a NUL-terminated string in a fixed-size
array. cs_dsp should protect against overrunning the array if the NUL
terminator is missing.
In the Linux kernel, the following vulnerability has been resolved:
mm: prevent derefencing NULL ptr in pfn_section_valid()
Commit 5ec8e8ea8b77 ("mm/sparsemem: fix race in accessing
memory_section->usage") changed pfn_section_valid() to add a READ_ONCE()
call around "ms->usage" to fix a race with section_deactivate() where
ms->usage can be cleared. The READ_ONCE() call, by itself, is not enough
to prevent NULL pointer dereference. We need to check its value before
dereferencing it.
In the Linux kernel, the following vulnerability has been resolved:
skmsg: Skip zero length skb in sk_msg_recvmsg
When running BPF selftests (./test_progs -t sockmap_basic) on a Loongarch
platform, the following kernel panic occurs:
[...]
Oops[#1]:
CPU: 22 PID: 2824 Comm: test_progs Tainted: G OE 6.10.0-rc2+ #18
Hardware name: LOONGSON Dabieshan/Loongson-TC542F0, BIOS Loongson-UDK2018
... ...
ra: 90000000048bf6c0 sk_msg_recvmsg+0x120/0x560
ERA: 9000000004162774 copy_page_to_iter+0x74/0x1c0
CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
PRMD: 0000000c (PPLV0 +PIE +PWE)
EUEN: 00000007 (+FPE +SXE +ASXE -BTE)
ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0)
BADV: 0000000000000040
PRID: 0014c011 (Loongson-64bit, Loongson-3C5000)
Modules linked in: bpf_testmod(OE) xt_CHECKSUM xt_MASQUERADE xt_conntrack
Process test_progs (pid: 2824, threadinfo=0000000000863a31, task=...)
Stack : ...
Call Trace:
[<9000000004162774>] copy_page_to_iter+0x74/0x1c0
[<90000000048bf6c0>] sk_msg_recvmsg+0x120/0x560
[<90000000049f2b90>] tcp_bpf_recvmsg_parser+0x170/0x4e0
[<90000000049aae34>] inet_recvmsg+0x54/0x100
[<900000000481ad5c>] sock_recvmsg+0x7c/0xe0
[<900000000481e1a8>] __sys_recvfrom+0x108/0x1c0
[<900000000481e27c>] sys_recvfrom+0x1c/0x40
[<9000000004c076ec>] do_syscall+0x8c/0xc0
[<9000000003731da4>] handle_syscall+0xc4/0x160
Code: ...
---[ end trace 0000000000000000 ]---
Kernel panic - not syncing: Fatal exception
Kernel relocated by 0x3510000
.text @ 0x9000000003710000
.data @ 0x9000000004d70000
.bss @ 0x9000000006469400
---[ end Kernel panic - not syncing: Fatal exception ]---
[...]
This crash happens every time when running sockmap_skb_verdict_shutdown
subtest in sockmap_basic.
This crash is because a NULL pointer is passed to page_address() in the
sk_msg_recvmsg(). Due to the different implementations depending on the
architecture, page_address(NULL) will trigger a panic on Loongarch
platform but not on x86 platform. So this bug was hidden on x86 platform
for a while, but now it is exposed on Loongarch platform. The root cause
is that a zero length skb (skb->len == 0) was put on the queue.
This zero length skb is a TCP FIN packet, which was sent by shutdown(),
invoked in test_sockmap_skb_verdict_shutdown():
shutdown(p1, SHUT_WR);
In this case, in sk_psock_skb_ingress_enqueue(), num_sge is zero, and no
page is put to this sge (see sg_set_page in sg_set_page), but this empty
sge is queued into ingress_msg list.
And in sk_msg_recvmsg(), this empty sge is used, and a NULL page is got by
sg_page(sge). Pass this NULL page to copy_page_to_iter(), which passes it
to kmap_local_page() and to page_address(), then kernel panics.
To solve this, we should skip this zero length skb. So in sk_msg_recvmsg(),
if copy is zero, that means it's a zero length skb, skip invoking
copy_page_to_iter(). We are using the EFAULT return triggered by
copy_page_to_iter to check for is_fin in tcp_bpf.c.
In the Linux kernel, the following vulnerability has been resolved:
i40e: Fix XDP program unloading while removing the driver
The commit 6533e558c650 ("i40e: Fix reset path while removing
the driver") introduced a new PF state "__I40E_IN_REMOVE" to block
modifying the XDP program while the driver is being removed.
Unfortunately, such a change is useful only if the ".ndo_bpf()"
callback was called out of the rmmod context because unloading the
existing XDP program is also a part of driver removing procedure.
In other words, from the rmmod context the driver is expected to
unload the XDP program without reporting any errors. Otherwise,
the kernel warning with callstack is printed out to dmesg.
Example failing scenario:
1. Load the i40e driver.
2. Load the XDP program.
3. Unload the i40e driver (using "rmmod" command).
The example kernel warning log:
[ +0.004646] WARNING: CPU: 94 PID: 10395 at net/core/dev.c:9290 unregister_netdevice_many_notify+0x7a9/0x870
[...]
[ +0.010959] RIP: 0010:unregister_netdevice_many_notify+0x7a9/0x870
[...]
[ +0.002726] Call Trace:
[ +0.002457] <TASK>
[ +0.002119] ? __warn+0x80/0x120
[ +0.003245] ? unregister_netdevice_many_notify+0x7a9/0x870
[ +0.005586] ? report_bug+0x164/0x190
[ +0.003678] ? handle_bug+0x3c/0x80
[ +0.003503] ? exc_invalid_op+0x17/0x70
[ +0.003846] ? asm_exc_invalid_op+0x1a/0x20
[ +0.004200] ? unregister_netdevice_many_notify+0x7a9/0x870
[ +0.005579] ? unregister_netdevice_many_notify+0x3cc/0x870
[ +0.005586] unregister_netdevice_queue+0xf7/0x140
[ +0.004806] unregister_netdev+0x1c/0x30
[ +0.003933] i40e_vsi_release+0x87/0x2f0 [i40e]
[ +0.004604] i40e_remove+0x1a1/0x420 [i40e]
[ +0.004220] pci_device_remove+0x3f/0xb0
[ +0.003943] device_release_driver_internal+0x19f/0x200
[ +0.005243] driver_detach+0x48/0x90
[ +0.003586] bus_remove_driver+0x6d/0xf0
[ +0.003939] pci_unregister_driver+0x2e/0xb0
[ +0.004278] i40e_exit_module+0x10/0x5f0 [i40e]
[ +0.004570] __do_sys_delete_module.isra.0+0x197/0x310
[ +0.005153] do_syscall_64+0x85/0x170
[ +0.003684] ? syscall_exit_to_user_mode+0x69/0x220
[ +0.004886] ? do_syscall_64+0x95/0x170
[ +0.003851] ? exc_page_fault+0x7e/0x180
[ +0.003932] entry_SYSCALL_64_after_hwframe+0x71/0x79
[ +0.005064] RIP: 0033:0x7f59dc9347cb
[ +0.003648] Code: 73 01 c3 48 8b 0d 65 16 0c 00 f7 d8 64 89 01 48 83
c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f
05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 35 16 0c 00 f7 d8 64 89 01 48
[ +0.018753] RSP: 002b:00007ffffac99048 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
[ +0.007577] RAX: ffffffffffffffda RBX: 0000559b9bb2f6e0 RCX: 00007f59dc9347cb
[ +0.007140] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 0000559b9bb2f748
[ +0.007146] RBP: 00007ffffac99070 R08: 1999999999999999 R09: 0000000000000000
[ +0.007133] R10: 00007f59dc9a5ac0 R11: 0000000000000206 R12: 0000000000000000
[ +0.007141] R13: 00007ffffac992d8 R14: 0000559b9bb2f6e0 R15: 0000000000000000
[ +0.007151] </TASK>
[ +0.002204] ---[ end trace 0000000000000000 ]---
Fix this by checking if the XDP program is being loaded or unloaded.
Then, block only loading a new program while "__I40E_IN_REMOVE" is set.
Also, move testing "__I40E_IN_REMOVE" flag to the beginning of XDP_SETUP
callback to avoid unnecessary operations and checks.
In the Linux kernel, the following vulnerability has been resolved:
ppp: reject claimed-as-LCP but actually malformed packets
Since 'ppp_async_encode()' assumes valid LCP packets (with code
from 1 to 7 inclusive), add 'ppp_check_packet()' to ensure that
LCP packet has an actual body beyond PPP_LCP header bytes, and
reject claimed-as-LCP but actually malformed data otherwise.
In the Linux kernel, the following vulnerability has been resolved:
firmware: cs_dsp: Prevent buffer overrun when processing V2 alg headers
Check that all fields of a V2 algorithm header fit into the available
firmware data buffer.
The wmfw V2 format introduced variable-length strings in the algorithm
block header. This means the overall header length is variable, and the
position of most fields varies depending on the length of the string
fields. Each field must be checked to ensure that it does not overflow
the firmware data buffer.
As this ia bugfix patch, the fixes avoid making any significant change to
the existing code. This makes it easier to review and less likely to
introduce new bugs.