In the Linux kernel, the following vulnerability has been resolved:
perf: arm-ni: Unregister PMUs on probe failure
When a resource allocation fails in one clock domain of an NI device,
we need to properly roll back all previously registered perf PMUs in
other clock domains of the same device.
Otherwise, it can lead to kernel panics.
Calling arm_ni_init+0x0/0xff8 [arm_ni] @ 2374
arm-ni ARMHCB70:00: Failed to request PMU region 0x1f3c13000
arm-ni ARMHCB70:00: probe with driver arm-ni failed with error -16
list_add corruption: next->prev should be prev (fffffd01e9698a18),
but was 0000000000000000. (next=ffff10001a0decc8).
pstate: 6340009 (nZCv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
pc : list_add_valid_or_report+0x7c/0xb8
lr : list_add_valid_or_report+0x7c/0xb8
Call trace:
__list_add_valid_or_report+0x7c/0xb8
perf_pmu_register+0x22c/0x3a0
arm_ni_probe+0x554/0x70c [arm_ni]
platform_probe+0x70/0xe8
really_probe+0xc6/0x4d8
driver_probe_device+0x48/0x170
__driver_attach+0x8e/0x1c0
bus_for_each_dev+0x64/0xf0
driver_add+0x138/0x260
bus_add_driver+0x68/0x138
__platform_driver_register+0x2c/0x40
arm_ni_init+0x14/0x2a [arm_ni]
do_init_module+0x36/0x298
---[ end trace 0000000000000000 ]---
Kernel panic - not syncing: Oops - BUG: Fatal exception
SMP: stopping secondary CPUs
In the Linux kernel, the following vulnerability has been resolved:
fs/ntfs3: handle hdr_first_de() return value
The hdr_first_de() function returns a pointer to a struct NTFS_DE. This
pointer may be NULL. To handle the NULL error effectively, it is important
to implement an error handler. This will help manage potential errors
consistently.
Additionally, error handling for the return value already exists at other
points where this function is called.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
In the Linux kernel, the following vulnerability has been resolved:
bpf: fix ktls panic with sockmap
[ 2172.936997] ------------[ cut here ]------------
[ 2172.936999] kernel BUG at lib/iov_iter.c:629!
......
[ 2172.944996] PKRU: 55555554
[ 2172.945155] Call Trace:
[ 2172.945299] <TASK>
[ 2172.945428] ? die+0x36/0x90
[ 2172.945601] ? do_trap+0xdd/0x100
[ 2172.945795] ? iov_iter_revert+0x178/0x180
[ 2172.946031] ? iov_iter_revert+0x178/0x180
[ 2172.946267] ? do_error_trap+0x7d/0x110
[ 2172.946499] ? iov_iter_revert+0x178/0x180
[ 2172.946736] ? exc_invalid_op+0x50/0x70
[ 2172.946961] ? iov_iter_revert+0x178/0x180
[ 2172.947197] ? asm_exc_invalid_op+0x1a/0x20
[ 2172.947446] ? iov_iter_revert+0x178/0x180
[ 2172.947683] ? iov_iter_revert+0x5c/0x180
[ 2172.947913] tls_sw_sendmsg_locked.isra.0+0x794/0x840
[ 2172.948206] tls_sw_sendmsg+0x52/0x80
[ 2172.948420] ? inet_sendmsg+0x1f/0x70
[ 2172.948634] __sys_sendto+0x1cd/0x200
[ 2172.948848] ? find_held_lock+0x2b/0x80
[ 2172.949072] ? syscall_trace_enter+0x140/0x270
[ 2172.949330] ? __lock_release.isra.0+0x5e/0x170
[ 2172.949595] ? find_held_lock+0x2b/0x80
[ 2172.949817] ? syscall_trace_enter+0x140/0x270
[ 2172.950211] ? lockdep_hardirqs_on_prepare+0xda/0x190
[ 2172.950632] ? ktime_get_coarse_real_ts64+0xc2/0xd0
[ 2172.951036] __x64_sys_sendto+0x24/0x30
[ 2172.951382] do_syscall_64+0x90/0x170
......
After calling bpf_exec_tx_verdict(), the size of msg_pl->sg may increase,
e.g., when the BPF program executes bpf_msg_push_data().
If the BPF program sets cork_bytes and sg.size is smaller than cork_bytes,
it will return -ENOSPC and attempt to roll back to the non-zero copy
logic. However, during rollback, msg->msg_iter is reset, but since
msg_pl->sg.size has been increased, subsequent executions will exceed the
actual size of msg_iter.
'''
iov_iter_revert(&msg->msg_iter, msg_pl->sg.size - orig_size);
'''
The changes in this commit are based on the following considerations:
1. When cork_bytes is set, rolling back to non-zero copy logic is
pointless and can directly go to zero-copy logic.
2. We can not calculate the correct number of bytes to revert msg_iter.
Assume the original data is "abcdefgh" (8 bytes), and after 3 pushes
by the BPF program, it becomes 11-byte data: "abc?de?fgh?".
Then, we set cork_bytes to 6, which means the first 6 bytes have been
processed, and the remaining 5 bytes "?fgh?" will be cached until the
length meets the cork_bytes requirement.
However, some data in "?fgh?" is not within 'sg->msg_iter'
(but in msg_pl instead), especially the data "?" we pushed.
So it doesn't seem as simple as just reverting through an offset of
msg_iter.
3. For non-TLS sockets in tcp_bpf_sendmsg, when a "cork" situation occurs,
the user-space send() doesn't return an error, and the returned length is
the same as the input length parameter, even if some data is cached.
Additionally, I saw that the current non-zero-copy logic for handling
corking is written as:
'''
line 1177
else if (ret != -EAGAIN) {
if (ret == -ENOSPC)
ret = 0;
goto send_end;
'''
So it's ok to just return 'copied' without error when a "cork" situation
occurs.
In the Linux kernel, the following vulnerability has been resolved:
bpf, sockmap: Fix panic when calling skb_linearize
The panic can be reproduced by executing the command:
./bench sockmap -c 2 -p 1 -a --rx-verdict-ingress --rx-strp 100000
Then a kernel panic was captured:
'''
[ 657.460555] kernel BUG at net/core/skbuff.c:2178!
[ 657.462680] Tainted: [W]=WARN
[ 657.463287] Workqueue: events sk_psock_backlog
...
[ 657.469610] <TASK>
[ 657.469738] ? die+0x36/0x90
[ 657.469916] ? do_trap+0x1d0/0x270
[ 657.470118] ? pskb_expand_head+0x612/0xf40
[ 657.470376] ? pskb_expand_head+0x612/0xf40
[ 657.470620] ? do_error_trap+0xa3/0x170
[ 657.470846] ? pskb_expand_head+0x612/0xf40
[ 657.471092] ? handle_invalid_op+0x2c/0x40
[ 657.471335] ? pskb_expand_head+0x612/0xf40
[ 657.471579] ? exc_invalid_op+0x2d/0x40
[ 657.471805] ? asm_exc_invalid_op+0x1a/0x20
[ 657.472052] ? pskb_expand_head+0xd1/0xf40
[ 657.472292] ? pskb_expand_head+0x612/0xf40
[ 657.472540] ? lock_acquire+0x18f/0x4e0
[ 657.472766] ? find_held_lock+0x2d/0x110
[ 657.472999] ? __pfx_pskb_expand_head+0x10/0x10
[ 657.473263] ? __kmalloc_cache_noprof+0x5b/0x470
[ 657.473537] ? __pfx___lock_release.isra.0+0x10/0x10
[ 657.473826] __pskb_pull_tail+0xfd/0x1d20
[ 657.474062] ? __kasan_slab_alloc+0x4e/0x90
[ 657.474707] sk_psock_skb_ingress_enqueue+0x3bf/0x510
[ 657.475392] ? __kasan_kmalloc+0xaa/0xb0
[ 657.476010] sk_psock_backlog+0x5cf/0xd70
[ 657.476637] process_one_work+0x858/0x1a20
'''
The panic originates from the assertion BUG_ON(skb_shared(skb)) in
skb_linearize(). A previous commit(see Fixes tag) introduced skb_get()
to avoid race conditions between skb operations in the backlog and skb
release in the recvmsg path. However, this caused the panic to always
occur when skb_linearize is executed.
The "--rx-strp 100000" parameter forces the RX path to use the strparser
module which aggregates data until it reaches 100KB before calling sockmap
logic. The 100KB payload exceeds MAX_MSG_FRAGS, triggering skb_linearize.
To fix this issue, just move skb_get into sk_psock_skb_ingress_enqueue.
'''
sk_psock_backlog:
sk_psock_handle_skb
skb_get(skb) <== we move it into 'sk_psock_skb_ingress_enqueue'
sk_psock_skb_ingress____________
↓
|
| → sk_psock_skb_ingress_self
| sk_psock_skb_ingress_enqueue
sk_psock_verdict_apply_________________↑ skb_linearize
'''
Note that for verdict_apply path, the skb_get operation is unnecessary so
we add 'take_ref' param to control it's behavior.
In the Linux kernel, the following vulnerability has been resolved:
f2fs: zone: fix to avoid inconsistence in between SIT and SSA
w/ below testcase, it will cause inconsistence in between SIT and SSA.
create_null_blk 512 2 1024 1024
mkfs.f2fs -m /dev/nullb0
mount /dev/nullb0 /mnt/f2fs/
touch /mnt/f2fs/file
f2fs_io pinfile set /mnt/f2fs/file
fallocate -l 4GiB /mnt/f2fs/file
F2FS-fs (nullb0): Inconsistent segment (0) type [1, 0] in SSA and SIT
CPU: 5 UID: 0 PID: 2398 Comm: fallocate Tainted: G O 6.13.0-rc1 #84
Tainted: [O]=OOT_MODULE
Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
Call Trace:
<TASK>
dump_stack_lvl+0xb3/0xd0
dump_stack+0x14/0x20
f2fs_handle_critical_error+0x18c/0x220 [f2fs]
f2fs_stop_checkpoint+0x38/0x50 [f2fs]
do_garbage_collect+0x674/0x6e0 [f2fs]
f2fs_gc_range+0x12b/0x230 [f2fs]
f2fs_allocate_pinning_section+0x5c/0x150 [f2fs]
f2fs_expand_inode_data+0x1cc/0x3c0 [f2fs]
f2fs_fallocate+0x3c3/0x410 [f2fs]
vfs_fallocate+0x15f/0x4b0
__x64_sys_fallocate+0x4a/0x80
x64_sys_call+0x15e8/0x1b80
do_syscall_64+0x68/0x130
entry_SYSCALL_64_after_hwframe+0x67/0x6f
RIP: 0033:0x7f9dba5197ca
F2FS-fs (nullb0): Stopped filesystem due to reason: 4
The reason is f2fs_gc_range() may try to migrate block in curseg, however,
its SSA block is not uptodate due to the last summary block data is still
in cache of curseg.
In this patch, we add a condition in f2fs_gc_range() to check whether
section is opened or not, and skip block migration for opened section.
In the Linux kernel, the following vulnerability has been resolved:
f2fs: fix to do sanity check on sbi->total_valid_block_count
syzbot reported a f2fs bug as below:
------------[ cut here ]------------
kernel BUG at fs/f2fs/f2fs.h:2521!
RIP: 0010:dec_valid_block_count+0x3b2/0x3c0 fs/f2fs/f2fs.h:2521
Call Trace:
f2fs_truncate_data_blocks_range+0xc8c/0x11a0 fs/f2fs/file.c:695
truncate_dnode+0x417/0x740 fs/f2fs/node.c:973
truncate_nodes+0x3ec/0xf50 fs/f2fs/node.c:1014
f2fs_truncate_inode_blocks+0x8e3/0x1370 fs/f2fs/node.c:1197
f2fs_do_truncate_blocks+0x840/0x12b0 fs/f2fs/file.c:810
f2fs_truncate_blocks+0x10d/0x300 fs/f2fs/file.c:838
f2fs_truncate+0x417/0x720 fs/f2fs/file.c:888
f2fs_setattr+0xc4f/0x12f0 fs/f2fs/file.c:1112
notify_change+0xbca/0xe90 fs/attr.c:552
do_truncate+0x222/0x310 fs/open.c:65
handle_truncate fs/namei.c:3466 [inline]
do_open fs/namei.c:3849 [inline]
path_openat+0x2e4f/0x35d0 fs/namei.c:4004
do_filp_open+0x284/0x4e0 fs/namei.c:4031
do_sys_openat2+0x12b/0x1d0 fs/open.c:1429
do_sys_open fs/open.c:1444 [inline]
__do_sys_creat fs/open.c:1522 [inline]
__se_sys_creat fs/open.c:1516 [inline]
__x64_sys_creat+0x124/0x170 fs/open.c:1516
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/syscall_64.c:94
The reason is: in fuzzed image, sbi->total_valid_block_count is
inconsistent w/ mapped blocks indexed by inode, so, we should
not trigger panic for such case, instead, let's print log and
set fsck flag.
In the Linux kernel, the following vulnerability has been resolved:
netfilter: nft_set_pipapo: prevent overflow in lookup table allocation
When calculating the lookup table size, ensure the following
multiplication does not overflow:
- desc->field_len[] maximum value is U8_MAX multiplied by
NFT_PIPAPO_GROUPS_PER_BYTE(f) that can be 2, worst case.
- NFT_PIPAPO_BUCKETS(f->bb) is 2^8, worst case.
- sizeof(unsigned long), from sizeof(*f->lt), lt in
struct nft_pipapo_field.
Then, use check_mul_overflow() to multiply by bucket size and then use
check_add_overflow() to the alignment for avx2 (if needed). Finally, add
lt_size_check_overflow() helper and use it to consolidate this.
While at it, replace leftover allocation using the GFP_KERNEL to
GFP_KERNEL_ACCOUNT for consistency, in pipapo_resize().
In the Linux kernel, the following vulnerability has been resolved:
clk: bcm: rpi: Add NULL check in raspberrypi_clk_register()
devm_kasprintf() returns NULL when memory allocation fails. Currently,
raspberrypi_clk_register() does not check for this case, which results
in a NULL pointer dereference.
Add NULL check after devm_kasprintf() to prevent this issue.
In the Linux kernel, the following vulnerability has been resolved:
hisi_acc_vfio_pci: fix XQE dma address error
The dma addresses of EQE and AEQE are wrong after migration and
results in guest kernel-mode encryption services failure.
Comparing the definition of hardware registers, we found that
there was an error when the data read from the register was
combined into an address. Therefore, the address combination
sequence needs to be corrected.
Even after fixing the above problem, we still have an issue
where the Guest from an old kernel can get migrated to
new kernel and may result in wrong data.
In order to ensure that the address is correct after migration,
if an old magic number is detected, the dma address needs to be
updated.
In the Linux kernel, the following vulnerability has been resolved:
wifi: mt76: mt7996: Fix null-ptr-deref in mt7996_mmio_wed_init()
devm_ioremap() returns NULL on error. Currently, mt7996_mmio_wed_init()
does not check for this case, which results in a NULL pointer
dereference.
Prevent null pointer dereference in mt7996_mmio_wed_init()
In the Linux kernel, the following vulnerability has been resolved:
wifi: mt76: mt7915: Fix null-ptr-deref in mt7915_mmio_wed_init()
devm_ioremap() returns NULL on error. Currently, mt7915_mmio_wed_init()
does not check for this case, which results in a NULL pointer
dereference.
Prevent null pointer dereference in mt7915_mmio_wed_init().
In the Linux kernel, the following vulnerability has been resolved:
RDMA/cma: Fix hang when cma_netevent_callback fails to queue_work
The cited commit fixed a crash when cma_netevent_callback was called for
a cma_id while work on that id from a previous call had not yet started.
The work item was re-initialized in the second call, which corrupted the
work item currently in the work queue.
However, it left a problem when queue_work fails (because the item is
still pending in the work queue from a previous call). In this case,
cma_id_put (which is called in the work handler) is therefore not
called. This results in a userspace process hang (zombie process).
Fix this by calling cma_id_put() if queue_work fails.
In the Linux kernel, the following vulnerability has been resolved:
af_packet: move notifier's packet_dev_mc out of rcu critical section
Syzkaller reports the following issue:
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:578
__mutex_lock+0x106/0xe80 kernel/locking/mutex.c:746
team_change_rx_flags+0x38/0x220 drivers/net/team/team_core.c:1781
dev_change_rx_flags net/core/dev.c:9145 [inline]
__dev_set_promiscuity+0x3f8/0x590 net/core/dev.c:9189
netif_set_promiscuity+0x50/0xe0 net/core/dev.c:9201
dev_set_promiscuity+0x126/0x260 net/core/dev_api.c:286 packet_dev_mc net/packet/af_packet.c:3698 [inline]
packet_dev_mclist_delete net/packet/af_packet.c:3722 [inline]
packet_notifier+0x292/0xa60 net/packet/af_packet.c:4247
notifier_call_chain+0x1b3/0x3e0 kernel/notifier.c:85
call_netdevice_notifiers_extack net/core/dev.c:2214 [inline]
call_netdevice_notifiers net/core/dev.c:2228 [inline]
unregister_netdevice_many_notify+0x15d8/0x2330 net/core/dev.c:11972
rtnl_delete_link net/core/rtnetlink.c:3522 [inline]
rtnl_dellink+0x488/0x710 net/core/rtnetlink.c:3564
rtnetlink_rcv_msg+0x7cf/0xb70 net/core/rtnetlink.c:6955
netlink_rcv_skb+0x219/0x490 net/netlink/af_netlink.c:2534
Calling `PACKET_ADD_MEMBERSHIP` on an ops-locked device can trigger
the `NETDEV_UNREGISTER` notifier, which may require disabling promiscuous
and/or allmulti mode. Both of these operations require acquiring
the netdev instance lock.
Move the call to `packet_dev_mc` outside of the RCU critical section.
The `mclist` modifications (add, del, flush, unregister) are protected by
the RTNL, not the RCU. The RCU only protects the `sklist` and its
associated `sks`. The delayed operation on the `mclist` entry remains
within the RTNL.
In the Linux kernel, the following vulnerability has been resolved:
net: phy: clear phydev->devlink when the link is deleted
There is a potential crash issue when disabling and re-enabling the
network port. When disabling the network port, phy_detach() calls
device_link_del() to remove the device link, but it does not clear
phydev->devlink, so phydev->devlink is not a NULL pointer. Then the
network port is re-enabled, but if phy_attach_direct() fails before
calling device_link_add(), the code jumps to the "error" label and
calls phy_detach(). Since phydev->devlink retains the old value from
the previous attach/detach cycle, device_link_del() uses the old value,
which accesses a NULL pointer and causes a crash. The simplified crash
log is as follows.
[ 24.702421] Call trace:
[ 24.704856] device_link_put_kref+0x20/0x120
[ 24.709124] device_link_del+0x30/0x48
[ 24.712864] phy_detach+0x24/0x168
[ 24.716261] phy_attach_direct+0x168/0x3a4
[ 24.720352] phylink_fwnode_phy_connect+0xc8/0x14c
[ 24.725140] phylink_of_phy_connect+0x1c/0x34
Therefore, phydev->devlink needs to be cleared when the device link is
deleted.
In the Linux kernel, the following vulnerability has been resolved:
net: phy: mscc: Fix memory leak when using one step timestamping
Fix memory leak when running one-step timestamping. When running
one-step sync timestamping, the HW is configured to insert the TX time
into the frame, so there is no reason to keep the skb anymore. As in
this case the HW will never generate an interrupt to say that the frame
was timestamped, then the frame will never released.
Fix this by freeing the frame in case of one-step timestamping.
In the Linux kernel, the following vulnerability has been resolved:
soc: aspeed: Add NULL check in aspeed_lpc_enable_snoop()
devm_kasprintf() returns NULL when memory allocation fails. Currently,
aspeed_lpc_enable_snoop() does not check for this case, which results in a
NULL pointer dereference.
Add NULL check after devm_kasprintf() to prevent this issue.
[arj: Fix Fixes: tag to use subject from 3772e5da4454]
In the Linux kernel, the following vulnerability has been resolved:
watchdog: lenovo_se30_wdt: Fix possible devm_ioremap() NULL pointer dereference in lenovo_se30_wdt_probe()
devm_ioremap() returns NULL on error. Currently, lenovo_se30_wdt_probe()
does not check for this case, which results in a NULL pointer
dereference.
Add NULL check after devm_ioremap() to prevent this issue.
In the Linux kernel, the following vulnerability has been resolved:
backlight: pm8941: Add NULL check in wled_configure()
devm_kasprintf() returns NULL when memory allocation fails. Currently,
wled_configure() does not check for this case, which results in a NULL
pointer dereference.
Add NULL check after devm_kasprintf() to prevent this issue.
In the Linux kernel, the following vulnerability has been resolved:
hwmon: (asus-ec-sensors) check sensor index in read_string()
Prevent a potential invalid memory access when the requested sensor
is not found.
find_ec_sensor_index() may return a negative value (e.g. -ENOENT),
but its result was used without checking, which could lead to
undefined behavior when passed to get_sensor_info().
Add a proper check to return -EINVAL if sensor_index is negative.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
[groeck: Return error code returned from find_ec_sensor_index]
In the Linux kernel, the following vulnerability has been resolved:
dm: limit swapping tables for devices with zone write plugs
dm_revalidate_zones() only allowed new or previously unzoned devices to
call blk_revalidate_disk_zones(). If the device was already zoned,
disk->nr_zones would always equal md->nr_zones, so dm_revalidate_zones()
returned without doing any work. This would make the zoned settings for
the device not match the new table. If the device had zone write plug
resources, it could run into errors like bdev_zone_is_seq() reading
invalid memory because disk->conv_zones_bitmap was the wrong size.
If the device doesn't have any zone write plug resources, calling
blk_revalidate_disk_zones() will always correctly update device. If
blk_revalidate_disk_zones() fails, it can still overwrite or clear the
current disk->nr_zones value. In this case, DM must restore the previous
value of disk->nr_zones, so that the zoned settings will continue to
match the previous value that it fell back to.
If the device already has zone write plug resources,
blk_revalidate_disk_zones() will not correctly update them, if it is
called for arbitrary zoned device changes. Since there is not much need
for this ability, the easiest solution is to disallow any table reloads
that change the zoned settings, for devices that already have zone plug
resources. Specifically, if a device already has zone plug resources
allocated, it can only switch to another zoned table that also emulates
zone append. Also, it cannot change the device size or the zone size. A
device can switch to an error target.
In the Linux kernel, the following vulnerability has been resolved:
dmaengine: ti: Add NULL check in udma_probe()
devm_kasprintf() returns NULL when memory allocation fails. Currently,
udma_probe() does not check for this case, which results in a NULL
pointer dereference.
Add NULL check after devm_kasprintf() to prevent this issue.
In the Linux kernel, the following vulnerability has been resolved:
serial: Fix potential null-ptr-deref in mlb_usio_probe()
devm_ioremap() can return NULL on error. Currently, mlb_usio_probe()
does not check for this case, which could result in a NULL pointer
dereference.
Add NULL check after devm_ioremap() to prevent this issue.
In the Linux kernel, the following vulnerability has been resolved:
usb: acpi: Prevent null pointer dereference in usb_acpi_add_usb4_devlink()
As demonstrated by the fix for update_port_device_state,
commit 12783c0b9e2c ("usb: core: Prevent null pointer dereference in update_port_device_state"),
usb_hub_to_struct_hub() can return NULL in certain scenarios,
such as during hub driver unbind or teardown race conditions,
even if the underlying usb_device structure exists.
Plus, all other places that call usb_hub_to_struct_hub() in the same file
do check for NULL return values.
If usb_hub_to_struct_hub() returns NULL, the subsequent access to
hub->ports[udev->portnum - 1] will cause a null pointer dereference.
In the Linux kernel, the following vulnerability has been resolved:
coresight: holding cscfg_csdev_lock while removing cscfg from csdev
There'll be possible race scenario for coresight config:
CPU0 CPU1
(perf enable) load module
cscfg_load_config_sets()
activate config. // sysfs
(sys_active_cnt == 1)
...
cscfg_csdev_enable_active_config()
lock(csdev->cscfg_csdev_lock)
deactivate config // sysfs
(sys_activec_cnt == 0)
cscfg_unload_config_sets()
<iterating config_csdev_list> cscfg_remove_owned_csdev_configs()
// here load config activate by CPU1
unlock(csdev->cscfg_csdev_lock)
iterating config_csdev_list could be raced with config_csdev_list's
entry delete.
To resolve this race , hold csdev->cscfg_csdev_lock() while
cscfg_remove_owned_csdev_configs()
In the Linux kernel, the following vulnerability has been resolved:
drm/connector: only call HDMI audio helper plugged cb if non-null
On driver remove, sound/soc/codecs/hdmi-codec.c calls the plugged_cb
with NULL as the callback function and codec_dev, as seen in its
hdmi_remove function.
The HDMI audio helper then happily tries calling said null function
pointer, and produces an Oops as a result.
Fix this by only executing the callback if fn is non-null. This means
the .plugged_cb and .plugged_cb_dev members still get appropriately
cleared.
In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: MGMT: reject malformed HCI_CMD_SYNC commands
In 'mgmt_hci_cmd_sync()', check whether the size of parameters passed
in 'struct mgmt_cp_hci_cmd_sync' matches the total size of the data
(i.e. 'sizeof(struct mgmt_cp_hci_cmd_sync)' plus trailing bytes).
Otherwise, large invalid 'params_len' will cause 'hci_cmd_sync_alloc()'
to do 'skb_put_data()' from an area beyond the one actually passed to
'mgmt_hci_cmd_sync()'.
In the Linux kernel, the following vulnerability has been resolved:
ice: fix Tx scheduler error handling in XDP callback
When the XDP program is loaded, the XDP callback adds new Tx queues.
This means that the callback must update the Tx scheduler with the new
queue number. In the event of a Tx scheduler failure, the XDP callback
should also fail and roll back any changes previously made for XDP
preparation.
The previous implementation had a bug that not all changes made by the
XDP callback were rolled back. This caused the crash with the following
call trace:
[ +9.549584] ice 0000:ca:00.0: Failed VSI LAN queue config for XDP, error: -5
[ +0.382335] Oops: general protection fault, probably for non-canonical address 0x50a2250a90495525: 0000 [#1] SMP NOPTI
[ +0.010710] CPU: 103 UID: 0 PID: 0 Comm: swapper/103 Not tainted 6.14.0-net-next-mar-31+ #14 PREEMPT(voluntary)
[ +0.010175] Hardware name: Intel Corporation M50CYP2SBSTD/M50CYP2SBSTD, BIOS SE5C620.86B.01.01.0005.2202160810 02/16/2022
[ +0.010946] RIP: 0010:__ice_update_sample+0x39/0xe0 [ice]
[...]
[ +0.002715] Call Trace:
[ +0.002452] <IRQ>
[ +0.002021] ? __die_body.cold+0x19/0x29
[ +0.003922] ? die_addr+0x3c/0x60
[ +0.003319] ? exc_general_protection+0x17c/0x400
[ +0.004707] ? asm_exc_general_protection+0x26/0x30
[ +0.004879] ? __ice_update_sample+0x39/0xe0 [ice]
[ +0.004835] ice_napi_poll+0x665/0x680 [ice]
[ +0.004320] __napi_poll+0x28/0x190
[ +0.003500] net_rx_action+0x198/0x360
[ +0.003752] ? update_rq_clock+0x39/0x220
[ +0.004013] handle_softirqs+0xf1/0x340
[ +0.003840] ? sched_clock_cpu+0xf/0x1f0
[ +0.003925] __irq_exit_rcu+0xc2/0xe0
[ +0.003665] common_interrupt+0x85/0xa0
[ +0.003839] </IRQ>
[ +0.002098] <TASK>
[ +0.002106] asm_common_interrupt+0x26/0x40
[ +0.004184] RIP: 0010:cpuidle_enter_state+0xd3/0x690
Fix this by performing the missing unmapping of XDP queues from
q_vectors and setting the XDP rings pointer back to NULL after all those
queues are released.
Also, add an immediate exit from the XDP callback in case of ring
preparation failure.
In the Linux kernel, the following vulnerability has been resolved:
net: stmmac: make sure that ptp_rate is not 0 before configuring timestamping
The stmmac platform drivers that do not open-code the clk_ptp_rate value
after having retrieved the default one from the device-tree can end up
with 0 in clk_ptp_rate (as clk_get_rate can return 0). It will
eventually propagate up to PTP initialization when bringing up the
interface, leading to a divide by 0:
Division by zero in kernel.
CPU: 1 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.12.30-00001-g48313bd5768a #22
Hardware name: STM32 (Device Tree Support)
Call trace:
unwind_backtrace from show_stack+0x18/0x1c
show_stack from dump_stack_lvl+0x6c/0x8c
dump_stack_lvl from Ldiv0_64+0x8/0x18
Ldiv0_64 from stmmac_init_tstamp_counter+0x190/0x1a4
stmmac_init_tstamp_counter from stmmac_hw_setup+0xc1c/0x111c
stmmac_hw_setup from __stmmac_open+0x18c/0x434
__stmmac_open from stmmac_open+0x3c/0xbc
stmmac_open from __dev_open+0xf4/0x1ac
__dev_open from __dev_change_flags+0x1cc/0x224
__dev_change_flags from dev_change_flags+0x24/0x60
dev_change_flags from ip_auto_config+0x2e8/0x11a0
ip_auto_config from do_one_initcall+0x84/0x33c
do_one_initcall from kernel_init_freeable+0x1b8/0x214
kernel_init_freeable from kernel_init+0x24/0x140
kernel_init from ret_from_fork+0x14/0x28
Exception stack(0xe0815fb0 to 0xe0815ff8)
Prevent this division by 0 by adding an explicit check and error log
about the actual issue. While at it, remove the same check from
stmmac_ptp_register, which then becomes duplicate
In the Linux kernel, the following vulnerability has been resolved:
net: stmmac: make sure that ptp_rate is not 0 before configuring EST
If the ptp_rate recorded earlier in the driver happens to be 0, this
bogus value will propagate up to EST configuration, where it will
trigger a division by 0.
Prevent this division by 0 by adding the corresponding check and error
code.
In the Linux kernel, the following vulnerability has been resolved:
net: fix udp gso skb_segment after pull from frag_list
Commit a1e40ac5b5e9 ("net: gso: fix udp gso fraglist segmentation after
pull from frag_list") detected invalid geometry in frag_list skbs and
redirects them from skb_segment_list to more robust skb_segment. But some
packets with modified geometry can also hit bugs in that code. We don't
know how many such cases exist. Addressing each one by one also requires
touching the complex skb_segment code, which risks introducing bugs for
other types of skbs. Instead, linearize all these packets that fail the
basic invariants on gso fraglist skbs. That is more robust.
If only part of the fraglist payload is pulled into head_skb, it will
always cause exception when splitting skbs by skb_segment. For detailed
call stack information, see below.
Valid SKB_GSO_FRAGLIST skbs
- consist of two or more segments
- the head_skb holds the protocol headers plus first gso_size
- one or more frag_list skbs hold exactly one segment
- all but the last must be gso_size
Optional datapath hooks such as NAT and BPF (bpf_skb_pull_data) can
modify fraglist skbs, breaking these invariants.
In extreme cases they pull one part of data into skb linear. For UDP,
this causes three payloads with lengths of (11,11,10) bytes were
pulled tail to become (12,10,10) bytes.
The skbs no longer meets the above SKB_GSO_FRAGLIST conditions because
payload was pulled into head_skb, it needs to be linearized before pass
to regular skb_segment.
skb_segment+0xcd0/0xd14
__udp_gso_segment+0x334/0x5f4
udp4_ufo_fragment+0x118/0x15c
inet_gso_segment+0x164/0x338
skb_mac_gso_segment+0xc4/0x13c
__skb_gso_segment+0xc4/0x124
validate_xmit_skb+0x9c/0x2c0
validate_xmit_skb_list+0x4c/0x80
sch_direct_xmit+0x70/0x404
__dev_queue_xmit+0x64c/0xe5c
neigh_resolve_output+0x178/0x1c4
ip_finish_output2+0x37c/0x47c
__ip_finish_output+0x194/0x240
ip_finish_output+0x20/0xf4
ip_output+0x100/0x1a0
NF_HOOK+0xc4/0x16c
ip_forward+0x314/0x32c
ip_rcv+0x90/0x118
__netif_receive_skb+0x74/0x124
process_backlog+0xe8/0x1a4
__napi_poll+0x5c/0x1f8
net_rx_action+0x154/0x314
handle_softirqs+0x154/0x4b8
[118.376811] [C201134] rxq0_pus: [name:bug&]kernel BUG at net/core/skbuff.c:4278!
[118.376829] [C201134] rxq0_pus: [name:traps&]Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
[118.470774] [C201134] rxq0_pus: [name:mrdump&]Kernel Offset: 0x178cc00000 from 0xffffffc008000000
[118.470810] [C201134] rxq0_pus: [name:mrdump&]PHYS_OFFSET: 0x40000000
[118.470827] [C201134] rxq0_pus: [name:mrdump&]pstate: 60400005 (nZCv daif +PAN -UAO)
[118.470848] [C201134] rxq0_pus: [name:mrdump&]pc : [0xffffffd79598aefc] skb_segment+0xcd0/0xd14
[118.470900] [C201134] rxq0_pus: [name:mrdump&]lr : [0xffffffd79598a5e8] skb_segment+0x3bc/0xd14
[118.470928] [C201134] rxq0_pus: [name:mrdump&]sp : ffffffc008013770
In the Linux kernel, the following vulnerability has been resolved:
net: wwan: t7xx: Fix napi rx poll issue
When driver handles the napi rx polling requests, the netdev might
have been released by the dellink logic triggered by the disconnect
operation on user plane. However, in the logic of processing skb in
polling, an invalid netdev is still being used, which causes a panic.
BUG: kernel NULL pointer dereference, address: 00000000000000f1
Oops: 0000 [#1] PREEMPT SMP NOPTI
RIP: 0010:dev_gro_receive+0x3a/0x620
[...]
Call Trace:
<IRQ>
? __die_body+0x68/0xb0
? page_fault_oops+0x379/0x3e0
? exc_page_fault+0x4f/0xa0
? asm_exc_page_fault+0x22/0x30
? __pfx_t7xx_ccmni_recv_skb+0x10/0x10 [mtk_t7xx (HASH:1400 7)]
? dev_gro_receive+0x3a/0x620
napi_gro_receive+0xad/0x170
t7xx_ccmni_recv_skb+0x48/0x70 [mtk_t7xx (HASH:1400 7)]
t7xx_dpmaif_napi_rx_poll+0x590/0x800 [mtk_t7xx (HASH:1400 7)]
net_rx_action+0x103/0x470
irq_exit_rcu+0x13a/0x310
sysvec_apic_timer_interrupt+0x56/0x90
</IRQ>
In the Linux kernel, the following vulnerability has been resolved:
gve: add missing NULL check for gve_alloc_pending_packet() in TX DQO
gve_alloc_pending_packet() can return NULL, but gve_tx_add_skb_dqo()
did not check for this case before dereferencing the returned pointer.
Add a missing NULL check to prevent a potential NULL pointer
dereference when allocation fails.
This improves robustness in low-memory scenarios.
In the Linux kernel, the following vulnerability has been resolved:
wifi: iwlwifi: mld: avoid panic on init failure
In case of an error during init, in_hw_restart will be set, but it will
never get cleared.
Instead, we will retry to init again, and then we will act like we are in a
restart when we are actually not.
This causes (among others) to a NULL pointer dereference when canceling
rx_omi::finished_work, that was not even initialized, because we thought
that we are in hw_restart.
Set in_hw_restart to true only if the fw is running, then we know that
FW was loaded successfully and we are not going to the retry loop.
In the Linux kernel, the following vulnerability has been resolved:
netfilter: nf_set_pipapo_avx2: fix initial map fill
If the first field doesn't cover the entire start map, then we must zero
out the remainder, else we leak those bits into the next match round map.
The early fix was incomplete and did only fix up the generic C
implementation.
A followup patch adds a test case to nft_concat_range.sh.
In the Linux kernel, the following vulnerability has been resolved:
scsi: core: ufs: Fix a hang in the error handler
ufshcd_err_handling_prepare() calls ufshcd_rpm_get_sync(). The latter
function can only succeed if UFSHCD_EH_IN_PROGRESS is not set because
resuming involves submitting a SCSI command and ufshcd_queuecommand()
returns SCSI_MLQUEUE_HOST_BUSY if UFSHCD_EH_IN_PROGRESS is set. Fix this
hang by setting UFSHCD_EH_IN_PROGRESS after ufshcd_rpm_get_sync() has
been called instead of before.
Backtrace:
__switch_to+0x174/0x338
__schedule+0x600/0x9e4
schedule+0x7c/0xe8
schedule_timeout+0xa4/0x1c8
io_schedule_timeout+0x48/0x70
wait_for_common_io+0xa8/0x160 //waiting on START_STOP
wait_for_completion_io_timeout+0x10/0x20
blk_execute_rq+0xe4/0x1e4
scsi_execute_cmd+0x108/0x244
ufshcd_set_dev_pwr_mode+0xe8/0x250
__ufshcd_wl_resume+0x94/0x354
ufshcd_wl_runtime_resume+0x3c/0x174
scsi_runtime_resume+0x64/0xa4
rpm_resume+0x15c/0xa1c
__pm_runtime_resume+0x4c/0x90 // Runtime resume ongoing
ufshcd_err_handler+0x1a0/0xd08
process_one_work+0x174/0x808
worker_thread+0x15c/0x490
kthread+0xf4/0x1ec
ret_from_fork+0x10/0x20
[ bvanassche: rewrote patch description ]
In the Linux kernel, the following vulnerability has been resolved:
net_sched: sch_sfq: fix a potential crash on gso_skb handling
SFQ has an assumption of always being able to queue at least one packet.
However, after the blamed commit, sch->q.len can be inflated by packets
in sch->gso_skb, and an enqueue() on an empty SFQ qdisc can be followed
by an immediate drop.
Fix sfq_drop() to properly clear q->tail in this situation.
ip netns add lb
ip link add dev to-lb type veth peer name in-lb netns lb
ethtool -K to-lb tso off # force qdisc to requeue gso_skb
ip netns exec lb ethtool -K in-lb gro on # enable NAPI
ip link set dev to-lb up
ip -netns lb link set dev in-lb up
ip addr add dev to-lb 192.168.20.1/24
ip -netns lb addr add dev in-lb 192.168.20.2/24
tc qdisc replace dev to-lb root sfq limit 100
ip netns exec lb netserver
netperf -H 192.168.20.2 -l 100 &
netperf -H 192.168.20.2 -l 100 &
netperf -H 192.168.20.2 -l 100 &
netperf -H 192.168.20.2 -l 100 &
In the Linux kernel, the following vulnerability has been resolved:
e1000: Move cancel_work_sync to avoid deadlock
Previously, e1000_down called cancel_work_sync for the e1000 reset task
(via e1000_down_and_stop), which takes RTNL.
As reported by users and syzbot, a deadlock is possible in the following
scenario:
CPU 0:
- RTNL is held
- e1000_close
- e1000_down
- cancel_work_sync (cancel / wait for e1000_reset_task())
CPU 1:
- process_one_work
- e1000_reset_task
- take RTNL
To remedy this, avoid calling cancel_work_sync from e1000_down
(e1000_reset_task does nothing if the device is down anyway). Instead,
call cancel_work_sync for e1000_reset_task when the device is being
removed.
In the Linux kernel, the following vulnerability has been resolved:
ACPI: CPPC: Fix NULL pointer dereference when nosmp is used
With nosmp in cmdline, other CPUs are not brought up, leaving
their cpc_desc_ptr NULL. CPU0's iteration via for_each_possible_cpu()
dereferences these NULL pointers, causing panic.
Panic backtrace:
[ 0.401123] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000b8
...
[ 0.403255] [<ffffffff809a5818>] cppc_allow_fast_switch+0x6a/0xd4
...
Kernel panic - not syncing: Attempted to kill init!
[ rjw: New subject ]
In the Linux kernel, the following vulnerability has been resolved:
net: Fix TOCTOU issue in sk_is_readable()
sk->sk_prot->sock_is_readable is a valid function pointer when sk resides
in a sockmap. After the last sk_psock_put() (which usually happens when
socket is removed from sockmap), sk->sk_prot gets restored and
sk->sk_prot->sock_is_readable becomes NULL.
This makes sk_is_readable() racy, if the value of sk->sk_prot is reloaded
after the initial check. Which in turn may lead to a null pointer
dereference.
Ensure the function pointer does not turn NULL after the check.
In the Linux kernel, the following vulnerability has been resolved:
ALSA: usb-audio: Kill timer properly at removal
The USB-audio MIDI code initializes the timer, but in a rare case, the
driver might be freed without the disconnect call. This leaves the
timer in an active state while the assigned object is released via
snd_usbmidi_free(), which ends up with a kernel warning when the debug
configuration is enabled, as spotted by fuzzer.
For avoiding the problem, put timer_shutdown_sync() at
snd_usbmidi_free(), so that the timer can be killed properly.
While we're at it, replace the existing timer_delete_sync() at the
disconnect callback with timer_shutdown_sync(), too.
In the Linux kernel, the following vulnerability has been resolved:
x86/iopl: Cure TIF_IO_BITMAP inconsistencies
io_bitmap_exit() is invoked from exit_thread() when a task exists or
when a fork fails. In the latter case the exit_thread() cleans up
resources which were allocated during fork().
io_bitmap_exit() invokes task_update_io_bitmap(), which in turn ends up
in tss_update_io_bitmap(). tss_update_io_bitmap() operates on the
current task. If current has TIF_IO_BITMAP set, but no bitmap installed,
tss_update_io_bitmap() crashes with a NULL pointer dereference.
There are two issues, which lead to that problem:
1) io_bitmap_exit() should not invoke task_update_io_bitmap() when
the task, which is cleaned up, is not the current task. That's a
clear indicator for a cleanup after a failed fork().
2) A task should not have TIF_IO_BITMAP set and neither a bitmap
installed nor IOPL emulation level 3 activated.
This happens when a kernel thread is created in the context of
a user space thread, which has TIF_IO_BITMAP set as the thread
flags are copied and the IO bitmap pointer is cleared.
Other than in the failed fork() case this has no impact because
kernel threads including IO workers never return to user space and
therefore never invoke tss_update_io_bitmap().
Cure this by adding the missing cleanups and checks:
1) Prevent io_bitmap_exit() to invoke task_update_io_bitmap() if
the to be cleaned up task is not the current task.
2) Clear TIF_IO_BITMAP in copy_thread() unconditionally. For user
space forks it is set later, when the IO bitmap is inherited in
io_bitmap_share().
For paranoia sake, add a warning into tss_update_io_bitmap() to catch
the case, when that code is invoked with inconsistent state.
In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: Disable SCO support if READ_VOICE_SETTING is unsupported/broken
A SCO connection without the proper voice_setting can cause
the controller to lock up.
In the Linux kernel, the following vulnerability has been resolved:
drm/amd/display: Don't treat wb connector as physical in create_validate_stream_for_sink
Don't try to operate on a drm_wb_connector as an amdgpu_dm_connector.
While dereferencing aconnector->base will "work" it's wrong and
might lead to unknown bad things. Just... don't.
In the Linux kernel, the following vulnerability has been resolved:
espintcp: remove encap socket caching to avoid reference leak
The current scheme for caching the encap socket can lead to reference
leaks when we try to delete the netns.
The reference chain is: xfrm_state -> enacp_sk -> netns
Since the encap socket is a userspace socket, it holds a reference on
the netns. If we delete the espintcp state (through flush or
individual delete) before removing the netns, the reference on the
socket is dropped and the netns is correctly deleted. Otherwise, the
netns may not be reachable anymore (if all processes within the ns
have terminated), so we cannot delete the xfrm state to drop its
reference on the socket.
This patch results in a small (~2% in my tests) performance
regression.
A GC-type mechanism could be added for the socket cache, to clear
references if the state hasn't been used "recently", but it's a lot
more complex than just not caching the socket.
In the Linux kernel, the following vulnerability has been resolved:
wifi: iwlwifi: don't warn when if there is a FW error
iwl_trans_reclaim is warning if it is called when the FW is not alive.
But if it is called when there is a pending restart, i.e. after a FW
error, there is no need to warn, instead - return silently.
In the Linux kernel, the following vulnerability has been resolved:
dma-buf: insert memory barrier before updating num_fences
smp_store_mb() inserts memory barrier after storing operation.
It is different with what the comment is originally aiming so Null
pointer dereference can be happened if memory update is reordered.
In the Linux kernel, the following vulnerability has been resolved:
net: cadence: macb: Fix a possible deadlock in macb_halt_tx.
There is a situation where after THALT is set high, TGO stays high as
well. Because jiffies are never updated, as we are in a context with
interrupts disabled, we never exit that loop and have a deadlock.
That deadlock was noticed on a sama5d4 device that stayed locked for days.
Use retries instead of jiffies so that the timeout really works and we do
not have a deadlock anymore.