In the Linux kernel, the following vulnerability has been resolved:
RDMA/bnxt_re: Fix OOB write in bnxt_re_copy_err_stats()
Commit ef56081d1864 ("RDMA/bnxt_re: RoCE related hardware counters
update") added three new counters and placed them after
BNXT_RE_OUT_OF_SEQ_ERR.
BNXT_RE_OUT_OF_SEQ_ERR acts as a boundary marker for allocating hardware
statistics with different num_counters values on chip_gen_p5_p7 devices.
As a result, BNXT_RE_NUM_STD_COUNTERS are used when allocating
hw_stats, which leads to an out-of-bounds write in
bnxt_re_copy_err_stats().
The counters BNXT_RE_REQ_CQE_ERROR, BNXT_RE_RESP_CQE_ERROR, and
BNXT_RE_RESP_REMOTE_ACCESS_ERRS are applicable to generic hardware, not
only p5/p7 devices.
Fix this by moving these counters before BNXT_RE_OUT_OF_SEQ_ERR so they
are included in the generic counter set.
In the Linux kernel, the following vulnerability has been resolved:
nfsd: fix nfsd_file reference leak in nfsd4_add_rdaccess_to_wrdeleg()
nfsd4_add_rdaccess_to_wrdeleg() unconditionally overwrites
fp->fi_fds[O_RDONLY] with a newly acquired nfsd_file. However, if
the client already has a SHARE_ACCESS_READ open from a previous OPEN
operation, this action overwrites the existing pointer without
releasing its reference, orphaning the previous reference.
Additionally, the function originally stored the same nfsd_file
pointer in both fp->fi_fds[O_RDONLY] and fp->fi_rdeleg_file with
only a single reference. When put_deleg_file() runs, it clears
fi_rdeleg_file and calls nfs4_file_put_access() to release the file.
However, nfs4_file_put_access() only releases fi_fds[O_RDONLY] when
the fi_access[O_RDONLY] counter drops to zero. If another READ open
exists on the file, the counter remains elevated and the nfsd_file
reference from the delegation is never released. This potentially
causes open conflicts on that file.
Then, on server shutdown, these leaks cause __nfsd_file_cache_purge()
to encounter files with an elevated reference count that cannot be
cleaned up, ultimately triggering a BUG() in kmem_cache_destroy()
because there are still nfsd_file objects allocated in that cache.
In the Linux kernel, the following vulnerability has been resolved:
iommu: disable SVA when CONFIG_X86 is set
Patch series "Fix stale IOTLB entries for kernel address space", v7.
This proposes a fix for a security vulnerability related to IOMMU Shared
Virtual Addressing (SVA). In an SVA context, an IOMMU can cache kernel
page table entries. When a kernel page table page is freed and
reallocated for another purpose, the IOMMU might still hold stale,
incorrect entries. This can be exploited to cause a use-after-free or
write-after-free condition, potentially leading to privilege escalation or
data corruption.
This solution introduces a deferred freeing mechanism for kernel page
table pages, which provides a safe window to notify the IOMMU to
invalidate its caches before the page is reused.
This patch (of 8):
In the IOMMU Shared Virtual Addressing (SVA) context, the IOMMU hardware
shares and walks the CPU's page tables. The x86 architecture maps the
kernel's virtual address space into the upper portion of every process's
page table. Consequently, in an SVA context, the IOMMU hardware can walk
and cache kernel page table entries.
The Linux kernel currently lacks a notification mechanism for kernel page
table changes, specifically when page table pages are freed and reused.
The IOMMU driver is only notified of changes to user virtual address
mappings. This can cause the IOMMU's internal caches to retain stale
entries for kernel VA.
Use-After-Free (UAF) and Write-After-Free (WAF) conditions arise when
kernel page table pages are freed and later reallocated. The IOMMU could
misinterpret the new data as valid page table entries. The IOMMU might
then walk into attacker-controlled memory, leading to arbitrary physical
memory DMA access or privilege escalation. This is also a
Write-After-Free issue, as the IOMMU will potentially continue to write
Accessed and Dirty bits to the freed memory while attempting to walk the
stale page tables.
Currently, SVA contexts are unprivileged and cannot access kernel
mappings. However, the IOMMU will still walk kernel-only page tables all
the way down to the leaf entries, where it realizes the mapping is for the
kernel and errors out. This means the IOMMU still caches these
intermediate page table entries, making the described vulnerability a real
concern.
Disable SVA on x86 architecture until the IOMMU can receive notification
to flush the paging cache before freeing the CPU kernel page table pages.
In the Linux kernel, the following vulnerability has been resolved:
mptcp: fallback earlier on simult connection
Syzkaller reports a simult-connect race leading to inconsistent fallback
status:
WARNING: CPU: 3 PID: 33 at net/mptcp/subflow.c:1515 subflow_data_ready+0x40b/0x7c0 net/mptcp/subflow.c:1515
Modules linked in:
CPU: 3 UID: 0 PID: 33 Comm: ksoftirqd/3 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
RIP: 0010:subflow_data_ready+0x40b/0x7c0 net/mptcp/subflow.c:1515
Code: 89 ee e8 78 61 3c f6 40 84 ed 75 21 e8 8e 66 3c f6 44 89 fe bf 07 00 00 00 e8 c1 61 3c f6 41 83 ff 07 74 09 e8 76 66 3c f6 90 <0f> 0b 90 e8 6d 66 3c f6 48 89 df e8 e5 ad ff ff 31 ff 89 c5 89 c6
RSP: 0018:ffffc900006cf338 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff888031acd100 RCX: ffffffff8b7f2abf
RDX: ffff88801e6ea440 RSI: ffffffff8b7f2aca RDI: 0000000000000005
RBP: 0000000000000000 R08: 0000000000000005 R09: 0000000000000007
R10: 0000000000000004 R11: 0000000000002c10 R12: ffff88802ba69900
R13: 1ffff920000d9e67 R14: ffff888046f81800 R15: 0000000000000004
FS: 0000000000000000(0000) GS:ffff8880d69bc000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000560fc0ca1670 CR3: 0000000032c3a000 CR4: 0000000000352ef0
Call Trace:
<TASK>
tcp_data_queue+0x13b0/0x4f90 net/ipv4/tcp_input.c:5197
tcp_rcv_state_process+0xfdf/0x4ec0 net/ipv4/tcp_input.c:6922
tcp_v6_do_rcv+0x492/0x1740 net/ipv6/tcp_ipv6.c:1672
tcp_v6_rcv+0x2976/0x41e0 net/ipv6/tcp_ipv6.c:1918
ip6_protocol_deliver_rcu+0x188/0x1520 net/ipv6/ip6_input.c:438
ip6_input_finish+0x1e4/0x4b0 net/ipv6/ip6_input.c:489
NF_HOOK include/linux/netfilter.h:318 [inline]
NF_HOOK include/linux/netfilter.h:312 [inline]
ip6_input+0x105/0x2f0 net/ipv6/ip6_input.c:500
dst_input include/net/dst.h:471 [inline]
ip6_rcv_finish net/ipv6/ip6_input.c:79 [inline]
NF_HOOK include/linux/netfilter.h:318 [inline]
NF_HOOK include/linux/netfilter.h:312 [inline]
ipv6_rcv+0x264/0x650 net/ipv6/ip6_input.c:311
__netif_receive_skb_one_core+0x12d/0x1e0 net/core/dev.c:5979
__netif_receive_skb+0x1d/0x160 net/core/dev.c:6092
process_backlog+0x442/0x15e0 net/core/dev.c:6444
__napi_poll.constprop.0+0xba/0x550 net/core/dev.c:7494
napi_poll net/core/dev.c:7557 [inline]
net_rx_action+0xa9f/0xfe0 net/core/dev.c:7684
handle_softirqs+0x216/0x8e0 kernel/softirq.c:579
run_ksoftirqd kernel/softirq.c:968 [inline]
run_ksoftirqd+0x3a/0x60 kernel/softirq.c:960
smpboot_thread_fn+0x3f7/0xae0 kernel/smpboot.c:160
kthread+0x3c2/0x780 kernel/kthread.c:463
ret_from_fork+0x5d7/0x6f0 arch/x86/kernel/process.c:148
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
</TASK>
The TCP subflow can process the simult-connect syn-ack packet after
transitioning to TCP_FIN1 state, bypassing the MPTCP fallback check,
as the sk_state_change() callback is not invoked for * -> FIN_WAIT1
transitions.
That will move the msk socket to an inconsistent status and the next
incoming data will hit the reported splat.
Close the race moving the simult-fallback check at the earliest possible
stage - that is at syn-ack generation time.
About the fixes tags: [2] was supposed to also fix this issue introduced
by [3]. [1] is required as a dependence: it was not explicitly marked as
a fix, but it is one and it has already been backported before [3]. In
other words, this commit should be backported up to [3], including [2]
and [1] if that's not already there.
In the Linux kernel, the following vulnerability has been resolved:
iavf: fix off-by-one issues in iavf_config_rss_reg()
There are off-by-one bugs when configuring RSS hash key and lookup
table, causing out-of-bounds reads to memory [1] and out-of-bounds
writes to device registers.
Before commit 43a3d9ba34c9 ("i40evf: Allow PF driver to configure RSS"),
the loop upper bounds were:
i <= I40E_VFQF_{HKEY,HLUT}_MAX_INDEX
which is safe since the value is the last valid index.
That commit changed the bounds to:
i <= adapter->rss_{key,lut}_size / 4
where `rss_{key,lut}_size / 4` is the number of dwords, so the last
valid index is `(rss_{key,lut}_size / 4) - 1`. Therefore, using `<=`
accesses one element past the end.
Fix the issues by using `<` instead of `<=`, ensuring we do not exceed
the bounds.
[1] KASAN splat about rss_key_size off-by-one
BUG: KASAN: slab-out-of-bounds in iavf_config_rss+0x619/0x800
Read of size 4 at addr ffff888102c50134 by task kworker/u8:6/63
CPU: 0 UID: 0 PID: 63 Comm: kworker/u8:6 Not tainted 6.18.0-rc2-enjuk-tnguy-00378-g3005f5b77652-dirty #156 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
Workqueue: iavf iavf_watchdog_task
Call Trace:
<TASK>
dump_stack_lvl+0x6f/0xb0
print_report+0x170/0x4f3
kasan_report+0xe1/0x1a0
iavf_config_rss+0x619/0x800
iavf_watchdog_task+0x2be7/0x3230
process_one_work+0x7fd/0x1420
worker_thread+0x4d1/0xd40
kthread+0x344/0x660
ret_from_fork+0x249/0x320
ret_from_fork_asm+0x1a/0x30
</TASK>
Allocated by task 63:
kasan_save_stack+0x30/0x50
kasan_save_track+0x14/0x30
__kasan_kmalloc+0x7f/0x90
__kmalloc_noprof+0x246/0x6f0
iavf_watchdog_task+0x28fc/0x3230
process_one_work+0x7fd/0x1420
worker_thread+0x4d1/0xd40
kthread+0x344/0x660
ret_from_fork+0x249/0x320
ret_from_fork_asm+0x1a/0x30
The buggy address belongs to the object at ffff888102c50100
which belongs to the cache kmalloc-64 of size 64
The buggy address is located 0 bytes to the right of
allocated 52-byte region [ffff888102c50100, ffff888102c50134)
The buggy address belongs to the physical page:
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x102c50
flags: 0x200000000000000(node=0|zone=2)
page_type: f5(slab)
raw: 0200000000000000 ffff8881000418c0 dead000000000122 0000000000000000
raw: 0000000000000000 0000000080200020 00000000f5000000 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff888102c50000: 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc
ffff888102c50080: 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc
>ffff888102c50100: 00 00 00 00 00 00 04 fc fc fc fc fc fc fc fc fc
^
ffff888102c50180: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
ffff888102c50200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
In the Linux kernel, the following vulnerability has been resolved:
net: rose: fix invalid array index in rose_kill_by_device()
rose_kill_by_device() collects sockets into a local array[] and then
iterates over them to disconnect sockets bound to a device being brought
down.
The loop mistakenly indexes array[cnt] instead of array[i]. For cnt <
ARRAY_SIZE(array), this reads an uninitialized entry; for cnt ==
ARRAY_SIZE(array), it is an out-of-bounds read. Either case can lead to
an invalid socket pointer dereference and also leaks references taken
via sock_hold().
Fix the index to use i.
In the Linux kernel, the following vulnerability has been resolved:
ipv6: BUG() in pskb_expand_head() as part of calipso_skbuff_setattr()
There exists a kernel oops caused by a BUG_ON(nhead < 0) at
net/core/skbuff.c:2232 in pskb_expand_head().
This bug is triggered as part of the calipso_skbuff_setattr()
routine when skb_cow() is passed headroom > INT_MAX
(i.e. (int)(skb_headroom(skb) + len_delta) < 0).
The root cause of the bug is due to an implicit integer cast in
__skb_cow(). The check (headroom > skb_headroom(skb)) is meant to ensure
that delta = headroom - skb_headroom(skb) is never negative, otherwise
we will trigger a BUG_ON in pskb_expand_head(). However, if
headroom > INT_MAX and delta <= -NET_SKB_PAD, the check passes, delta
becomes negative, and pskb_expand_head() is passed a negative value for
nhead.
Fix the trigger condition in calipso_skbuff_setattr(). Avoid passing
"negative" headroom sizes to skb_cow() within calipso_skbuff_setattr()
by only using skb_cow() to grow headroom.
PoC:
Using `netlabelctl` tool:
netlabelctl map del default
netlabelctl calipso add pass doi:7
netlabelctl map add default address:0::1/128 protocol:calipso,7
Then run the following PoC:
int fd = socket(AF_INET6, SOCK_DGRAM, IPPROTO_UDP);
// setup msghdr
int cmsg_size = 2;
int cmsg_len = 0x60;
struct msghdr msg;
struct sockaddr_in6 dest_addr;
struct cmsghdr * cmsg = (struct cmsghdr *) calloc(1,
sizeof(struct cmsghdr) + cmsg_len);
msg.msg_name = &dest_addr;
msg.msg_namelen = sizeof(dest_addr);
msg.msg_iov = NULL;
msg.msg_iovlen = 0;
msg.msg_control = cmsg;
msg.msg_controllen = cmsg_len;
msg.msg_flags = 0;
// setup sockaddr
dest_addr.sin6_family = AF_INET6;
dest_addr.sin6_port = htons(31337);
dest_addr.sin6_flowinfo = htonl(31337);
dest_addr.sin6_addr = in6addr_loopback;
dest_addr.sin6_scope_id = 31337;
// setup cmsghdr
cmsg->cmsg_len = cmsg_len;
cmsg->cmsg_level = IPPROTO_IPV6;
cmsg->cmsg_type = IPV6_HOPOPTS;
char * hop_hdr = (char *)cmsg + sizeof(struct cmsghdr);
hop_hdr[1] = 0x9; //set hop size - (0x9 + 1) * 8 = 80
sendmsg(fd, &msg, 0);
In the Linux kernel, the following vulnerability has been resolved:
RDMA/cm: Fix leaking the multicast GID table reference
If the CM ID is destroyed while the CM event for multicast creating is
still queued the cancel_work_sync() will prevent the work from running
which also prevents destroying the ah_attr. This leaks a refcount and
triggers a WARN:
GID entry ref leak for dev syz1 index 2 ref=573
WARNING: CPU: 1 PID: 655 at drivers/infiniband/core/cache.c:809 release_gid_table drivers/infiniband/core/cache.c:806 [inline]
WARNING: CPU: 1 PID: 655 at drivers/infiniband/core/cache.c:809 gid_table_release_one+0x284/0x3cc drivers/infiniband/core/cache.c:886
Destroy the ah_attr after canceling the work, it is safe to call this
twice.
In the Linux kernel, the following vulnerability has been resolved:
drm/ttm: Avoid NULL pointer deref for evicted BOs
It is possible for a BO to exist that is not currently associated with a
resource, e.g. because it has been evicted.
When devcoredump tries to read the contents of all BOs for dumping, we need
to expect this as well -- in this case, ENODATA is recorded instead of the
buffer contents.
In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: btusb: revert use of devm_kzalloc in btusb
This reverts commit 98921dbd00c4e ("Bluetooth: Use devm_kzalloc in
btusb.c file").
In btusb_probe(), we use devm_kzalloc() to allocate the btusb data. This
ties the lifetime of all the btusb data to the binding of a driver to
one interface, INTF. In a driver that binds to other interfaces, ISOC
and DIAG, this is an accident waiting to happen.
The issue is revealed in btusb_disconnect(), where calling
usb_driver_release_interface(&btusb_driver, data->intf) will have devm
free the data that is also being used by the other interfaces of the
driver that may not be released yet.
To fix this, revert the use of devm and go back to freeing memory
explicitly.
In the Linux kernel, the following vulnerability has been resolved:
ASoC: stm32: sai: fix OF node leak on probe
The reference taken to the sync provider OF node when probing the
platform device is currently only dropped if the set_sync() callback
fails during DAI probe.
Make sure to drop the reference on platform probe failures (e.g. probe
deferral) and on driver unbind.
This also avoids a potential use-after-free in case the DAI is ever
reprobed without first rebinding the platform driver.
In the Linux kernel, the following vulnerability has been resolved:
ipv6: fix a BUG in rt6_get_pcpu_route() under PREEMPT_RT
On PREEMPT_RT kernels, after rt6_get_pcpu_route() returns NULL, the
current task can be preempted. Another task running on the same CPU
may then execute rt6_make_pcpu_route() and successfully install a
pcpu_rt entry. When the first task resumes execution, its cmpxchg()
in rt6_make_pcpu_route() will fail because rt6i_pcpu is no longer
NULL, triggering the BUG_ON(prev). It's easy to reproduce it by adding
mdelay() after rt6_get_pcpu_route().
Using preempt_disable/enable is not appropriate here because
ip6_rt_pcpu_alloc() may sleep.
Fix this by handling the cmpxchg() failure gracefully on PREEMPT_RT:
free our allocation and return the existing pcpu_rt installed by
another task. The BUG_ON is replaced by WARN_ON_ONCE for non-PREEMPT_RT
kernels where such races should not occur.
In the Linux kernel, the following vulnerability has been resolved:
net: nfc: fix deadlock between nfc_unregister_device and rfkill_fop_write
A deadlock can occur between nfc_unregister_device() and rfkill_fop_write()
due to lock ordering inversion between device_lock and rfkill_global_mutex.
The problematic lock order is:
Thread A (rfkill_fop_write):
rfkill_fop_write()
mutex_lock(&rfkill_global_mutex)
rfkill_set_block()
nfc_rfkill_set_block()
nfc_dev_down()
device_lock(&dev->dev) <- waits for device_lock
Thread B (nfc_unregister_device):
nfc_unregister_device()
device_lock(&dev->dev)
rfkill_unregister()
mutex_lock(&rfkill_global_mutex) <- waits for rfkill_global_mutex
This creates a classic ABBA deadlock scenario.
Fix this by moving rfkill_unregister() and rfkill_destroy() outside the
device_lock critical section. Store the rfkill pointer in a local variable
before releasing the lock, then call rfkill_unregister() after releasing
device_lock.
This change is safe because rfkill_fop_write() holds rfkill_global_mutex
while calling the rfkill callbacks, and rfkill_unregister() also acquires
rfkill_global_mutex before cleanup. Therefore, rfkill_unregister() will
wait for any ongoing callback to complete before proceeding, and
device_del() is only called after rfkill_unregister() returns, preventing
any use-after-free.
The similar lock ordering in nfc_register_device() (device_lock ->
rfkill_global_mutex via rfkill_register) is safe because during
registration the device is not yet in rfkill_list, so no concurrent
rfkill operations can occur on this device.
In the Linux kernel, the following vulnerability has been resolved:
powerpc/64s/slb: Fix SLB multihit issue during SLB preload
On systems using the hash MMU, there is a software SLB preload cache that
mirrors the entries loaded into the hardware SLB buffer. This preload
cache is subject to periodic eviction — typically after every 256 context
switches — to remove old entry.
To optimize performance, the kernel skips switch_mmu_context() in
switch_mm_irqs_off() when the prev and next mm_struct are the same.
However, on hash MMU systems, this can lead to inconsistencies between
the hardware SLB and the software preload cache.
If an SLB entry for a process is evicted from the software cache on one
CPU, and the same process later runs on another CPU without executing
switch_mmu_context(), the hardware SLB may retain stale entries. If the
kernel then attempts to reload that entry, it can trigger an SLB
multi-hit error.
The following timeline shows how stale SLB entries are created and can
cause a multi-hit error when a process moves between CPUs without a
MMU context switch.
CPU 0 CPU 1
----- -----
Process P
exec swapper/1
load_elf_binary
begin_new_exc
activate_mm
switch_mm_irqs_off
switch_mmu_context
switch_slb
/*
* This invalidates all
* the entries in the HW
* and setup the new HW
* SLB entries as per the
* preload cache.
*/
context_switch
sched_migrate_task migrates process P to cpu-1
Process swapper/0 context switch (to process P)
(uses mm_struct of Process P) switch_mm_irqs_off()
switch_slb
load_slb++
/*
* load_slb becomes 0 here
* and we evict an entry from
* the preload cache with
* preload_age(). We still
* keep HW SLB and preload
* cache in sync, that is
* because all HW SLB entries
* anyways gets evicted in
* switch_slb during SLBIA.
* We then only add those
* entries back in HW SLB,
* which are currently
* present in preload_cache
* (after eviction).
*/
load_elf_binary continues...
setup_new_exec()
slb_setup_new_exec()
sched_switch event
sched_migrate_task migrates
process P to cpu-0
context_switch from swapper/0 to Process P
switch_mm_irqs_off()
/*
* Since both prev and next mm struct are same we don't call
* switch_mmu_context(). This will cause the HW SLB and SW preload
* cache to go out of sync in preload_new_slb_context. Because there
* was an SLB entry which was evicted from both HW and preload cache
* on cpu-1. Now later in preload_new_slb_context(), when we will try
* to add the same preload entry again, we will add this to the SW
* preload cache and then will add it to the HW SLB. Since on cpu-0
* this entry was never invalidated, hence adding this entry to the HW
* SLB will cause a SLB multi-hit error.
*/
load_elf_binary cont
---truncated---
In the Linux kernel, the following vulnerability has been resolved:
tpm: Cap the number of PCR banks
tpm2_get_pcr_allocation() does not cap any upper limit for the number of
banks. Cap the limit to eight banks so that out of bounds values coming
from external I/O cause on only limited harm.
In the Linux kernel, the following vulnerability has been resolved:
drm/xe/oa: Limit num_syncs to prevent oversized allocations
The OA open parameters did not validate num_syncs, allowing
userspace to pass arbitrarily large values, potentially
leading to excessive allocations.
Add check to ensure that num_syncs does not exceed DRM_XE_MAX_SYNCS,
returning -EINVAL when the limit is violated.
v2: use XE_IOCTL_DBG() and drop duplicated check. (Ashutosh)
(cherry picked from commit e057b2d2b8d815df3858a87dffafa2af37e5945b)
In the Linux kernel, the following vulnerability has been resolved:
scsi: aic94xx: fix use-after-free in device removal path
The asd_pci_remove() function fails to synchronize with pending tasklets
before freeing the asd_ha structure, leading to a potential
use-after-free vulnerability.
When a device removal is triggered (via hot-unplug or module unload),
race condition can occur.
The fix adds tasklet_kill() before freeing the asd_ha structure,
ensuring all scheduled tasklets complete before cleanup proceeds.
In the Linux kernel, the following vulnerability has been resolved:
functionfs: fix the open/removal races
ffs_epfile_open() can race with removal, ending up with file->private_data
pointing to freed object.
There is a total count of opened files on functionfs (both ep0 and
dynamic ones) and when it hits zero, dynamic files get removed.
Unfortunately, that removal can happen while another thread is
in ffs_epfile_open(), but has not incremented the count yet.
In that case open will succeed, leaving us with UAF on any subsequent
read() or write().
The root cause is that ffs->opened is misused; atomic_dec_and_test() vs.
atomic_add_return() is not a good idea, when object remains visible all
along.
To untangle that
* serialize openers on ffs->mutex (both for ep0 and for dynamic files)
* have dynamic ones use atomic_inc_not_zero() and fail if we had
zero ->opened; in that case the file we are opening is doomed.
* have the inodes of dynamic files marked on removal (from the
callback of simple_recursive_removal()) - clear ->i_private there.
* have open of dynamic ones verify they hadn't been already removed,
along with checking that state is FFS_ACTIVE.
In the Linux kernel, the following vulnerability has been resolved:
Input: lkkbd - disable pending work before freeing device
lkkbd_interrupt() schedules lk->tq via schedule_work(), and the work
handler lkkbd_reinit() dereferences the lkkbd structure and its
serio/input_dev fields.
lkkbd_disconnect() and error paths in lkkbd_connect() free the lkkbd
structure without preventing the reinit work from being queued again
until serio_close() returns. This can allow the work handler to run
after the structure has been freed, leading to a potential use-after-free.
Use disable_work_sync() instead of cancel_work_sync() to ensure the
reinit work cannot be re-queued, and call it both in lkkbd_disconnect()
and in lkkbd_connect() error paths after serio_open().
In the Linux kernel, the following vulnerability has been resolved:
shmem: fix recovery on rename failures
maple_tree insertions can fail if we are seriously short on memory;
simple_offset_rename() does not recover well if it runs into that.
The same goes for simple_offset_rename_exchange().
Moreover, shmem_whiteout() expects that if it succeeds, the caller will
progress to d_move(), i.e. that shmem_rename2() won't fail past the
successful call of shmem_whiteout().
Not hard to fix, fortunately - mtree_store() can't fail if the index we
are trying to store into is already present in the tree as a singleton.
For simple_offset_rename_exchange() that's enough - we just need to be
careful about the order of operations.
For simple_offset_rename() solution is to preinsert the target into the
tree for new_dir; the rest can be done without any potentially failing
operations.
That preinsertion has to be done in shmem_rename2() rather than in
simple_offset_rename() itself - otherwise we'd need to deal with the
possibility of failure after successful shmem_whiteout().
In the Linux kernel, the following vulnerability has been resolved:
iommu/mediatek: fix use-after-free on probe deferral
The driver is dropping the references taken to the larb devices during
probe after successful lookup as well as on errors. This can
potentially lead to a use-after-free in case a larb device has not yet
been bound to its driver so that the iommu driver probe defers.
Fix this by keeping the references as expected while the iommu driver is
bound.
In the Linux kernel, the following vulnerability has been resolved:
ublk: clean up user copy references on ublk server exit
If a ublk server process releases a ublk char device file, any requests
dispatched to the ublk server but not yet completed will retain a ref
value of UBLK_REFCOUNT_INIT. Before commit e63d2228ef83 ("ublk: simplify
aborting ublk request"), __ublk_fail_req() would decrement the reference
count before completing the failed request. However, that commit
optimized __ublk_fail_req() to call __ublk_complete_rq() directly
without decrementing the request reference count.
The leaked reference count incorrectly allows user copy and zero copy
operations on the completed ublk request. It also triggers the
WARN_ON_ONCE(refcount_read(&io->ref)) warnings in ublk_queue_reinit()
and ublk_deinit_queue().
Commit c5c5eb24ed61 ("ublk: avoid ublk_io_release() called after ublk
char dev is closed") already fixed the issue for ublk devices using
UBLK_F_SUPPORT_ZERO_COPY or UBLK_F_AUTO_BUF_REG. However, the reference
count leak also affects UBLK_F_USER_COPY, the other reference-counted
data copy mode. Fix the condition in ublk_check_and_reset_active_ref()
to include all reference-counted data copy modes. This ensures that any
ublk requests still owned by the ublk server when it exits have their
reference counts reset to 0.
In the Linux kernel, the following vulnerability has been resolved:
f2fs: invalidate dentry cache on failed whiteout creation
F2FS can mount filesystems with corrupted directory depth values that
get runtime-clamped to MAX_DIR_HASH_DEPTH. When RENAME_WHITEOUT
operations are performed on such directories, f2fs_rename performs
directory modifications (updating target entry and deleting source
entry) before attempting to add the whiteout entry via f2fs_add_link.
If f2fs_add_link fails due to the corrupted directory structure, the
function returns an error to VFS, but the partial directory
modifications have already been committed to disk. VFS assumes the
entire rename operation failed and does not update the dentry cache,
leaving stale mappings.
In the error path, VFS does not call d_move() to update the dentry
cache. This results in new_dentry still pointing to the old inode
(new_inode) which has already had its i_nlink decremented to zero.
The stale cache causes subsequent operations to incorrectly reference
the freed inode.
This causes subsequent operations to use cached dentry information that
no longer matches the on-disk state. When a second rename targets the
same entry, VFS attempts to decrement i_nlink on the stale inode, which
may already have i_nlink=0, triggering a WARNING in drop_nlink().
Example sequence:
1. First rename (RENAME_WHITEOUT): file2 → file1
- f2fs updates file1 entry on disk (points to inode 8)
- f2fs deletes file2 entry on disk
- f2fs_add_link(whiteout) fails (corrupted directory)
- Returns error to VFS
- VFS does not call d_move() due to error
- VFS cache still has: file1 → inode 7 (stale!)
- inode 7 has i_nlink=0 (already decremented)
2. Second rename: file3 → file1
- VFS uses stale cache: file1 → inode 7
- Tries to drop_nlink on inode 7 (i_nlink already 0)
- WARNING in drop_nlink()
Fix this by explicitly invalidating old_dentry and new_dentry when
f2fs_add_link fails during whiteout creation. This forces VFS to
refresh from disk on subsequent operations, ensuring cache consistency
even when the rename partially succeeds.
Reproducer:
1. Mount F2FS image with corrupted i_current_depth
2. renameat2(file2, file1, RENAME_WHITEOUT)
3. renameat2(file3, file1, 0)
4. System triggers WARNING in drop_nlink()
In the Linux kernel, the following vulnerability has been resolved:
svcrdma: bound check rq_pages index in inline path
svc_rdma_copy_inline_range indexed rqstp->rq_pages[rc_curpage] without
verifying rc_curpage stays within the allocated page array. Add guards
before the first use and after advancing to a new page.
In the Linux kernel, the following vulnerability has been resolved:
ntfs: set dummy blocksize to read boot_block when mounting
When mounting, sb->s_blocksize is used to read the boot_block without
being defined or validated. Set a dummy blocksize before attempting to
read the boot_block.
The issue can be triggered with the following syz reproducer:
mkdirat(0xffffffffffffff9c, &(0x7f0000000080)='./file1\x00', 0x0)
r4 = openat$nullb(0xffffffffffffff9c, &(0x7f0000000040), 0x121403, 0x0)
ioctl$FS_IOC_SETFLAGS(r4, 0x40081271, &(0x7f0000000980)=0x4000)
mount(&(0x7f0000000140)=@nullb, &(0x7f0000000040)='./cgroup\x00',
&(0x7f0000000000)='ntfs3\x00', 0x2208004, 0x0)
syz_clone(0x88200200, 0x0, 0x0, 0x0, 0x0, 0x0)
Here, the ioctl sets the bdev block size to 16384. During mount,
get_tree_bdev_flags() calls sb_set_blocksize(sb, block_size(bdev)),
but since block_size(bdev) > PAGE_SIZE, sb_set_blocksize() leaves
sb->s_blocksize at zero.
Later, ntfs_init_from_boot() attempts to read the boot_block while
sb->s_blocksize is still zero, which triggers the bug.
[almaz.alexandrovich@paragon-software.com: changed comment style, added
return value handling]
In the Linux kernel, the following vulnerability has been resolved:
net/sched: ets: Always remove class from active list before deleting in ets_qdisc_change
zdi-disclosures@trendmicro.com says:
The vulnerability is a race condition between `ets_qdisc_dequeue` and
`ets_qdisc_change`. It leads to UAF on `struct Qdisc` object.
Attacker requires the capability to create new user and network namespace
in order to trigger the bug.
See my additional commentary at the end of the analysis.
Analysis:
static int ets_qdisc_change(struct Qdisc *sch, struct nlattr *opt,
struct netlink_ext_ack *extack)
{
...
// (1) this lock is preventing .change handler (`ets_qdisc_change`)
//to race with .dequeue handler (`ets_qdisc_dequeue`)
sch_tree_lock(sch);
for (i = nbands; i < oldbands; i++) {
if (i >= q->nstrict && q->classes[i].qdisc->q.qlen)
list_del_init(&q->classes[i].alist);
qdisc_purge_queue(q->classes[i].qdisc);
}
WRITE_ONCE(q->nbands, nbands);
for (i = nstrict; i < q->nstrict; i++) {
if (q->classes[i].qdisc->q.qlen) {
// (2) the class is added to the q->active
list_add_tail(&q->classes[i].alist, &q->active);
q->classes[i].deficit = quanta[i];
}
}
WRITE_ONCE(q->nstrict, nstrict);
memcpy(q->prio2band, priomap, sizeof(priomap));
for (i = 0; i < q->nbands; i++)
WRITE_ONCE(q->classes[i].quantum, quanta[i]);
for (i = oldbands; i < q->nbands; i++) {
q->classes[i].qdisc = queues[i];
if (q->classes[i].qdisc != &noop_qdisc)
qdisc_hash_add(q->classes[i].qdisc, true);
}
// (3) the qdisc is unlocked, now dequeue can be called in parallel
// to the rest of .change handler
sch_tree_unlock(sch);
ets_offload_change(sch);
for (i = q->nbands; i < oldbands; i++) {
// (4) we're reducing the refcount for our class's qdisc and
// freeing it
qdisc_put(q->classes[i].qdisc);
// (5) If we call .dequeue between (4) and (5), we will have
// a strong UAF and we can control RIP
q->classes[i].qdisc = NULL;
WRITE_ONCE(q->classes[i].quantum, 0);
q->classes[i].deficit = 0;
gnet_stats_basic_sync_init(&q->classes[i].bstats);
memset(&q->classes[i].qstats, 0, sizeof(q->classes[i].qstats));
}
return 0;
}
Comment:
This happens because some of the classes have their qdiscs assigned to
NULL, but remain in the active list. This commit fixes this issue by always
removing the class from the active list before deleting and freeing its
associated qdisc
Reproducer Steps
(trimmed version of what was sent by zdi-disclosures@trendmicro.com)
```
DEV="${DEV:-lo}"
ROOT_HANDLE="${ROOT_HANDLE:-1:}"
BAND2_HANDLE="${BAND2_HANDLE:-20:}" # child under 1:2
PING_BYTES="${PING_BYTES:-48}"
PING_COUNT="${PING_COUNT:-200000}"
PING_DST="${PING_DST:-127.0.0.1}"
SLOW_TBF_RATE="${SLOW_TBF_RATE:-8bit}"
SLOW_TBF_BURST="${SLOW_TBF_BURST:-100b}"
SLOW_TBF_LAT="${SLOW_TBF_LAT:-1s}"
cleanup() {
tc qdisc del dev "$DEV" root 2>/dev/null
}
trap cleanup EXIT
ip link set "$DEV" up
tc qdisc del dev "$DEV" root 2>/dev/null || true
tc qdisc add dev "$DEV" root handle "$ROOT_HANDLE" ets bands 2 strict 2
tc qdisc add dev "$DEV" parent 1:2 handle "$BAND2_HANDLE" \
tbf rate "$SLOW_TBF_RATE" burst "$SLOW_TBF_BURST" latency "$SLOW_TBF_LAT"
tc filter add dev "$DEV" parent 1: protocol all prio 1 u32 match u32 0 0 flowid 1:2
tc -s qdisc ls dev $DEV
ping -I "$DEV" -f -c "$PING_COUNT" -s "$PING_BYTES" -W 0.001 "$PING_DST" \
>/dev/null 2>&1 &
tc qdisc change dev "$DEV" root handle "$ROOT_HANDLE" ets bands 2 strict 0
tc qdisc change dev "$DEV" root handle "$ROOT_HANDLE" ets bands 2 strict 2
tc -s qdisc ls dev $DEV
tc qdisc del dev "$DEV" parent
---truncated---
In the Linux kernel, the following vulnerability has been resolved:
net: hns3: using the num_tqps in the vf driver to apply for resources
Currently, hdev->htqp is allocated using hdev->num_tqps, and kinfo->tqp
is allocated using kinfo->num_tqps. However, kinfo->num_tqps is set to
min(new_tqps, hdev->num_tqps); Therefore, kinfo->num_tqps may be smaller
than hdev->num_tqps, which causes some hdev->htqp[i] to remain
uninitialized in hclgevf_knic_setup().
Thus, this patch allocates hdev->htqp and kinfo->tqp using hdev->num_tqps,
ensuring that the lengths of hdev->htqp and kinfo->tqp are consistent
and that all elements are properly initialized.
Tenda AX-3 v16.03.12.10_CN was discovered to contain a stack overflow in the wanMTU2 parameter of the fromAdvSetMacMtuWan function. This vulnerability allows attackers to cause a Denial of Service (DoS) via a crafted request.
Tenda AX-3 v16.03.12.10_CN was discovered to contain a stack overflow in the wanSpeed2 parameter of the fromAdvSetMacMtuWan function. This vulnerability allows attackers to cause a Denial of Service (DoS) via a crafted request.
Tenda AX-3 v16.03.12.10_CN was discovered to contain a stack overflow in the cloneType2 parameter of the fromAdvSetMacMtuWan function. This vulnerability allows attackers to cause a Denial of Service (DoS) via a crafted request.
Tenda AX-3 v16.03.12.10_CN was discovered to contain a stack overflow in the serviceName2 parameter of the fromAdvSetMacMtuWan function. This vulnerability allows attackers to cause a Denial of Service (DoS) via a crafted request.
Tenda AX-3 v16.03.12.10_CN was discovered to contain a stack overflow in the mac2 parameter of the fromAdvSetMacMtuWan function. This vulnerability allows attackers to cause a Denial of Service (DoS) via a crafted request.
Tenda AX-1806 v1.0.0.1 was discovered to contain a stack overflow in the security_5g parameter of the sub_4CA50 function. This vulnerability allows attackers to cause a Denial of Service (DoS) via a crafted request.
phpgurukul News Portal Project V4.1 has File Upload Vulnerability via upload.php, which enables the upload of files of any format to the server without identity authentication.
phpgurukul News Portal Project V4.1 has an Arbitrary File Deletion Vulnerability in remove_file.php. The parameter file can cause any file to be deleted.
In the Linux kernel, the following vulnerability has been resolved:
ublk: fix deadlock when reading partition table
When one process(such as udev) opens ublk block device (e.g., to read
the partition table via bdev_open()), a deadlock[1] can occur:
1. bdev_open() grabs disk->open_mutex
2. The process issues read I/O to ublk backend to read partition table
3. In __ublk_complete_rq(), blk_update_request() or blk_mq_end_request()
runs bio->bi_end_io() callbacks
4. If this triggers fput() on file descriptor of ublk block device, the
work may be deferred to current task's task work (see fput() implementation)
5. This eventually calls blkdev_release() from the same context
6. blkdev_release() tries to grab disk->open_mutex again
7. Deadlock: same task waiting for a mutex it already holds
The fix is to run blk_update_request() and blk_mq_end_request() with bottom
halves disabled. This forces blkdev_release() to run in kernel work-queue
context instead of current task work context, and allows ublk server to make
forward progress, and avoids the deadlock.
[axboe: rewrite comment in ublk]
In the Linux kernel, the following vulnerability has been resolved:
Input: alps - fix use-after-free bugs caused by dev3_register_work
The dev3_register_work delayed work item is initialized within
alps_reconnect() and scheduled upon receipt of the first bare
PS/2 packet from an external PS/2 device connected to the ALPS
touchpad. During device detachment, the original implementation
calls flush_workqueue() in psmouse_disconnect() to ensure
completion of dev3_register_work. However, the flush_workqueue()
in psmouse_disconnect() only blocks and waits for work items that
were already queued to the workqueue prior to its invocation. Any
work items submitted after flush_workqueue() is called are not
included in the set of tasks that the flush operation awaits.
This means that after flush_workqueue() has finished executing,
the dev3_register_work could still be scheduled. Although the
psmouse state is set to PSMOUSE_CMD_MODE in psmouse_disconnect(),
the scheduling of dev3_register_work remains unaffected.
The race condition can occur as follows:
CPU 0 (cleanup path) | CPU 1 (delayed work)
psmouse_disconnect() |
psmouse_set_state() |
flush_workqueue() | alps_report_bare_ps2_packet()
alps_disconnect() | psmouse_queue_work()
kfree(priv); // FREE | alps_register_bare_ps2_mouse()
| priv = container_of(work...); // USE
| priv->dev3 // USE
Add disable_delayed_work_sync() in alps_disconnect() to ensure
that dev3_register_work is properly canceled and prevented from
executing after the alps_data structure has been deallocated.
This bug is identified by static analysis.
In the Linux kernel, the following vulnerability has been resolved:
fuse: fix readahead reclaim deadlock
Commit e26ee4efbc79 ("fuse: allocate ff->release_args only if release is
needed") skips allocating ff->release_args if the server does not
implement open. However in doing so, fuse_prepare_release() now skips
grabbing the reference on the inode, which makes it possible for an
inode to be evicted from the dcache while there are inflight readahead
requests. This causes a deadlock if the server triggers reclaim while
servicing the readahead request and reclaim attempts to evict the inode
of the file being read ahead. Since the folio is locked during
readahead, when reclaim evicts the fuse inode and fuse_evict_inode()
attempts to remove all folios associated with the inode from the page
cache (truncate_inode_pages_range()), reclaim will block forever waiting
for the lock since readahead cannot relinquish the lock because it is
itself blocked in reclaim:
>>> stack_trace(1504735)
folio_wait_bit_common (mm/filemap.c:1308:4)
folio_lock (./include/linux/pagemap.h:1052:3)
truncate_inode_pages_range (mm/truncate.c:336:10)
fuse_evict_inode (fs/fuse/inode.c:161:2)
evict (fs/inode.c:704:3)
dentry_unlink_inode (fs/dcache.c:412:3)
__dentry_kill (fs/dcache.c:615:3)
shrink_kill (fs/dcache.c:1060:12)
shrink_dentry_list (fs/dcache.c:1087:3)
prune_dcache_sb (fs/dcache.c:1168:2)
super_cache_scan (fs/super.c:221:10)
do_shrink_slab (mm/shrinker.c:435:9)
shrink_slab (mm/shrinker.c:626:10)
shrink_node (mm/vmscan.c:5951:2)
shrink_zones (mm/vmscan.c:6195:3)
do_try_to_free_pages (mm/vmscan.c:6257:3)
do_swap_page (mm/memory.c:4136:11)
handle_pte_fault (mm/memory.c:5562:10)
handle_mm_fault (mm/memory.c:5870:9)
do_user_addr_fault (arch/x86/mm/fault.c:1338:10)
handle_page_fault (arch/x86/mm/fault.c:1481:3)
exc_page_fault (arch/x86/mm/fault.c:1539:2)
asm_exc_page_fault+0x22/0x27
Fix this deadlock by allocating ff->release_args and grabbing the
reference on the inode when preparing the file for release even if the
server does not implement open. The inode reference will be dropped when
the last reference on the fuse file is dropped (see fuse_file_put() ->
fuse_release_end()).
In the Linux kernel, the following vulnerability has been resolved:
ext4: xattr: fix null pointer deref in ext4_raw_inode()
If ext4_get_inode_loc() fails (e.g. if it returns -EFSCORRUPTED),
iloc.bh will remain set to NULL. Since ext4_xattr_inode_dec_ref_all()
lacks error checking, this will lead to a null pointer dereference
in ext4_raw_inode(), called right after ext4_get_inode_loc().
Found by Linux Verification Center (linuxtesting.org) with SVACE.
In the Linux kernel, the following vulnerability has been resolved:
media: dvb-usb: dtv5100: fix out-of-bounds in dtv5100_i2c_msg()
rlen value is a user-controlled value, but dtv5100_i2c_msg() does not
check the size of the rlen value. Therefore, if it is set to a value
larger than sizeof(st->data), an out-of-bounds vuln occurs for st->data.
Therefore, we need to add proper range checking to prevent this vuln.
In the Linux kernel, the following vulnerability has been resolved:
scsi: Revert "scsi: qla2xxx: Perform lockless command completion in abort path"
This reverts commit 0367076b0817d5c75dfb83001ce7ce5c64d803a9.
The commit being reverted added code to __qla2x00_abort_all_cmds() to
call sp->done() without holding a spinlock. But unlike the older code
below it, this new code failed to check sp->cmd_type and just assumed
TYPE_SRB, which results in a jump to an invalid pointer in target-mode
with TYPE_TGT_CMD:
qla2xxx [0000:65:00.0]-d034:8: qla24xx_do_nack_work create sess success
0000000009f7a79b
qla2xxx [0000:65:00.0]-5003:8: ISP System Error - mbx1=1ff5h mbx2=10h
mbx3=0h mbx4=0h mbx5=191h mbx6=0h mbx7=0h.
qla2xxx [0000:65:00.0]-d01e:8: -> fwdump no buffer
qla2xxx [0000:65:00.0]-f03a:8: qla_target(0): System error async event
0x8002 occurred
qla2xxx [0000:65:00.0]-00af:8: Performing ISP error recovery -
ha=0000000058183fda.
BUG: kernel NULL pointer dereference, address: 0000000000000000
PF: supervisor instruction fetch in kernel mode
PF: error_code(0x0010) - not-present page
PGD 0 P4D 0
Oops: 0010 [#1] SMP
CPU: 2 PID: 9446 Comm: qla2xxx_8_dpc Tainted: G O 6.1.133 #1
Hardware name: Supermicro Super Server/X11SPL-F, BIOS 4.2 12/15/2023
RIP: 0010:0x0
Code: Unable to access opcode bytes at 0xffffffffffffffd6.
RSP: 0018:ffffc90001f93dc8 EFLAGS: 00010206
RAX: 0000000000000282 RBX: 0000000000000355 RCX: ffff88810d16a000
RDX: ffff88810dbadaa8 RSI: 0000000000080000 RDI: ffff888169dc38c0
RBP: ffff888169dc38c0 R08: 0000000000000001 R09: 0000000000000045
R10: ffffffffa034bdf0 R11: 0000000000000000 R12: ffff88810800bb40
R13: 0000000000001aa8 R14: ffff888100136610 R15: ffff8881070f7400
FS: 0000000000000000(0000) GS:ffff88bf80080000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 000000010c8ff006 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
? __die+0x4d/0x8b
? page_fault_oops+0x91/0x180
? trace_buffer_unlock_commit_regs+0x38/0x1a0
? exc_page_fault+0x391/0x5e0
? asm_exc_page_fault+0x22/0x30
__qla2x00_abort_all_cmds+0xcb/0x3e0 [qla2xxx_scst]
qla2x00_abort_all_cmds+0x50/0x70 [qla2xxx_scst]
qla2x00_abort_isp_cleanup+0x3b7/0x4b0 [qla2xxx_scst]
qla2x00_abort_isp+0xfd/0x860 [qla2xxx_scst]
qla2x00_do_dpc+0x581/0xa40 [qla2xxx_scst]
kthread+0xa8/0xd0
</TASK>
Then commit 4475afa2646d ("scsi: qla2xxx: Complete command early within
lock") added the spinlock back, because not having the lock caused a
race and a crash. But qla2x00_abort_srb() in the switch below already
checks for qla2x00_chip_is_down() and handles it the same way, so the
code above the switch is now redundant and still buggy in target-mode.
Remove it.
In the Linux kernel, the following vulnerability has been resolved:
ksmbd: fix use-after-free in ksmbd_tree_connect_put under concurrency
Under high concurrency, A tree-connection object (tcon) is freed on
a disconnect path while another path still holds a reference and later
executes *_put()/write on it.
In the Linux kernel, the following vulnerability has been resolved:
net/mlx5: fw_tracer, Validate format string parameters
Add validation for format string parameters in the firmware tracer to
prevent potential security vulnerabilities and crashes from malformed
format strings received from firmware.
The firmware tracer receives format strings from the device firmware and
uses them to format trace messages. Without proper validation, bad
firmware could provide format strings with invalid format specifiers
(e.g., %s, %p, %n) that could lead to crashes, or other undefined
behavior.
Add mlx5_tracer_validate_params() to validate that all format specifiers
in trace strings are limited to safe integer/hex formats (%x, %d, %i,
%u, %llx, %lx, etc.). Reject strings containing other format types that
could be used to access arbitrary memory or cause crashes.
Invalid format strings are added to the trace output for visibility with
"BAD_FORMAT: " prefix.
In the Linux kernel, the following vulnerability has been resolved:
net/sched: ets: Remove drr class from the active list if it changes to strict
Whenever a user issues an ets qdisc change command, transforming a
drr class into a strict one, the ets code isn't checking whether that
class was in the active list and removing it. This means that, if a
user changes a strict class (which was in the active list) back to a drr
one, that class will be added twice to the active list [1].
Doing so with the following commands:
tc qdisc add dev lo root handle 1: ets bands 2 strict 1
tc qdisc add dev lo parent 1:2 handle 20: \
tbf rate 8bit burst 100b latency 1s
tc filter add dev lo parent 1: basic classid 1:2
ping -c1 -W0.01 -s 56 127.0.0.1
tc qdisc change dev lo root handle 1: ets bands 2 strict 2
tc qdisc change dev lo root handle 1: ets bands 2 strict 1
ping -c1 -W0.01 -s 56 127.0.0.1
Will trigger the following splat with list debug turned on:
[ 59.279014][ T365] ------------[ cut here ]------------
[ 59.279452][ T365] list_add double add: new=ffff88801d60e350, prev=ffff88801d60e350, next=ffff88801d60e2c0.
[ 59.280153][ T365] WARNING: CPU: 3 PID: 365 at lib/list_debug.c:35 __list_add_valid_or_report+0x17f/0x220
[ 59.280860][ T365] Modules linked in:
[ 59.281165][ T365] CPU: 3 UID: 0 PID: 365 Comm: tc Not tainted 6.18.0-rc7-00105-g7e9f13163c13-dirty #239 PREEMPT(voluntary)
[ 59.281977][ T365] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 59.282391][ T365] RIP: 0010:__list_add_valid_or_report+0x17f/0x220
[ 59.282842][ T365] Code: 89 c6 e8 d4 b7 0d ff 90 0f 0b 90 90 31 c0 e9 31 ff ff ff 90 48 c7 c7 e0 a0 22 9f 48 89 f2 48 89 c1 4c 89 c6 e8 b2 b7 0d ff 90 <0f> 0b 90 90 31 c0 e9 0f ff ff ff 48 89 f7 48 89 44 24 10 4c 89 44
...
[ 59.288812][ T365] Call Trace:
[ 59.289056][ T365] <TASK>
[ 59.289224][ T365] ? srso_alias_return_thunk+0x5/0xfbef5
[ 59.289546][ T365] ets_qdisc_change+0xd2b/0x1e80
[ 59.289891][ T365] ? __lock_acquire+0x7e7/0x1be0
[ 59.290223][ T365] ? __pfx_ets_qdisc_change+0x10/0x10
[ 59.290546][ T365] ? srso_alias_return_thunk+0x5/0xfbef5
[ 59.290898][ T365] ? __mutex_trylock_common+0xda/0x240
[ 59.291228][ T365] ? __pfx___mutex_trylock_common+0x10/0x10
[ 59.291655][ T365] ? srso_alias_return_thunk+0x5/0xfbef5
[ 59.291993][ T365] ? srso_alias_return_thunk+0x5/0xfbef5
[ 59.292313][ T365] ? trace_contention_end+0xc8/0x110
[ 59.292656][ T365] ? srso_alias_return_thunk+0x5/0xfbef5
[ 59.293022][ T365] ? srso_alias_return_thunk+0x5/0xfbef5
[ 59.293351][ T365] tc_modify_qdisc+0x63a/0x1cf0
Fix this by always checking and removing an ets class from the active list
when changing it to strict.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/tree/net/sched/sch_ets.c?id=ce052b9402e461a9aded599f5b47e76bc727f7de#n663
In the Linux kernel, the following vulnerability has been resolved:
io_uring: fix filename leak in __io_openat_prep()
__io_openat_prep() allocates a struct filename using getname(). However,
for the condition of the file being installed in the fixed file table as
well as having O_CLOEXEC flag set, the function returns early. At that
point, the request doesn't have REQ_F_NEED_CLEANUP flag set. Due to this,
the memory for the newly allocated struct filename is not cleaned up,
causing a memory leak.
Fix this by setting the REQ_F_NEED_CLEANUP for the request just after the
successful getname() call, so that when the request is torn down, the
filename will be cleaned up, along with other resources needing cleanup.
In the Linux kernel, the following vulnerability has been resolved:
ipvs: fix ipv4 null-ptr-deref in route error path
The IPv4 code path in __ip_vs_get_out_rt() calls dst_link_failure()
without ensuring skb->dev is set, leading to a NULL pointer dereference
in fib_compute_spec_dst() when ipv4_link_failure() attempts to send
ICMP destination unreachable messages.
The issue emerged after commit ed0de45a1008 ("ipv4: recompile ip options
in ipv4_link_failure") started calling __ip_options_compile() from
ipv4_link_failure(). This code path eventually calls fib_compute_spec_dst()
which dereferences skb->dev. An attempt was made to fix the NULL skb->dev
dereference in commit 0113d9c9d1cc ("ipv4: fix null-deref in
ipv4_link_failure"), but it only addressed the immediate dev_net(skb->dev)
dereference by using a fallback device. The fix was incomplete because
fib_compute_spec_dst() later in the call chain still accesses skb->dev
directly, which remains NULL when IPVS calls dst_link_failure().
The crash occurs when:
1. IPVS processes a packet in NAT mode with a misconfigured destination
2. Route lookup fails in __ip_vs_get_out_rt() before establishing a route
3. The error path calls dst_link_failure(skb) with skb->dev == NULL
4. ipv4_link_failure() → ipv4_send_dest_unreach() →
__ip_options_compile() → fib_compute_spec_dst()
5. fib_compute_spec_dst() dereferences NULL skb->dev
Apply the same fix used for IPv6 in commit 326bf17ea5d4 ("ipvs: fix
ipv6 route unreach panic"): set skb->dev from skb_dst(skb)->dev before
calling dst_link_failure().
KASAN: null-ptr-deref in range [0x0000000000000328-0x000000000000032f]
CPU: 1 PID: 12732 Comm: syz.1.3469 Not tainted 6.6.114 #2
RIP: 0010:__in_dev_get_rcu include/linux/inetdevice.h:233
RIP: 0010:fib_compute_spec_dst+0x17a/0x9f0 net/ipv4/fib_frontend.c:285
Call Trace:
<TASK>
spec_dst_fill net/ipv4/ip_options.c:232
spec_dst_fill net/ipv4/ip_options.c:229
__ip_options_compile+0x13a1/0x17d0 net/ipv4/ip_options.c:330
ipv4_send_dest_unreach net/ipv4/route.c:1252
ipv4_link_failure+0x702/0xb80 net/ipv4/route.c:1265
dst_link_failure include/net/dst.h:437
__ip_vs_get_out_rt+0x15fd/0x19e0 net/netfilter/ipvs/ip_vs_xmit.c:412
ip_vs_nat_xmit+0x1d8/0xc80 net/netfilter/ipvs/ip_vs_xmit.c:764
In the Linux kernel, the following vulnerability has been resolved:
media: iris: Add sanity check for stop streaming
Add sanity check in iris_vb2_stop_streaming. If inst->state is
already IRIS_INST_ERROR, we should skip the stream_off operation
because it would still send packets to the firmware.
In iris_kill_session, inst->state is set to IRIS_INST_ERROR and
session_close is executed, which will kfree(inst_hfi_gen2->packet).
If stop_streaming is called afterward, it will cause a crash.
[bod: remove qcom from patch title]