In the Linux kernel, the following vulnerability has been resolved:
pinctrl: renesas: rzn1: Fix possible null-ptr-deref in sh_pfc_map_resources()
It will cause null-ptr-deref when using 'res', if platform_get_resource()
returns NULL, so move using 'res' after devm_ioremap_resource() that
will check it to avoid null-ptr-deref.
And use devm_platform_get_and_ioremap_resource() to simplify code.
In the Linux kernel, the following vulnerability has been resolved:
soc: bcm: Check for NULL return of devm_kzalloc()
As the potential failure of allocation, devm_kzalloc() may return NULL. Then
the 'pd->pmb' and the follow lines of code may bring null pointer dereference.
Therefore, it is better to check the return value of devm_kzalloc() to avoid
this confusion.
In the Linux kernel, the following vulnerability has been resolved:
ARM: hisi: Add missing of_node_put after of_find_compatible_node
of_find_compatible_node will increment the refcount of the returned
device_node. Calling of_node_put() to avoid the refcount leak
In the Linux kernel, the following vulnerability has been resolved:
nvdimm: Fix firmware activation deadlock scenarios
Lockdep reports the following deadlock scenarios for CXL root device
power-management, device_prepare(), operations, and device_shutdown()
operations for 'nd_region' devices:
Chain exists of:
&nvdimm_region_key --> &nvdimm_bus->reconfig_mutex --> system_transition_mutex
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(system_transition_mutex);
lock(&nvdimm_bus->reconfig_mutex);
lock(system_transition_mutex);
lock(&nvdimm_region_key);
Chain exists of:
&cxl_nvdimm_bridge_key --> acpi_scan_lock --> &cxl_root_key
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&cxl_root_key);
lock(acpi_scan_lock);
lock(&cxl_root_key);
lock(&cxl_nvdimm_bridge_key);
These stem from holding nvdimm_bus_lock() over hibernate_quiet_exec()
which walks the entire system device topology taking device_lock() along
the way. The nvdimm_bus_lock() is protecting against unregistration,
multiple simultaneous ops callers, and preventing activate_show() from
racing activate_store(). For the first 2, the lock is redundant.
Unregistration already flushes all ops users, and sysfs already prevents
multiple threads to be active in an ops handler at the same time. For
the last userspace should already be waiting for its last
activate_store() to complete, and does not need activate_show() to flush
the write side, so this lock usage can be deleted in these attributes.
In the Linux kernel, the following vulnerability has been resolved:
pinctrl: renesas: core: Fix possible null-ptr-deref in sh_pfc_map_resources()
It will cause null-ptr-deref when using 'res', if platform_get_resource()
returns NULL, so move using 'res' after devm_ioremap_resource() that
will check it to avoid null-ptr-deref.
And use devm_platform_get_and_ioremap_resource() to simplify code.
In the Linux kernel, the following vulnerability has been resolved:
list: fix a data-race around ep->rdllist
ep_poll() first calls ep_events_available() with no lock held and checks
if ep->rdllist is empty by list_empty_careful(), which reads
rdllist->prev. Thus all accesses to it need some protection to avoid
store/load-tearing.
Note INIT_LIST_HEAD_RCU() already has the annotation for both prev
and next.
Commit bf3b9f6372c4 ("epoll: Add busy poll support to epoll with socket
fds.") added the first lockless ep_events_available(), and commit
c5a282e9635e ("fs/epoll: reduce the scope of wq lock in epoll_wait()")
made some ep_events_available() calls lockless and added single call under
a lock, finally commit e59d3c64cba6 ("epoll: eliminate unnecessary lock
for zero timeout") made the last ep_events_available() lockless.
BUG: KCSAN: data-race in do_epoll_wait / do_epoll_wait
write to 0xffff88810480c7d8 of 8 bytes by task 1802 on cpu 0:
INIT_LIST_HEAD include/linux/list.h:38 [inline]
list_splice_init include/linux/list.h:492 [inline]
ep_start_scan fs/eventpoll.c:622 [inline]
ep_send_events fs/eventpoll.c:1656 [inline]
ep_poll fs/eventpoll.c:1806 [inline]
do_epoll_wait+0x4eb/0xf40 fs/eventpoll.c:2234
do_epoll_pwait fs/eventpoll.c:2268 [inline]
__do_sys_epoll_pwait fs/eventpoll.c:2281 [inline]
__se_sys_epoll_pwait+0x12b/0x240 fs/eventpoll.c:2275
__x64_sys_epoll_pwait+0x74/0x80 fs/eventpoll.c:2275
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
read to 0xffff88810480c7d8 of 8 bytes by task 1799 on cpu 1:
list_empty_careful include/linux/list.h:329 [inline]
ep_events_available fs/eventpoll.c:381 [inline]
ep_poll fs/eventpoll.c:1797 [inline]
do_epoll_wait+0x279/0xf40 fs/eventpoll.c:2234
do_epoll_pwait fs/eventpoll.c:2268 [inline]
__do_sys_epoll_pwait fs/eventpoll.c:2281 [inline]
__se_sys_epoll_pwait+0x12b/0x240 fs/eventpoll.c:2275
__x64_sys_epoll_pwait+0x74/0x80 fs/eventpoll.c:2275
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
value changed: 0xffff88810480c7d0 -> 0xffff888103c15098
Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 1799 Comm: syz-fuzzer Tainted: G W 5.17.0-rc7-syzkaller-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
In the Linux kernel, the following vulnerability has been resolved:
drivers/base/node.c: fix compaction sysfs file leak
Compaction sysfs file is created via compaction_register_node in
register_node. But we forgot to remove it in unregister_node. Thus
compaction sysfs file is leaked. Using compaction_unregister_node to fix
this issue.
In the Linux kernel, the following vulnerability has been resolved:
tty: fix deadlock caused by calling printk() under tty_port->lock
pty_write() invokes kmalloc() which may invoke a normal printk() to print
failure message. This can cause a deadlock in the scenario reported by
syz-bot below:
CPU0 CPU1 CPU2
---- ---- ----
lock(console_owner);
lock(&port_lock_key);
lock(&port->lock);
lock(&port_lock_key);
lock(&port->lock);
lock(console_owner);
As commit dbdda842fe96 ("printk: Add console owner and waiter logic to
load balance console writes") said, such deadlock can be prevented by
using printk_deferred() in kmalloc() (which is invoked in the section
guarded by the port->lock). But there are too many printk() on the
kmalloc() path, and kmalloc() can be called from anywhere, so changing
printk() to printk_deferred() is too complicated and inelegant.
Therefore, this patch chooses to specify __GFP_NOWARN to kmalloc(), so
that printk() will not be called, and this deadlock problem can be
avoided.
Syzbot reported the following lockdep error:
======================================================
WARNING: possible circular locking dependency detected
5.4.143-00237-g08ccc19a-dirty #10 Not tainted
------------------------------------------------------
syz-executor.4/29420 is trying to acquire lock:
ffffffff8aedb2a0 (console_owner){....}-{0:0}, at: console_trylock_spinning kernel/printk/printk.c:1752 [inline]
ffffffff8aedb2a0 (console_owner){....}-{0:0}, at: vprintk_emit+0x2ca/0x470 kernel/printk/printk.c:2023
but task is already holding lock:
ffff8880119c9158 (&port->lock){-.-.}-{2:2}, at: pty_write+0xf4/0x1f0 drivers/tty/pty.c:120
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (&port->lock){-.-.}-{2:2}:
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x35/0x50 kernel/locking/spinlock.c:159
tty_port_tty_get drivers/tty/tty_port.c:288 [inline] <-- lock(&port->lock);
tty_port_default_wakeup+0x1d/0xb0 drivers/tty/tty_port.c:47
serial8250_tx_chars+0x530/0xa80 drivers/tty/serial/8250/8250_port.c:1767
serial8250_handle_irq.part.0+0x31f/0x3d0 drivers/tty/serial/8250/8250_port.c:1854
serial8250_handle_irq drivers/tty/serial/8250/8250_port.c:1827 [inline] <-- lock(&port_lock_key);
serial8250_default_handle_irq+0xb2/0x220 drivers/tty/serial/8250/8250_port.c:1870
serial8250_interrupt+0xfd/0x200 drivers/tty/serial/8250/8250_core.c:126
__handle_irq_event_percpu+0x109/0xa50 kernel/irq/handle.c:156
[...]
-> #1 (&port_lock_key){-.-.}-{2:2}:
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x35/0x50 kernel/locking/spinlock.c:159
serial8250_console_write+0x184/0xa40 drivers/tty/serial/8250/8250_port.c:3198
<-- lock(&port_lock_key);
call_console_drivers kernel/printk/printk.c:1819 [inline]
console_unlock+0x8cb/0xd00 kernel/printk/printk.c:2504
vprintk_emit+0x1b5/0x470 kernel/printk/printk.c:2024 <-- lock(console_owner);
vprintk_func+0x8d/0x250 kernel/printk/printk_safe.c:394
printk+0xba/0xed kernel/printk/printk.c:2084
register_console+0x8b3/0xc10 kernel/printk/printk.c:2829
univ8250_console_init+0x3a/0x46 drivers/tty/serial/8250/8250_core.c:681
console_init+0x49d/0x6d3 kernel/printk/printk.c:2915
start_kernel+0x5e9/0x879 init/main.c:713
secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:241
-> #0 (console_owner){....}-{0:0}:
[...]
lock_acquire+0x127/0x340 kernel/locking/lockdep.c:4734
console_trylock_spinning kernel/printk/printk.c:1773
---truncated---
In the Linux kernel, the following vulnerability has been resolved:
powerpc/rtas: Keep MSR[RI] set when calling RTAS
RTAS runs in real mode (MSR[DR] and MSR[IR] unset) and in 32-bit big
endian mode (MSR[SF,LE] unset).
The change in MSR is done in enter_rtas() in a relatively complex way,
since the MSR value could be hardcoded.
Furthermore, a panic has been reported when hitting the watchdog interrupt
while running in RTAS, this leads to the following stack trace:
watchdog: CPU 24 Hard LOCKUP
watchdog: CPU 24 TB:997512652051031, last heartbeat TB:997504470175378 (15980ms ago)
...
Supported: No, Unreleased kernel
CPU: 24 PID: 87504 Comm: drmgr Kdump: loaded Tainted: G E X 5.14.21-150400.71.1.bz196362_2-default #1 SLE15-SP4 (unreleased) 0d821077ef4faa8dfaf370efb5fdca1fa35f4e2c
NIP: 000000001fb41050 LR: 000000001fb4104c CTR: 0000000000000000
REGS: c00000000fc33d60 TRAP: 0100 Tainted: G E X (5.14.21-150400.71.1.bz196362_2-default)
MSR: 8000000002981000 <SF,VEC,VSX,ME> CR: 48800002 XER: 20040020
CFAR: 000000000000011c IRQMASK: 1
GPR00: 0000000000000003 ffffffffffffffff 0000000000000001 00000000000050dc
GPR04: 000000001ffb6100 0000000000000020 0000000000000001 000000001fb09010
GPR08: 0000000020000000 0000000000000000 0000000000000000 0000000000000000
GPR12: 80040000072a40a8 c00000000ff8b680 0000000000000007 0000000000000034
GPR16: 000000001fbf6e94 000000001fbf6d84 000000001fbd1db0 000000001fb3f008
GPR20: 000000001fb41018 ffffffffffffffff 000000000000017f fffffffffffff68f
GPR24: 000000001fb18fe8 000000001fb3e000 000000001fb1adc0 000000001fb1cf40
GPR28: 000000001fb26000 000000001fb460f0 000000001fb17f18 000000001fb17000
NIP [000000001fb41050] 0x1fb41050
LR [000000001fb4104c] 0x1fb4104c
Call Trace:
Instruction dump:
XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
Oops: Unrecoverable System Reset, sig: 6 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
...
Supported: No, Unreleased kernel
CPU: 24 PID: 87504 Comm: drmgr Kdump: loaded Tainted: G E X 5.14.21-150400.71.1.bz196362_2-default #1 SLE15-SP4 (unreleased) 0d821077ef4faa8dfaf370efb5fdca1fa35f4e2c
NIP: 000000001fb41050 LR: 000000001fb4104c CTR: 0000000000000000
REGS: c00000000fc33d60 TRAP: 0100 Tainted: G E X (5.14.21-150400.71.1.bz196362_2-default)
MSR: 8000000002981000 <SF,VEC,VSX,ME> CR: 48800002 XER: 20040020
CFAR: 000000000000011c IRQMASK: 1
GPR00: 0000000000000003 ffffffffffffffff 0000000000000001 00000000000050dc
GPR04: 000000001ffb6100 0000000000000020 0000000000000001 000000001fb09010
GPR08: 0000000020000000 0000000000000000 0000000000000000 0000000000000000
GPR12: 80040000072a40a8 c00000000ff8b680 0000000000000007 0000000000000034
GPR16: 000000001fbf6e94 000000001fbf6d84 000000001fbd1db0 000000001fb3f008
GPR20: 000000001fb41018 ffffffffffffffff 000000000000017f fffffffffffff68f
GPR24: 000000001fb18fe8 000000001fb3e000 000000001fb1adc0 000000001fb1cf40
GPR28: 000000001fb26000 000000001fb460f0 000000001fb17f18 000000001fb17000
NIP [000000001fb41050] 0x1fb41050
LR [000000001fb4104c] 0x1fb4104c
Call Trace:
Instruction dump:
XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
---[ end trace 3ddec07f638c34a2 ]---
This happens because MSR[RI] is unset when entering RTAS but there is no
valid reason to not set it here.
RTAS is expected to be called with MSR[RI] as specified in PAPR+ section
"7.2.1 Machine State":
R1–7.2.1–9. If called with MSR[RI] equal to 1, then RTAS must protect
its own critical regions from recursion by setting the MSR[RI] bit to
0 when in the critical regions.
Fixing this by reviewing the way MSR is compute before calling RTAS. Now a
hardcoded value meaning real
---truncated---
In the Linux kernel, the following vulnerability has been resolved:
powerpc/fsl_rio: Fix refcount leak in fsl_rio_setup
of_parse_phandle() returns a node pointer with refcount
incremented, we should use of_node_put() on it when not need anymore.
Add missing of_node_put() to avoid refcount leak.
In the Linux kernel, the following vulnerability has been resolved:
Input: sparcspkr - fix refcount leak in bbc_beep_probe
of_find_node_by_path() calls of_find_node_opts_by_path(),
which returns a node pointer with refcount
incremented, we should use of_node_put() on it when done.
Add missing of_node_put() to avoid refcount leak.
In the Linux kernel, the following vulnerability has been resolved:
powerpc/xive: Fix refcount leak in xive_spapr_init
of_find_compatible_node() returns a node pointer with refcount
incremented, we should use of_node_put() on it when done.
Add missing of_node_put() to avoid refcount leak.
In the Linux kernel, the following vulnerability has been resolved:
powerpc/papr_scm: Fix leaking nvdimm_events_map elements
Right now 'char *' elements allocated for individual 'stat_id' in
'papr_scm_priv.nvdimm_events_map[]' during papr_scm_pmu_check_events(), get
leaked in papr_scm_remove() and papr_scm_pmu_register(),
papr_scm_pmu_check_events() error paths.
Also individual 'stat_id' arent NULL terminated 'char *' instead they are fixed
8-byte sized identifiers. However papr_scm_pmu_register() assumes it to be a
NULL terminated 'char *' and at other places it assumes it to be a
'papr_scm_perf_stat.stat_id' sized string which is 8-byes in size.
Fix this by allocating the memory for papr_scm_priv.nvdimm_events_map to also
include space for 'stat_id' entries. This is possible since number of available
events/stat_ids are known upfront. This saves some memory and one extra level of
indirection from 'nvdimm_events_map' to 'stat_id'. Also rest of the code
can continue to call 'kfree(papr_scm_priv.nvdimm_events_map)' without needing to
iterate over the array and free up individual elements.
In the Linux kernel, the following vulnerability has been resolved:
mfd: davinci_voicecodec: Fix possible null-ptr-deref davinci_vc_probe()
It will cause null-ptr-deref when using 'res', if platform_get_resource()
returns NULL, so move using 'res' after devm_ioremap_resource() that
will check it to avoid null-ptr-deref.
And use devm_platform_get_and_ioremap_resource() to simplify code.
In the Linux kernel, the following vulnerability has been resolved:
PCI: Avoid pci_dev_lock() AB/BA deadlock with sriov_numvfs_store()
The sysfs sriov_numvfs_store() path acquires the device lock before the
config space access lock:
sriov_numvfs_store
device_lock # A (1) acquire device lock
sriov_configure
vfio_pci_sriov_configure # (for example)
vfio_pci_core_sriov_configure
pci_disable_sriov
sriov_disable
pci_cfg_access_lock
pci_wait_cfg # B (4) wait for dev->block_cfg_access == 0
Previously, pci_dev_lock() acquired the config space access lock before the
device lock:
pci_dev_lock
pci_cfg_access_lock
dev->block_cfg_access = 1 # B (2) set dev->block_cfg_access = 1
device_lock # A (3) wait for device lock
Any path that uses pci_dev_lock(), e.g., pci_reset_function(), may
deadlock with sriov_numvfs_store() if the operations occur in the sequence
(1) (2) (3) (4).
Avoid the deadlock by reversing the order in pci_dev_lock() so it acquires
the device lock before the config space access lock, the same as the
sriov_numvfs_store() path.
[bhelgaas: combined and adapted commit log from Jay Zhou's independent
subsequent posting:
https://lore.kernel.org/r/20220404062539.1710-1-jianjay.zhou@huawei.com]
In the Linux kernel, the following vulnerability has been resolved:
RDMA/hfi1: Prevent use of lock before it is initialized
If there is a failure during probe of hfi1 before the sdma_map_lock is
initialized, the call to hfi1_free_devdata() will attempt to use a lock
that has not been initialized. If the locking correctness validator is on
then an INFO message and stack trace resembling the following may be seen:
INFO: trying to register non-static key.
The code is fine but needs lockdep annotation, or maybe
you didn't initialize this object before use?
turning off the locking correctness validator.
Call Trace:
register_lock_class+0x11b/0x880
__lock_acquire+0xf3/0x7930
lock_acquire+0xff/0x2d0
_raw_spin_lock_irq+0x46/0x60
sdma_clean+0x42a/0x660 [hfi1]
hfi1_free_devdata+0x3a7/0x420 [hfi1]
init_one+0x867/0x11a0 [hfi1]
pci_device_probe+0x40e/0x8d0
The use of sdma_map_lock in sdma_clean() is for freeing the sdma_map
memory, and sdma_map is not allocated/initialized until after
sdma_map_lock has been initialized. This code only needs to be run if
sdma_map is not NULL, and so checking for that condition will avoid trying
to use the lock before it is initialized.
In the Linux kernel, the following vulnerability has been resolved:
powerpc/xics: fix refcount leak in icp_opal_init()
The of_find_compatible_node() function returns a node pointer with
refcount incremented, use of_node_put() on it when done.
In the Linux kernel, the following vulnerability has been resolved:
powerpc/iommu: Add missing of_node_put in iommu_init_early_dart
The device_node pointer is returned by of_find_compatible_node
with refcount incremented. We should use of_node_put() to avoid
the refcount leak.
In the Linux kernel, the following vulnerability has been resolved:
RDMA/hfi1: Prevent panic when SDMA is disabled
If the hfi1 module is loaded with HFI1_CAP_SDMA off, a call to
hfi1_write_iter() will dereference a NULL pointer and panic. A typical
stack frame is:
sdma_select_user_engine [hfi1]
hfi1_user_sdma_process_request [hfi1]
hfi1_write_iter [hfi1]
do_iter_readv_writev
do_iter_write
vfs_writev
do_writev
do_syscall_64
The fix is to test for SDMA in hfi1_write_iter() and fail the I/O with
EINVAL.
In the Linux kernel, the following vulnerability has been resolved:
f2fs: fix to do sanity check on inline_dots inode
As Wenqing reported in bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=215765
It will cause a kernel panic with steps:
- mkdir mnt
- mount tmp40.img mnt
- ls mnt
folio_mark_dirty+0x33/0x50
f2fs_add_regular_entry+0x541/0xad0 [f2fs]
f2fs_add_dentry+0x6c/0xb0 [f2fs]
f2fs_do_add_link+0x182/0x230 [f2fs]
__recover_dot_dentries+0x2d6/0x470 [f2fs]
f2fs_lookup+0x5af/0x6a0 [f2fs]
__lookup_slow+0xac/0x200
lookup_slow+0x45/0x70
walk_component+0x16c/0x250
path_lookupat+0x8b/0x1f0
filename_lookup+0xef/0x250
user_path_at_empty+0x46/0x70
vfs_statx+0x98/0x190
__do_sys_newlstat+0x41/0x90
__x64_sys_newlstat+0x1a/0x30
do_syscall_64+0x37/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae
The root cause is for special file: e.g. character, block, fifo or
socket file, f2fs doesn't assign address space operations pointer array
for mapping->a_ops field, so, in a fuzzed image, if inline_dots flag was
tagged in special file, during lookup(), when f2fs runs into
__recover_dot_dentries(), it will cause NULL pointer access once
f2fs_add_regular_entry() calls a_ops->set_dirty_page().
In the Linux kernel, the following vulnerability has been resolved:
iommu/mediatek: Remove clk_disable in mtk_iommu_remove
After the commit b34ea31fe013 ("iommu/mediatek: Always enable the clk on
resume"), the iommu clock is controlled by the runtime callback.
thus remove the clk control in the mtk_iommu_remove.
Otherwise, it will warning like:
echo 14018000.iommu > /sys/bus/platform/drivers/mtk-iommu/unbind
[ 51.413044] ------------[ cut here ]------------
[ 51.413648] vpp0_smi_iommu already disabled
[ 51.414233] WARNING: CPU: 2 PID: 157 at */v5.15-rc1/kernel/mediatek/
drivers/clk/clk.c:952 clk_core_disable+0xb0/0xb8
[ 51.417174] Hardware name: MT8195V/C(ENG) (DT)
[ 51.418635] pc : clk_core_disable+0xb0/0xb8
[ 51.419177] lr : clk_core_disable+0xb0/0xb8
...
[ 51.429375] Call trace:
[ 51.429694] clk_core_disable+0xb0/0xb8
[ 51.430193] clk_core_disable_lock+0x24/0x40
[ 51.430745] clk_disable+0x20/0x30
[ 51.431189] mtk_iommu_remove+0x58/0x118
[ 51.431705] platform_remove+0x28/0x60
[ 51.432197] device_release_driver_internal+0x110/0x1f0
[ 51.432873] device_driver_detach+0x18/0x28
[ 51.433418] unbind_store+0xd4/0x108
[ 51.433886] drv_attr_store+0x24/0x38
[ 51.434363] sysfs_kf_write+0x40/0x58
[ 51.434843] kernfs_fop_write_iter+0x164/0x1e0
In the Linux kernel, the following vulnerability has been resolved:
f2fs: fix dereference of stale list iterator after loop body
The list iterator variable will be a bogus pointer if no break was hit.
Dereferencing it (cur->page in this case) could load an out-of-bounds/undefined
value making it unsafe to use that in the comparision to determine if the
specific element was found.
Since 'cur->page' *can* be out-ouf-bounds it cannot be guaranteed that
by chance (or intention of an attacker) it matches the value of 'page'
even though the correct element was not found.
This is fixed by using a separate list iterator variable for the loop
and only setting the original variable if a suitable element was found.
Then determing if the element was found is simply checking if the
variable is set.
In the Linux kernel, the following vulnerability has been resolved:
rtla: Avoid record NULL pointer dereference
Fix the following null/deref_null.cocci errors:
./tools/tracing/rtla/src/osnoise_hist.c:870:31-36: ERROR: record is NULL but dereferenced.
./tools/tracing/rtla/src/osnoise_top.c:650:31-36: ERROR: record is NULL but dereferenced.
./tools/tracing/rtla/src/timerlat_hist.c:905:31-36: ERROR: record is NULL but dereferenced.
./tools/tracing/rtla/src/timerlat_top.c:700:31-36: ERROR: record is NULL but dereferenced.
"record" is NULL before calling osnoise_init_trace_tool.
Add a tag "out_free" to avoid dereferring a NULL pointer.
In the Linux kernel, the following vulnerability has been resolved:
dmaengine: idxd: Fix the error handling path in idxd_cdev_register()
If a call to alloc_chrdev_region() fails, the already allocated resources
are leaking.
Add the needed error handling path to fix the leak.
In the Linux kernel, the following vulnerability has been resolved:
net: annotate races around sk->sk_bound_dev_if
UDP sendmsg() is lockless, and reads sk->sk_bound_dev_if while
this field can be changed by another thread.
Adds minimal annotations to avoid KCSAN splats for UDP.
Following patches will add more annotations to potential lockless readers.
BUG: KCSAN: data-race in __ip6_datagram_connect / udpv6_sendmsg
write to 0xffff888136d47a94 of 4 bytes by task 7681 on cpu 0:
__ip6_datagram_connect+0x6e2/0x930 net/ipv6/datagram.c:221
ip6_datagram_connect+0x2a/0x40 net/ipv6/datagram.c:272
inet_dgram_connect+0x107/0x190 net/ipv4/af_inet.c:576
__sys_connect_file net/socket.c:1900 [inline]
__sys_connect+0x197/0x1b0 net/socket.c:1917
__do_sys_connect net/socket.c:1927 [inline]
__se_sys_connect net/socket.c:1924 [inline]
__x64_sys_connect+0x3d/0x50 net/socket.c:1924
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x2b/0x50 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
read to 0xffff888136d47a94 of 4 bytes by task 7670 on cpu 1:
udpv6_sendmsg+0xc60/0x16e0 net/ipv6/udp.c:1436
inet6_sendmsg+0x5f/0x80 net/ipv6/af_inet6.c:652
sock_sendmsg_nosec net/socket.c:705 [inline]
sock_sendmsg net/socket.c:725 [inline]
____sys_sendmsg+0x39a/0x510 net/socket.c:2413
___sys_sendmsg net/socket.c:2467 [inline]
__sys_sendmmsg+0x267/0x4c0 net/socket.c:2553
__do_sys_sendmmsg net/socket.c:2582 [inline]
__se_sys_sendmmsg net/socket.c:2579 [inline]
__x64_sys_sendmmsg+0x53/0x60 net/socket.c:2579
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x2b/0x50 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
value changed: 0x00000000 -> 0xffffff9b
Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 7670 Comm: syz-executor.3 Tainted: G W 5.18.0-rc1-syzkaller-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
I chose to not add Fixes: tag because race has minor consequences
and stable teams busy enough.
In the Linux kernel, the following vulnerability has been resolved:
ext4: fix race condition between ext4_write and ext4_convert_inline_data
Hulk Robot reported a BUG_ON:
==================================================================
EXT4-fs error (device loop3): ext4_mb_generate_buddy:805: group 0,
block bitmap and bg descriptor inconsistent: 25 vs 31513 free clusters
kernel BUG at fs/ext4/ext4_jbd2.c:53!
invalid opcode: 0000 [#1] SMP KASAN PTI
CPU: 0 PID: 25371 Comm: syz-executor.3 Not tainted 5.10.0+ #1
RIP: 0010:ext4_put_nojournal fs/ext4/ext4_jbd2.c:53 [inline]
RIP: 0010:__ext4_journal_stop+0x10e/0x110 fs/ext4/ext4_jbd2.c:116
[...]
Call Trace:
ext4_write_inline_data_end+0x59a/0x730 fs/ext4/inline.c:795
generic_perform_write+0x279/0x3c0 mm/filemap.c:3344
ext4_buffered_write_iter+0x2e3/0x3d0 fs/ext4/file.c:270
ext4_file_write_iter+0x30a/0x11c0 fs/ext4/file.c:520
do_iter_readv_writev+0x339/0x3c0 fs/read_write.c:732
do_iter_write+0x107/0x430 fs/read_write.c:861
vfs_writev fs/read_write.c:934 [inline]
do_pwritev+0x1e5/0x380 fs/read_write.c:1031
[...]
==================================================================
Above issue may happen as follows:
cpu1 cpu2
__________________________|__________________________
do_pwritev
vfs_writev
do_iter_write
ext4_file_write_iter
ext4_buffered_write_iter
generic_perform_write
ext4_da_write_begin
vfs_fallocate
ext4_fallocate
ext4_convert_inline_data
ext4_convert_inline_data_nolock
ext4_destroy_inline_data_nolock
clear EXT4_STATE_MAY_INLINE_DATA
ext4_map_blocks
ext4_ext_map_blocks
ext4_mb_new_blocks
ext4_mb_regular_allocator
ext4_mb_good_group_nolock
ext4_mb_init_group
ext4_mb_init_cache
ext4_mb_generate_buddy --> error
ext4_test_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA)
ext4_restore_inline_data
set EXT4_STATE_MAY_INLINE_DATA
ext4_block_write_begin
ext4_da_write_end
ext4_test_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA)
ext4_write_inline_data_end
handle=NULL
ext4_journal_stop(handle)
__ext4_journal_stop
ext4_put_nojournal(handle)
ref_cnt = (unsigned long)handle
BUG_ON(ref_cnt == 0) ---> BUG_ON
The lock held by ext4_convert_inline_data is xattr_sem, but the lock
held by generic_perform_write is i_rwsem. Therefore, the two locks can
be concurrent.
To solve above issue, we add inode_lock() for ext4_convert_inline_data().
At the same time, move ext4_convert_inline_data() in front of
ext4_punch_hole(), remove similar handling from ext4_punch_hole().
In the Linux kernel, the following vulnerability has been resolved:
ext4: fix memory leak in parse_apply_sb_mount_options()
If processing the on-disk mount options fails after any memory was
allocated in the ext4_fs_context, e.g. s_qf_names, then this memory is
leaked. Fix this by calling ext4_fc_free() instead of kfree() directly.
Reproducer:
mkfs.ext4 -F /dev/vdc
tune2fs /dev/vdc -E mount_opts=usrjquota=file
echo clear > /sys/kernel/debug/kmemleak
mount /dev/vdc /vdc
echo scan > /sys/kernel/debug/kmemleak
sleep 5
echo scan > /sys/kernel/debug/kmemleak
cat /sys/kernel/debug/kmemleak
In the Linux kernel, the following vulnerability has been resolved:
block: Fix potential deadlock in blk_ia_range_sysfs_show()
When being read, a sysfs attribute is already protected against removal
with the kobject node active reference counter. As a result, in
blk_ia_range_sysfs_show(), there is no need to take the queue sysfs
lock when reading the value of a range attribute. Using the queue sysfs
lock in this function creates a potential deadlock situation with the
disk removal, something that a lockdep signals with a splat when the
device is removed:
[ 760.703551] Possible unsafe locking scenario:
[ 760.703551]
[ 760.703554] CPU0 CPU1
[ 760.703556] ---- ----
[ 760.703558] lock(&q->sysfs_lock);
[ 760.703565] lock(kn->active#385);
[ 760.703573] lock(&q->sysfs_lock);
[ 760.703579] lock(kn->active#385);
[ 760.703587]
[ 760.703587] *** DEADLOCK ***
Solve this by removing the mutex_lock()/mutex_unlock() calls from
blk_ia_range_sysfs_show().
In the Linux kernel, the following vulnerability has been resolved:
staging: r8188eu: prevent ->Ssid overflow in rtw_wx_set_scan()
This code has a check to prevent read overflow but it needs another
check to prevent writing beyond the end of the ->Ssid[] array.
In the Linux kernel, the following vulnerability has been resolved:
RDMA/hfi1: Fix potential integer multiplication overflow errors
When multiplying of different types, an overflow is possible even when
storing the result in a larger type. This is because the conversion is
done after the multiplication. So arithmetic overflow and thus in
incorrect value is possible.
Correct an instance of this in the inter packet delay calculation. Fix by
ensuring one of the operands is u64 which will promote the other to u64 as
well ensuring no overflow.
In the Linux kernel, the following vulnerability has been resolved:
lib/string_helpers: fix not adding strarray to device's resource list
Add allocated strarray to device's resource list. This is a must to
automatically release strarray when the device disappears.
Without this fix we have a memory leak in the few drivers which use
devm_kasprintf_strarray().
In the Linux kernel, the following vulnerability has been resolved:
md: Don't set mddev private to NULL in raid0 pers->free
In normal stop process, it does like this:
do_md_stop
|
__md_stop (pers->free(); mddev->private=NULL)
|
md_free (free mddev)
__md_stop sets mddev->private to NULL after pers->free. The raid device
will be stopped and mddev memory is free. But in reshape, it doesn't
free the mddev and mddev will still be used in new raid.
In reshape, it first sets mddev->private to new_pers and then runs
old_pers->free(). Now raid0 sets mddev->private to NULL in raid0_free.
The new raid can't work anymore. It will panic when dereference
mddev->private because of NULL pointer dereference.
It can panic like this:
[63010.814972] kernel BUG at drivers/md/raid10.c:928!
[63010.819778] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[63010.825011] CPU: 3 PID: 44437 Comm: md0_resync Kdump: loaded Not tainted 5.14.0-86.el9.x86_64 #1
[63010.833789] Hardware name: Dell Inc. PowerEdge R6415/07YXFK, BIOS 1.15.0 09/11/2020
[63010.841440] RIP: 0010:raise_barrier+0x161/0x170 [raid10]
[63010.865508] RSP: 0018:ffffc312408bbc10 EFLAGS: 00010246
[63010.870734] RAX: 0000000000000000 RBX: ffffa00bf7d39800 RCX: 0000000000000000
[63010.877866] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffa00bf7d39800
[63010.884999] RBP: 0000000000000000 R08: fffffa4945e74400 R09: 0000000000000000
[63010.892132] R10: ffffa00eed02f798 R11: 0000000000000000 R12: ffffa00bbc435200
[63010.899266] R13: ffffa00bf7d39800 R14: 0000000000000400 R15: 0000000000000003
[63010.906399] FS: 0000000000000000(0000) GS:ffffa00eed000000(0000) knlGS:0000000000000000
[63010.914485] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[63010.920229] CR2: 00007f5cfbe99828 CR3: 0000000105efe000 CR4: 00000000003506e0
[63010.927363] Call Trace:
[63010.929822] ? bio_reset+0xe/0x40
[63010.933144] ? raid10_alloc_init_r10buf+0x60/0xa0 [raid10]
[63010.938629] raid10_sync_request+0x756/0x1610 [raid10]
[63010.943770] md_do_sync.cold+0x3e4/0x94c
[63010.947698] md_thread+0xab/0x160
[63010.951024] ? md_write_inc+0x50/0x50
[63010.954688] kthread+0x149/0x170
[63010.957923] ? set_kthread_struct+0x40/0x40
[63010.962107] ret_from_fork+0x22/0x30
Removing the code that sets mddev->private to NULL in raid0 can fix
problem.
In the Linux kernel, the following vulnerability has been resolved:
tty: goldfish: Use tty_port_destroy() to destroy port
In goldfish_tty_probe(), the port initialized through tty_port_init()
should be destroyed in error paths.In goldfish_tty_remove(), qtty->port
also should be destroyed or else might leak resources.
Fix the above by calling tty_port_destroy().
In the Linux kernel, the following vulnerability has been resolved:
usb: dwc3: gadget: Replace list_for_each_entry_safe() if using giveback
The list_for_each_entry_safe() macro saves the current item (n) and
the item after (n+1), so that n can be safely removed without
corrupting the list. However, when traversing the list and removing
items using gadget giveback, the DWC3 lock is briefly released,
allowing other routines to execute. There is a situation where, while
items are being removed from the cancelled_list using
dwc3_gadget_ep_cleanup_cancelled_requests(), the pullup disable
routine is running in parallel (due to UDC unbind). As the cleanup
routine removes n, and the pullup disable removes n+1, once the
cleanup retakes the DWC3 lock, it references a request who was already
removed/handled. With list debug enabled, this leads to a panic.
Ensure all instances of the macro are replaced where gadget giveback
is used.
Example call stack:
Thread#1:
__dwc3_gadget_ep_set_halt() - CLEAR HALT
-> dwc3_gadget_ep_cleanup_cancelled_requests()
->list_for_each_entry_safe()
->dwc3_gadget_giveback(n)
->dwc3_gadget_del_and_unmap_request()- n deleted[cancelled_list]
->spin_unlock
->Thread#2 executes
...
->dwc3_gadget_giveback(n+1)
->Already removed!
Thread#2:
dwc3_gadget_pullup()
->waiting for dwc3 spin_lock
...
->Thread#1 released lock
->dwc3_stop_active_transfers()
->dwc3_remove_requests()
->fetches n+1 item from cancelled_list (n removed by Thread#1)
->dwc3_gadget_giveback()
->dwc3_gadget_del_and_unmap_request()- n+1 deleted[cancelled_list]
->spin_unlock
In the Linux kernel, the following vulnerability has been resolved:
blk-iolatency: Fix inflight count imbalances and IO hangs on offline
iolatency needs to track the number of inflight IOs per cgroup. As this
tracking can be expensive, it is disabled when no cgroup has iolatency
configured for the device. To ensure that the inflight counters stay
balanced, iolatency_set_limit() freezes the request_queue while manipulating
the enabled counter, which ensures that no IO is in flight and thus all
counters are zero.
Unfortunately, iolatency_set_limit() isn't the only place where the enabled
counter is manipulated. iolatency_pd_offline() can also dec the counter and
trigger disabling. As this disabling happens without freezing the q, this
can easily happen while some IOs are in flight and thus leak the counts.
This can be easily demonstrated by turning on iolatency on an one empty
cgroup while IOs are in flight in other cgroups and then removing the
cgroup. Note that iolatency shouldn't have been enabled elsewhere in the
system to ensure that removing the cgroup disables iolatency for the whole
device.
The following keeps flipping on and off iolatency on sda:
echo +io > /sys/fs/cgroup/cgroup.subtree_control
while true; do
mkdir -p /sys/fs/cgroup/test
echo '8:0 target=100000' > /sys/fs/cgroup/test/io.latency
sleep 1
rmdir /sys/fs/cgroup/test
sleep 1
done
and there's concurrent fio generating direct rand reads:
fio --name test --filename=/dev/sda --direct=1 --rw=randread \
--runtime=600 --time_based --iodepth=256 --numjobs=4 --bs=4k
while monitoring with the following drgn script:
while True:
for css in css_for_each_descendant_pre(prog['blkcg_root'].css.address_of_()):
for pos in hlist_for_each(container_of(css, 'struct blkcg', 'css').blkg_list):
blkg = container_of(pos, 'struct blkcg_gq', 'blkcg_node')
pd = blkg.pd[prog['blkcg_policy_iolatency'].plid]
if pd.value_() == 0:
continue
iolat = container_of(pd, 'struct iolatency_grp', 'pd')
inflight = iolat.rq_wait.inflight.counter.value_()
if inflight:
print(f'inflight={inflight} {disk_name(blkg.q.disk).decode("utf-8")} '
f'{cgroup_path(css.cgroup).decode("utf-8")}')
time.sleep(1)
The monitoring output looks like the following:
inflight=1 sda /user.slice
inflight=1 sda /user.slice
...
inflight=14 sda /user.slice
inflight=13 sda /user.slice
inflight=17 sda /user.slice
inflight=15 sda /user.slice
inflight=18 sda /user.slice
inflight=17 sda /user.slice
inflight=20 sda /user.slice
inflight=19 sda /user.slice <- fio stopped, inflight stuck at 19
inflight=19 sda /user.slice
inflight=19 sda /user.slice
If a cgroup with stuck inflight ends up getting throttled, the throttled IOs
will never get issued as there's no completion event to wake it up leading
to an indefinite hang.
This patch fixes the bug by unifying enable handling into a work item which
is automatically kicked off from iolatency_set_min_lat_nsec() which is
called from both iolatency_set_limit() and iolatency_pd_offline() paths.
Punting to a work item is necessary as iolatency_pd_offline() is called
under spinlocks while freezing a request_queue requires a sleepable context.
This also simplifies the code reducing LOC sans the comments and avoids the
unnecessary freezes which were happening whenever a cgroup's latency target
is newly set or cleared.
In the Linux kernel, the following vulnerability has been resolved:
serial: 8250_aspeed_vuart: Fix potential NULL dereference in aspeed_vuart_probe
platform_get_resource() may fail and return NULL, so we should
better check it's return value to avoid a NULL pointer dereference.
In the Linux kernel, the following vulnerability has been resolved:
usb: usbip: fix a refcount leak in stub_probe()
usb_get_dev() is called in stub_device_alloc(). When stub_probe() fails
after that, usb_put_dev() needs to be called to release the reference.
Fix this by moving usb_put_dev() to sdev_free error path handling.
Find this by code review.
In the Linux kernel, the following vulnerability has been resolved:
watchdog: rzg2l_wdt: Fix 32bit overflow issue
The value of timer_cycle_us can be 0 due to 32bit overflow.
For eg:- If we assign the counter value "0xfff" for computing
maxval.
This patch fixes this issue by appending ULL to 1024, so that
it is promoted to 64bit.
This patch also fixes the warning message, 'watchdog: Invalid min and
max timeout values, resetting to 0!'.
In the Linux kernel, the following vulnerability has been resolved:
net: ethernet: ti: am65-cpsw-nuss: Fix some refcount leaks
of_get_child_by_name() returns a node pointer with refcount
incremented, we should use of_node_put() on it when not need anymore.
am65_cpsw_init_cpts() and am65_cpsw_nuss_probe() don't release
the refcount in error case.
Add missing of_node_put() to avoid refcount leak.
In the Linux kernel, the following vulnerability has been resolved:
soc: rockchip: Fix refcount leak in rockchip_grf_init
of_find_matching_node_and_match returns a node pointer with refcount
incremented, we should use of_node_put() on it when done.
Add missing of_node_put() to avoid refcount leak.
In the Linux kernel, the following vulnerability has been resolved:
f2fs: fix to avoid f2fs_bug_on() in dec_valid_node_count()
As Yanming reported in bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=215897
I have encountered a bug in F2FS file system in kernel v5.17.
The kernel should enable CONFIG_KASAN=y and CONFIG_KASAN_INLINE=y. You can
reproduce the bug by running the following commands:
The kernel message is shown below:
kernel BUG at fs/f2fs/f2fs.h:2511!
Call Trace:
f2fs_remove_inode_page+0x2a2/0x830
f2fs_evict_inode+0x9b7/0x1510
evict+0x282/0x4e0
do_unlinkat+0x33a/0x540
__x64_sys_unlinkat+0x8e/0xd0
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
The root cause is: .total_valid_block_count or .total_valid_node_count
could fuzzed to zero, then once dec_valid_node_count() was called, it
will cause BUG_ON(), this patch fixes to print warning info and set
SBI_NEED_FSCK into CP instead of panic.
In the Linux kernel, the following vulnerability has been resolved:
sfc: fix considering that all channels have TX queues
Normally, all channels have RX and TX queues, but this is not true if
modparam efx_separate_tx_channels=1 is used. In that cases, some
channels only have RX queues and others only TX queues (or more
preciselly, they have them allocated, but not initialized).
Fix efx_channel_has_tx_queues to return the correct value for this case
too.
Messages shown at probe time before the fix:
sfc 0000:03:00.0 ens6f0np0: MC command 0x82 inlen 544 failed rc=-22 (raw=0) arg=0
------------[ cut here ]------------
netdevice: ens6f0np0: failed to initialise TXQ -1
WARNING: CPU: 1 PID: 626 at drivers/net/ethernet/sfc/ef10.c:2393 efx_ef10_tx_init+0x201/0x300 [sfc]
[...] stripped
RIP: 0010:efx_ef10_tx_init+0x201/0x300 [sfc]
[...] stripped
Call Trace:
efx_init_tx_queue+0xaa/0xf0 [sfc]
efx_start_channels+0x49/0x120 [sfc]
efx_start_all+0x1f8/0x430 [sfc]
efx_net_open+0x5a/0xe0 [sfc]
__dev_open+0xd0/0x190
__dev_change_flags+0x1b3/0x220
dev_change_flags+0x21/0x60
[...] stripped
Messages shown at remove time before the fix:
sfc 0000:03:00.0 ens6f0np0: failed to flush 10 queues
sfc 0000:03:00.0 ens6f0np0: failed to flush queues
In the Linux kernel, the following vulnerability has been resolved:
scsi: sd: Fix potential NULL pointer dereference
If sd_probe() sees an early error before sdkp->device is initialized,
sd_zbc_release_disk() is called. This causes a NULL pointer dereference
when sd_is_zoned() is called inside that function. Avoid this by removing
the call to sd_zbc_release_disk() in sd_probe() error path.
This change is safe and does not result in zone information memory leakage
because the zone information for a zoned disk is allocated only when
sd_revalidate_disk() is called, at which point sdkp->disk_dev is fully set,
resulting in sd_disk_release() being called when needed to cleanup a disk
zone information using sd_zbc_release_disk().
In the Linux kernel, the following vulnerability has been resolved:
rtc: mt6397: check return value after calling platform_get_resource()
It will cause null-ptr-deref if platform_get_resource() returns NULL,
we need check the return value.
In the Linux kernel, the following vulnerability has been resolved:
tipc: check attribute length for bearer name
syzbot reported uninit-value:
=====================================================
BUG: KMSAN: uninit-value in string_nocheck lib/vsprintf.c:644 [inline]
BUG: KMSAN: uninit-value in string+0x4f9/0x6f0 lib/vsprintf.c:725
string_nocheck lib/vsprintf.c:644 [inline]
string+0x4f9/0x6f0 lib/vsprintf.c:725
vsnprintf+0x2222/0x3650 lib/vsprintf.c:2806
vprintk_store+0x537/0x2150 kernel/printk/printk.c:2158
vprintk_emit+0x28b/0xab0 kernel/printk/printk.c:2256
vprintk_default+0x86/0xa0 kernel/printk/printk.c:2283
vprintk+0x15f/0x180 kernel/printk/printk_safe.c:50
_printk+0x18d/0x1cf kernel/printk/printk.c:2293
tipc_enable_bearer net/tipc/bearer.c:371 [inline]
__tipc_nl_bearer_enable+0x2022/0x22a0 net/tipc/bearer.c:1033
tipc_nl_bearer_enable+0x6c/0xb0 net/tipc/bearer.c:1042
genl_family_rcv_msg_doit net/netlink/genetlink.c:731 [inline]
- Do sanity check the attribute length for TIPC_NLA_BEARER_NAME.
- Do not use 'illegal name' in printing message.