Author: Lizhi Hou <[email protected]> Date: Tue Feb 17 11:28:15 2026 -0800 accel/amdxdna: Prevent ubuf size overflow [ Upstream commit 03808abb1d868aed7478a11a82e5bb4b3f1ca6d6 ] The ubuf size calculation may overflow, resulting in an undersized allocation and possible memory corruption. Use check_add_overflow() helpers to validate the size calculation before allocation. Fixes: bd72d4acda10 ("accel/amdxdna: Support user space allocated buffer") Reviewed-by: Mario Limonciello (AMD) <[email protected]> Signed-off-by: Lizhi Hou <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
Author: Lizhi Hou <[email protected]> Date: Thu Feb 5 22:02:37 2026 -0800 accel/amdxdna: Remove buffer size check when creating command BO [ Upstream commit 08fe1b5166fdc81b010d7bf39cd6440620e7931e ] Large command buffers may be used, and they do not always need to be mapped or accessed by the driver. Performing a size check at command BO creation time unnecessarily rejects valid use cases. Remove the buffer size check from command BO creation, and defer vmap and size validation to the paths where the driver actually needs to map and access the command buffer. Fixes: ac49797c1815 ("accel/amdxdna: Add GEM buffer object management") Reviewed-by: Mario Limonciello (AMD) <[email protected]> Signed-off-by: Lizhi Hou <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
Author: Lizhi Hou <[email protected]> Date: Thu Feb 19 13:19:46 2026 -0800 accel/amdxdna: Validate command buffer payload count [ Upstream commit 901ec3470994006bc8dd02399e16b675566c3416 ] The count field in the command header is used to determine the valid payload size. Verify that the valid payload does not exceed the remaining buffer space. Fixes: aac243092b70 ("accel/amdxdna: Add command execution") Reviewed-by: Mario Limonciello (AMD) <[email protected]> Signed-off-by: Lizhi Hou <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
Author: Quentin Schulz <[email protected]> Date: Mon Dec 15 17:36:14 2025 +0100 accel/rocket: fix unwinding in error path in rocket_core_init [ Upstream commit f509a081f6a289f7c66856333b3becce7a33c97e ] When rocket_job_init() is called, iommu_group_get() has already been called, therefore we should call iommu_group_put() and make the iommu_group pointer NULL. This aligns with what's done in rocket_core_fini(). If pm_runtime_resume_and_get() somehow fails, not only should rocket_job_fini() be called but we should also unwind everything done before that, that is, disable PM, put the iommu_group, NULLify it and then call rocket_job_fini(). This is exactly what's done in rocket_core_fini() so let's call that function instead of duplicating the code. Fixes: 0810d5ad88a1 ("accel/rocket: Add job submission IOCTL") Cc: [email protected] Signed-off-by: Quentin Schulz <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]> Signed-off-by: Tomeu Vizoso <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
Author: Quentin Schulz <[email protected]> Date: Mon Dec 15 17:36:15 2025 +0100 accel/rocket: fix unwinding in error path in rocket_probe [ Upstream commit 34f4495a7f72895776b81969639f527c99eb12b9 ] When rocket_core_init() fails (as could be the case with EPROBE_DEFER), we need to properly unwind by decrementing the counter we just incremented and if this is the first core we failed to probe, remove the rocket DRM device with rocket_device_fini() as well. This matches the logic in rocket_remove(). Failing to properly unwind results in out-of-bounds accesses. Fixes: 0810d5ad88a1 ("accel/rocket: Add job submission IOCTL") Cc: [email protected] Signed-off-by: Quentin Schulz <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]> Signed-off-by: Tomeu Vizoso <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
Author: Fabio M. De Francesco <[email protected]> Date: Wed Jan 14 11:14:23 2026 +0100 ACPI: APEI: GHES: Add helper for CPER CXL protocol errors checks [ Upstream commit 70205869686212eb8e4cddf02bf87fd5fd597bc2 ] Move the CPER CXL protocol errors validity check out of cxl_cper_post_prot_err() to new cxl_cper_sec_prot_err_valid() and limit the serial number check only to CXL agents that are CXL devices (UEFI v2.10, Appendix N.2.13). Export the new symbol for reuse by ELOG. Reviewed-by: Dave Jiang <[email protected]> Reviewed-by: Hanjun Guo <[email protected]> Reviewed-by: Jonathan Cameron <[email protected]> Signed-off-by: Fabio M. De Francesco <[email protected]> [ rjw: Subject tweak ] Link: https://patch.msgid.link/[email protected] Signed-off-by: Rafael J. Wysocki <[email protected]> Stable-dep-of: b584bfbd7ec4 ("ACPI: APEI: GHES: Disable KASAN instrumentation when compile testing with clang < 18") Signed-off-by: Sasha Levin <[email protected]>
Author: Nathan Chancellor <[email protected]> Date: Wed Jan 14 16:27:11 2026 -0700 ACPI: APEI: GHES: Disable KASAN instrumentation when compile testing with clang < 18 [ Upstream commit b584bfbd7ec417f257f651cc00a90c66e31dfbf1 ] After a recent innocuous change to drivers/acpi/apei/ghes.c, building ARCH=arm64 allmodconfig with clang-17 or older (which has both CONFIG_KASAN=y and CONFIG_WERROR=y) fails with: drivers/acpi/apei/ghes.c:902:13: error: stack frame size (2768) exceeds limit (2048) in 'ghes_do_proc' [-Werror,-Wframe-larger-than] 902 | static void ghes_do_proc(struct ghes *ghes, | ^ A KASAN pass that removes unneeded stack instrumentation, enabled by default in clang-18 [1], drastically improves stack usage in this case. To avoid the warning in the common allmodconfig case when it can break the build, disable KASAN for ghes.o when compile testing with clang-17 and older. Disabling KASAN outright may hide legitimate runtime issues, so live with the warning in that case; the user can either increase the frame warning limit or disable -Werror, which they should probably do when debugging with KASAN anyways. Closes: https://github.com/ClangBuiltLinux/linux/issues/2148 Link: https://github.com/llvm/llvm-project/commit/51fbab134560ece663517bf1e8c2a30300d08f1a [1] Signed-off-by: Nathan Chancellor <[email protected]> Cc: All applicable <[email protected]> Link: https://patch.msgid.link/20260114-ghes-avoid-wflt-clang-older-than-18-v1-1-9c8248bfe4f4@kernel.org Signed-off-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Rong Zhang <[email protected]> Date: Tue Mar 3 01:32:59 2026 +0800 ALSA: doc: usb-audio: Add doc for QUIRK_FLAG_SKIP_IFACE_SETUP commit 93992667d0ab695ac30ceec91a516fd4bf725d75 upstream. QUIRK_FLAG_SKIP_IFACE_SETUP was introduced into usb-audio before without appropriate documentation, so add it. Fixes: 38c322068a26 ("ALSA: usb-audio: Add QUIRK_FLAG_SKIP_IFACE_SETUP") Cc: [email protected] Signed-off-by: Rong Zhang <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Panagiotis Foliadis <[email protected]> Date: Wed Feb 25 14:53:43 2026 +0000 ALSA: hda/intel: increase default bdl_pos_adj for Nvidia controllers commit e9fb2028f1eb563e653cff3b0d1c87c5e0203d45 upstream. The default bdl_pos_adj of 32 for Nvidia HDA controllers is insufficient on GA102 (and likely other recent Nvidia GPUs) after S3 suspend/resume. The controller's DMA timing degrades after resume, causing premature IRQ detection in azx_position_ok() which results in silent HDMI/DP audio output despite userspace reporting a valid playback state and correct ELD data. Increase bdl_pos_adj to 64 for AZX_DRIVER_NVIDIA, matching the value already used by Intel Apollo Lake for the same class of timing issue. Cc: [email protected] Closes: https://bugzilla.kernel.org/show_bug.cgi?id=221069 Suggested-by: Charalampos Mitrodimas <[email protected]> Signed-off-by: Panagiotis Foliadis <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Panagiotis Foliadis <[email protected]> Date: Sat Feb 21 19:40:58 2026 +0000 ALSA: hda/realtek: Add quirk for Acer Aspire V3-572G commit cbddd303416456db5ceeedaf9e262096f079e861 upstream. The Acer Aspire V3-572G has a combo jack (ALC283) but the BIOS sets pin 0x19 to 0x411111f0 (not connected), so the headset mic is not detected. Add a quirk to override pin 0x19 as a headset mic and enable headset mode. Cc: [email protected] Closes: https://bugzilla.kernel.org/show_bug.cgi?id=221075 Suggested-by: Charalampos Mitrodimas <[email protected]> Signed-off-by: Panagiotis Foliadis <[email protected]> Reviewed-by: Charalampos Mitrodimas <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Zhang Heng <[email protected]> Date: Fri Feb 27 20:13:27 2026 +0800 ALSA: hda/realtek: Add quirk for HP Pavilion 15-eh1xxx to enable mute LED commit 068641bc9dc3d680d1ec4f6ee9199d4812041dff upstream. The HP Pavilion 15-eh1xxx series uses the HP mainboard 88D1 with ALC245 and needs the ALC245_FIXUP_HP_MUTE_LED_V1_COEFBIT quirk to make the mute led working. Link: https://bugzilla.kernel.org/show_bug.cgi?id=215978 Cc: <[email protected]> Signed-off-by: Zhang Heng <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Juhyung Park <[email protected]> Date: Sun Feb 22 21:26:09 2026 +0900 ALSA: hda/realtek: add quirk for Samsung Galaxy Book Flex (NT950QCT-A38A) commit 9fb16a5c5ff93058851099a2b80a899b0c53fe3f upstream. Similar to other Samsung laptops, NT950QCT also requires the ALC298_FIXUP_SAMSUNG_AMP quirk applied. Cc: <[email protected]> Signed-off-by: Juhyung Park <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Juhyung Park <[email protected]> Date: Sun Feb 22 21:26:08 2026 +0900 ALSA: hda/realtek: fix model name typo for Samsung Galaxy Book Flex (NT950QCG-X716) commit 43a44fb7f2fa163926b23149805e989ba2395db1 upstream. There's no product named "Samsung Galaxy Flex Book". Use the correct "Samsung Galaxy Book Flex" name. Link: https://www.samsung.com/sec/support/model/NT950QCG-X716 Link: https://www.samsung.com/us/computing/galaxy-books/galaxy-book-flex/galaxy-book-flex-15-6-qled-512gb-storage-s-pen-included-np950qcg-k01us Cc: <[email protected]> Signed-off-by: Juhyung Park <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Richard Fitzgerald <[email protected]> Date: Thu Feb 26 11:17:28 2026 +0000 ALSA: hda: cs35l56: Fix signedness error in cs35l56_hda_posture_put() [ Upstream commit 003ce8c9b2ca28fbb4860651e76fb1c9a91f2ea1 ] In cs35l56_hda_posture_put() assign ucontrol->value.integer.value[0] to a long instead of an unsigned long. ucontrol->value.integer.value[0] is a long. This fixes the sparse warning: sound/hda/codecs/side-codecs/cs35l56_hda.c:256:20: warning: unsigned value that used to be signed checked against zero? sound/hda/codecs/side-codecs/cs35l56_hda.c:252:29: signed value source Signed-off-by: Richard Fitzgerald <[email protected]> Fixes: 73cfbfa9caea8 ("ALSA: hda/cs35l56: Add driver for Cirrus Logic CS35L56 amplifier") Link: https://patch.msgid.link/[email protected] Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Geoffrey D. Bennett <[email protected]> Date: Fri Feb 20 21:58:48 2026 +1030 ALSA: scarlett2: Fix DSP filter control array handling [ Upstream commit 1d241483368f2fd87fbaba64d6aec6bad3a1e12e ] scarlett2_add_dsp_ctls() was incorrectly storing the precomp and PEQ filter coefficient control pointers into the precomp_flt_switch_ctls and peq_flt_switch_ctls arrays instead of the intended targets precomp_flt_ctls and peq_flt_ctls. Pass NULL instead, as the filter coefficient control pointers are not used, and remove the unused precomp_flt_ctls and peq_flt_ctls arrays from struct scarlett2_data. Additionally, scarlett2_update_filter_values() was reading dsp_input_count * peq_flt_count values for SCARLETT2_CONFIG_PEQ_FLT_SWITCH, but the peq_flt_switch array is indexed only by dsp_input_count (one switch per DSP input, not per filter). Fix the read count. Fixes: b64678eb4e70 ("ALSA: scarlett2: Add DSP controls") Signed-off-by: Geoffrey D. Bennett <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Geoffrey D. Bennett <[email protected]> Date: Sat Feb 21 02:36:35 2026 +1030 ALSA: usb-audio: Add QUIRK_FLAG_SKIP_IFACE_SETUP [ Upstream commit 38c322068a26a01d7ff64da92179e68cdde9860b ] Add a quirk flag to skip the usb_set_interface(), snd_usb_init_pitch(), and snd_usb_init_sample_rate() calls in __snd_usb_parse_audio_interface(). These are redundant with snd_usb_endpoint_prepare() at stream-open time. Enable the quirk for Focusrite devices, as init_sample_rate(rate_max) sets 192kHz during probing, which disables the internal mixer and Air and Safe modes. Fixes: 16f1f838442d ("Revert "ALSA: usb-audio: Drop superfluous interface setup at parsing"") Signed-off-by: Geoffrey D. Bennett <[email protected]> Signed-off-by: Takashi Iwai <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
Author: Takashi Iwai <[email protected]> Date: Wed Feb 25 09:52:28 2026 +0100 ALSA: usb-audio: Cap the packet size pre-calculations [ Upstream commit 7fe8dec3f628e9779f1631576f8e693370050348 ] We calculate the possible packet sizes beforehand for adaptive and synchronous endpoints, but we didn't take care of the max frame size for those pre-calculated values. When a device or a bus limits the packet size, a high sample rate or a high number of channels may lead to the packet sizes that are larger than the given limit, which results in an error from the USB core at submitting URBs. As a simple workaround, just add the sanity checks of pre-calculated packet sizes to have the upper boundary of ep->maxframesize. Fixes: f0bd62b64016 ("ALSA: usb-audio: Improve frames size computation") Link: https://bugzilla.kernel.org/show_bug.cgi?id=221076 Signed-off-by: Takashi Iwai <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Geoffrey D. Bennett <[email protected]> Date: Sat Feb 21 02:34:48 2026 +1030 ALSA: usb-audio: Remove VALIDATE_RATES quirk for Focusrite devices [ Upstream commit a8cc55bf81a45772cad44c83ea7bb0e98431094a ] Remove QUIRK_FLAG_VALIDATE_RATES for Focusrite. With the previous commit, focusrite_valid_sample_rate() produces correct rate tables without USB probing. QUIRK_FLAG_VALIDATE_RATES sends SET_CUR requests for each rate (~25ms each) and leaves the device at 192kHz. This is a problem because that rate: 1) disables the internal mixer, so outputs are silent until an application opens the PCM and sets a lower rate, and 2) the Air and Safe modes get disabled. Fixes: 5963e5262180 ("ALSA: usb-audio: Enable rate validation for Scarlett devices") Signed-off-by: Geoffrey D. Bennett <[email protected]> Signed-off-by: Takashi Iwai <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
Author: Jun Seo <[email protected]> Date: Thu Feb 26 10:08:20 2026 +0900 ALSA: usb-audio: Use correct version for UAC3 header validation commit 54f9d645a5453d0bfece0c465d34aaf072ea99fa upstream. The entry of the validators table for UAC3 AC header descriptor is defined with the wrong protocol version UAC_VERSION_2, while it should have been UAC_VERSION_3. This results in the validator never matching for actual UAC3 devices (protocol == UAC_VERSION_3), causing their header descriptors to bypass validation entirely. A malicious USB device presenting a truncated UAC3 header could exploit this to cause out-of-bounds reads when the driver later accesses unvalidated descriptor fields. The bug was introduced in the same commit as the recently fixed UAC3 feature unit sub-type typo, and appears to be from the same copy-paste error when the UAC3 section was created from the UAC2 section. Fixes: 57f8770620e9 ("ALSA: usb-audio: More validations of descriptor units") Cc: <[email protected]> Signed-off-by: Jun Seo <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Takashi Iwai <[email protected]> Date: Wed Feb 25 09:52:31 2026 +0100 ALSA: usb-audio: Use inclusive terms [ Upstream commit 4e9113c533acee2ba1f72fd68ee6ecd36b64484e ] Replace the remaining with inclusive terms; it's only this function name we overlooked at the previous conversion. Fixes: 53837b4ac2bd ("ALSA: usb-audio: Replace slave/master terms") Signed-off-by: Takashi Iwai <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Takashi Iwai <[email protected]> Date: Thu Feb 26 16:43:49 2026 +0100 ALSA: usb: qcom: Correct parameter comment for uaudio_transfer_buffer_setup() [ Upstream commit 1d6452a0ce78cd3f4e48943b5ba21d273a658298 ] At fixing the memory leak of xfer buffer, we forgot to update the corresponding comment, too. This resulted in a kernel-doc warning with W=1. Let's correct it. Fixes: 5c7ef5001292 ("ALSA: qc_audio_offload: avoid leaking xfer_buf allocation") Link: https://patch.msgid.link/[email protected] Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Raju Rangoju <[email protected]> Date: Thu Feb 26 22:37:53 2026 +0530 amd-xgbe: fix MAC_TCR_SS register width for 2.5G and 10M speeds [ Upstream commit 9439a661c2e80485406ce2c90b107ca17858382d ] Extend the MAC_TCR_SS (Speed Select) register field width from 2 bits to 3 bits to properly support all speed settings. The MAC_TCR register's SS field encoding requires 3 bits to represent all supported speeds: - 0x00: 10Gbps (XGMII) - 0x02: 2.5Gbps (GMII) / 100Mbps - 0x03: 1Gbps / 10Mbps - 0x06: 2.5Gbps (XGMII) - P100a only With only 2 bits, values 0x04-0x07 cannot be represented, which breaks 2.5G XGMII mode on newer platforms and causes incorrect speed select values to be programmed. Fixes: 07445f3c7ca1 ("amd-xgbe: Add support for 10 Mbps speed") Co-developed-by: Guruvendra Punugupati <[email protected]> Signed-off-by: Guruvendra Punugupati <[email protected]> Signed-off-by: Raju Rangoju <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Raju Rangoju <[email protected]> Date: Mon Mar 2 09:51:24 2026 +0530 amd-xgbe: fix sleep while atomic on suspend/resume [ Upstream commit e2f27363aa6d983504c6836dd0975535e2e9dba0 ] The xgbe_powerdown() and xgbe_powerup() functions use spinlocks (spin_lock_irqsave) while calling functions that may sleep: - napi_disable() can sleep waiting for NAPI polling to complete - flush_workqueue() can sleep waiting for pending work items This causes a "BUG: scheduling while atomic" error during suspend/resume cycles on systems using the AMD XGBE Ethernet controller. The spinlock protection in these functions is unnecessary as these functions are called from suspend/resume paths which are already serialized by the PM core Fix this by removing the spinlock. Since only code that takes this lock is xgbe_powerdown() and xgbe_powerup(), remove it completely. Fixes: c5aa9e3b8156 ("amd-xgbe: Initial AMD 10GbE platform driver") Signed-off-by: Raju Rangoju <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Shawn Lin <[email protected]> Date: Mon Jan 5 16:15:28 2026 +0800 arm64: dts: rockchip: Fix rk356x PCIe range mappings [ Upstream commit f63ea193a404481f080ca2958f73e9f364682db9 ] The pcie bus address should be mapped 1:1 to the cpu side MMIO address, so that there is no same address allocated from normal system memory. Otherwise it's broken if the same address assigned to the EP for DMA purpose.Fix it to sync with the vendor BSP. Fixes: 568a67e742df ("arm64: dts: rockchip: Fix rk356x PCIe register and range mappings") Fixes: 66b51ea7d70f ("arm64: dts: rockchip: Add rk3568 PCIe2x1 controller") Cc: [email protected] Cc: Andrew Powers-Holmes <[email protected]> Signed-off-by: Shawn Lin <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Heiko Stuebner <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Shawn Lin <[email protected]> Date: Mon Jan 5 16:15:29 2026 +0800 arm64: dts: rockchip: Fix rk3588 PCIe range mappings [ Upstream commit 46c56b737161060dfa468f25ae699749047902a2 ] The pcie bus address should be mapped 1:1 to the cpu side MMIO address, so that there is no same address allocated from normal system memory. Otherwise it's broken if the same address assigned to the EP for DMA purpose.Fix it to sync with the vendor BSP. Fixes: 0acf4fa7f187 ("arm64: dts: rockchip: add PCIe3 support for rk3588") Fixes: 8d81b77f4c49 ("arm64: dts: rockchip: add rk3588 PCIe2 support") Cc: [email protected] Cc: Sebastian Reichel <[email protected]> Signed-off-by: Shawn Lin <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Heiko Stuebner <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Catalin Marinas <[email protected]> Date: Mon Feb 23 17:45:30 2026 +0000 arm64: gcs: Do not set PTE_SHARED on GCS mappings if FEAT_LPA2 is enabled commit 8a85b3131225a8c8143ba2ae29c0eef8c1f9117f upstream. When FEAT_LPA2 is enabled, bits 8-9 of the PTE replace the shareability attribute with bits 50-51 of the output address. The _PAGE_GCS{,_RO} definitions include the PTE_SHARED bits as 0b11 (this matches the other _PAGE_* definitions) but using this macro directly leads to the following panic when enabling GCS on a system/model with LPA2: Unable to handle kernel paging request at virtual address fffff1ffc32d8008 Mem abort info: ESR = 0x0000000096000004 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x04: level 0 translation fault Data abort info: ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 CM = 0, WnR = 0, TnD = 0, TagAccess = 0 GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 swapper pgtable: 4k pages, 52-bit VAs, pgdp=0000000060f4d000 [fffff1ffc32d8008] pgd=100000006184b003, p4d=0000000000000000 Internal error: Oops: 0000000096000004 [#1] SMP CPU: 0 UID: 0 PID: 513 Comm: gcs_write_fault Tainted: G M 7.0.0-rc1 #1 PREEMPT Tainted: [M]=MACHINE_CHECK Hardware name: QEMU QEMU Virtual Machine, BIOS 2025.02-8+deb13u1 11/08/2025 pstate: 03402005 (nzcv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--) pc : zap_huge_pmd+0x168/0x468 lr : zap_huge_pmd+0x2c/0x468 sp : ffff800080beb660 x29: ffff800080beb660 x28: fff00000c2058180 x27: ffff800080beb898 x26: fff00000c2058180 x25: ffff800080beb820 x24: 00c800010b600f41 x23: ffffc1ffc30af1a8 x22: fff00000c2058180 x21: 0000ffff8dc00000 x20: fff00000c2bc6370 x19: ffff800080beb898 x18: ffff800080bebb60 x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000007 x14: 000000000000000a x13: 0000aaaacbbbffff x12: 0000000000000000 x11: 0000ffff8ddfffff x10: 00000000000001fe x9 : 0000ffff8ddfffff x8 : 0000ffff8de00000 x7 : 0000ffff8da00000 x6 : fff00000c2bc6370 x5 : 0000ffff8da00000 x4 : 000000010b600000 x3 : ffffc1ffc0000000 x2 : fff00000c2058180 x1 : fffff1ffc32d8000 x0 : 000000c00010b600 Call trace: zap_huge_pmd+0x168/0x468 (P) unmap_page_range+0xd70/0x1560 unmap_single_vma+0x48/0x80 unmap_vmas+0x90/0x180 unmap_region+0x88/0xe4 vms_complete_munmap_vmas+0xf8/0x1e0 do_vmi_align_munmap+0x158/0x180 do_vmi_munmap+0xac/0x160 __vm_munmap+0xb0/0x138 vm_munmap+0x14/0x20 gcs_free+0x70/0x80 mm_release+0x1c/0xc8 exit_mm_release+0x28/0x38 do_exit+0x190/0x8ec do_group_exit+0x34/0x90 get_signal+0x794/0x858 arch_do_signal_or_restart+0x11c/0x3e0 exit_to_user_mode_loop+0x10c/0x17c el0_da+0x8c/0x9c el0t_64_sync_handler+0xd0/0xf0 el0t_64_sync+0x198/0x19c Code: aa1603e2 d34cfc00 cb813001 8b011861 (f9400420) Similarly to how the kernel handles protection_map[], use a gcs_page_prot variable to store the protection bits and clear PTE_SHARED if LPA2 is enabled. Also remove the unused PAGE_GCS{,_RO} macros. Signed-off-by: Catalin Marinas <[email protected]> Fixes: 6497b66ba694 ("arm64/mm: Map pages for guarded control stack") Reported-by: Emanuele Rocca <[email protected]> Cc: [email protected] Cc: Mark Brown <[email protected]> Cc: Will Deacon <[email protected]> Reviewed-by: David Hildenbrand (Arm) <[email protected]> Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Will Deacon <[email protected]> Date: Mon Feb 23 22:10:11 2026 +0000 arm64: io: Extract user memory type in ioremap_prot() [ Upstream commit 8f098037139b294050053123ab2bc0f819d08932 ] The only caller of ioremap_prot() outside of the generic ioremap() implementation is generic_access_phys(), which passes a 'pgprot_t' value determined from the user mapping of the target 'pfn' being accessed by the kernel. On arm64, the 'pgprot_t' contains all of the non-address bits from the pte, including the permission controls, and so we end up returning a new user mapping from ioremap_prot() which faults when accessed from the kernel on systems with PAN: | Unable to handle kernel read from unreadable memory at virtual address ffff80008ea89000 | ... | Call trace: | __memcpy_fromio+0x80/0xf8 | generic_access_phys+0x20c/0x2b8 | __access_remote_vm+0x46c/0x5b8 | access_remote_vm+0x18/0x30 | environ_read+0x238/0x3e8 | vfs_read+0xe4/0x2b0 | ksys_read+0xcc/0x178 | __arm64_sys_read+0x4c/0x68 Extract only the memory type from the user 'pgprot_t' in ioremap_prot() and assert that we're being passed a user mapping, to protect us against any changes in future that may require additional handling. To avoid falsely flagging users of ioremap(), provide our own ioremap() macro which simply wraps __ioremap_prot(). Cc: Zeng Heng <[email protected]> Cc: Jinjiang Tu <[email protected]> Cc: Catalin Marinas <[email protected]> Fixes: 893dea9ccd08 ("arm64: Add HAVE_IOREMAP_PROT support") Reported-by: Jinjiang Tu <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Will Deacon <[email protected]> Date: Mon Feb 23 22:10:10 2026 +0000 arm64: io: Rename ioremap_prot() to __ioremap_prot() [ Upstream commit f6bf47ab32e0863df50f5501d207dcdddb7fc507 ] Rename our ioremap_prot() implementation to __ioremap_prot() and convert all arch-internal callers over to the new function. ioremap_prot() remains as a #define to __ioremap_prot() for generic_access_phys() and will be subsequently extended to handle user permissions in 'prot'. Cc: Zeng Heng <[email protected]> Cc: Jinjiang Tu <[email protected]> Cc: Catalin Marinas <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Signed-off-by: Will Deacon <[email protected]> Stable-dep-of: 8f098037139b ("arm64: io: Extract user memory type in ioremap_prot()") Signed-off-by: Sasha Levin <[email protected]>
Author: Thomas Weißschuh <[email protected]> Date: Fri Feb 13 08:39:29 2026 +0100 ARM: clean up the memset64() C wrapper commit b52343d1cb47bb27ca32a3f4952cc2fd3cd165bf upstream. The current logic to split the 64-bit argument into its 32-bit halves is byte-order specific and a bit clunky. Use a union instead which is easier to read and works in all cases. GCC still generates the same machine code. While at it, rename the arguments of the __memset64() prototype to actually reflect their semantics. Signed-off-by: Thomas Weißschuh <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Reported-by: Ben Hutchings <[email protected]> # for -stable Link: https://lore.kernel.org/all/[email protected]/ Suggested-by: https://lore.kernel.org/all/[email protected]/ # for -stable Link: https://lore.kernel.org/all/[email protected]/ Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Bence Csókás <[email protected]> Date: Thu Aug 14 09:47:44 2025 +0200 ARM: dts: imx53-usbarmory: Replace license text comment with SPDX identifier [ Upstream commit faa6baa36497958dd8fd5561daa37249779446d7 ] Replace verbatim license text with a `SPDX-License-Identifier`. The comment header mis-attributes this license to be "X11", but the license text does not include the last line "Except as contained in this notice, the name of the X Consortium shall not be used in advertising or otherwise to promote the sale, use or other dealings in this Software without prior written authorization from the X Consortium.". Therefore, this license is actually equivalent to the SPDX "MIT" license (confirmed by text diffing). Cc: Andrej Rosano <[email protected]> Signed-off-by: Bence Csókás <[email protected]> Acked-by: Andrej Rosano <[email protected]> Signed-off-by: Shawn Guo <[email protected]> Stable-dep-of: 43d67ec26b32 ("PCI: dwc: ep: Fix resizable BAR support for multi-PF configurations") Signed-off-by: Sasha Levin <[email protected]>
Author: Alexander Stein <[email protected]> Date: Tue Dec 16 09:49:30 2025 +0100 ASoC: fsl_xcvr: provide regmap names commit 08fd332eeb88515af4f1892d91f6ef4ea7558b71 upstream. This driver uses multiple regmaps, which will causes name conflicts in debugfs like: debugfs: '30cc0000.xcvr' already exists in 'regmap' Fix this by adding a name for the non-core regmap configurations. Signed-off-by: Alexander Stein <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Fabio Estevam <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Alexander Stein <[email protected]> Date: Tue Nov 25 11:13:33 2025 +0100 ASoC: fsl_xcvr: use dev_err_probe() replacing dev_err() + return commit 8ae28d04593a5fdddb16d3edcdabb8d1e4330d0b upstream. Use dev_err_probe() to simplify the code. This also silences -517 errors. Signed-off-by: Alexander Stein <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Fabio Estevam <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Takashi Iwai <[email protected]> Date: Thu Feb 26 16:47:52 2026 +0100 ASoC: SDCA: Fix comments for sdca_irq_request() [ Upstream commit 71c1978ab6d2c6d48c31311855f1a85377c152ae ] The kernel-doc comments for sdca_irq_request() contained some typos that lead to build warnings with W=1. Let's correct them. Fixes: b126394d9ec6 ("ASoC: SDCA: Generic interrupt support") Acked-by: Mark Brown <[email protected]> Signed-off-by: Takashi Iwai <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
Author: Guenter Roeck <[email protected]> Date: Thu Mar 5 18:48:05 2026 -0800 ata: libata-eh: Fix detection of deferred qc timeouts [ Upstream commit ee0e6e69a772d601e152e5368a1da25d656122a8 ] If the ata_qc_for_each_raw() loop finishes without finding a matching SCSI command for any QC, the variable qc will hold a pointer to the last element examined, which has the tag i == ATA_MAX_QUEUE - 1. This qc can match the port deferred QC (ap->deferred_qc). If that happens, the condition qc == ap->deferred_qc evaluates to true despite the loop not breaking with a match on the SCSI command for this QC. In that case, the error handler mistakenly intercepts a command that has not been issued yet and that has not timed out, and thus erroneously returning a timeout error. Fix the problem by checking for i < ATA_MAX_QUEUE in addition to qc == ap->deferred_qc. The problem was found by an experimental code review agent based on gemini-3.1-pro while reviewing backports into v6.18.y. Assisted-by: Gemini:gemini-3.1-pro Fixes: eddb98ad9364 ("ata: libata-eh: correctly handle deferred qc timeouts") Signed-off-by: Guenter Roeck <[email protected]> [cassel: modified commit log as suggested by Damien] Reviewed-by: Damien Le Moal <[email protected]> Signed-off-by: Niklas Cassel <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Jiayuan Chen <[email protected]> Date: Wed Feb 25 20:32:40 2026 +0800 atm: lec: fix null-ptr-deref in lec_arp_clear_vccs [ Upstream commit 101bacb303e89dc2e0640ae6a5e0fb97c4eb45bb ] syzkaller reported a null-ptr-deref in lec_arp_clear_vccs(). This issue can be easily reproduced using the syzkaller reproducer. In the ATM LANE (LAN Emulation) module, the same atm_vcc can be shared by multiple lec_arp_table entries (e.g., via entry->vcc or entry->recv_vcc). When the underlying VCC is closed, lec_vcc_close() iterates over all ARP entries and calls lec_arp_clear_vccs() for each matched entry. For example, when lec_vcc_close() iterates through the hlists in priv->lec_arp_empty_ones or other ARP tables: 1. In the first iteration, for the first matched ARP entry sharing the VCC, lec_arp_clear_vccs() frees the associated vpriv (which is vcc->user_back) and sets vcc->user_back to NULL. 2. In the second iteration, for the next matched ARP entry sharing the same VCC, lec_arp_clear_vccs() is called again. It obtains a NULL vpriv from vcc->user_back (via LEC_VCC_PRIV(vcc)) and then attempts to dereference it via `vcc->pop = vpriv->old_pop`, leading to a null-ptr-deref crash. Fix this by adding a null check for vpriv before dereferencing it. If vpriv is already NULL, it means the VCC has been cleared by a previous call, so we can safely skip the cleanup and just clear the entry's vcc/recv_vcc pointers. The entire cleanup block (including vcc_release_async()) is placed inside the vpriv guard because a NULL vpriv indicates the VCC has already been fully released by a prior iteration — repeating the teardown would redundantly set flags and trigger callbacks on an already-closing socket. The Fixes tag points to the initial commit because the entry->vcc path has been vulnerable since the original code. The entry->recv_vcc path was later added by commit 8d9f73c0ad2f ("atm: fix a memory leak of vcc->user_back") with the same pattern, and both paths are fixed here. Reported-by: [email protected] Closes: https://lore.kernel.org/all/[email protected]/T/ Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Suggested-by: Dan Carpenter <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Jiayuan Chen <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Ming Lei <[email protected]> Date: Thu Mar 5 11:15:50 2026 +0800 block: use trylock to avoid lockdep circular dependency in sysfs [ Upstream commit ce8ee8583ed83122405eabaa8fb351be4d9dc65c ] Use trylock instead of blocking lock acquisition for update_nr_hwq_lock in queue_requests_store() and elv_iosched_store() to avoid circular lock dependency with kernfs active reference during concurrent disk deletion: update_nr_hwq_lock -> kn->active (via del_gendisk -> kobject_del) kn->active -> update_nr_hwq_lock (via sysfs write path) Return -EBUSY when the lock is not immediately available. Reported-and-tested-by: Yi Zhang <[email protected]> Closes: https://lore.kernel.org/linux-block/CAHj4cs-em-4acsHabMdT=jJhXkCzjnprD-aQH1OgrZo4nTnmMw@mail.gmail.com/ Fixes: 626ff4f8ebcb ("blk-mq: convert to serialize updating nr_requests with update_nr_hwq_lock") Signed-off-by: Ming Lei <[email protected]> Tested-by: Yi Zhang <[email protected]> Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Mariusz Skamra <[email protected]> Date: Thu Feb 12 14:46:46 2026 +0100 Bluetooth: Fix CIS host feature condition commit 7cff9a40c6b0f72ccefdaf0ffe03cfac30348f51 upstream. This fixes the condition for sending the LE Set Host Feature command. The command is sent to indicate host support for Connected Isochronous Streams in this case. It has been observed that the system could not initialize BIS-only capable controllers because the controllers do not support the command. As per Core v6.2 | Vol 4, Part E, Table 3.1 the command shall be supported if CIS Central or CIS Peripheral is supported; otherwise, the command is optional. Fixes: 709788b154ca ("Bluetooth: hci_core: Fix using {cis,bis}_capable for current settings") Cc: [email protected] Signed-off-by: Mariusz Skamra <[email protected]> Reviewed-by: Paul Menzel <[email protected]> Signed-off-by: Luiz Augusto von Dentz <[email protected]> [ iso_capable() => cis_capable() ] Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Heitor Alves de Siqueira <[email protected]> Date: Wed Feb 11 15:03:35 2026 -0300 Bluetooth: purge error queues in socket destructors commit 21e4271e65094172aadd5beb8caea95dd0fbf6d7 upstream. When TX timestamping is enabled via SO_TIMESTAMPING, SKBs may be queued into sk_error_queue and will stay there until consumed. If userspace never gets to read the timestamps, or if the controller is removed unexpectedly, these SKBs will leak. Fix by adding skb_queue_purge() calls for sk_error_queue in affected bluetooth destructors. RFCOMM does not currently use sk_error_queue. Fixes: 134f4b39df7b ("Bluetooth: add support for skb TX SND/COMPLETION timestamping") Reported-by: [email protected] Closes: https://syzbot.org/bug?extid=7ff4013eabad1407b70a Cc: [email protected] Signed-off-by: Heitor Alves de Siqueira <[email protected]> Signed-off-by: Luiz Augusto von Dentz <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Fuad Tabba <[email protected]> Date: Thu Feb 26 07:55:25 2026 +0000 bpf, arm64: Force 8-byte alignment for JIT buffer to prevent atomic tearing [ Upstream commit ef06fd16d48704eac868441d98d4ef083d8f3d07 ] struct bpf_plt contains a u64 target field. Currently, the BPF JIT allocator requests an alignment of 4 bytes (sizeof(u32)) for the JIT buffer. Because the base address of the JIT buffer can be 4-byte aligned (e.g., ending in 0x4 or 0xc), the relative padding logic in build_plt() fails to ensure that target lands on an 8-byte boundary. This leads to two issues: 1. UBSAN reports misaligned-access warnings when dereferencing the structure. 2. More critically, target is updated concurrently via WRITE_ONCE() in bpf_arch_text_poke() while the JIT'd code executes ldr. On arm64, 64-bit loads/stores are only guaranteed to be single-copy atomic if they are 64-bit aligned. A misaligned target risks a torn read, causing the JIT to jump to a corrupted address. Fix this by increasing the allocation alignment requirement to 8 bytes (sizeof(u64)) in bpf_jit_binary_pack_alloc(). This anchors the base of the JIT buffer to an 8-byte boundary, allowing the relative padding math in build_plt() to correctly align the target field. Fixes: b2ad54e1533e ("bpf, arm64: Implement bpf_arch_text_poke() for arm64") Signed-off-by: Fuad Tabba <[email protected]> Acked-by: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Jiayuan Chen <[email protected]> Date: Thu Feb 26 16:03:01 2026 +0800 bpf/bonding: reject vlan+srcmac xmit_hash_policy change when XDP is loaded [ Upstream commit 479d589b40b836442bbdadc3fdb37f001bb67f26 ] bond_option_mode_set() already rejects mode changes that would make a loaded XDP program incompatible via bond_xdp_check(). However, bond_option_xmit_hash_policy_set() has no such guard. For 802.3ad and balance-xor modes, bond_xdp_check() returns false when xmit_hash_policy is vlan+srcmac, because the 802.1q payload is usually absent due to hardware offload. This means a user can: 1. Attach a native XDP program to a bond in 802.3ad/balance-xor mode with a compatible xmit_hash_policy (e.g. layer2+3). 2. Change xmit_hash_policy to vlan+srcmac while XDP remains loaded. This leaves bond->xdp_prog set but bond_xdp_check() now returning false for the same device. When the bond is later destroyed, dev_xdp_uninstall() calls bond_xdp_set(dev, NULL, NULL) to remove the program, which hits the bond_xdp_check() guard and returns -EOPNOTSUPP, triggering: WARN_ON(dev_xdp_install(dev, mode, bpf_op, NULL, 0, NULL)) Fix this by rejecting xmit_hash_policy changes to vlan+srcmac when an XDP program is loaded on a bond in 802.3ad or balance-xor mode. commit 39a0876d595b ("net, bonding: Disallow vlan+srcmac with XDP") introduced bond_xdp_check() which returns false for 802.3ad/balance-xor modes when xmit_hash_policy is vlan+srcmac. The check was wired into bond_xdp_set() to reject XDP attachment with an incompatible policy, but the symmetric path -- preventing xmit_hash_policy from being changed to an incompatible value after XDP is already loaded -- was left unguarded in bond_option_xmit_hash_policy_set(). Note: commit 094ee6017ea0 ("bonding: check xdp prog when set bond mode") later added a similar guard to bond_option_mode_set(), but bond_option_xmit_hash_policy_set() remained unprotected. Reported-by: [email protected] Closes: https://lore.kernel.org/all/[email protected]/T/ Fixes: 39a0876d595b ("net, bonding: Disallow vlan+srcmac with XDP") Signed-off-by: Jiayuan Chen <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Tianci Cao <[email protected]> Date: Wed Feb 4 19:15:02 2026 +0800 bpf: Add bitwise tracking for BPF_END [ Upstream commit 9d21199842247ab05c675fb9b6c6ca393a5c0024 ] This patch implements bitwise tracking (tnum analysis) for BPF_END (byte swap) operation. Currently, the BPF verifier does not track value for BPF_END operation, treating the result as completely unknown. This limits the verifier's ability to prove safety of programs that perform endianness conversions, which are common in networking code. For example, the following code pattern for port number validation: int test(struct pt_regs *ctx) { __u64 x = bpf_get_prandom_u32(); x &= 0x3f00; // Range: [0, 0x3f00], var_off: (0x0; 0x3f00) x = bswap16(x); // Should swap to range [0, 0x3f], var_off: (0x0; 0x3f) if (x > 0x3f) goto trap; return 0; trap: return *(u64 *)NULL; // Should be unreachable } Currently generates verifier output: 1: (54) w0 &= 16128 ; R0=scalar(smin=smin32=0,smax=umax=smax32=umax32=16128,var_off=(0x0; 0x3f00)) 2: (d7) r0 = bswap16 r0 ; R0=scalar() 3: (25) if r0 > 0x3f goto pc+2 ; R0=scalar(smin=smin32=0,smax=umax=smax32=umax32=63,var_off=(0x0; 0x3f)) Without this patch, even though the verifier knows `x` has certain bits set, after bswap16, it loses all tracking information and treats port as having a completely unknown value [0, 65535]. According to the BPF instruction set[1], there are 3 kinds of BPF_END: 1. `bswap(16|32|64)`: opcode=0xd7 (BPF_END | BPF_ALU64 | BPF_TO_LE) - do unconditional swap 2. `le(16|32|64)`: opcode=0xd4 (BPF_END | BPF_ALU | BPF_TO_LE) - on big-endian: do swap - on little-endian: truncation (16/32-bit) or no-op (64-bit) 3. `be(16|32|64)`: opcode=0xdc (BPF_END | BPF_ALU | BPF_TO_BE) - on little-endian: do swap - on big-endian: truncation (16/32-bit) or no-op (64-bit) Since BPF_END operations are inherently bit-wise permutations, tnum (bitwise tracking) offers the most efficient and precise mechanism for value analysis. By implementing `tnum_bswap16`, `tnum_bswap32`, and `tnum_bswap64`, we can derive exact `var_off` values concisely, directly reflecting the bit-level changes. Here is the overview of changes: 1. In `tnum_bswap(16|32|64)` (kernel/bpf/tnum.c): Call `swab(16|32|64)` function on the value and mask of `var_off`, and do truncation for 16/32-bit cases. 2. In `adjust_scalar_min_max_vals` (kernel/bpf/verifier.c): Call helper function `scalar_byte_swap`. - Only do byte swap when * alu64 (unconditional swap) OR * switching between big-endian and little-endian machines. - If need do byte swap: * Firstly call `tnum_bswap(16|32|64)` to update `var_off`. * Then reset the bound since byte swap scrambles the range. - For 16/32-bit cases, truncate dst register to match the swapped size. This enables better verification of networking code that frequently uses byte swaps for protocol processing, reducing false positive rejections. [1] https://www.kernel.org/doc/Documentation/bpf/standardization/instruction-set.rst Co-developed-by: Shenghao Yuan <[email protected]> Signed-off-by: Shenghao Yuan <[email protected]> Co-developed-by: Yazhou Tang <[email protected]> Signed-off-by: Yazhou Tang <[email protected]> Signed-off-by: Tianci Cao <[email protected]> Acked-by: Eduard Zingerman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]> Stable-dep-of: efc11a667878 ("bpf: Improve bounds when tnum has a single possible value") Signed-off-by: Sasha Levin <[email protected]>
Author: Eduard Zingerman <[email protected]> Date: Fri Mar 6 16:02:47 2026 -0800 bpf: collect only live registers in linked regs [ Upstream commit 2658a1720a1944fbaeda937000ad2b3c3dfaf1bb ] Fix an inconsistency between func_states_equal() and collect_linked_regs(): - regsafe() uses check_ids() to verify that cached and current states have identical register id mapping. - func_states_equal() calls regsafe() only for registers computed as live by compute_live_registers(). - clean_live_states() is supposed to remove dead registers from cached states, but it can skip states belonging to an iterator-based loop. - collect_linked_regs() collects all registers sharing the same id, ignoring the marks computed by compute_live_registers(). Linked registers are stored in the state's jump history. - backtrack_insn() marks all linked registers for an instruction as precise whenever one of the linked registers is precise. The above might lead to a scenario: - There is an instruction I with register rY known to be dead at I. - Instruction I is reached via two paths: first A, then B. - On path A: - There is an id link between registers rX and rY. - Checkpoint C is created at I. - Linked register set {rX, rY} is saved to the jump history. - rX is marked as precise at I, causing both rX and rY to be marked precise at C. - On path B: - There is no id link between registers rX and rY, otherwise register states are sub-states of those in C. - Because rY is dead at I, check_ids() returns true. - Current state is considered equal to checkpoint C, propagate_precision() propagates spurious precision mark for register rY along the path B. - Depending on a program, this might hit verifier_bug() in the backtrack_insn(), e.g. if rY ∈ [r1..r5] and backtrack_insn() spots a function call. The reproducer program is in the next patch. This was hit by sched_ext scx_lavd scheduler code. Changes in tests: - verifier_scalar_ids.c selftests need modification to preserve some registers as live for __msg() checks. - exceptions_assert.c adjusted to match changes in the verifier log, R0 is dead after conditional instruction and thus does not get range. - precise.c adjusted to match changes in the verifier log, register r9 is dead after comparison and it's range is not important for test. Reported-by: Emil Tsalapatis <[email protected]> Fixes: 0fb3cf6110a5 ("bpf: use register liveness information for func_states_equal") Signed-off-by: Eduard Zingerman <[email protected]> Link: https://lore.kernel.org/r/20260306-linked-regs-and-propagate-precision-v1-1-18e859be570d@gmail.com Signed-off-by: Alexei Starovoitov <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Lang Xu <[email protected]> Date: Tue Mar 3 17:52:17 2026 +0800 bpf: Fix a UAF issue in bpf_trampoline_link_cgroup_shim [ Upstream commit 56145d237385ca0e7ca9ff7b226aaf2eb8ef368b ] The root cause of this bug is that when 'bpf_link_put' reduces the refcount of 'shim_link->link.link' to zero, the resource is considered released but may still be referenced via 'tr->progs_hlist' in 'cgroup_shim_find'. The actual cleanup of 'tr->progs_hlist' in 'bpf_shim_tramp_link_release' is deferred. During this window, another process can cause a use-after-free via 'bpf_trampoline_link_cgroup_shim'. Based on Martin KaFai Lau's suggestions, I have created a simple patch. To fix this: Add an atomic non-zero check in 'bpf_trampoline_link_cgroup_shim'. Only increment the refcount if it is not already zero. Testing: I verified the fix by adding a delay in 'bpf_shim_tramp_link_release' to make the bug easier to trigger: static void bpf_shim_tramp_link_release(struct bpf_link *link) { /* ... */ if (!shim_link->trampoline) return; + msleep(100); WARN_ON_ONCE(bpf_trampoline_unlink_prog(&shim_link->link, shim_link->trampoline, NULL)); bpf_trampoline_put(shim_link->trampoline); } Before the patch, running a PoC easily reproduced the crash(almost 100%) with a call trace similar to KaiyanM's report. After the patch, the bug no longer occurs even after millions of iterations. Fixes: 69fd337a975c ("bpf: per-cgroup lsm flavor") Reported-by: Kaiyan Mei <[email protected]> Closes: https://lore.kernel.org/bpf/[email protected]/ Signed-off-by: Lang Xu <[email protected]> Signed-off-by: Martin KaFai Lau <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
Author: Jiayuan Chen <[email protected]> Date: Wed Feb 25 20:14:55 2026 +0800 bpf: Fix race in cpumap on PREEMPT_RT [ Upstream commit 869c63d5975d55e97f6b168e885452b3da20ea47 ] On PREEMPT_RT kernels, the per-CPU xdp_bulk_queue (bq) can be accessed concurrently by multiple preemptible tasks on the same CPU. The original code assumes bq_enqueue() and __cpu_map_flush() run atomically with respect to each other on the same CPU, relying on local_bh_disable() to prevent preemption. However, on PREEMPT_RT, local_bh_disable() only calls migrate_disable() (when PREEMPT_RT_NEEDS_BH_LOCK is not set) and does not disable preemption, which allows CFS scheduling to preempt a task during bq_flush_to_queue(), enabling another task on the same CPU to enter bq_enqueue() and operate on the same per-CPU bq concurrently. This leads to several races: 1. Double __list_del_clearprev(): after bq->count is reset in bq_flush_to_queue(), a preempting task can call bq_enqueue() -> bq_flush_to_queue() on the same bq when bq->count reaches CPU_MAP_BULK_SIZE. Both tasks then call __list_del_clearprev() on the same bq->flush_node, the second call dereferences the prev pointer that was already set to NULL by the first. 2. bq->count and bq->q[] races: concurrent bq_enqueue() can corrupt the packet queue while bq_flush_to_queue() is processing it. The race between task A (__cpu_map_flush -> bq_flush_to_queue) and task B (bq_enqueue -> bq_flush_to_queue) on the same CPU: Task A (xdp_do_flush) Task B (cpu_map_enqueue) ---------------------- ------------------------ bq_flush_to_queue(bq) spin_lock(&q->producer_lock) /* flush bq->q[] to ptr_ring */ bq->count = 0 spin_unlock(&q->producer_lock) bq_enqueue(rcpu, xdpf) <-- CFS preempts Task A --> bq->q[bq->count++] = xdpf /* ... more enqueues until full ... */ bq_flush_to_queue(bq) spin_lock(&q->producer_lock) /* flush to ptr_ring */ spin_unlock(&q->producer_lock) __list_del_clearprev(flush_node) /* sets flush_node.prev = NULL */ <-- Task A resumes --> __list_del_clearprev(flush_node) flush_node.prev->next = ... /* prev is NULL -> kernel oops */ Fix this by adding a local_lock_t to xdp_bulk_queue and acquiring it in bq_enqueue() and __cpu_map_flush(). These paths already run under local_bh_disable(), so use local_lock_nested_bh() which on non-RT is a pure annotation with no overhead, and on PREEMPT_RT provides a per-CPU sleeping lock that serializes access to the bq. To reproduce, insert an mdelay(100) between bq->count = 0 and __list_del_clearprev() in bq_flush_to_queue(), then run reproducer provided by syzkaller. Fixes: 3253cb49cbad ("softirq: Allow to drop the softirq-BKL lock on PREEMPT_RT") Reported-by: [email protected] Closes: https://lore.kernel.org/all/[email protected]/T/ Reviewed-by: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Jiayuan Chen <[email protected]> Signed-off-by: Jiayuan Chen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Jiayuan Chen <[email protected]> Date: Wed Feb 25 20:14:56 2026 +0800 bpf: Fix race in devmap on PREEMPT_RT [ Upstream commit 1872e75375c40add4a35990de3be77b5741c252c ] On PREEMPT_RT kernels, the per-CPU xdp_dev_bulk_queue (bq) can be accessed concurrently by multiple preemptible tasks on the same CPU. The original code assumes bq_enqueue() and __dev_flush() run atomically with respect to each other on the same CPU, relying on local_bh_disable() to prevent preemption. However, on PREEMPT_RT, local_bh_disable() only calls migrate_disable() (when PREEMPT_RT_NEEDS_BH_LOCK is not set) and does not disable preemption, which allows CFS scheduling to preempt a task during bq_xmit_all(), enabling another task on the same CPU to enter bq_enqueue() and operate on the same per-CPU bq concurrently. This leads to several races: 1. Double-free / use-after-free on bq->q[]: bq_xmit_all() snapshots cnt = bq->count, then iterates bq->q[0..cnt-1] to transmit frames. If preempted after the snapshot, a second task can call bq_enqueue() -> bq_xmit_all() on the same bq, transmitting (and freeing) the same frames. When the first task resumes, it operates on stale pointers in bq->q[], causing use-after-free. 2. bq->count and bq->q[] corruption: concurrent bq_enqueue() modifying bq->count and bq->q[] while bq_xmit_all() is reading them. 3. dev_rx/xdp_prog teardown race: __dev_flush() clears bq->dev_rx and bq->xdp_prog after bq_xmit_all(). If preempted between bq_xmit_all() return and bq->dev_rx = NULL, a preempting bq_enqueue() sees dev_rx still set (non-NULL), skips adding bq to the flush_list, and enqueues a frame. When __dev_flush() resumes, it clears dev_rx and removes bq from the flush_list, orphaning the newly enqueued frame. 4. __list_del_clearprev() on flush_node: similar to the cpumap race, both tasks can call __list_del_clearprev() on the same flush_node, the second dereferences the prev pointer already set to NULL. The race between task A (__dev_flush -> bq_xmit_all) and task B (bq_enqueue -> bq_xmit_all) on the same CPU: Task A (xdp_do_flush) Task B (ndo_xdp_xmit redirect) ---------------------- -------------------------------- __dev_flush(flush_list) bq_xmit_all(bq) cnt = bq->count /* e.g. 16 */ /* start iterating bq->q[] */ <-- CFS preempts Task A --> bq_enqueue(dev, xdpf) bq->count == DEV_MAP_BULK_SIZE bq_xmit_all(bq, 0) cnt = bq->count /* same 16! */ ndo_xdp_xmit(bq->q[]) /* frames freed by driver */ bq->count = 0 <-- Task A resumes --> ndo_xdp_xmit(bq->q[]) /* use-after-free: frames already freed! */ Fix this by adding a local_lock_t to xdp_dev_bulk_queue and acquiring it in bq_enqueue() and __dev_flush(). These paths already run under local_bh_disable(), so use local_lock_nested_bh() which on non-RT is a pure annotation with no overhead, and on PREEMPT_RT provides a per-CPU sleeping lock that serializes access to the bq. Fixes: 3253cb49cbad ("softirq: Allow to drop the softirq-BKL lock on PREEMPT_RT") Reported-by: Sebastian Andrzej Siewior <[email protected]> Reviewed-by: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Jiayuan Chen <[email protected]> Signed-off-by: Jiayuan Chen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Kohei Enju <[email protected]> Date: Wed Feb 25 05:34:44 2026 +0000 bpf: Fix stack-out-of-bounds write in devmap [ Upstream commit b7bf516c3ecd9a2aae2dc2635178ab87b734fef1 ] get_upper_ifindexes() iterates over all upper devices and writes their indices into an array without checking bounds. Also the callers assume that the max number of upper devices is MAX_NEST_DEV and allocate excluded_devices[1+MAX_NEST_DEV] on the stack, but that assumption is not correct and the number of upper devices could be larger than MAX_NEST_DEV (e.g., many macvlans), causing a stack-out-of-bounds write. Add a max parameter to get_upper_ifindexes() to avoid the issue. When there are too many upper devices, return -EOVERFLOW and abort the redirect. To reproduce, create more than MAX_NEST_DEV(8) macvlans on a device with an XDP program attached using BPF_F_BROADCAST | BPF_F_EXCLUDE_INGRESS. Then send a packet to the device to trigger the XDP redirect path. Reported-by: [email protected] Closes: https://lore.kernel.org/all/[email protected]/T/ Fixes: aeea1b86f936 ("bpf, devmap: Exclude XDP broadcast to master device") Reviewed-by: Toke Høiland-Jørgensen <[email protected]> Signed-off-by: Kohei Enju <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Paul Chaignon <[email protected]> Date: Fri Feb 27 22:35:02 2026 +0100 bpf: Improve bounds when tnum has a single possible value [ Upstream commit efc11a667878a1d655ff034a93a539debbfedb12 ] We're hitting an invariant violation in Cilium that sometimes leads to BPF programs being rejected and Cilium failing to start [1]. The following extract from verifier logs shows what's happening: from 201 to 236: R1=0 R6=ctx() R7=1 R9=scalar(smin=umin=smin32=umin32=3584,smax=umax=smax32=umax32=3840,var_off=(0xe00; 0x100)) R10=fp0 236: R1=0 R6=ctx() R7=1 R9=scalar(smin=umin=smin32=umin32=3584,smax=umax=smax32=umax32=3840,var_off=(0xe00; 0x100)) R10=fp0 ; if (magic == MARK_MAGIC_HOST || magic == MARK_MAGIC_OVERLAY || magic == MARK_MAGIC_ENCRYPT) @ bpf_host.c:1337 236: (16) if w9 == 0xe00 goto pc+45 ; R9=scalar(smin=umin=smin32=umin32=3585,smax=umax=smax32=umax32=3840,var_off=(0xe00; 0x100)) 237: (16) if w9 == 0xf00 goto pc+1 verifier bug: REG INVARIANTS VIOLATION (false_reg1): range bounds violation u64=[0xe01, 0xe00] s64=[0xe01, 0xe00] u32=[0xe01, 0xe00] s32=[0xe01, 0xe00] var_off=(0xe00, 0x0) We reach instruction 236 with two possible values for R9, 0xe00 and 0xf00. This is perfectly reflected in the tnum, but of course the ranges are less accurate and cover [0xe00; 0xf00]. Taking the fallthrough path at instruction 236 allows the verifier to reduce the range to [0xe01; 0xf00]. The tnum is however not updated. With these ranges, at instruction 237, the verifier is not able to deduce that R9 is always equal to 0xf00. Hence the fallthrough pass is explored first, the verifier refines the bounds using the assumption that R9 != 0xf00, and ends up with an invariant violation. This pattern of impossible branch + bounds refinement is common to all invariant violations seen so far. The long-term solution is likely to rely on the refinement + invariant violation check to detect dead branches, as started by Eduard. To fix the current issue, we need something with less refactoring that we can backport. This patch uses the tnum_step helper introduced in the previous patch to detect the above situation. In particular, three cases are now detected in the bounds refinement: 1. The u64 range and the tnum only overlap in umin. u64: ---[xxxxxx]----- tnum: --xx----------x- 2. The u64 range and the tnum only overlap in the maximum value represented by the tnum, called tmax. u64: ---[xxxxxx]----- tnum: xx-----x-------- 3. The u64 range and the tnum only overlap in between umin (excluded) and umax. u64: ---[xxxxxx]----- tnum: xx----x-------x- To detect these three cases, we call tnum_step(tnum, umin), which returns the smallest member of the tnum greater than umin, called tnum_next here. We're in case (1) if umin is part of the tnum and tnum_next is greater than umax. We're in case (2) if umin is not part of the tnum and tnum_next is equal to tmax. Finally, we're in case (3) if umin is not part of the tnum, tnum_next is inferior or equal to umax, and calling tnum_step a second time gives us a value past umax. This change implements these three cases. With it, the above bytecode looks as follows: 0: (85) call bpf_get_prandom_u32#7 ; R0=scalar() 1: (47) r0 |= 3584 ; R0=scalar(smin=0x8000000000000e00,umin=umin32=3584,smin32=0x80000e00,var_off=(0xe00; 0xfffffffffffff1ff)) 2: (57) r0 &= 3840 ; R0=scalar(smin=umin=smin32=umin32=3584,smax=umax=smax32=umax32=3840,var_off=(0xe00; 0x100)) 3: (15) if r0 == 0xe00 goto pc+2 ; R0=3840 4: (15) if r0 == 0xf00 goto pc+1 4: R0=3840 6: (95) exit In addition to the new selftests, this change was also verified with Agni [3]. For the record, the raw SMT is available at [4]. The property it verifies is that: If a concrete value x is contained in all input abstract values, after __update_reg_bounds, it will continue to be contained in all output abstract values. Link: https://github.com/cilium/cilium/issues/44216 [1] Link: https://pchaigno.github.io/test-verifier-complexity.html [2] Link: https://github.com/bpfverif/agni [3] Link: https://pastebin.com/raw/naCfaqNx [4] Fixes: 0df1a55afa83 ("bpf: Warn on internal verifier errors") Acked-by: Eduard Zingerman <[email protected]> Tested-by: Marco Schirrmeister <[email protected]> Co-developed-by: Harishankar Vishwanathan <[email protected]> Signed-off-by: Harishankar Vishwanathan <[email protected]> Signed-off-by: Paul Chaignon <[email protected]> Link: https://lore.kernel.org/r/ef254c4f68be19bd393d450188946821c588565d.1772225741.git.paul.chaignon@gmail.com Signed-off-by: Alexei Starovoitov <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Harishankar Vishwanathan <[email protected]> Date: Fri Feb 27 22:32:21 2026 +0100 bpf: Introduce tnum_step to step through tnum's members [ Upstream commit 76e954155b45294c502e3d3a9e15757c858ca55e ] This commit introduces tnum_step(), a function that, when given t, and a number z returns the smallest member of t larger than z. The number z must be greater or equal to the smallest member of t and less than the largest member of t. The first step is to compute j, a number that keeps all of t's known bits, and matches all unknown bits to z's bits. Since j is a member of the t, it is already a candidate for result. However, we want our result to be (minimally) greater than z. There are only two possible cases: (1) Case j <= z. In this case, we want to increase the value of j and make it > z. (2) Case j > z. In this case, we want to decrease the value of j while keeping it > z. (Case 1) j <= z t = xx11x0x0 z = 10111101 (189) j = 10111000 (184) ^ k (Case 1.1) Let's first consider the case where j < z. We will address j == z later. Since z > j, there had to be a bit position that was 1 in z and a 0 in j, beyond which all positions of higher significance are equal in j and z. Further, this position could not have been unknown in a, because the unknown positions of a match z. This position had to be a 1 in z and known 0 in t. Let k be position of the most significant 1-to-0 flip. In our example, k = 3 (starting the count at 1 at the least significant bit). Setting (to 1) the unknown bits of t in positions of significance smaller than k will not produce a result > z. Hence, we must set/unset the unknown bits at positions of significance higher than k. Specifically, we look for the next larger combination of 1s and 0s to place in those positions, relative to the combination that exists in z. We can achieve this by concatenating bits at unknown positions of t into an integer, adding 1, and writing the bits of that result back into the corresponding bit positions previously extracted from z. >From our example, considering only positions of significance greater than k: t = xx..x z = 10..1 + 1 ----- 11..0 This is the exact combination 1s and 0s we need at the unknown bits of t in positions of significance greater than k. Further, our result must only increase the value minimally above z. Hence, unknown bits in positions of significance smaller than k should remain 0. We finally have, result = 11110000 (240) (Case 1.2) Now consider the case when j = z, for example t = 1x1x0xxx z = 10110100 (180) j = 10110100 (180) Matching the unknown bits of the t to the bits of z yielded exactly z. To produce a number greater than z, we must set/unset the unknown bits in t, and *all* the unknown bits of t candidates for being set/unset. We can do this similar to Case 1.1, by adding 1 to the bits extracted from the masked bit positions of z. Essentially, this case is equivalent to Case 1.1, with k = 0. t = 1x1x0xxx z = .0.1.100 + 1 --------- .0.1.101 This is the exact combination of bits needed in the unknown positions of t. After recalling the known positions of t, we get result = 10110101 (181) (Case 2) j > z t = x00010x1 z = 10000010 (130) j = 10001011 (139) ^ k Since j > z, there had to be a bit position which was 0 in z, and a 1 in j, beyond which all positions of higher significance are equal in j and z. This position had to be a 0 in z and known 1 in t. Let k be the position of the most significant 0-to-1 flip. In our example, k = 4. Because of the 0-to-1 flip at position k, a member of t can become greater than z if the bits in positions greater than k are themselves >= to z. To make that member *minimally* greater than z, the bits in positions greater than k must be exactly = z. Hence, we simply match all of t's unknown bits in positions more significant than k to z's bits. In positions less significant than k, we set all t's unknown bits to 0 to retain minimality. In our example, in positions of greater significance than k (=4), t=x000. These positions are matched with z (1000) to produce 1000. In positions of lower significance than k, t=10x1. All unknown bits are set to 0 to produce 1001. The final result is: result = 10001001 (137) This concludes the computation for a result > z that is a member of t. The procedure for tnum_step() in this commit implements the idea described above. As a proof of correctness, we verified the algorithm against a logical specification of tnum_step. The specification asserts the following about the inputs t, z and output res that: 1. res is a member of t, and 2. res is strictly greater than z, and 3. there does not exist another value res2 such that 3a. res2 is also a member of t, and 3b. res2 is greater than z 3c. res2 is smaller than res We checked the implementation against this logical specification using an SMT solver. The verification formula in SMTLIB format is available at [1]. The verification returned an "unsat": indicating that no input assignment exists for which the implementation and the specification produce different outputs. In addition, we also automatically generated the logical encoding of the C implementation using Agni [2] and verified it against the same specification. This verification also returned an "unsat", confirming that the implementation is equivalent to the specification. The formula for this check is also available at [3]. Link: https://pastebin.com/raw/2eRWbiit [1] Link: https://github.com/bpfverif/agni [2] Link: https://pastebin.com/raw/EztVbBJ2 [3] Co-developed-by: Srinivas Narayana <[email protected]> Signed-off-by: Srinivas Narayana <[email protected]> Co-developed-by: Santosh Nagarakatte <[email protected]> Signed-off-by: Santosh Nagarakatte <[email protected]> Signed-off-by: Harishankar Vishwanathan <[email protected]> Link: https://lore.kernel.org/r/93fdf71910411c0f19e282ba6d03b4c65f9c5d73.1772225741.git.paul.chaignon@gmail.com Signed-off-by: Alexei Starovoitov <[email protected]> Stable-dep-of: efc11a667878 ("bpf: Improve bounds when tnum has a single possible value") Signed-off-by: Sasha Levin <[email protected]>
Author: Miquel Sabaté Solà <[email protected]> Date: Fri Oct 24 12:21:41 2025 +0200 btrfs: define the AUTO_KFREE/AUTO_KVFREE helper macros [ Upstream commit d00cbce0a7d5de5fc31bf60abd59b44d36806b6e ] These are two simple macros which ensure that a pointer is initialized to NULL and with the proper cleanup attribute for it. Signed-off-by: Miquel Sabaté Solà <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]> Stable-dep-of: 52ee9965d09b ("btrfs: zoned: fixup last alloc pointer after extent removal for RAID0/10") Signed-off-by: Sasha Levin <[email protected]>
Author: Mark Harmstone <[email protected]> Date: Tue Feb 17 17:46:41 2026 +0000 btrfs: fix compat mask in error messages in btrfs_check_features() [ Upstream commit 587bb33b10bda645a1028c1737ad3992b3d7cf61 ] Commit d7f67ac9a928 ("btrfs: relax block-group-tree feature dependency checks") introduced a regression when it comes to handling unsupported incompat or compat_ro flags. Beforehand we only printed the flags that we didn't recognize, afterwards we printed them all, which is less useful. Fix the error handling so it behaves like it used to. Fixes: d7f67ac9a928 ("btrfs: relax block-group-tree feature dependency checks") Reviewed-by: Qu Wenruo <[email protected]> Signed-off-by: Mark Harmstone <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Mark Harmstone <[email protected]> Date: Tue Feb 17 18:25:42 2026 +0000 btrfs: fix error message order of parameters in btrfs_delete_delayed_dir_index() [ Upstream commit 3cf0f35779d364cf2003c617bb7f3f3e41023372 ] Fix the error message in btrfs_delete_delayed_dir_index() if __btrfs_add_delayed_item() fails: the message says root, inode, index, error, but we're actually passing index, root, inode, error. Fixes: adc1ef55dc04 ("btrfs: add details to error messages at btrfs_delete_delayed_dir_index()") Signed-off-by: Mark Harmstone <[email protected]> Reviewed-by: Filipe Manana <[email protected]> Signed-off-by: Filipe Manana <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Mark Harmstone <[email protected]> Date: Tue Feb 17 10:21:44 2026 +0000 btrfs: fix incorrect key offset in error message in check_dev_extent_item() [ Upstream commit 511dc8912ae3e929c1a182f5e6b2326516fd42a0 ] Fix the error message in check_dev_extent_item(), when an overlapping stripe is encountered. For dev extents, objectid is the disk number and offset the physical address, so prev_key->objectid should actually be prev_key->offset. (I can't take any credit for this one - this was discovered by Chris and his friend Claude.) Reported-by: Chris Mason <[email protected]> Fixes: 008e2512dc56 ("btrfs: tree-checker: add dev extent item checks") Reviewed-by: Qu Wenruo <[email protected]> Signed-off-by: Mark Harmstone <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Mark Harmstone <[email protected]> Date: Tue Feb 17 14:39:46 2026 +0000 btrfs: fix objectid value in error message in check_extent_data_ref() [ Upstream commit a10172780526c2002e062102ad4f2aabac495889 ] Fix a copy-paste error in check_extent_data_ref(): we're printing root as in the message above, we should be printing objectid. Fixes: f333a3c7e832 ("btrfs: tree-checker: validate dref root and objectid") Reviewed-by: Qu Wenruo <[email protected]> Signed-off-by: Mark Harmstone <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Mark Harmstone <[email protected]> Date: Tue Feb 17 17:46:13 2026 +0000 btrfs: fix warning in scrub_verify_one_metadata() [ Upstream commit 44e2fda66427a0442d8d2c0e6443256fb458ab6b ] Commit b471965fdb2d ("btrfs: fix replace/scrub failure with metadata_uuid") fixed the comparison in scrub_verify_one_metadata() to use metadata_uuid rather than fsid, but left the warning as it was. Fix it so it matches what we're doing. Fixes: b471965fdb2d ("btrfs: fix replace/scrub failure with metadata_uuid") Reviewed-by: Qu Wenruo <[email protected]> Signed-off-by: Mark Harmstone <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Miquel Sabaté Solà <[email protected]> Date: Mon Feb 16 22:12:15 2026 +0100 btrfs: free pages on error in btrfs_uring_read_extent() [ Upstream commit 3f501412f2079ca14bf68a18d80a2b7a823f1f64 ] In this function the 'pages' object is never freed in the hopes that it is picked up by btrfs_uring_read_finished() whenever that executes in the future. But that's just the happy path. Along the way previous allocations might have gone wrong, or we might not get -EIOCBQUEUED from btrfs_encoded_read_regular_fill_pages(). In all these cases, we go to a cleanup section that frees all memory allocated by this function without assuming any deferred execution, and this also needs to happen for the 'pages' allocation. Fixes: 34310c442e17 ("btrfs: add io_uring command for encoded reads (ENCODED_READ ioctl)") Signed-off-by: Miquel Sabaté Solà <[email protected]> Reviewed-by: Filipe Manana <[email protected]> Signed-off-by: Filipe Manana <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Mark Harmstone <[email protected]> Date: Tue Feb 17 17:32:39 2026 +0000 btrfs: print correct subvol num if active swapfile prevents deletion [ Upstream commit 1c7e9111f4e6d6d42bc47759c9af1ef91f03ac2c ] Fix the error message in btrfs_delete_subvolume() if we can't delete a subvolume because it has an active swapfile: we were printing the number of the parent rather than the target. Fixes: 60021bd754c6 ("btrfs: prevent subvol with swapfile from being deleted") Reviewed-by: Qu Wenruo <[email protected]> Reviewed-by: Filipe Manana <[email protected]> Signed-off-by: Mark Harmstone <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Naohiro Aota <[email protected]> Date: Fri Jan 23 21:41:36 2026 +0900 btrfs: zoned: fixup last alloc pointer after extent removal for RAID0/10 [ Upstream commit 52ee9965d09b2c56a027613db30d1fb20d623861 ] When a block group is composed of a sequential write zone and a conventional zone, we recover the (pseudo) write pointer of the conventional zone using the end of the last allocated position. However, if the last extent in a block group is removed, the last extent position will be smaller than the other real write pointer position. Then, that will cause an error due to mismatch of the write pointers. We can fixup this case by moving the alloc_offset to the corresponding write pointer position. Fixes: 568220fa9657 ("btrfs: zoned: support RAID0/1/10 on top of raid stripe tree") CC: [email protected] # 6.12+ Reviewed-by: Johannes Thumshirn <[email protected]> Signed-off-by: Naohiro Aota <[email protected]> Signed-off-by: David Sterba <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Oliver Hartkopp <[email protected]> Date: Wed Feb 18 11:58:06 2026 +0100 can: bcm: fix locking for bcm_op runtime updates [ Upstream commit c35636e91e392e1540949bbc67932167cb48bc3a ] Commit c2aba69d0c36 ("can: bcm: add locking for bcm_op runtime updates") added a locking for some variables that can be modified at runtime when updating the sending bcm_op with a new TX_SETUP command in bcm_tx_setup(). Usually the RX_SETUP only handles and filters incoming traffic with one exception: When the RX_RTR_FRAME flag is set a predefined CAN frame is sent when a specific RTR frame is received. Therefore the rx bcm_op uses bcm_can_tx() which uses the bcm_tx_lock that was only initialized in bcm_tx_setup(). Add the missing spin_lock_init() when allocating the bcm_op in bcm_rx_setup() to handle the RTR case properly. Fixes: c2aba69d0c36 ("can: bcm: add locking for bcm_op runtime updates") Reported-by: [email protected] Closes: https://lore.kernel.org/linux-can/[email protected]/ Signed-off-by: Oliver Hartkopp <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Marc Kleine-Budde <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Greg Kroah-Hartman <[email protected]> Date: Mon Feb 23 17:51:17 2026 +0100 can: ems_usb: ems_usb_read_bulk_callback(): check the proper length of a message commit 38a01c9700b0dcafe97dfa9dc7531bf4a245deff upstream. When looking at the data in a USB urb, the actual_length is the size of the buffer passed to the driver, not the transfer_buffer_length which is set by the driver as the max size of the buffer. When parsing the messages in ems_usb_read_bulk_callback() properly check the size both at the beginning of parsing the message to make sure it is big enough for the expected structure, and at the end of the message to make sure we don't overflow past the end of the buffer for the next message. Cc: Vincent Mailhol <[email protected]> Cc: Marc Kleine-Budde <[email protected]> Cc: [email protected] Assisted-by: gkh_clanker_2000 Signed-off-by: Greg Kroah-Hartman <[email protected]> Link: https://patch.msgid.link/2026022316-answering-strainer-a5db@gregkh Fixes: 702171adeed3 ("ems_usb: Added support for EMS CPC-USB/ARM7 CAN/USB interface") Signed-off-by: Marc Kleine-Budde <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Alban Bedel <[email protected]> Date: Mon Feb 9 15:47:05 2026 +0100 can: mcp251x: fix deadlock in error path of mcp251x_open [ Upstream commit ab3f894de216f4a62adc3b57e9191888cbf26885 ] The mcp251x_open() function call free_irq() in its error path with the mpc_lock mutex held. But if an interrupt already occurred the interrupt handler will be waiting for the mpc_lock and free_irq() will deadlock waiting for the handler to finish. This issue is similar to the one fixed in commit 7dd9c26bd6cf ("can: mcp251x: fix deadlock if an interrupt occurs during mcp251x_open") but for the error path. To solve this issue move the call to free_irq() after the lock is released. Setting `priv->force_quit = 1` beforehand ensure that the IRQ handler will exit right away once it acquired the lock. Signed-off-by: Alban Bedel <[email protected]> Link: https://patch.msgid.link/[email protected] Fixes: bf66f3736a94 ("can: mcp251x: Move to threaded interrupts instead of workqueues.") Signed-off-by: Marc Kleine-Budde <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Greg Kroah-Hartman <[email protected]> Date: Mon Feb 23 17:30:20 2026 +0100 can: ucan: Fix infinite loop from zero-length messages commit 1e446fd0582ad8be9f6dafb115fc2e7245f9bea7 upstream. If a broken ucan device gets a message with the message length field set to 0, then the driver will loop for forever in ucan_read_bulk_callback(), hanging the system. If the length is 0, just skip the message and go on to the next one. This has been fixed in the kvaser_usb driver in the past in commit 0c73772cd2b8 ("can: kvaser_usb: leaf: Fix potential infinite loop in command parsers"), so there must be some broken devices out there like this somewhere. Cc: Marc Kleine-Budde <[email protected]> Cc: Vincent Mailhol <[email protected]> Cc: [email protected] Assisted-by: gkh_clanker_2000 Signed-off-by: Greg Kroah-Hartman <[email protected]> Link: https://patch.msgid.link/2026022319-huff-absurd-6a18@gregkh Fixes: 9f2d3eae88d2 ("can: ucan: add driver for Theobroma Systems UCAN devices") Signed-off-by: Marc Kleine-Budde <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Greg Kroah-Hartman <[email protected]> Date: Mon Feb 23 17:39:20 2026 +0100 can: usb: etas_es58x: correctly anchor the urb in the read bulk callback commit 5eaad4f768266f1f17e01232ffe2ef009f8129b7 upstream. When submitting an urb, that is using the anchor pattern, it needs to be anchored before submitting it otherwise it could be leaked if usb_kill_anchored_urbs() is called. This logic is correctly done elsewhere in the driver, except in the read bulk callback so do that here also. Cc: Vincent Mailhol <[email protected]> Cc: Marc Kleine-Budde <[email protected]> Cc: [email protected] Assisted-by: gkh_clanker_2000 Signed-off-by: Greg Kroah-Hartman <[email protected]> Reviewed-by: Vincent Mailhol <[email protected]> Tested-by: Vincent Mailhol <[email protected]> Link: https://patch.msgid.link/2026022320-poser-stiffly-9d84@gregkh Fixes: 8537257874e9 ("can: etas_es58x: add core support for ETAS ES58X CAN USB interfaces") Signed-off-by: Marc Kleine-Budde <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Greg Kroah-Hartman <[email protected]> Date: Mon Feb 23 13:10:32 2026 +0100 can: usb: f81604: correctly anchor the urb in the read bulk callback commit 952caa5da10bed22be09612433964f6877ba0dde upstream. When submitting an urb, that is using the anchor pattern, it needs to be anchored before submitting it otherwise it could be leaked if usb_kill_anchored_urbs() is called. This logic is correctly done elsewhere in the driver, except in the read bulk callback so do that here also. Cc: Ji-Ze Hong (Peter Hong) <[email protected]> Cc: Marc Kleine-Budde <[email protected]> Cc: Vincent Mailhol <[email protected]> Cc: [email protected] Assisted-by: gkh_clanker_2000 Signed-off-by: Greg Kroah-Hartman <[email protected]> Link: https://patch.msgid.link/2026022334-starlight-scaling-2cea@gregkh Fixes: 88da17436973 ("can: usb: f81604: add Fintek F81604 support") Signed-off-by: Marc Kleine-Budde <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Greg Kroah-Hartman <[email protected]> Date: Mon Feb 23 13:10:31 2026 +0100 can: usb: f81604: handle bulk write errors properly commit 51f94780720fa90c424f67e3e9784cb8ef8190e5 upstream. If a write urb fails then more needs to be done other than just logging the message, otherwise the transmission could be stalled. Properly increment the error counters and wake up the queues so that data will continue to flow. Cc: Ji-Ze Hong (Peter Hong) <[email protected]> Cc: Marc Kleine-Budde <[email protected]> Cc: Vincent Mailhol <[email protected]> Cc: [email protected] Assisted-by: gkh_clanker_2000 Signed-off-by: Greg Kroah-Hartman <[email protected]> Link: https://patch.msgid.link/2026022334-slackness-dynamic-9195@gregkh Fixes: 88da17436973 ("can: usb: f81604: add Fintek F81604 support") Signed-off-by: Marc Kleine-Budde <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Greg Kroah-Hartman <[email protected]> Date: Mon Feb 23 13:10:30 2026 +0100 can: usb: f81604: handle short interrupt urb messages properly commit 7299b1b39a255f6092ce4ec0b65f66e9d6a357af upstream. If an interrupt urb is received that is not the correct length, properly detect it and don't attempt to treat the data as valid. Cc: Ji-Ze Hong (Peter Hong) <[email protected]> Cc: Marc Kleine-Budde <[email protected]> Cc: Vincent Mailhol <[email protected]> Cc: [email protected] Assisted-by: gkh_clanker_2000 Signed-off-by: Greg Kroah-Hartman <[email protected]> Link: https://patch.msgid.link/2026022331-opal-evaluator-a928@gregkh Fixes: 88da17436973 ("can: usb: f81604: add Fintek F81604 support") Signed-off-by: Marc Kleine-Budde <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Waiman Long <[email protected]> Date: Sat Feb 21 13:54:12 2026 -0500 cgroup/cpuset: Fix incorrect use of cpuset_update_tasks_cpumask() in update_cpumasks_hier() [ Upstream commit 68230aac8b9aad243626fbaf3ca170012c17fec5 ] Commit e2ffe502ba45 ("cgroup/cpuset: Add cpuset.cpus.exclusive for v2") incorrectly changed the 2nd parameter of cpuset_update_tasks_cpumask() from tmp->new_cpus to cp->effective_cpus. This second parameter is just a temporary cpumask for internal use. The cpuset_update_tasks_cpumask() function was originally called update_tasks_cpumask() before commit 381b53c3b549 ("cgroup/cpuset: rename functions shared between v1 and v2"). This mistake can incorrectly change the effective_cpus of the cpuset when it is the top_cpuset or in arm64 architecture where task_cpu_possible_mask() may differ from cpu_possible_mask. So far top_cpuset hasn't been passed to update_cpumasks_hier() yet, but arm64 arch can still be impacted. Fix it by reverting the incorrect change. Fixes: e2ffe502ba45 ("cgroup/cpuset: Add cpuset.cpus.exclusive for v2") Signed-off-by: Waiman Long <[email protected]> Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Johan Hovold <[email protected]> Date: Fri Nov 21 17:40:03 2025 +0100 clk: tegra: tegra124-emc: fix device leak on set_rate() [ Upstream commit da61439c63d34ae6503d080a847f144d587e3a48 ] Make sure to drop the reference taken when looking up the EMC device and its driver data on first set_rate(). Note that holding a reference to a device does not prevent its driver data from going away so there is no point in keeping the reference. Fixes: 2db04f16b589 ("clk: tegra: Add EMC clock driver") Fixes: 6d6ef58c2470 ("clk: tegra: tegra124-emc: Fix missing put_device() call in emc_ensure_emc_driver") Cc: [email protected] # 4.2: 6d6ef58c2470 Cc: Mikko Perttunen <[email protected]> Cc: Miaoqian Lin <[email protected]> Signed-off-by: Johan Hovold <[email protected]> Signed-off-by: Stephen Boyd <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Srinivas Pandruvada <[email protected]> Date: Tue Feb 24 16:17:52 2026 -0800 cpufreq: intel_pstate: Fix crash during turbo disable commit 6b050482ec40569429d963ac52afa878691b04c9 upstream. When the system is booted with kernel command line argument "nosmt" or "maxcpus" to limit the number of CPUs, disabling turbo via: echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo results in a crash: PF: supervisor read access in kernel mode PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP PTI ... RIP: 0010:store_no_turbo+0x100/0x1f0 ... This occurs because for_each_possible_cpu() returns CPUs even if they are not online. For those CPUs, all_cpu_data[] will be NULL. Since commit 973207ae3d7c ("cpufreq: intel_pstate: Rearrange max frequency updates handling code"), all_cpu_data[] is dereferenced even for CPUs which are not online, causing the NULL pointer dereference. To fix that, pass CPU number to intel_pstate_update_max_freq() and use all_cpu_data[] for those CPUs for which there is a valid cpufreq policy. Fixes: 973207ae3d7c ("cpufreq: intel_pstate: Rearrange max frequency updates handling code") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=221068 Signed-off-by: Srinivas Pandruvada <[email protected]> Cc: 6.16+ <[email protected]> # 6.16+ Link: https://patch.msgid.link/[email protected] Signed-off-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Dave Jiang <[email protected]> Date: Thu Feb 12 14:50:38 2026 -0700 cxl: Fix race of nvdimm_bus object when creating nvdimm objects [ Upstream commit 96a1fd0d84b17360840f344826897fa71049870e ] Found issue during running of cxl-translate.sh unit test. Adding a 3s sleep right before the test seems to make the issue reproduce fairly consistently. The cxl_translate module has dependency on cxl_acpi and causes orphaned nvdimm objects to reprobe after cxl_acpi is removed. The nvdimm_bus object is registered by the cxl_nvb object when cxl_acpi_probe() is called. With the nvdimm_bus object missing, __nd_device_register() will trigger NULL pointer dereference when accessing the dev->parent that points to &nvdimm_bus->dev. [ 192.884510] BUG: kernel NULL pointer dereference, address: 000000000000006c [ 192.895383] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS edk2-20250812-19.fc42 08/12/2025 [ 192.897721] Workqueue: cxl_port cxl_bus_rescan_queue [cxl_core] [ 192.899459] RIP: 0010:kobject_get+0xc/0x90 [ 192.924871] Call Trace: [ 192.925959] <TASK> [ 192.926976] ? pm_runtime_init+0xb9/0xe0 [ 192.929712] __nd_device_register.part.0+0x4d/0xc0 [libnvdimm] [ 192.933314] __nvdimm_create+0x206/0x290 [libnvdimm] [ 192.936662] cxl_nvdimm_probe+0x119/0x1d0 [cxl_pmem] [ 192.940245] cxl_bus_probe+0x1a/0x60 [cxl_core] [ 192.943349] really_probe+0xde/0x380 This patch also relies on the previous change where devm_cxl_add_nvdimm_bridge() is called from drivers/cxl/pmem.c instead of drivers/cxl/core.c to ensure the dependency of cxl_acpi on cxl_pmem. 1. Set probe_type of cxl_nvb to PROBE_FORCE_SYNCHRONOUS to ensure the driver is probed synchronously when add_device() is called. 2. Add a check in __devm_cxl_add_nvdimm_bridge() to ensure that the cxl_nvb driver is attached during cxl_acpi_probe(). 3. Take the cxl_root uport_dev lock and the cxl_nvb->dev lock in devm_cxl_add_nvdimm() before checking nvdimm_bus is valid. 4. Set cxl_nvdimm flag to CXL_NVD_F_INVALIDATED so cxl_nvdimm_probe() will exit with -EBUSY. The removal of cxl_nvdimm devices should prevent any orphaned devices from probing once the nvdimm_bus is gone. [ dj: Fixed 0-day reported kdoc issue. ] [ dj: Fix cxl_nvb reference leak on error. Gregory (kreview-0811365) ] Suggested-by: Dan Williams <[email protected]> Fixes: 8fdcb1704f61 ("cxl/pmem: Add initial infrastructure for pmem support") Tested-by: Alison Schofield <[email protected]> Reviewed-by: Alison Schofield <[email protected]?> Link: https://patch.msgid.link/[email protected] Signed-off-by: Dave Jiang <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Dave Jiang <[email protected]> Date: Wed Feb 11 17:31:23 2026 -0700 cxl: Move devm_cxl_add_nvdimm_bridge() to cxl_pmem.ko [ Upstream commit e7e222ad73d93fe54d6e6e3a15253a0ecf081a1b ] Moving the symbol devm_cxl_add_nvdimm_bridge() to drivers/cxl/cxl_pmem.c, so that cxl_pmem can export a symbol that gives cxl_acpi a depedency on cxl_pmem kernel module. This is a prepatory patch to resolve the issue of a race for nvdimm_bus object that is created during cxl_acpi_probe(). No functional changes besides moving code. Suggested-by: Dan Williams <[email protected]> Acked-by: Ira Weiny <[email protected]> Tested-by: Alison Schofield <[email protected]> Reviewed-by: Alison Schofield <[email protected]?> Link: https://patch.msgid.link/[email protected] Signed-off-by: Dave Jiang <[email protected]> Stable-dep-of: 96a1fd0d84b1 ("cxl: Fix race of nvdimm_bus object when creating nvdimm objects") Signed-off-by: Sasha Levin <[email protected]>
Author: Thomas Gleixner <[email protected]> Date: Sat Feb 7 14:27:05 2026 +0100 debugobject: Make it work with deferred page initialization - again [ Upstream commit fd3634312a04f336dcbfb481060219f0cd320738 ] debugobjects uses __GFP_HIGH for allocations as it might be invoked within locked regions. That worked perfectly fine until v6.18. It still works correctly when deferred page initialization is disabled and works by chance when no page allocation is required before deferred page initialization has completed. Since v6.18 allocations w/o a reclaim flag cause new_slab() to end up in alloc_frozen_pages_nolock_noprof(), which returns early when deferred page initialization has not yet completed. As the deferred page initialization takes quite a while the debugobject pool is depleted and debugobjects are disabled. This can be worked around when PREEMPT_COUNT is enabled as that allows debugobjects to add __GFP_KSWAPD_RECLAIM to the GFP flags when the context is preemtible. When PREEMPT_COUNT is disabled the context is unknown and the reclaim bit can't be set because the caller might hold locks which might deadlock in the allocator. In preemptible context the reclaim bit is harmless and not a performance issue as that's usually invoked from slow path initialization context. That makes debugobjects depend on PREEMPT_COUNT || !DEFERRED_STRUCT_PAGE_INIT. Fixes: af92793e52c3 ("slab: Introduce kmalloc_nolock() and kfree_nolock().") Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Sebastian Andrzej Siewior <[email protected]> Acked-by: Alexei Starovoitov <[email protected]> Acked-by: Vlastimil Babka <[email protected]> Link: https://patch.msgid.link/87pl6gznti.ffs@tglx Signed-off-by: Sasha Levin <[email protected]>
Author: Guenter Roeck <[email protected]> Date: Thu Feb 26 21:58:12 2026 -0800 dpaa2-switch: Fix interrupt storm after receiving bad if_id in IRQ handler [ Upstream commit 74badb9c20b1a9c02a95c735c6d3cd6121679c93 ] Commit 31a7a0bbeb00 ("dpaa2-switch: add bounds check for if_id in IRQ handler") introduces a range check for if_id to avoid an out-of-bounds access. If an out-of-bounds if_id is detected, the interrupt status is not cleared. This may result in an interrupt storm. Clear the interrupt status after detecting an out-of-bounds if_id to avoid the problem. Found by an experimental AI code review agent at Google. Fixes: 31a7a0bbeb00 ("dpaa2-switch: add bounds check for if_id in IRQ handler") Cc: Junrui Luo <[email protected]> Signed-off-by: Guenter Roeck <[email protected]> Reviewed-by: Ioana Ciornei <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Lars Ellenberg <[email protected]> Date: Thu Feb 19 15:20:12 2026 +0100 drbd: fix "LOGIC BUG" in drbd_al_begin_io_nonblock() commit ab140365fb62c0bdab22b2f516aff563b2559e3b upstream. Even though we check that we "should" be able to do lc_get_cumulative() while holding the device->al_lock spinlock, it may still fail, if some other code path decided to do lc_try_lock() with bad timing. If that happened, we logged "LOGIC BUG for enr=...", but still did not return an error. The rest of the code now assumed that this request has references for the relevant activity log extents. The implcations are that during an active resync, mutual exclusivity of resync versus application IO is not guaranteed. And a potential crash at this point may not realizs that these extents could have been target of in-flight IO and would need to be resynced just in case. Also, once the request completes, it will give up activity log references it does not even hold, which will trigger a BUG_ON(refcnt == 0) in lc_put(). Fix: Do not crash the kernel for a condition that is harmless during normal operation: also catch "e->refcnt == 0", not only "e == NULL" when being noisy about "al_complete_io() called on inactive extent %u\n". And do not try to be smart and "guess" whether something will work, then be surprised when it does not. Deal with the fact that it may or may not work. If it does not, remember a possible "partially in activity log" state (only possible for requests that cross extent boundaries), and return an error code from drbd_al_begin_io_nonblock(). A latter call for the same request will then resume from where we left off. Cc: [email protected] Signed-off-by: Lars Ellenberg <[email protected]> Signed-off-by: Christoph Böhmwalder <[email protected]> Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Christoph Böhmwalder <[email protected]> Date: Fri Feb 20 12:39:37 2026 +0100 drbd: fix null-pointer dereference on local read error commit 0d195d3b205ca90db30d70d09d7bb6909aac178f upstream. In drbd_request_endio(), READ_COMPLETED_WITH_ERROR is passed to __req_mod() with a NULL peer_device: __req_mod(req, what, NULL, &m); The READ_COMPLETED_WITH_ERROR handler then unconditionally passes this NULL peer_device to drbd_set_out_of_sync(), which dereferences it, causing a null-pointer dereference. Fix this by obtaining the peer_device via first_peer_device(device), matching how drbd_req_destroy() handles the same situation. Cc: [email protected] Reported-by: Tuo Li <[email protected]> Link: https://lore.kernel.org/linux-block/[email protected] Signed-off-by: Christoph Böhmwalder <[email protected]> Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Natalie Vock <[email protected]> Date: Mon Feb 23 12:45:37 2026 +0100 drm/amd/display: Use GFP_ATOMIC in dc_create_stream_for_sink commit 28dfe4317541e57fe52f9a290394cd29c348228b upstream. This can be called while preemption is disabled, for example by dcn32_internal_validate_bw which is called with the FPU active. Fixes "BUG: scheduling while atomic" messages I encounter on my Navi31 machine. Signed-off-by: Natalie Vock <[email protected]> Signed-off-by: Alex Deucher <[email protected]> (cherry picked from commit b42dae2ebc5c84a68de63ec4ffdfec49362d53f1) Cc: [email protected] [ Context ] Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Mario Limonciello <[email protected]> Date: Thu Feb 5 10:42:54 2026 -0600 drm/amd: Fix hang on amdgpu unload by using pci_dev_is_disconnected() [ Upstream commit f7afda7fcd169a9168695247d07ad94cf7b9798f ] The commit 6a23e7b4332c ("drm/amd: Clean up kfd node on surprise disconnect") introduced early KFD cleanup when drm_dev_is_unplugged() returns true. However, this causes hangs during normal module unload (rmmod amdgpu). The issue occurs because drm_dev_unplug() is called in amdgpu_pci_remove() for all removal scenarios, not just surprise disconnects. This was done intentionally in commit 39934d3ed572 ("Revert "drm/amdgpu: TA unload messages are not actually sent to psp when amdgpu is uninstalled"") to fix IGT PCI software unplug test failures. As a result, drm_dev_is_unplugged() returns true even during normal module unload, triggering the early KFD cleanup inappropriately. The correct check should distinguish between: - Actual surprise disconnect (eGPU unplugged): pci_dev_is_disconnected() returns true - Normal module unload (rmmod): pci_dev_is_disconnected() returns false Replace drm_dev_is_unplugged() with pci_dev_is_disconnected() to ensure the early cleanup only happens during true hardware disconnect events. Cc: [email protected] Reported-by: Cal Peake <[email protected]> Closes: https://lore.kernel.org/all/[email protected]/ Fixes: 6a23e7b4332c ("drm/amd: Clean up kfd node on surprise disconnect") Acked-by: Alex Deucher <[email protected]> Signed-off-by: Mario Limonciello <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Tvrtko Ursulin <[email protected]> Date: Mon Feb 23 12:41:31 2026 +0000 drm/amdgpu/userq: Do not allow userspace to trivially triger kernel warnings [ Upstream commit 7b7d7693a55d606d700beb9549c9f7f0e5d9c24f ] Userspace can either deliberately pass in the too small num_fences, or the required number can legitimately grow between the two calls to the userq wait ioctl. In both cases we do not want the emit the kernel warning backtrace since nothing is wrong with the kernel and userspace will simply get an errno reported back. So lets simply drop the WARN_ONs. Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Tvrtko Ursulin <[email protected]> Fixes: a292fdecd728 ("drm/amdgpu: Implement userqueue signal/wait IOCTL") Cc: Arunpravin Paneer Selvam <[email protected]> Cc: Christian König <[email protected]> Cc: Alex Deucher <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]> (cherry picked from commit 2c333ea579de6cc20ea7bc50e9595ef72863e65c) Signed-off-by: Sasha Levin <[email protected]>
Author: Lijo Lazar <[email protected]> Date: Tue Feb 24 10:18:51 2026 +0530 drm/amdgpu: Fix error handling in slot reset [ Upstream commit b57c4ec98c17789136a4db948aec6daadceb5024 ] If the device has not recovered after slot reset is called, it goes to out label for error handling. There it could make decision based on uninitialized hive pointer and could result in accessing an uninitialized list. Initialize the list and hive properly so that it handles the error situation and also releases the reset domain lock which is acquired during error_detected callback. Fixes: 732c6cefc1ec ("drm/amdgpu: Replace tmp_adev with hive in amdgpu_pci_slot_reset") Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Ce Sun <[email protected]> Signed-off-by: Alex Deucher <[email protected]> (cherry picked from commit bb71362182e59caa227e4192da5a612b09349696) Signed-off-by: Sasha Levin <[email protected]>
Author: Bart Van Assche <[email protected]> Date: Mon Feb 23 13:50:23 2026 -0800 drm/amdgpu: Fix locking bugs in error paths [ Upstream commit 480ad5f6ead4a47b969aab6618573cd6822bb6a4 ] Do not unlock psp->ras_context.mutex if it has not been locked. This has been detected by the Clang thread-safety analyzer. Cc: Alex Deucher <[email protected]> Cc: Christian König <[email protected]> Cc: YiPeng Chai <[email protected]> Cc: Hawking Zhang <[email protected]> Cc: [email protected] Fixes: b3fb79cda568 ("drm/amdgpu: add mutex to protect ras shared memory") Acked-by: Christian König <[email protected]> Signed-off-by: Bart Van Assche <[email protected]> Signed-off-by: Alex Deucher <[email protected]> (cherry picked from commit 6fa01b4335978051d2cd80841728fd63cc597970) Signed-off-by: Sasha Levin <[email protected]>
Author: Bart Van Assche <[email protected]> Date: Mon Feb 23 14:00:07 2026 -0800 drm/amdgpu: Unlock a mutex before destroying it [ Upstream commit 5e0bcc7b88bcd081aaae6f481b10d9ab294fcb69 ] Mutexes must be unlocked before these are destroyed. This has been detected by the Clang thread-safety analyzer. Cc: Alex Deucher <[email protected]> Cc: Christian König <[email protected]> Cc: Yang Wang <[email protected]> Cc: Hawking Zhang <[email protected]> Cc: [email protected] Fixes: f5e4cc8461c4 ("drm/amdgpu: implement RAS ACA driver framework") Reviewed-by: Yang Wang <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Bart Van Assche <[email protected]> Signed-off-by: Alex Deucher <[email protected]> (cherry picked from commit 270258ba320beb99648dceffb67e86ac76786e55) Signed-off-by: Sasha Levin <[email protected]>
Author: Jonathan Cavitt <[email protected]> Date: Tue Feb 24 22:12:28 2026 +0000 drm/client: Do not destroy NULL modes [ Upstream commit c601fd5414315fc515f746b499110e46272e7243 ] 'modes' in drm_client_modeset_probe may fail to kcalloc. If this occurs, we jump to 'out', calling modes_destroy on it, which dereferences it. This may result in a NULL pointer dereference in the error case. Prevent that. Fixes: 3039cc0c0653 ("drm/client: Make copies of modes") Signed-off-by: Jonathan Cavitt <[email protected]> Cc: Ville Syrjälä <[email protected]> Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
Author: Simon Ser <[email protected]> Date: Sun Feb 8 22:47:26 2026 +0000 drm/fourcc: fix plane order for 10/12/16-bit YCbCr formats [ Upstream commit e9e0b48cd15b46dcb2bbc165f6b0fee698b855d6 ] The short comments had the correct order, but the long comments had the planes reversed. Fixes: 2271e0a20ef7 ("drm: drm_fourcc: add 10/12/16bit software decoder YCbCr formats") Signed-off-by: Simon Ser <[email protected]> Reviewed-by: Daniel Stone <[email protected]> Reviewed-by: Robert Mader <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
Author: Imre Deak <[email protected]> Date: Mon Dec 15 21:23:56 2025 +0200 drm/i915/dp: Fail state computation for invalid DSC source input BPP values [ Upstream commit 338465490cf7bd4a700ecd33e4855fee4622fa5f ] There is no reason to accept an invalid minimum/maximum DSC source input BPP value (i.e a minimum DSC input BPP value above the maximum pipe BPP or a maximum DSC input BPP value below the minimum pipe BPP value), fail the state computation in these cases. Reviewed-by: Vinod Govindapillai <[email protected]> Signed-off-by: Imre Deak <[email protected]> Link: https://patch.msgid.link/[email protected] Stable-dep-of: fe26ae6ac8b8 ("drm/i915/dp: Fix pipe BPP clamping due to HDR") Signed-off-by: Sasha Levin <[email protected]>
Author: Imre Deak <[email protected]> Date: Mon Feb 9 15:38:16 2026 +0200 drm/i915/dp: Fix pipe BPP clamping due to HDR [ Upstream commit fe26ae6ac8b88fcdac5036b557c129a17fe520d2 ] The pipe BPP value shouldn't be set outside of the source's / sink's valid pipe BPP range, ensure this when increasing the minimum pipe BPP value to 30 due to HDR. While at it debug print if the HDR mode was requested for a connector by setting the corresponding HDR connector property. This indicates if the requested HDR mode could not be enabled, since the selected pipe BPP is below 30, due to a sink capability or link BW limit. v2: - Also handle the case where the sink could support the target 30 BPP only in DSC mode due to a BW limit, but the sink doesn't support DSC or 30 BPP as a DSC input BPP. (Chaitanya) - Debug print the connector's HDR mode in the link config dump, to indicate if a BPP >= 30 required by HDR couldn't be reached. (Ankit) - Add Closes: trailer. (Ankit) - Don't print the 30 BPP-outside of valid BPP range debug message if the min BPP is already > 30 (and so a target BPP >= 30 required for HDR is ensured). Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/7052 Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/15503 Fixes: ba49a4643cf53 ("drm/i915/dp: Set min_bpp limit to 30 in HDR mode") Cc: Chaitanya Kumar Borah <[email protected]> Cc: <[email protected]> # v6.18+ Reviewed-by: Ankit Nautiyal <[email protected]> # v1 Reviewed-by: Chaitanya Kumar Borah <[email protected]> Signed-off-by: Imre Deak <[email protected]> Link: https://patch.msgid.link/[email protected] (cherry picked from commit 08b7ef16b6a03e8c966e286ee1ac608a6ffb3d4a) Signed-off-by: Joonas Lahtinen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Chen Ni <[email protected]> Date: Wed Feb 4 17:06:29 2026 +0800 drm/imx: parallel-display: check return value of devm_drm_bridge_add() in imx_pd_probe() [ Upstream commit c5f8658f97ec392eeaf355d4e9775ae1f23ca1d3 ] Return the value of devm_drm_bridge_add() in order to propagate the error properly, if it fails due to resource allocation failure or bridge registration failure. This ensures that the probe function fails safely rather than proceeding with a potentially incomplete bridge setup. Fixes: bf7e97910b9f ("drm/imx: parallel-display: add the bridge before attaching it") Signed-off-by: Chen Ni <[email protected]> Reviewed-by: Luca Ceresoli <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Luca Ceresoli <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Felix Gu <[email protected]> Date: Fri Jan 30 00:21:19 2026 +0800 drm/logicvc: Fix device node reference leak in logicvc_drm_config_parse() [ Upstream commit fef0e649f8b42bdffe4a916dd46e1b1e9ad2f207 ] The logicvc_drm_config_parse() function calls of_get_child_by_name() to find the "layers" node but fails to release the reference, leading to a device node reference leak. Fix this by using the __free(device_node) cleanup attribute to automatic release the reference when the variable goes out of scope. Fixes: efeeaefe9be5 ("drm: Add support for the LogiCVC display controller") Signed-off-by: Felix Gu <[email protected]> Reviewed-by: Luca Ceresoli <[email protected]> Reviewed-by: Kory Maincent <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Luca Ceresoli <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Yujie Liu <[email protected]> Date: Fri Feb 27 16:24:52 2026 +0800 drm/sched: Fix kernel-doc warning for drm_sched_job_done() [ Upstream commit 61ded1083b264ff67ca8c2de822c66b6febaf9a8 ] There is a kernel-doc warning for the scheduler: Warning: drivers/gpu/drm/scheduler/sched_main.c:367 function parameter 'result' not described in 'drm_sched_job_done' Fix the warning by describing the undocumented error code. Fixes: 539f9ee4b52a ("drm/scheduler: properly forward fence errors") Signed-off-by: Yujie Liu <[email protected]> [phasta: Flesh out commit message] Signed-off-by: Philipp Stanner <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
Author: Francesco Lavra <[email protected]> Date: Tue Feb 10 19:09:32 2026 +0100 drm/solomon: Fix page start when updating rectangle in page addressing mode [ Upstream commit 36d9579fed6c9429aa172f77bd28c58696ce8e2b ] In page addressing mode, the pixel values of a dirty rectangle must be sent to the display controller one page at a time. The range of pages corresponding to a given rectangle is being incorrectly calculated as if the Y value of the top left coordinate of the rectangle was 0. This can result in rectangle updates being displayed on wrong parts of the screen. Fix the above issue by consolidating the start page calculation in a single place at the beginning of the update_rect function, and using the calculated value for all addressing modes. Fixes: b0daaa5cfaa5 ("drm/ssd130x: Support page addressing mode") Signed-off-by: Francesco Lavra <[email protected]> Reviewed-by: Javier Martinez Canillas <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Javier Martinez Canillas <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Julian Orth <[email protected]> Date: Sun Mar 1 13:34:42 2026 +0100 drm/syncobj: Fix handle <-> fd ioctls with dirty stack [ Upstream commit 2e3649e237237258a08d75afef96648dd2b379f7 ] Consider the following application: #include <fcntl.h> #include <string.h> #include <drm/drm.h> #include <sys/ioctl.h> int main(void) { int fd = open("/dev/dri/renderD128", O_RDWR); struct drm_syncobj_create arg1; ioctl(fd, DRM_IOCTL_SYNCOBJ_CREATE, &arg1); struct drm_syncobj_handle arg2; memset(&arg2, 1, sizeof(arg2)); // simulate dirty stack arg2.handle = arg1.handle; arg2.flags = 0; arg2.fd = 0; arg2.pad = 0; // arg2.point = 0; // userspace is required to set point to 0 ioctl(fd, DRM_IOCTL_SYNCOBJ_HANDLE_TO_FD, &arg2); } The last ioctl returns EINVAL because args->point is not 0. However, userspace developed against older kernel versions is not aware of the new point field and might therefore not initialize it. The correct check would be if (args->flags & DRM_SYNCOBJ_FD_TO_HANDLE_FLAGS_TIMELINE) return -EINVAL; However, there might already be userspace that relies on this not returning an error as long as point == 0. Therefore use the more lenient check. Fixes: c2d3a7300695 ("drm/syncobj: Extend EXPORT_SYNC_FILE for timeline syncobjs") Signed-off-by: Julian Orth <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Christian König <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]>
Author: Johan Hovold <[email protected]> Date: Fri Nov 21 17:42:01 2025 +0100 drm/tegra: dsi: fix device leak on probe [ Upstream commit bfef062695570842cf96358f2f46f4c6642c6689 ] Make sure to drop the reference taken when looking up the companion (ganged) device and its driver data during probe(). Note that holding a reference to a device does not prevent its driver data from going away so there is no point in keeping the reference. Fixes: e94236cde4d5 ("drm/tegra: dsi: Add ganged mode support") Fixes: 221e3638feb8 ("drm/tegra: Fix reference leak in tegra_dsi_ganged_probe") Cc: [email protected] # 3.19: 221e3638feb8 Cc: Thierry Reding <[email protected]> Signed-off-by: Johan Hovold <[email protected]> Signed-off-by: Thierry Reding <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
Author: Ethan Tidmore <[email protected]> Date: Sun Feb 15 22:04:38 2026 -0600 drm/tiny: sharp-memory: fix pointer error dereference [ Upstream commit 46120745bb4e7e1f09959624716b4c5d6e2c2e9e ] The function devm_drm_dev_alloc() returns a pointer error upon failure not NULL. Change null check to pointer error check. Detected by Smatch: drivers/gpu/drm/tiny/sharp-memory.c:549 sharp_memory_probe() error: 'smd' dereferencing possible ERR_PTR() Fixes: b8f9f21716fec ("drm/tiny: Add driver for Sharp Memory LCD") Signed-off-by: Ethan Tidmore <[email protected]> Reviewed-by: Thomas Zimmermann <[email protected]> Signed-off-by: Thomas Zimmermann <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
Author: Brad Spengler <[email protected]> Date: Wed Jan 7 12:12:36 2026 -0500 drm/vmwgfx: Fix invalid kref_put callback in vmw_bo_dirty_release [ Upstream commit 211ecfaaef186ee5230a77d054cdec7fbfc6724a ] The kref_put() call uses (void *)kvfree as the release callback, which is incorrect. kref_put() expects a function with signature void (*release)(struct kref *), but kvfree has signature void (*)(const void *). Calling through an incompatible function pointer is undefined behavior. The code only worked by accident because ref_count is the first member of vmw_bo_dirty, making the kref pointer equal to the struct pointer. Fix this by adding a proper release callback that uses container_of() to retrieve the containing structure before freeing. Fixes: c1962742ffff ("drm/vmwgfx: Use kref in vmw_bo_dirty") Signed-off-by: Brad Spengler <[email protected]> Signed-off-by: Zack Rusin <[email protected]> Cc: Ian Forbes <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
Author: Ian Forbes <[email protected]> Date: Tue Jan 13 11:53:57 2026 -0600 drm/vmwgfx: Return the correct value in vmw_translate_ptr functions [ Upstream commit 5023ca80f9589295cb60735016e39fc5cc714243 ] Before the referenced fixes these functions used a lookup function that returned a pointer. This was changed to another lookup function that returned an error code with the pointer becoming an out parameter. The error path when the lookup failed was not changed to reflect this change and the code continued to return the PTR_ERR of the now uninitialized pointer. This could cause the vmw_translate_ptr functions to return success when they actually failed causing further uninitialized and OOB accesses. Reported-by: Kuzey Arda Bulut <[email protected]> Fixes: a309c7194e8a ("drm/vmwgfx: Remove rcu locks from user resources") Signed-off-by: Ian Forbes <[email protected]> Reviewed-by: Zack Rusin <[email protected]> Signed-off-by: Zack Rusin <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
Author: Shuicheng Lin <[email protected]> Date: Wed Feb 25 01:34:49 2026 +0000 drm/xe/configfs: Free ctx_restore_mid_bb in release [ Upstream commit e377182f0266f46f02d01838e6bde67b9dac0d66 ] ctx_restore_mid_bb memory is allocated in wa_bb_store(), but xe_config_device_release() only frees ctx_restore_post_bb. Free ctx_restore_mid_bb[0].cs as well to avoid leaking the allocation when the configfs device is removed. Fixes: b30d5de3d40c ("drm/xe/configfs: Add mid context restore bb") Signed-off-by: Shuicheng Lin <[email protected]> Reviewed-by: Nitin Gote <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Matt Roper <[email protected]> (cherry picked from commit a235e7d0098337c3f2d1e8f3610c719a589e115f) Signed-off-by: Rodrigo Vivi <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Zhanjun Dong <[email protected]> Date: Fri Feb 20 17:53:08 2026 -0500 drm/xe/gsc: Fix GSC proxy cleanup on early initialization failure [ Upstream commit b3368ecca9538b88ddf982ea99064860fd5add97 ] xe_gsc_proxy_remove undoes what is done in both xe_gsc_proxy_init and xe_gsc_proxy_start; however, if we fail between those 2 calls, it is possible that the HW forcewake access hasn't been initialized yet and so we hit errors when the cleanup code tries to write GSC register. To avoid that, split the cleanup in 2 functions so that the HW cleanup is only called if the HW setup was completed successfully. Since the HW cleanup (interrupt disabling) is now removed from xe_gsc_proxy_remove, the cleanup on error paths in xe_gsc_proxy_start must be updated to disable interrupts before returning. Fixes: ff6cd29b690b ("drm/xe: Cleanup unwind of gt initialization") Signed-off-by: Zhanjun Dong <[email protected]> Reviewed-by: Daniele Ceraolo Spurio <[email protected]> Signed-off-by: Daniele Ceraolo Spurio <[email protected]> Link: https://patch.msgid.link/[email protected] (cherry picked from commit 2b37c401b265c07b46408b5cb36a4b757c9b5060) Signed-off-by: Rodrigo Vivi <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Shuicheng Lin <[email protected]> Date: Wed Feb 4 17:28:11 2026 +0000 drm/xe/reg_sr: Fix leak on xa_store failure [ Upstream commit 3091723785def05ebfe6a50866f87a044ae314ba ] Free the newly allocated entry when xa_store() fails to avoid a memory leak on the error path. v2: use goto fail_free. (Bala) Fixes: e5283bd4dfec ("drm/xe/reg_sr: Remove register pool") Cc: Balasubramani Vivekanandan <[email protected]> Cc: Matt Roper <[email protected]> Signed-off-by: Shuicheng Lin <[email protected]> Reviewed-by: Matt Roper <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Matt Roper <[email protected]> (cherry picked from commit 6bc6fec71ac45f52db609af4e62bdb96b9f5fadb) Signed-off-by: Rodrigo Vivi <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Matt Roper <[email protected]> Date: Fri Feb 6 14:30:59 2026 -0800 drm/xe/wa: Steer RMW of MCR registers while building default LRC [ Upstream commit 43d37df67f7770d8d261fdcb64ecc8c314e91303 ] When generating the default LRC, if a register is not masked, we apply any save-restore programming necessary via a read-modify-write sequence that will ensure we only update the relevant bits/fields without clobbering the rest of the register. However some of the registers that need to be updated might be MCR registers which require steering to a non-terminated instance to ensure we can read back a valid, non-zero value. The steering of reads originating from a command streamer is controlled by register CS_MMIO_GROUP_INSTANCE_SELECT. Emit additional MI_LRI commands to update the steering before any RMW of an MCR register to ensure the reads are performed properly. Note that needing to perform a RMW of an MCR register while building the default LRC is pretty rare. Most of the MCR registers that are part of an engine's LRCs are also masked registers, so no MCR is necessary. Fixes: f2f90989ccff ("drm/xe: Avoid reading RMW registers in emit_wa_job") Cc: Michal Wajdeczko <[email protected]> Reviewed-by: Balasubramani Vivekanandan <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Matt Roper <[email protected]> (cherry picked from commit 6c2e331c915ba9e774aa847921262805feb00863) Signed-off-by: Rodrigo Vivi <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Matthew Brost <[email protected]> Date: Wed Jan 14 16:45:46 2026 -0800 drm/xe: Do not preempt fence signaling CS instructions [ Upstream commit cdc8a1e11f4d5b480ec750e28010c357185b95a6 ] If a batch buffer is complete, it makes little sense to preempt the fence signaling instructions in the ring, as the largest portion of the work (the batch buffer) is already done and fence signaling consists of only a few instructions. If these instructions are preempted, the GuC would need to perform a context switch just to signal the fence, which is costly and delays fence signaling. Avoid this scenario by disabling preemption immediately after the BB start instruction and re-enabling it after executing the fence signaling instructions. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Daniele Ceraolo Spurio <[email protected]> Cc: Carlos Santa <[email protected]> Signed-off-by: Matthew Brost <[email protected]> Reviewed-by: Daniele Ceraolo Spurio <[email protected]> Link: https://patch.msgid.link/[email protected] (cherry picked from commit 2bcbf2dcde0c839a73af664a3c77d4e77d58a3eb) Signed-off-by: Rodrigo Vivi <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Vitaly Lifshits <[email protected]> Date: Tue Jan 6 16:14:20 2026 +0200 e1000e: clear DPG_EN after reset to avoid autonomous power-gating [ Upstream commit 0942fc6d324eb9c6b16187b2aa994c0823557f06 ] Panther Lake systems introduced an autonomous power gating feature for the integrated Gigabit Ethernet in shutdown state (S5) state. As part of it, the reset value of DPG_EN bit was changed to 1. Clear this bit after performing hardware reset to avoid errors such as Tx/Rx hangs, or packet loss/corruption. Fixes: 0c9183ce61bc ("e1000e: Add support for the next LOM generation") Signed-off-by: Vitaly Lifshits <[email protected]> Reviewed-by: Aleksandr Loktionov <[email protected]> Tested-by: Avigail Dahan <[email protected]> Reviewed-by: Paul Menzel <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Jann Horn <[email protected]> Date: Mon Feb 23 20:59:33 2026 +0100 eventpoll: Fix integer overflow in ep_loop_check_proc() commit fdcfce93073d990ed4b71752e31ad1c1d6e9d58b upstream. If a recursive call to ep_loop_check_proc() hits the `result = INT_MAX`, an integer overflow will occur in the calling ep_loop_check_proc() at `result = max(result, ep_loop_check_proc(ep_tovisit, depth + 1) + 1)`, breaking the recursion depth check. Fix it by using a different placeholder value that can't lead to an overflow. Reported-by: Guenter Roeck <[email protected]> Fixes: f2e467a48287 ("eventpoll: Fix semi-unbounded recursion") Cc: [email protected] Signed-off-by: Jann Horn <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Christian Brauner <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Yang Erkun <[email protected]> Date: Wed Nov 12 16:45:38 2025 +0800 ext4: correct the comments place for EXT4_EXT_MAY_ZEROOUT [ Upstream commit cc742fd1d184bb2a11bacf50587d2c85290622e4 ] Move the comments just before we set EXT4_EXT_MAY_ZEROOUT in ext4_split_convert_extents. Signed-off-by: Yang Erkun <[email protected]> Message-ID: <[email protected]> Signed-off-by: Theodore Ts'o <[email protected]> Stable-dep-of: feaf2a80e78f ("ext4: don't set EXT4_GET_BLOCKS_CONVERT when splitting before submitting I/O") Signed-off-by: Sasha Levin <[email protected]>
Author: Zhang Yi <[email protected]> Date: Sat Nov 29 18:32:35 2025 +0800 ext4: don't set EXT4_GET_BLOCKS_CONVERT when splitting before submitting I/O [ Upstream commit feaf2a80e78f89ee8a3464126077ba8683b62791 ] When allocating blocks during within-EOF DIO and writeback with dioread_nolock enabled, EXT4_GET_BLOCKS_PRE_IO was set to split an existing large unwritten extent. However, EXT4_GET_BLOCKS_CONVERT was set when calling ext4_split_convert_extents(), which may potentially result in stale data issues. Assume we have an unwritten extent, and then DIO writes the second half. [UUUUUUUUUUUUUUUU] on-disk extent U: unwritten extent [UUUUUUUUUUUUUUUU] extent status tree |<- ->| ----> dio write this range First, ext4_iomap_alloc() call ext4_map_blocks() with EXT4_GET_BLOCKS_PRE_IO, EXT4_GET_BLOCKS_UNWRIT_EXT and EXT4_GET_BLOCKS_CREATE flags set. ext4_map_blocks() find this extent and call ext4_split_convert_extents() with EXT4_GET_BLOCKS_CONVERT and the above flags set. Then, ext4_split_convert_extents() calls ext4_split_extent() with EXT4_EXT_MAY_ZEROOUT, EXT4_EXT_MARK_UNWRIT2 and EXT4_EXT_DATA_VALID2 flags set, and it calls ext4_split_extent_at() to split the second half with EXT4_EXT_DATA_VALID2, EXT4_EXT_MARK_UNWRIT1, EXT4_EXT_MAY_ZEROOUT and EXT4_EXT_MARK_UNWRIT2 flags set. However, ext4_split_extent_at() failed to insert extent since a temporary lack -ENOSPC. It zeroes out the first half but convert the entire on-disk extent to written since the EXT4_EXT_DATA_VALID2 flag set, but left the second half as unwritten in the extent status tree. [0000000000SSSSSS] data S: stale data, 0: zeroed [WWWWWWWWWWWWWWWW] on-disk extent W: written extent [WWWWWWWWWWUUUUUU] extent status tree Finally, if the DIO failed to write data to the disk, the stale data in the second half will be exposed once the cached extent entry is gone. Fix this issue by not passing EXT4_GET_BLOCKS_CONVERT when splitting an unwritten extent before submitting I/O, and make ext4_split_convert_extents() to zero out the entire extent range to zero for this case, and also mark the extent in the extent status tree for consistency. Fixes: b8a8684502a0 ("ext4: Introduce FALLOC_FL_ZERO_RANGE flag for fallocate") Signed-off-by: Zhang Yi <[email protected]> Reviewed-by: Ojaswin Mujoo <[email protected]> Reviewed-by: Baokun Li <[email protected]> Cc: [email protected] Message-ID: <[email protected]> Signed-off-by: Theodore Ts'o <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Ankit Garg <[email protected]> Date: Fri Feb 20 13:53:24 2026 -0800 gve: fix incorrect buffer cleanup in gve_tx_clean_pending_packets for QPL commit fb868db5f4bccd7a78219313ab2917429f715cea upstream. In DQ-QPL mode, gve_tx_clean_pending_packets() incorrectly uses the RDA buffer cleanup path. It iterates num_bufs times and attempts to unmap entries in the dma array. This leads to two issues: 1. The dma array shares storage with tx_qpl_buf_ids (union). Interpreting buffer IDs as DMA addresses results in attempting to unmap incorrect memory locations. 2. num_bufs in QPL mode (counting 2K chunks) can significantly exceed the size of the dma array, causing out-of-bounds access warnings (trace below is how we noticed this issue). UBSAN: array-index-out-of-bounds in drivers/net/ethernet/drivers/net/ethernet/google/gve/gve_tx_dqo.c:178:5 index 18 is out of range for type 'dma_addr_t[18]' (aka 'unsigned long long[18]') Workqueue: gve gve_service_task [gve] Call Trace: <TASK> dump_stack_lvl+0x33/0xa0 __ubsan_handle_out_of_bounds+0xdc/0x110 gve_tx_stop_ring_dqo+0x182/0x200 [gve] gve_close+0x1be/0x450 [gve] gve_reset+0x99/0x120 [gve] gve_service_task+0x61/0x100 [gve] process_scheduled_works+0x1e9/0x380 Fix this by properly checking for QPL mode and delegating to gve_free_tx_qpl_bufs() to reclaim the buffers. Cc: [email protected] Fixes: a6fb8d5a8b69 ("gve: Tx path for DQO-QPL") Signed-off-by: Ankit Garg <[email protected]> Reviewed-by: Jordan Rhee <[email protected]> Reviewed-by: Harshitha Ramamurthy <[email protected]> Signed-off-by: Joshua Washington <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Greg Kroah-Hartman <[email protected]> Date: Thu Feb 19 15:33:54 2026 +0100 HID: Add HID_CLAIMED_INPUT guards in raw_event callbacks missing them commit ecfa6f34492c493a9a1dc2900f3edeb01c79946b upstream. In commit 2ff5baa9b527 ("HID: appleir: Fix potential NULL dereference at raw event handle"), we handle the fact that raw event callbacks can happen even for a HID device that has not been "claimed" causing a crash if a broken device were attempted to be connected to the system. Fix up the remaining in-tree HID drivers that forgot to add this same check to resolve the same issue. Cc: Jiri Kosina <[email protected]> Cc: Benjamin Tissoires <[email protected]> Cc: Bastien Nocera <[email protected]> Cc: [email protected] Cc: stable <[email protected]> Assisted-by: gkh_clanker_2000 Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Benjamin Tissoires <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Werner Sembach <[email protected]> Date: Thu Jan 8 17:09:54 2026 +0100 HID: multitouch: Keep latency normal on deactivate for reactivation gesture commit ec3070f01fa30f2c5547d645dbb76174304bf0e4 upstream. Uniwill devices have a built in gesture in the touchpad to de- and reactivate it by double taping the upper left corner. This gesture stops working when latency is set to high, so this patch keeps the latency on normal. Cc: [email protected] Signed-off-by: Werner Sembach <[email protected]> [[email protected]: change bit from 24 to 25] [[email protected]: update shortlog] Signed-off-by: Jiri Kosina <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Ian Ray <[email protected]> Date: Tue Feb 17 13:51:51 2026 +0200 HID: multitouch: new class MT_CLS_EGALAX_P80H84 [ Upstream commit a2e70a89fa58133521b2deae4427d35776bda935 ] Fixes: f9e82295eec1 ("HID: multitouch: add eGalaxTouch P80H84 support") Signed-off-by: Ian Ray <[email protected]> Signed-off-by: Jiri Kosina <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Tomasz Pakuła <[email protected]> Date: Wed Feb 4 22:44:55 2026 +0100 HID: pidff: Fix condition effect bit clearing commit 97d5c8f5c09a604c4873c8348f58de3cea69a7df upstream. As reported by MPDarkGuy on discord, NULL pointer dereferences were happening because not all the conditional effects bits were cleared. Properly clear all conditional effect bits from ffbit Fixes: 7f3d7bc0df4b ("HID: pidff: Better quirk assigment when searching for fields") Cc: [email protected] # 6.18.x Signed-off-by: Tomasz Pakuła <[email protected]> Signed-off-by: Jiri Kosina <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Akhilesh Patil <[email protected]> Date: Sun Nov 2 15:13:20 2025 +0530 hwmon: (aht10) Add support for dht20 [ Upstream commit 3eaf1b631506e8de2cb37c278d5bc042521e82c1 ] Add support for dht20 temperature and humidity sensor from Aosong. Modify aht10 driver to handle different init command for dht20 sensor by adding init_cmd entry in the driver data. dht20 sensor is compatible with aht10 hwmon driver with this change. Tested on TI am62x SK board with dht20 sensor connected at i2c-2 port. Signed-off-by: Akhilesh Patil <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Guenter Roeck <[email protected]> Stable-dep-of: b7497b5a99f5 ("hwmon: (aht10) Fix initialization commands for AHT20") Signed-off-by: Sasha Levin <[email protected]>
Author: Hao Yu <[email protected]> Date: Mon Feb 23 01:03:31 2026 +0800 hwmon: (aht10) Fix initialization commands for AHT20 [ Upstream commit b7497b5a99f54ab8dcda5b14a308385b2fb03d8d ] According to the AHT20 datasheet (updated to V1.0 after the 2023.09 version), the initialization command for AHT20 is 0b10111110 (0xBE). The previous sequence (0xE1) used in earlier versions is no longer compatible with newer AHT20 sensors. Update the initialization command to ensure the sensor is properly initialized. While at it, use binary notation for DHT20_CMD_INIT to match the notation used in the datasheet. Fixes: d2abcb5cc885 ("hwmon: (aht10) Add support for compatible aht20") Signed-off-by: Hao Yu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Guenter Roeck <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Bart Van Assche <[email protected]> Date: Mon Feb 23 14:00:14 2026 -0800 hwmon: (it87) Check the it87_lock() return value [ Upstream commit 07ed4f05bbfd2bc014974dcc4297fd3aa1cb88c0 ] Return early in it87_resume() if it87_lock() fails instead of ignoring the return value of that function. This patch suppresses a Clang thread-safety warning. Cc: Frank Crawford <[email protected]> Cc: Guenter Roeck <[email protected]> Cc: Jean Delvare <[email protected]> Cc: [email protected] Fixes: 376e1a937b30 ("hwmon: (it87) Add calls to smbus_enable/smbus_disable as required") Signed-off-by: Bart Van Assche <[email protected]> Link: https://lore.kernel.org/r/[email protected] [groeck: Declare 'ret' at the beginning of it87_resume()] Signed-off-by: Guenter Roeck <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Gui-Dong Han <[email protected]> Date: Tue Feb 3 20:14:43 2026 +0800 hwmon: (max16065) Use READ/WRITE_ONCE to avoid compiler optimization induced race [ Upstream commit 007be4327e443d79c9dd9e56dc16c36f6395d208 ] Simply copying shared data to a local variable cannot prevent data races. The compiler is allowed to optimize away the local copy and re-read the shared memory, causing a Time-of-Check Time-of-Use (TOCTOU) issue if the data changes between the check and the usage. To enforce the use of the local variable, use READ_ONCE() when reading the shared data and WRITE_ONCE() when updating it. Apply these macros to the three identified locations (curr_sense, adc, and fault) where local variables are used for error validation, ensuring the value remains consistent. Reported-by: Ben Hutchings <[email protected]> Closes: https://lore.kernel.org/all/[email protected]/ Fixes: f5bae2642e3d ("hwmon: Driver for MAX16065 System Manager and compatibles") Fixes: b8d5acdcf525 ("hwmon: (max16065) Use local variable to avoid TOCTOU") Cc: [email protected] Signed-off-by: Gui-Dong Han <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Guenter Roeck <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Olivier Sobrie <[email protected]> Date: Wed Mar 4 22:20:39 2026 +0100 hwmon: (max6639) fix inverted polarity [ Upstream commit 170a4b21f49b3dcff3115b4c90758f0a0d77375a ] According to MAX6639 documentation: D1: PWM Output Polarity. PWM output is low at 100% duty cycle when this bit is set to zero. PWM output is high at 100% duty cycle when this bit is set to 1. Up to commit 0f33272b60ed ("hwmon: (max6639) : Update hwmon init using info structure"), the polarity was set to high (0x2) when no platform data was set. After the patch, the polarity register wasn't set anymore if no platform data was specified. Nowadays, since commit 7506ebcd662b ("hwmon: (max6639) : Configure based on DT property"), it is always set to low which doesn't match with the comment above and change the behavior compared to versions prior 0f33272b60ed. Fixes: 0f33272b60ed ("hwmon: (max6639) : Update hwmon init using info structure") Signed-off-by: Olivier Sobrie <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Guenter Roeck <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Charles Haithcock <[email protected]> Date: Fri Feb 27 18:41:15 2026 -0700 i2c: i801: Revert "i2c: i801: replace acpi_lock with I2C bus lock" [ Upstream commit cfc69c2e6c699c96949f7b0455195b0bfb7dc715 ] This reverts commit f707d6b9e7c18f669adfdb443906d46cfbaaa0c1. Under rare circumstances, multiple udev threads can collect i801 device info on boot and walk i801_acpi_io_handler somewhat concurrently. The first will note the area is reserved by acpi to prevent further touches. This ultimately causes the area to be deregistered. The second will enter i801_acpi_io_handler after the area is unregistered but before a check can be made that the area is unregistered. i2c_lock_bus relies on the now unregistered area containing lock_ops to lock the bus. The end result is a kernel panic on boot with the following backtrace; [ 14.971872] ioatdma 0000:09:00.2: enabling device (0100 -> 0102) [ 14.971873] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 14.971880] #PF: supervisor read access in kernel mode [ 14.971884] #PF: error_code(0x0000) - not-present page [ 14.971887] PGD 0 P4D 0 [ 14.971894] Oops: 0000 [#1] PREEMPT SMP PTI [ 14.971900] CPU: 5 PID: 956 Comm: systemd-udevd Not tainted 5.14.0-611.5.1.el9_7.x86_64 #1 [ 14.971905] Hardware name: XXXXXXXXXXXXXXXXXXXXXXX BIOS 1.20.10.SV91 01/30/2023 [ 14.971908] RIP: 0010:i801_acpi_io_handler+0x2d/0xb0 [i2c_i801] [ 14.971929] Code: 00 00 49 8b 40 20 41 57 41 56 4d 8b b8 30 04 00 00 49 89 ce 41 55 41 89 d5 41 54 49 89 f4 be 02 00 00 00 55 4c 89 c5 53 89 fb <48> 8b 00 4c 89 c7 e8 18 61 54 e9 80 bd 80 04 00 00 00 75 09 4c 3b [ 14.971933] RSP: 0018:ffffbaa841483838 EFLAGS: 00010282 [ 14.971938] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff9685e01ba568 [ 14.971941] RDX: 0000000000000008 RSI: 0000000000000002 RDI: 0000000000000000 [ 14.971944] RBP: ffff9685ca22f028 R08: ffff9685ca22f028 R09: ffff9685ca22f028 [ 14.971948] R10: 000000000000000b R11: 0000000000000580 R12: 0000000000000580 [ 14.971951] R13: 0000000000000008 R14: ffff9685e01ba568 R15: ffff9685c222f000 [ 14.971954] FS: 00007f8287c0ab40(0000) GS:ffff96a47f940000(0000) knlGS:0000000000000000 [ 14.971959] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 14.971963] CR2: 0000000000000000 CR3: 0000000168090001 CR4: 00000000003706f0 [ 14.971966] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 14.971968] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 14.971972] Call Trace: [ 14.971977] <TASK> [ 14.971981] ? show_trace_log_lvl+0x1c4/0x2df [ 14.971994] ? show_trace_log_lvl+0x1c4/0x2df [ 14.972003] ? acpi_ev_address_space_dispatch+0x16e/0x3c0 [ 14.972014] ? __die_body.cold+0x8/0xd [ 14.972021] ? page_fault_oops+0x132/0x170 [ 14.972028] ? exc_page_fault+0x61/0x150 [ 14.972036] ? asm_exc_page_fault+0x22/0x30 [ 14.972045] ? i801_acpi_io_handler+0x2d/0xb0 [i2c_i801] [ 14.972061] acpi_ev_address_space_dispatch+0x16e/0x3c0 [ 14.972069] ? __pfx_i801_acpi_io_handler+0x10/0x10 [i2c_i801] [ 14.972085] acpi_ex_access_region+0x5b/0xd0 [ 14.972093] acpi_ex_field_datum_io+0x73/0x2e0 [ 14.972100] acpi_ex_read_data_from_field+0x8e/0x230 [ 14.972106] acpi_ex_resolve_node_to_value+0x23d/0x310 [ 14.972114] acpi_ds_evaluate_name_path+0xad/0x110 [ 14.972121] acpi_ds_exec_end_op+0x321/0x510 [ 14.972127] acpi_ps_parse_loop+0xf7/0x680 [ 14.972136] acpi_ps_parse_aml+0x17a/0x3d0 [ 14.972143] acpi_ps_execute_method+0x137/0x270 [ 14.972150] acpi_ns_evaluate+0x1f4/0x2e0 [ 14.972158] acpi_evaluate_object+0x134/0x2f0 [ 14.972164] acpi_evaluate_integer+0x50/0xe0 [ 14.972173] ? vsnprintf+0x24b/0x570 [ 14.972181] acpi_ac_get_state.part.0+0x23/0x70 [ 14.972189] get_ac_property+0x4e/0x60 [ 14.972195] power_supply_show_property+0x90/0x1f0 [ 14.972205] add_prop_uevent+0x29/0x90 [ 14.972213] power_supply_uevent+0x109/0x1d0 [ 14.972222] dev_uevent+0x10e/0x2f0 [ 14.972228] uevent_show+0x8e/0x100 [ 14.972236] dev_attr_show+0x19/0x40 [ 14.972246] sysfs_kf_seq_show+0x9b/0x100 [ 14.972253] seq_read_iter+0x120/0x4b0 [ 14.972262] ? selinux_file_permission+0x106/0x150 [ 14.972273] vfs_read+0x24f/0x3a0 [ 14.972284] ksys_read+0x5f/0xe0 [ 14.972291] do_syscall_64+0x5f/0xe0 ... The kernel panic is mitigated by setting limiting the count of udev children to 1. Revert to using the acpi_lock to continue protecting marking the area as owned by firmware without relying on a lock in a potentially unmapped region of memory. Fixes: f707d6b9e7c1 ("i2c: i801: replace acpi_lock with I2C bus lock") Signed-off-by: Charles Haithcock <[email protected]> [wsa: added Fixes-tag and updated comment stating the importance of the lock] Signed-off-by: Wolfram Sang <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Thomas Gleixner <[email protected]> Date: Sat Feb 7 11:50:23 2026 +0100 i40e: Fix preempt count leak in napi poll tracepoint [ Upstream commit 4b3d54a85bd37ebf2d9836f0d0de775c0ff21af9 ] Using get_cpu() in the tracepoint assignment causes an obvious preempt count leak because nothing invokes put_cpu() to undo it: softirq: huh, entered softirq 3 NET_RX with preempt_count 00000100, exited with 00000101? This clearly has seen a lot of testing in the last 3+ years... Use smp_processor_id() instead. Fixes: 6d4d584a7ea8 ("i40e: Add i40e_napi_poll tracepoint") Signed-off-by: Thomas Gleixner <[email protected]> Cc: Tony Nguyen <[email protected]> Cc: Przemek Kitszel <[email protected]> Cc: [email protected] Cc: [email protected] Reviewed-by: Joe Damato <[email protected]> Reviewed-by: Aleksandr Loktionov <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Larysa Zaremba <[email protected]> Date: Thu Mar 5 12:12:46 2026 +0100 i40e: fix registering XDP RxQ info [ Upstream commit 8f497dc8a61429cc004720aa8e713743355d80cf ] Current way of handling XDP RxQ info in i40e has a problem, where frag_size is not updated when xsk_buff_pool is detached or when MTU is changed, this leads to growing tail always failing for multi-buffer packets. Couple XDP RxQ info registering with buffer allocations and unregistering with cleaning the ring. Fixes: a045d2f2d03d ("i40e: set xdp_rxq_info::frag_size") Reviewed-by: Aleksandr Loktionov <[email protected]> Signed-off-by: Larysa Zaremba <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Larysa Zaremba <[email protected]> Date: Thu Mar 5 12:12:47 2026 +0100 i40e: use xdp.frame_sz as XDP RxQ info frag_size [ Upstream commit c69d22c6c46a1d792ba8af3d8d6356fdc0e6f538 ] The only user of frag_size field in XDP RxQ info is bpf_xdp_frags_increase_tail(). It clearly expects whole buffer size instead of DMA write size. Different assumptions in i40e driver configuration lead to negative tailroom. Set frag_size to the same value as frame_sz in shared pages mode, use new helper to set frag_size when AF_XDP ZC is active. Fixes: a045d2f2d03d ("i40e: set xdp_rxq_info::frag_size") Reviewed-by: Aleksandr Loktionov <[email protected]> Signed-off-by: Larysa Zaremba <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Kohei Enju <[email protected]> Date: Tue Feb 10 15:57:14 2026 +0000 iavf: fix netdev->max_mtu to respect actual hardware limit [ Upstream commit b84852170153671bb0fa6737a6e48370addd8e1a ] iavf sets LIBIE_MAX_MTU as netdev->max_mtu, ignoring vf_res->max_mtu from PF [1]. This allows setting an MTU beyond the actual hardware limit, causing TX queue timeouts [2]. Set correct netdev->max_mtu using vf_res->max_mtu from the PF. Note that currently PF drivers such as ice/i40e set the frame size in vf_res->max_mtu, not MTU. Convert vf_res->max_mtu to MTU before setting netdev->max_mtu. [1] # ip -j -d link show $DEV | jq '.[0].max_mtu' 16356 [2] iavf 0000:00:05.0 enp0s5: NETDEV WATCHDOG: CPU: 1: transmit queue 0 timed out 5692 ms iavf 0000:00:05.0 enp0s5: NIC Link is Up Speed is 10 Gbps Full Duplex iavf 0000:00:05.0 enp0s5: NETDEV WATCHDOG: CPU: 6: transmit queue 3 timed out 5312 ms iavf 0000:00:05.0 enp0s5: NIC Link is Up Speed is 10 Gbps Full Duplex ... Fixes: 5fa4caff59f2 ("iavf: switch to Page Pool") Signed-off-by: Kohei Enju <[email protected]> Reviewed-by: Alexander Lobakin <[email protected]> Reviewed-by: Simon Horman <[email protected]> Tested-by: Rafal Romanowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Jason Gunthorpe <[email protected]> Date: Mon Feb 16 11:02:48 2026 -0400 IB/mthca: Add missed mthca_unmap_user_db() for mthca_create_srq() commit 117942ca43e2e3c3d121faae530989931b7f67e1 upstream. Fix a user triggerable leak on the system call failure path. Cc: [email protected] Fixes: ec34a922d243 ("[PATCH] IB/mthca: Add SRQ implementation") Signed-off-by: Jason Gunthorpe <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Larysa Zaremba <[email protected]> Date: Wed Dec 3 14:29:48 2025 +0100 ice: fix adding AQ LLDP filter for VF [ Upstream commit eef33aa44935d001747ca97703c08dd6f9031162 ] The referenced commit came from a misunderstanding of the FW LLDP filter AQ (Admin Queue) command due to the error in the internal documentation. Contrary to the assumptions in the original commit, VFs can be added and deleted from this filter without any problems. Introduced dev_info message proved to be useful, so reverting the whole commit does not make sense. Without this fix, trusted VFs do not receive LLDP traffic, if there is an AQ LLDP filter on PF. When trusted VF attempts to add an LLDP multicast MAC address, the following message can be seen in dmesg on host: ice 0000:33:00.0: Failed to add Rx LLDP rule on VSI 20 error: -95 Revert checking VSI type when adding LLDP filter through AQ. Fixes: 4d5a1c4e6d49 ("ice: do not add LLDP-specific filter if not necessary") Reviewed-by: Aleksandr Loktionov <[email protected]> Signed-off-by: Larysa Zaremba <[email protected]> Tested-by: Rafal Romanowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Aaron Ma <[email protected]> Date: Thu Jan 29 12:00:26 2026 +0800 ice: recap the VSI and QoS info after rebuild [ Upstream commit 6aa07e23dd3ccd35a0100c06fcb6b6c3b01e7965 ] Fix IRDMA hardware initialization timeout (-110) after resume by separating VSI-dependent configuration from RDMA resource allocation, ensuring VSI is rebuilt before IRDMA accesses it. After resume from suspend, IRDMA hardware initialization fails: ice: IRDMA hardware initialization FAILED init_state=4 status=-110 Separate RDMA initialization into two phases: 1. ice_init_rdma() - Allocate resources only (no VSI/QoS access, no plug) 2. ice_rdma_finalize_setup() - Assign VSI/QoS info and plug device This allows: - ice_init_rdma() to stay in ice_resume() (mirrors ice_deinit_rdma() in ice_suspend()) - VSI assignment deferred until after ice_vsi_rebuild() completes - QoS info updated after ice_dcb_rebuild() completes - Device plugged only when control queues, VSI, and DCB are all ready Fixes: bc69ad74867db ("ice: avoid IRQ collision to fix init failure on ACPI S3 resume") Reviewed-by: Aleksandr Loktionov <[email protected]> Signed-off-by: Aaron Ma <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Brian Vazquez <[email protected]> Date: Mon Jan 26 21:55:59 2026 +0000 idpf: change IRQ naming to match netdev and ethtool queue numbering [ Upstream commit 1500a8662d2d41d6bb03e034de45ddfe6d7d362d ] The code uses the vidx for the IRQ name but that doesn't match ethtool reporting nor netdev naming, this makes it hard to tune the device and associate queues with IRQs. Sequentially requesting irqs starting from '0' makes the output consistent. This commit changes the interrupt numbering but preserves the name format, maintaining ABI compatibility. Existing tools relying on the old numbering are already non-functional, as they lack a useful correlation to the interrupts. Before: ethtool -L eth1 tx 1 combined 3 grep . /proc/irq/*/*idpf*/../smp_affinity_list /proc/irq/67/idpf-Mailbox-0/../smp_affinity_list:0-55,112-167 /proc/irq/68/idpf-eth1-TxRx-1/../smp_affinity_list:0 /proc/irq/70/idpf-eth1-TxRx-3/../smp_affinity_list:1 /proc/irq/71/idpf-eth1-TxRx-4/../smp_affinity_list:2 /proc/irq/72/idpf-eth1-Tx-5/../smp_affinity_list:3 ethtool -S eth1 | grep -v ': 0' NIC statistics: tx_q-0_pkts: 1002 tx_q-1_pkts: 2679 tx_q-2_pkts: 1113 tx_q-3_pkts: 1192 <----- tx_q-3 vs idpf-eth1-Tx-5 rx_q-0_pkts: 1143 rx_q-1_pkts: 3172 rx_q-2_pkts: 1074 After: ethtool -L eth1 tx 1 combined 3 grep . /proc/irq/*/*idpf*/../smp_affinity_list /proc/irq/67/idpf-Mailbox-0/../smp_affinity_list:0-55,112-167 /proc/irq/68/idpf-eth1-TxRx-0/../smp_affinity_list:0 /proc/irq/70/idpf-eth1-TxRx-1/../smp_affinity_list:1 /proc/irq/71/idpf-eth1-TxRx-2/../smp_affinity_list:2 /proc/irq/72/idpf-eth1-Tx-3/../smp_affinity_list:3 ethtool -S eth1 | grep -v ': 0' NIC statistics: tx_q-0_pkts: 118 tx_q-1_pkts: 134 tx_q-2_pkts: 228 tx_q-3_pkts: 138 <--- tx_q-3 matches idpf-eth1-Tx-3 rx_q-0_pkts: 111 rx_q-1_pkts: 366 rx_q-2_pkts: 120 Fixes: d4d558718266 ("idpf: initialize interrupts and enable vport") Signed-off-by: Brian Vazquez <[email protected]> Reviewed-by: Brett Creeley <[email protected]> Reviewed-by: Aleksandr Loktionov <[email protected]> Reviewed-by: Paul Menzel <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Tested-by: Samuel Salin <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Sreedevi Joshi <[email protected]> Date: Tue Jan 13 12:01:13 2026 -0600 idpf: Fix flow rule delete failure due to invalid validation [ Upstream commit 2c31557336a8e4d209ed8d4513cef2c0f15e7ef4 ] When deleting a flow rule using "ethtool -N <dev> delete <location>", idpf_sideband_action_ena() incorrectly validates fsp->ring_cookie even though ethtool doesn't populate this field for delete operations. The uninitialized ring_cookie may randomly match RX_CLS_FLOW_DISC or RX_CLS_FLOW_WAKE, causing validation to fail and preventing legitimate rule deletions. Remove the unnecessary sideband action enable check and ring_cookie validation during delete operations since action validation is not required when removing existing rules. Fixes: ada3e24b84a0 ("idpf: add flow steering support") Signed-off-by: Sreedevi Joshi <[email protected]> Reviewed-by: Aleksandr Loktionov <[email protected]> Reviewed-by: Simon Horman <[email protected]> Reviewed-by: Paul Menzel <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Li Li <[email protected]> Date: Mon Jan 5 06:47:28 2026 +0000 idpf: increment completion queue next_to_clean in sw marker wait routine [ Upstream commit 712896ac4bce38a965a1c175f6e7804ed0381334 ] Currently, in idpf_wait_for_sw_marker_completion(), when an IDPF_TXD_COMPLT_SW_MARKER packet is found, the routine breaks out of the for loop and does not increment the next_to_clean counter. This causes the subsequent NAPI polls to run into the same IDPF_TXD_COMPLT_SW_MARKER packet again and print out the following: [ 23.261341] idpf 0000:05:00.0 eth1: Unknown TX completion type: 5 Instead, we should increment next_to_clean regardless when an IDPF_TXD_COMPLT_SW_MARKER packet is found. Tested: with the patch applied, we do not see the errors above from NAPI polls anymore. Fixes: 9d39447051a0 ("idpf: remove SW marker handling from NAPI") Signed-off-by: Li Li <[email protected]> Reviewed-by: Aleksandr Loktionov <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Vivek Behera <[email protected]> Date: Thu Jan 22 15:16:52 2026 +0100 igb: Fix trigger of incorrect irq in igb_xsk_wakeup [ Upstream commit d4c13ab36273a8c318ba06799793cc1f5d9c6fa1 ] The current implementation in the igb_xsk_wakeup expects the Rx and Tx queues to share the same irq. This would lead to triggering of incorrect irq in split irq configuration. This patch addresses this issue which could impact environments with 2 active cpu cores or when the number of queues is reduced to 2 or less cat /proc/interrupts | grep eno2 167: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0 0-edge eno2 168: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0 1-edge eno2-rx-0 169: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0 2-edge eno2-rx-1 170: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0 3-edge eno2-tx-0 171: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0 4-edge eno2-tx-1 Furthermore it uses the flags input argument to trigger either rx, tx or both rx and tx irqs as specified in the ndo_xsk_wakeup api documentation Fixes: 80f6ccf9f116 ("igb: Introduce XSK data structures and helpers") Signed-off-by: Vivek Behera <[email protected]> Reviewed-by: Aleksandr Loktionov <[email protected]> Suggested-by: Maciej Fijalkowski <[email protected]> Acked-by: Maciej Fijalkowski <[email protected]> Tested-by: Saritha Sanigani <[email protected]> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Eric Dumazet <[email protected]> Date: Fri Feb 27 17:26:03 2026 +0000 indirect_call_wrapper: do not reevaluate function pointer [ Upstream commit 710f5c76580306cdb9ec51fac8fcf6a8faff7821 ] We have an increasing number of READ_ONCE(xxx->function) combined with INDIRECT_CALL_[1234]() helpers. Unfortunately this forces INDIRECT_CALL_[1234]() to read xxx->function many times, which is not what we wanted. Fix these macros so that xxx->function value is not reloaded. $ scripts/bloat-o-meter -t vmlinux.0 vmlinux add/remove: 0/0 grow/shrink: 1/65 up/down: 122/-1084 (-962) Function old new delta ip_push_pending_frames 59 181 +122 ip6_finish_output 687 681 -6 __udp_enqueue_schedule_skb 1078 1072 -6 ioam6_output 2319 2312 -7 xfrm4_rcv_encap_finish2 64 56 -8 xfrm4_output 297 289 -8 vrf_ip_local_out 278 270 -8 vrf_ip6_local_out 278 270 -8 seg6_input_finish 64 56 -8 rpl_output 700 692 -8 ipmr_forward_finish 124 116 -8 ip_forward_finish 143 135 -8 ip6mr_forward2_finish 100 92 -8 ip6_forward_finish 73 65 -8 input_action_end_bpf 1091 1083 -8 dst_input 52 44 -8 __xfrm6_output 801 793 -8 __xfrm4_output 83 75 -8 bpf_input 500 491 -9 __tcp_check_space 530 521 -9 input_action_end_dt6 291 280 -11 vti6_tnl_xmit 1634 1622 -12 bpf_xmit 1203 1191 -12 rpl_input 497 483 -14 rawv6_send_hdrinc 1355 1341 -14 ndisc_send_skb 1030 1016 -14 ipv6_srh_rcv 1377 1363 -14 ip_send_unicast_reply 1253 1239 -14 ip_rcv_finish 226 212 -14 ip6_rcv_finish 300 286 -14 input_action_end_x_core 205 191 -14 input_action_end_x 355 341 -14 input_action_end_t 205 191 -14 input_action_end_dx6_finish 127 113 -14 input_action_end_dx4_finish 373 359 -14 input_action_end_dt4 426 412 -14 input_action_end_core 186 172 -14 input_action_end_b6_encap 292 278 -14 input_action_end_b6 198 184 -14 igmp6_send 1332 1318 -14 ip_sublist_rcv 864 848 -16 ip6_sublist_rcv 1091 1075 -16 ipv6_rpl_srh_rcv 1937 1920 -17 xfrm_policy_queue_process 1246 1228 -18 seg6_output_core 903 885 -18 mld_sendpack 856 836 -20 NF_HOOK 756 736 -20 vti_tunnel_xmit 1447 1426 -21 input_action_end_dx6 664 642 -22 input_action_end 1502 1480 -22 sock_sendmsg_nosec 134 111 -23 ip6mr_forward2 388 364 -24 sock_recvmsg_nosec 134 109 -25 seg6_input_core 836 810 -26 ip_send_skb 172 146 -26 ip_local_out 140 114 -26 ip6_local_out 140 114 -26 __sock_sendmsg 162 136 -26 __ip_queue_xmit 1196 1170 -26 __ip_finish_output 405 379 -26 ipmr_queue_fwd_xmit 373 346 -27 sock_recvmsg 173 145 -28 ip6_xmit 1635 1607 -28 xfrm_output_resume 1418 1389 -29 ip_build_and_send_pkt 625 591 -34 dst_output 504 432 -72 Total: Before=25217686, After=25216724, chg -0.00% Fixes: 283c16a2dfd3 ("indirect call wrappers: helpers to speed-up indirect calls of builtin") Signed-off-by: Eric Dumazet <[email protected]> Reviewed-by: Kuniyuki Iwashima <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Eric Dumazet <[email protected]> Date: Wed Feb 25 20:35:45 2026 +0000 inet: annotate data-races around isk->inet_num [ Upstream commit 29252397bcc1e0a1f85e5c3bee59c325f5c26341 ] UDP/TCP lookups are using RCU, thus isk->inet_num accesses should use READ_ONCE() and WRITE_ONCE() where needed. Fixes: 3ab5aee7fe84 ("net: Convert TCP & DCCP hash tables to use RCU / hlist_nulls") Signed-off-by: Eric Dumazet <[email protected]> Reviewed-by: Kuniyuki Iwashima <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Minseong Kim <[email protected]> Date: Wed Jan 21 10:02:02 2026 -0800 Input: synaptics_i2c - guard polling restart in resume [ Upstream commit 870c2e7cd881d7a10abb91f2b38135622d9f9f65 ] synaptics_i2c_resume() restarts delayed work unconditionally, even when the input device is not opened. Guard the polling restart by taking the input device mutex and checking input_device_enabled() before re-queuing the delayed work. Fixes: eef3e4cab72ea ("Input: add driver for Synaptics I2C touchpad") Signed-off-by: Minseong Kim <[email protected]> Cc: [email protected] Link: https://patch.msgid.link/[email protected] Signed-off-by: Dmitry Torokhov <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Marco Crivellari <[email protected]> Date: Thu Nov 6 15:19:54 2025 +0100 Input: synaptics_i2c - replace use of system_wq with system_dfl_wq [ Upstream commit b3ee88e27798f0e8dd3a81867804d693da74d57d ] Currently if a user enqueues a work item using schedule_delayed_work() the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to schedule_work() that is using system_wq and queue_work(), that makes use again of WORK_CPU_UNBOUND. This lack of consistency cannot be addressed without refactoring the API. This patch continues the effort to refactor worqueue APIs, which has begun with the change introducing new workqueues and a new alloc_workqueue flag: commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq") commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag") This specific workload do not benefit from a per-cpu workqueue, so use the default unbound workqueue (system_dfl_wq) instead. Suggested-by: Tejun Heo <[email protected]> Signed-off-by: Marco Crivellari <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Dmitry Torokhov <[email protected]> Stable-dep-of: 870c2e7cd881 ("Input: synaptics_i2c - guard polling restart in resume") Signed-off-by: Sasha Levin <[email protected]>
Author: Jinhui Guo <[email protected]> Date: Thu Jan 22 09:48:50 2026 +0800 iommu/vt-d: Skip dev-iotlb flush for inaccessible PCIe device without scalable mode [ Upstream commit 42662d19839f34735b718129ea200e3734b07e50 ] PCIe endpoints with ATS enabled and passed through to userspace (e.g., QEMU, DPDK) can hard-lock the host when their link drops, either by surprise removal or by a link fault. Commit 4fc82cd907ac ("iommu/vt-d: Don't issue ATS Invalidation request when device is disconnected") adds pci_dev_is_disconnected() to devtlb_invalidation_with_pasid() so ATS invalidation is skipped only when the device is being safely removed, but it applies only when Intel IOMMU scalable mode is enabled. With scalable mode disabled or unsupported, a system hard-lock occurs when a PCIe endpoint's link drops because the Intel IOMMU waits indefinitely for an ATS invalidation that cannot complete. Call Trace: qi_submit_sync qi_flush_dev_iotlb __context_flush_dev_iotlb.part.0 domain_context_clear_one_cb pci_for_each_dma_alias device_block_translation blocking_domain_attach_dev iommu_deinit_device __iommu_group_remove_device iommu_release_device iommu_bus_notifier blocking_notifier_call_chain bus_notify device_del pci_remove_bus_device pci_stop_and_remove_bus_device pciehp_unconfigure_device pciehp_disable_slot pciehp_handle_presence_or_link_change pciehp_ist Commit 81e921fd3216 ("iommu/vt-d: Fix NULL domain on device release") adds intel_pasid_teardown_sm_context() to intel_iommu_release_device(), which calls qi_flush_dev_iotlb() and can also hard-lock the system when a PCIe endpoint's link drops. Call Trace: qi_submit_sync qi_flush_dev_iotlb __context_flush_dev_iotlb.part.0 intel_context_flush_no_pasid device_pasid_table_teardown pci_pasid_table_teardown pci_for_each_dma_alias intel_pasid_teardown_sm_context intel_iommu_release_device iommu_deinit_device __iommu_group_remove_device iommu_release_device iommu_bus_notifier blocking_notifier_call_chain bus_notify device_del pci_remove_bus_device pci_stop_and_remove_bus_device pciehp_unconfigure_device pciehp_disable_slot pciehp_handle_presence_or_link_change pciehp_ist Sometimes the endpoint loses connection without a link-down event (e.g., due to a link fault); killing the process (virsh destroy) then hard-locks the host. Call Trace: qi_submit_sync qi_flush_dev_iotlb __context_flush_dev_iotlb.part.0 domain_context_clear_one_cb pci_for_each_dma_alias device_block_translation blocking_domain_attach_dev __iommu_attach_device __iommu_device_set_domain __iommu_group_set_domain_internal iommu_detach_group vfio_iommu_type1_detach_group vfio_group_detach_container vfio_group_fops_release __fput pci_dev_is_disconnected() only covers safe-removal paths; pci_device_is_present() tests accessibility by reading vendor/device IDs and internally calls pci_dev_is_disconnected(). On a ConnectX-5 (8 GT/s, x2) this costs ~70 µs. Since __context_flush_dev_iotlb() is only called on {attach,release}_dev paths (not hot), add pci_device_is_present() there to skip inaccessible devices and avoid the hard-lock. Fixes: 37764b952e1b ("iommu/vt-d: Global devTLB flush when present context entry changed") Fixes: 81e921fd3216 ("iommu/vt-d: Fix NULL domain on device release") Cc: [email protected] Signed-off-by: Jinhui Guo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lu Baolu <[email protected]> Signed-off-by: Joerg Roedel <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Corey Minyard <[email protected]> Date: Tue Jan 27 07:22:35 2026 -0600 ipmi: Fix use-after-free and list corruption on sender error commit 594c11d0e1d445f580898a2b8c850f2e3f099368 upstream. The analysis from Breno: When the SMI sender returns an error, smi_work() delivers an error response but then jumps back to restart without cleaning up properly: 1. intf->curr_msg is not cleared, so no new message is pulled 2. newmsg still points to the message, causing sender() to be called again with the same message 3. If sender() fails again, deliver_err_response() is called with the same recv_msg that was already queued for delivery This causes list_add corruption ("list_add double add") because the recv_msg is added to the user_msgs list twice. Subsequently, the corrupted list leads to use-after-free when the memory is freed and reused, and eventually a NULL pointer dereference when accessing recv_msg->done. The buggy sequence: sender() fails -> deliver_err_response(recv_msg) // recv_msg queued for delivery -> goto restart // curr_msg not cleared! sender() fails again (same message!) -> deliver_err_response(recv_msg) // tries to queue same recv_msg -> LIST CORRUPTION Fix this by freeing the message and setting it to NULL on a send error. Also, always free the newmsg on a send error, otherwise it will leak. Reported-by: Breno Leitao <[email protected]> Closes: https://lore.kernel.org/lkml/[email protected]/ Fixes: 9cf93a8fa9513 ("ipmi: Allow an SMI sender to return an error") Cc: [email protected] # 4.18 Reviewed-by: Breno Leitao <[email protected]> Signed-off-by: Corey Minyard <[email protected]> Signed-off-by: Breno Leitao <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Jakub Kicinski <[email protected]> Date: Sun Mar 1 11:45:48 2026 -0800 ipv6: fix NULL pointer deref in ip6_rt_get_dev_rcu() [ Upstream commit 2ffb4f5c2ccb2fa1c049dd11899aee7967deef5a ] l3mdev_master_dev_rcu() can return NULL when the slave device is being un-slaved from a VRF. All other callers deal with this, but we lost the fallback to loopback in ip6_rt_pcpu_alloc() -> ip6_rt_get_dev_rcu() with commit 4832c30d5458 ("net: ipv6: put host and anycast routes on device with address"). KASAN: null-ptr-deref in range [0x0000000000000108-0x000000000000010f] RIP: 0010:ip6_rt_pcpu_alloc (net/ipv6/route.c:1418) Call Trace: ip6_pol_route (net/ipv6/route.c:2318) fib6_rule_lookup (net/ipv6/fib6_rules.c:115) ip6_route_output_flags (net/ipv6/route.c:2607) vrf_process_v6_outbound (drivers/net/vrf.c:437) I was tempted to rework the un-slaving code to clear the flag first and insert synchronize_rcu() before we remove the upper. But looks like the explicit fallback to loopback_dev is an established pattern. And I guess avoiding the synchronize_rcu() is nice, too. Fixes: 4832c30d5458 ("net: ipv6: put host and anycast routes on device with address") Reviewed-by: David Ahern <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Nam Cao <[email protected]> Date: Thu Feb 12 12:41:25 2026 +0100 irqchip/sifive-plic: Fix frozen interrupt due to affinity setting [ Upstream commit 1072020685f4b81f6efad3b412cdae0bd62bb043 ] PLIC ignores interrupt completion message for disabled interrupt, explained by the specification: The PLIC signals it has completed executing an interrupt handler by writing the interrupt ID it received from the claim to the claim/complete register. The PLIC does not check whether the completion ID is the same as the last claim ID for that target. If the completion ID does not match an interrupt source that is currently enabled for the target, the completion is silently ignored. This caused problems in the past, because an interrupt can be disabled while still being handled and plic_irq_eoi() had no effect. That was fixed by checking if the interrupt is disabled, and if so enable it, before sending the completion message. That check is done with irqd_irq_disabled(). However, that is not sufficient because the enable bit for the handling hart can be zero despite irqd_irq_disabled(d) being false. This can happen when affinity setting is changed while a hart is still handling the interrupt. This problem is easily reproducible by dumping a large file to uart (which generates lots of interrupts) and at the same time keep changing the uart interrupt's affinity setting. The uart port becomes frozen almost instantaneously. Fix this by checking PLIC's enable bit instead of irqd_irq_disabled(). Fixes: cc9f04f9a84f ("irqchip/sifive-plic: Implement irq_set_affinity() for SMP host") Signed-off-by: Nam Cao <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
Author: Nathan Chancellor <[email protected]> Date: Wed Feb 25 15:02:51 2026 -0700 kbuild: Split .modinfo out from ELF_DETAILS commit 8678591b47469fe16357234efef9b260317b8be4 upstream. Commit 3e86e4d74c04 ("kbuild: keep .modinfo section in vmlinux.unstripped") added .modinfo to ELF_DETAILS while removing it from COMMON_DISCARDS, as it was needed in vmlinux.unstripped and ELF_DETAILS was present in all architecture specific vmlinux linker scripts. While this shuffle is fine for vmlinux, ELF_DETAILS and COMMON_DISCARDS may be used by other linker scripts, such as the s390 and x86 compressed boot images, which may not expect to have a .modinfo section. In certain circumstances, this could result in a bootloader failing to load the compressed kernel [1]. Commit ddc6cbef3ef1 ("s390/boot/vmlinux.lds.S: Ensure bzImage ends with SecureBoot trailer") recently addressed this for the s390 bzImage but the same bug remains for arm, parisc, and x86. The presence of .modinfo in the x86 bzImage was the root cause of the issue worked around with commit d50f21091358 ("kbuild: align modinfo section for Secureboot Authenticode EDK2 compat"). misc.c in arch/x86/boot/compressed includes lib/decompress_unzstd.c, which in turn includes lib/xxhash.c and its MODULE_LICENSE / MODULE_DESCRIPTION macros due to the STATIC definition. Split .modinfo out from ELF_DETAILS into its own macro and handle it in all vmlinux linker scripts. Discard .modinfo in the places where it was previously being discarded from being in COMMON_DISCARDS, as it has never been necessary in those uses. Cc: [email protected] Fixes: 3e86e4d74c04 ("kbuild: keep .modinfo section in vmlinux.unstripped") Reported-by: Ed W <[email protected]> Closes: https://lore.kernel.org/[email protected]/ [1] Tested-by: Ed W <[email protected]> # x86_64 Link: https://patch.msgid.link/20260225-separate-modinfo-from-elf-details-v1-1-387ced6baf4b@kernel.org Signed-off-by: Nathan Chancellor <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Wake Liu <[email protected]> Date: Wed Dec 24 16:41:20 2025 +0800 kselftest/harness: Use helper to avoid zero-size memset warning [ Upstream commit 19b8a76cd99bde6d299e60490f3e62b8d3df3997 ] When building kselftests with a toolchain that enables source fortification (e.g., Android's build environment, which uses -D_FORTIFY_SOURCE=3), a build failure occurs in tests that use an empty FIXTURE(). The root cause is that an empty fixture struct results in `sizeof(self_private)` evaluating to 0. The compiler's fortification checks then detect the `memset()` call with a compile-time constant size of 0, issuing a `-Wuser-defined-warnings` which is promoted to an error by `-Werror`. An initial attempt to guard the call with `if (sizeof(self_private) > 0)` was insufficient. The compiler's static analysis is aggressive enough to flag the `memset(..., 0)` pattern before evaluating the conditional, thus still triggering the error. To resolve this robustly, this change introduces a `static inline` helper function, `__kselftest_memset_safe()`. This function wraps the size check and the `memset()` call. By replacing the direct `memset()` in the `__TEST_F_IMPL` macro with a call to this helper, we create an abstraction boundary. This prevents the compiler's static analyzer from "seeing" the problematic pattern at the macro expansion site, resolving the build failure. Build Context: Compiler: Android (14488419, +pgo, +bolt, +lto, +mlgo, based on r584948) clang version 22.0.0 (https://android.googlesource.com/toolchain/llvm-project 2d65e4108033380e6fe8e08b1f1826cd2bfb0c99) Relevant Options: -O2 -Wall -Werror -D_FORTIFY_SOURCE=3 -target i686-linux-android10000 Test: m kselftest_futex_futex_requeue_pi Removed Gerrit Change-Id Shuah Khan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Wake Liu <[email protected]> Signed-off-by: Shuah Khan <[email protected]> Stable-dep-of: 6be268151426 ("selftests/harness: order TEST_F and XFAIL_ADD constructors") Signed-off-by: Sasha Levin <[email protected]>
Author: Fedor Pchelkin <[email protected]> Date: Tue Feb 24 20:49:44 2026 -0500 ksmbd: call ksmbd_vfs_kern_path_end_removing() on some error paths [ Upstream commit a09dc10d1353f0e92c21eae2a79af1c2b1ddcde8 ] There are two places where ksmbd_vfs_kern_path_end_removing() needs to be called in order to balance what the corresponding successful call to ksmbd_vfs_kern_path_start_removing() has done, i.e. drop inode locks and put the taken references. Otherwise there might be potential deadlocks and unbalanced locks which are caught like: BUG: workqueue leaked lock or atomic: kworker/5:21/0x00000000/7596 last function: handle_ksmbd_work 2 locks held by kworker/5:21/7596: #0: ffff8881051ae448 (sb_writers#3){.+.+}-{0:0}, at: ksmbd_vfs_kern_path_locked+0x142/0x660 #1: ffff888130e966c0 (&type->i_mutex_dir_key#3/1){+.+.}-{4:4}, at: ksmbd_vfs_kern_path_locked+0x17d/0x660 CPU: 5 PID: 7596 Comm: kworker/5:21 Not tainted 6.1.162-00456-gc29b353f383b #138 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-debian-1.17.0-1 04/01/2014 Workqueue: ksmbd-io handle_ksmbd_work Call Trace: <TASK> dump_stack_lvl+0x44/0x5b process_one_work.cold+0x57/0x5c worker_thread+0x82/0x600 kthread+0x153/0x190 ret_from_fork+0x22/0x30 </TASK> Found by Linux Verification Center (linuxtesting.org). Fixes: d5fc1400a34b ("smb/server: avoid deadlock when linking with ReplaceIfExists") Cc: [email protected] Signed-off-by: Fedor Pchelkin <[email protected]> Acked-by: Namjae Jeon <[email protected]> Signed-off-by: Steve French <[email protected]> [ ksmbd_vfs_kern_path_end_removing() -> ksmbd_vfs_kern_path_unlock() ] Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Shuvam Pandey <[email protected]> Date: Thu Feb 26 21:14:10 2026 +0545 kunit: tool: copy caller args in run_kernel to prevent mutation [ Upstream commit 40804c4974b8df2adab72f6475d343eaff72b7f6 ] run_kernel() appended KUnit flags directly to the caller-provided args list. When exec_tests() calls run_kernel() repeatedly (e.g. with --run_isolated), each call mutated the same list, causing later runs to inherit stale filter_glob values and duplicate kunit.enable flags. Fix this by copying args at the start of run_kernel(). Add a regression test that calls run_kernel() twice with the same list and verifies the original remains unchanged. Fixes: ff9e09a3762f ("kunit: tool: support running each suite/test separately") Signed-off-by: Shuvam Pandey <[email protected]> Reviewed-by: David Gow <[email protected]> Signed-off-by: Shuah Khan <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Fuad Tabba <[email protected]> Date: Fri Feb 13 14:38:14 2026 +0000 KVM: arm64: Fix ID register initialization for non-protected pKVM guests [ Upstream commit 7e7c2cf0024d89443a7af52e09e47b1fe634ab17 ] In protected mode, the hypervisor maintains a separate instance of the `kvm` structure for each VM. For non-protected VMs, this structure is initialized from the host's `kvm` state. Currently, `pkvm_init_features_from_host()` copies the `KVM_ARCH_FLAG_ID_REGS_INITIALIZED` flag from the host without the underlying `id_regs` data being initialized. This results in the hypervisor seeing the flag as set while the ID registers remain zeroed. Consequently, `kvm_has_feat()` checks at EL2 fail (return 0) for non-protected VMs. This breaks logic that relies on feature detection, such as `ctxt_has_tcrx()` for TCR2_EL1 support. As a result, certain system registers (e.g., TCR2_EL1, PIR_EL1, POR_EL1) are not saved/restored during the world switch, which could lead to state corruption. Fix this by explicitly copying the ID registers from the host `kvm` to the hypervisor `kvm` for non-protected VMs during initialization, since we trust the host with its non-protected guests' features. Also ensure `KVM_ARCH_FLAG_ID_REGS_INITIALIZED` is cleared initially in `pkvm_init_features_from_host` so that `vm_copy_id_regs` can properly initialize them and set the flag once done. Fixes: 41d6028e28bd ("KVM: arm64: Convert the SVE guest vcpu flag to a vm flag") Signed-off-by: Fuad Tabba <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Marc Zyngier <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Fuad Tabba <[email protected]> Date: Fri Feb 13 14:38:12 2026 +0000 KVM: arm64: Hide S1POE from guests when not supported by the host [ Upstream commit f66857bafd4f151c5cc6856e47be2e12c1721e43 ] When CONFIG_ARM64_POE is disabled, KVM does not save/restore POR_EL1. However, ID_AA64MMFR3_EL1 sanitisation currently exposes the feature to guests whenever the hardware supports it, ignoring the host kernel configuration. If a guest detects this feature and attempts to use it, the host will fail to context-switch POR_EL1, potentially leading to state corruption. Fix this by masking ID_AA64MMFR3_EL1.S1POE in the sanitised system registers, preventing KVM from advertising the feature when the host does not support it (i.e. system_supports_poe() is false). Fixes: 70ed7238297f ("KVM: arm64: Sanitise ID_AA64MMFR3_EL1") Signed-off-by: Fuad Tabba <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Marc Zyngier <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Khushit Shah <[email protected]> Date: Fri Jan 23 12:56:25 2026 +0000 KVM: x86: Add x2APIC "features" to control EOI broadcast suppression [ Upstream commit 6517dfbcc918f970a928d9dc17586904bac06893 ] Add two flags for KVM_CAP_X2APIC_API to allow userspace to control support for Suppress EOI Broadcasts when using a split IRQCHIP (I/O APIC emulated by userspace), which KVM completely mishandles. When x2APIC support was first added, KVM incorrectly advertised and "enabled" Suppress EOI Broadcast, without fully supporting the I/O APIC side of the equation, i.e. without adding directed EOI to KVM's in-kernel I/O APIC. That flaw was carried over to split IRQCHIP support, i.e. KVM advertised support for Suppress EOI Broadcasts irrespective of whether or not the userspace I/O APIC implementation supported directed EOIs. Even worse, KVM didn't actually suppress EOI broadcasts, i.e. userspace VMMs without support for directed EOI came to rely on the "spurious" broadcasts. KVM "fixed" the in-kernel I/O APIC implementation by completely disabling support for Suppress EOI Broadcasts in commit 0bcc3fb95b97 ("KVM: lapic: stop advertising DIRECTED_EOI when in-kernel IOAPIC is in use"), but didn't do anything to remedy userspace I/O APIC implementations. KVM's bogus handling of Suppress EOI Broadcast is problematic when the guest relies on interrupts being masked in the I/O APIC until well after the initial local APIC EOI. E.g. Windows with Credential Guard enabled handles interrupts in the following order: 1. Interrupt for L2 arrives. 2. L1 APIC EOIs the interrupt. 3. L1 resumes L2 and injects the interrupt. 4. L2 EOIs after servicing. 5. L1 performs the I/O APIC EOI. Because KVM EOIs the I/O APIC at step #2, the guest can get an interrupt storm, e.g. if the IRQ line is still asserted and userspace reacts to the EOI by re-injecting the IRQ, because the guest doesn't de-assert the line until step #4, and doesn't expect the interrupt to be re-enabled until step #5. Unfortunately, simply "fixing" the bug isn't an option, as KVM has no way of knowing if the userspace I/O APIC supports directed EOIs, i.e. suppressing EOI broadcasts would result in interrupts being stuck masked in the userspace I/O APIC due to step #5 being ignored by userspace. And fully disabling support for Suppress EOI Broadcast is also undesirable, as picking up the fix would require a guest reboot, *and* more importantly would change the virtual CPU model exposed to the guest without any buy-in from userspace. Add KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST and KVM_X2APIC_DISABLE_SUPPRESS_EOI_BROADCAST flags to allow userspace to explicitly enable or disable support for Suppress EOI Broadcasts. This gives userspace control over the virtual CPU model exposed to the guest, as KVM should never have enabled support for Suppress EOI Broadcast without userspace opt-in. Not setting either flag will result in legacy quirky behavior for backward compatibility. Disallow fully enabling SUPPRESS_EOI_BROADCAST when using an in-kernel I/O APIC, as KVM's history/support is just as tragic. E.g. it's not clear that commit c806a6ad35bf ("KVM: x86: call irq notifiers with directed EOI") was entirely correct, i.e. it may have simply papered over the lack of Directed EOI emulation in the I/O APIC. Note, Suppress EOI Broadcasts is defined only in Intel's SDM, not in AMD's APM. But the bit is writable on some AMD CPUs, e.g. Turin, and KVM's ABI is to support Directed EOI (KVM's name) irrespective of guest CPU vendor. Fixes: 7543a635aa09 ("KVM: x86: Add KVM exit for IOAPIC EOIs") Closes: https://lore.kernel.org/kvm/[email protected] Cc: [email protected] Suggested-by: David Woodhouse <[email protected]> Signed-off-by: Khushit Shah <[email protected]> Link: https://patch.msgid.link/[email protected] [sean: clean up minor formatting goofs and fix a comment typo] Co-developed-by: Sean Christopherson <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Sean Christopherson <[email protected]> Date: Thu Jan 8 19:06:57 2026 -0800 KVM: x86: Ignore -EBUSY when checking nested events from vcpu_block() [ Upstream commit ead63640d4e72e6f6d464f4e31f7fecb79af8869 ] Ignore -EBUSY when checking nested events after exiting a blocking state while L2 is active, as exiting to userspace will generate a spurious userspace exit, usually with KVM_EXIT_UNKNOWN, and likely lead to the VM's demise. Continuing with the wakeup isn't perfect either, as *something* has gone sideways if a vCPU is awakened in L2 with an injected event (or worse, a nested run pending), but continuing on gives the VM a decent chance of surviving without any major side effects. As explained in the Fixes commits, it _should_ be impossible for a vCPU to be put into a blocking state with an already-injected event (exception, IRQ, or NMI). Unfortunately, userspace can stuff MP_STATE and/or injected events, and thus put the vCPU into what should be an impossible state. Don't bother trying to preserve the WARN, e.g. with an anti-syzkaller Kconfig, as WARNs can (hopefully) be added in paths where _KVM_ would be violating x86 architecture, e.g. by WARNing if KVM attempts to inject an exception or interrupt while the vCPU isn't running. Cc: Alessandro Ratti <[email protected]> Cc: [email protected] Fixes: 26844fee6ade ("KVM: x86: never write to memory from kvm_vcpu_check_block()") Fixes: 45405155d876 ("KVM: x86: WARN if a vCPU gets a valid wakeup that KVM can't yet inject") Link: https://syzkaller.appspot.com/text?tag=ReproC&x=10d4261a580000 Reported-by: [email protected] Closes: https://lore.kernel.org/all/[email protected] Link: https://patch.msgid.link/[email protected] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Michal Swiatkowski <[email protected]> Date: Wed Feb 11 10:11:40 2026 +0100 libie: don't unroll if fwlog isn't supported [ Upstream commit 636cc3bd12f499c74eaf5dc9a7d5b832f1bb24ed ] The libie_fwlog_deinit() function can be called during driver unload even when firmware logging was never properly initialized. This led to call trace: [ 148.576156] Oops: Oops: 0000 [#1] SMP NOPTI [ 148.576167] CPU: 80 UID: 0 PID: 12843 Comm: rmmod Kdump: loaded Not tainted 6.17.0-rc7next-queue-3oct-01915-g06d79d51cf51 #1 PREEMPT(full) [ 148.576177] Hardware name: HPE ProLiant DL385 Gen10 Plus/ProLiant DL385 Gen10 Plus, BIOS A42 07/18/2020 [ 148.576182] RIP: 0010:__dev_printk+0x16/0x70 [ 148.576196] Code: 1f 44 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 41 55 41 54 49 89 d4 55 48 89 fd 53 48 85 f6 74 3c <4c> 8b 6e 50 48 89 f3 4d 85 ed 75 03 4c 8b 2e 48 89 df e8 f3 27 98 [ 148.576204] RSP: 0018:ffffd2fd7ea17a48 EFLAGS: 00010202 [ 148.576211] RAX: ffffd2fd7ea17aa0 RBX: ffff8eb288ae2000 RCX: 0000000000000000 [ 148.576217] RDX: ffffd2fd7ea17a70 RSI: 00000000000000c8 RDI: ffffffffb68d3d88 [ 148.576222] RBP: ffffffffb68d3d88 R08: 0000000000000000 R09: 0000000000000000 [ 148.576227] R10: 00000000000000c8 R11: ffff8eb2b1a49400 R12: ffffd2fd7ea17a70 [ 148.576231] R13: ffff8eb3141fb000 R14: ffffffffc1215b48 R15: ffffffffc1215bd8 [ 148.576236] FS: 00007f5666ba6740(0000) GS:ffff8eb2472b9000(0000) knlGS:0000000000000000 [ 148.576242] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 148.576247] CR2: 0000000000000118 CR3: 000000011ad17000 CR4: 0000000000350ef0 [ 148.576252] Call Trace: [ 148.576258] <TASK> [ 148.576269] _dev_warn+0x7c/0x96 [ 148.576290] libie_fwlog_deinit+0x112/0x117 [libie_fwlog] [ 148.576303] ixgbe_remove+0x63/0x290 [ixgbe] [ 148.576342] pci_device_remove+0x42/0xb0 [ 148.576354] device_release_driver_internal+0x19c/0x200 [ 148.576365] driver_detach+0x48/0x90 [ 148.576372] bus_remove_driver+0x6d/0xf0 [ 148.576383] pci_unregister_driver+0x2e/0xb0 [ 148.576393] ixgbe_exit_module+0x1c/0xd50 [ixgbe] [ 148.576430] __do_sys_delete_module.isra.0+0x1bc/0x2e0 [ 148.576446] do_syscall_64+0x7f/0x980 It can be reproduced by trying to unload ixgbe driver in recovery mode. Fix that by checking if fwlog is supported before doing unroll. Fixes: 641585bc978e ("ixgbe: fwlog support for e610") Reviewed-by: Aleksandr Loktionov <[email protected]> Signed-off-by: Michal Swiatkowski <[email protected]> Reviewed-by: Simon Horman <[email protected]> Tested-by: Rinitha S <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Sasha Levin <[email protected]> Date: Thu Mar 12 07:14:22 2026 -0400 Linux 6.18.17 Tested-by: Brett A C Sheffield <[email protected]> Tested-by: Jon Hunter <[email protected]> Tested-by: Dileep malepu <[email protected]> Tested-by: Jeffrin Jose T <[email protected]> Tested-by: Florian Fainelli <[email protected]> Tested-by: Ron Economos <[email protected]> Tested-by: Peter Schneider <[email protected]> Tested-by: Mark Brown <[email protected]> Tested-by: Shuah Khan <[email protected]> Tested-by: Barry K. Nathan <[email protected]> Tested-by: Miguel Ojeda <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Tiezhu Yang <[email protected]> Date: Tue Feb 10 19:31:13 2026 +0800 LoongArch: Handle percpu handler address for ORC unwinder [ Upstream commit 055c7e75190e0be43037bd663a3f6aced194416e ] After commit 4cd641a79e69 ("LoongArch: Remove unnecessary checks for ORC unwinder"), the system can not boot normally under some configs (such as enable KASAN), there are many error messages "cannot find unwind pc". The kernel boots normally with the defconfig, so no problem found out at the first time. Here is one way to reproduce: cd linux make mrproper defconfig -j"$(nproc)" scripts/config -e KASAN make olddefconfig all -j"$(nproc)" sudo make modules_install sudo make install sudo reboot The address that can not unwind is not a valid kernel address which is between "pcpu_handlers[cpu]" and "pcpu_handlers[cpu] + vec_sz" due to the code of eentry was copied to the new area of pcpu_handlers[cpu] in setup_tlb_handler(), handle this special case to get the valid address to unwind normally. Cc: [email protected] Signed-off-by: Tiezhu Yang <[email protected]> Signed-off-by: Huacai Chen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Tiezhu Yang <[email protected]> Date: Tue Feb 10 19:31:14 2026 +0800 LoongArch: Remove some extern variables in source files [ Upstream commit 0e6f596d6ac635e80bb265d587b2287ef8fa1cd6 ] There are declarations of the variable "eentry", "pcpu_handlers[]" and "exception_handlers[]" in asm/setup.h, the source files already include this header file directly or indirectly, so no need to declare them in the source files, just remove the code. Cc: [email protected] Signed-off-by: Tiezhu Yang <[email protected]> Signed-off-by: Huacai Chen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Tiezhu Yang <[email protected]> Date: Wed Dec 31 15:19:19 2025 +0800 LoongArch: Remove unnecessary checks for ORC unwinder [ Upstream commit 4cd641a79e69270a062777f64a0dd330abb9044a ] According to the following function definitions, __kernel_text_address() already checks __module_text_address(), so it should remove the check of __module_text_address() in bt_address() at least. int __kernel_text_address(unsigned long addr) { if (kernel_text_address(addr)) return 1; ... return 0; } int kernel_text_address(unsigned long addr) { bool no_rcu; int ret = 1; ... if (is_module_text_address(addr)) goto out; ... return ret; } bool is_module_text_address(unsigned long addr) { guard(rcu)(); return __module_text_address(addr) != NULL; } Furthermore, there are two checks of __kernel_text_address(), one is in bt_address() and the other is after calling bt_address(), it looks like redundant. Handle the exception address first and then use __kernel_text_address() to validate the calculated address for exception or the normal address in bt_address(), then it can remove the check of __kernel_text_address() after calling bt_address(). Just remove unnecessary checks, no functional changes intended. Signed-off-by: Tiezhu Yang <[email protected]> Signed-off-by: Huacai Chen <[email protected]> Stable-dep-of: 055c7e75190e ("LoongArch: Handle percpu handler address for ORC unwinder") Signed-off-by: Sasha Levin <[email protected]>
Author: Jens Axboe <[email protected]> Date: Tue Feb 24 11:51:16 2026 -0700 media: dvb-core: fix wrong reinitialization of ringbuffer on reopen commit bfbc0b5b32a8f28ce284add619bf226716a59bc0 upstream. dvb_dvr_open() calls dvb_ringbuffer_init() when a new reader opens the DVR device. dvb_ringbuffer_init() calls init_waitqueue_head(), which reinitializes the waitqueue list head to empty. Since dmxdev->dvr_buffer.queue is a shared waitqueue (all opens of the same DVR device share it), this orphans any existing waitqueue entries from io_uring poll or epoll, leaving them with stale prev/next pointers while the list head is reset to {self, self}. The waitqueue and spinlock in dvr_buffer are already properly initialized once in dvb_dmxdev_init(). The open path only needs to reset the buffer data pointer, size, and read/write positions. Replace the dvb_ringbuffer_init() call in dvb_dvr_open() with direct assignment of data/size and a call to dvb_ringbuffer_reset(), which properly resets pread, pwrite, and error with correct memory ordering without touching the waitqueue or spinlock. Cc: [email protected] Fixes: 34731df288a5f ("V4L/DVB (3501): Dmxdev: use dvb_ringbuffer") Reported-by: [email protected] Tested-by: [email protected] Link: https://lore.kernel.org/all/[email protected]/ Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Dikshita Agarwal <[email protected]> Date: Thu Dec 18 12:24:09 2025 +0530 media: iris: Add missing platform data entries for SM8750 [ Upstream commit bbef55f414100853d5bcea56a41f8b171bac8fcb ] Two platform-data fields for SM8750 were missed: - get_vpu_buffer_size = iris_vpu33_buf_size Without this, the driver fails to allocate the required internal buffers, leading to basic decode/encode failures during session bring-up. - max_core_mbps = ((7680 * 4320) / 256) * 60 Without this capability exposed, capability checks are incomplete and v4l2-compliance for encoder fails. Fixes: a5925a2ce077 ("media: iris: add VPU33 specific encoding buffer calculation") Fixes: a6882431a138 ("media: iris: Add support for ENUM_FRAMESIZES/FRAMEINTERVALS for encoder") Cc: [email protected] Signed-off-by: Dikshita Agarwal <[email protected]> Reviewed-by: Vikash Garodia <[email protected]> Reviewed-by: Konrad Dybcio <[email protected]> Signed-off-by: Bryan O'Donoghue <[email protected]> Signed-off-by: Hans Verkuil <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Dikshita Agarwal <[email protected]> Date: Sun Nov 2 09:10:19 2025 +0530 media: iris: remove v4l2_m2m_ioctl_{de,en}coder_cmd API usage during STOP handling [ Upstream commit 8fc707d13df517222db12b465af4aa9df05c99e1 ] Currently v4l2_m2m_ioctl_{de,enc}coder_cmd is being invoked during STOP command handling. However, this is not required as the iris driver has its own drain and stop handling mechanism in place. Using the m2m command API in this context leads to incorrect behavior, where the LAST flag is prematurely attached to a capture buffer, when there are no buffers in m2m source queue. But, in this scenario even though the source buffers are returned to client, hardware might still need to process the pending capture buffers. Attaching LAST flag prematurely can result in the capture buffer being removed from the destination queue before the hardware has finished processing it, causing issues when the buffer is eventually returned by the hardware. To prevent this, remove the m2m API usage in stop handling. Fixes: d09100763bed ("media: iris: add support for drain sequence") Fixes: 75db90ae067d ("media: iris: Add support for drain sequence in encoder video device") Signed-off-by: Dikshita Agarwal <[email protected]> Reviewed-by: Vikash Garodia <[email protected]> Cc: [email protected] Signed-off-by: Bryan O'Donoghue <[email protected]> Signed-off-by: Hans Verkuil <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Zilin Guan <[email protected]> Date: Fri Nov 14 09:12:57 2025 +0000 media: tegra-video: Fix memory leak in __tegra_channel_try_format() [ Upstream commit 43e5302d22334f1183dec3e0d5d8007eefe2817c ] The state object allocated by __v4l2_subdev_state_alloc() must be freed with __v4l2_subdev_state_free() when it is no longer needed. In __tegra_channel_try_format(), two error paths return directly after v4l2_subdev_call() fails, without freeing the allocated 'sd_state' object. This violates the requirement and causes a memory leak. Fix this by introducing a cleanup label and using goto statements in the error paths to ensure that __v4l2_subdev_state_free() is always called before the function returns. Fixes: 56f64b82356b7 ("media: tegra-video: Use zero crop settings if subdev has no get_selection") Fixes: 1ebaeb09830f3 ("media: tegra-video: Add support for external sensor capture") Cc: [email protected] Signed-off-by: Zilin Guan <[email protected]> Signed-off-by: Hans Verkuil <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Johan Hovold <[email protected]> Date: Fri Nov 21 17:46:23 2025 +0100 memory: mtk-smi: fix device leak on larb probe [ Upstream commit 9dae65913b32d05dbc8ff4b8a6bf04a0e49a8eb6 ] Make sure to drop the reference taken when looking up the SMI device during larb probe on late probe failure (e.g. probe deferral) and on driver unbind. Fixes: cc8bbe1a8312 ("memory: mediatek: Add SMI driver") Fixes: 038ae37c510f ("memory: mtk-smi: add missing put_device() call in mtk_smi_device_link_common") Cc: [email protected] # 4.6: 038ae37c510f Cc: [email protected] # 4.6 Cc: Yong Wu <[email protected]> Cc: Miaoqian Lin <[email protected]> Signed-off-by: Johan Hovold <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Krzysztof Kozlowski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Johan Hovold <[email protected]> Date: Fri Nov 21 17:46:22 2025 +0100 memory: mtk-smi: fix device leaks on common probe [ Upstream commit 6cfa038bddd710f544076ea2ef7792fc82fbedd6 ] Make sure to drop the reference taken when looking up the SMI device during common probe on late probe failure (e.g. probe deferral) and on driver unbind. Fixes: 47404757702e ("memory: mtk-smi: Add device link for smi-sub-common") Fixes: 038ae37c510f ("memory: mtk-smi: add missing put_device() call in mtk_smi_device_link_common") Cc: [email protected] # 5.16: 038ae37c510f Cc: [email protected] # 5.16 Cc: Yong Wu <[email protected]> Cc: Miaoqian Lin <[email protected]> Signed-off-by: Johan Hovold <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Krzysztof Kozlowski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Harry Yoo <[email protected]> Date: Tue Feb 10 17:19:00 2026 +0900 mm/slab: use prandom if !allow_spin [ Upstream commit a1e244a9f177894969c6cd5ebbc6d72c19fc4a7a ] When CONFIG_SLAB_FREELIST_RANDOM is enabled and get_random_u32() is called in an NMI context, lockdep complains because it acquires a local_lock: ================================ WARNING: inconsistent lock state 6.19.0-rc5-slab-for-next+ #325 Tainted: G N -------------------------------- inconsistent {INITIAL USE} -> {IN-NMI} usage. kunit_try_catch/8312 [HC2[2]:SC0[0]:HE0:SE1] takes: ffff88a02ec49cc0 (batched_entropy_u32.lock){-.-.}-{3:3}, at: get_random_u32+0x7f/0x2e0 {INITIAL USE} state was registered at: lock_acquire+0xd9/0x2f0 get_random_u32+0x93/0x2e0 __get_random_u32_below+0x17/0x70 cache_random_seq_create+0x121/0x1c0 init_cache_random_seq+0x5d/0x110 do_kmem_cache_create+0x1e0/0xa30 __kmem_cache_create_args+0x4ec/0x830 create_kmalloc_caches+0xe6/0x130 kmem_cache_init+0x1b1/0x660 mm_core_init+0x1d8/0x4b0 start_kernel+0x620/0xcd0 x86_64_start_reservations+0x18/0x30 x86_64_start_kernel+0xf3/0x140 common_startup_64+0x13e/0x148 irq event stamp: 76 hardirqs last enabled at (75): [<ffffffff8298b77a>] exc_nmi+0x11a/0x240 hardirqs last disabled at (76): [<ffffffff8298b991>] sysvec_irq_work+0x11/0x110 softirqs last enabled at (0): [<ffffffff813b2dda>] copy_process+0xc7a/0x2350 softirqs last disabled at (0): [<0000000000000000>] 0x0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(batched_entropy_u32.lock); <Interrupt> lock(batched_entropy_u32.lock); *** DEADLOCK *** Fix this by using pseudo-random number generator if !allow_spin. This means kmalloc_nolock() users won't get truly random numbers, but there is not much we can do about it. Note that an NMI handler might interrupt prandom_u32_state() and change the random state, but that's safe. Link: https://lore.kernel.org/all/[email protected] Fixes: af92793e52c3 ("slab: Introduce kmalloc_nolock() and kfree_nolock().") Cc: [email protected] Signed-off-by: Harry Yoo <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Vlastimil Babka <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Deepanshu Kartikey <[email protected]> Date: Sat Feb 14 05:45:35 2026 +0530 mm: thp: deny THP for files on anonymous inodes commit dd085fe9a8ebfc5d10314c60452db38d2b75e609 upstream. file_thp_enabled() incorrectly allows THP for files on anonymous inodes (e.g. guest_memfd and secretmem). These files are created via alloc_file_pseudo(), which does not call get_write_access() and leaves inode->i_writecount at 0. Combined with S_ISREG(inode->i_mode) being true, they appear as read-only regular files when CONFIG_READ_ONLY_THP_FOR_FS is enabled, making them eligible for THP collapse. Anonymous inodes can never pass the inode_is_open_for_write() check since their i_writecount is never incremented through the normal VFS open path. The right thing to do is to exclude them from THP eligibility altogether, since CONFIG_READ_ONLY_THP_FOR_FS was designed for real filesystem files (e.g. shared libraries), not for pseudo-filesystem inodes. For guest_memfd, this allows khugepaged and MADV_COLLAPSE to create large folios in the page cache via the collapse path, but the guest_memfd fault handler does not support large folios. This triggers WARN_ON_ONCE(folio_test_large(folio)) in kvm_gmem_fault_user_mapping(). For secretmem, collapse_file() tries to copy page contents through the direct map, but secretmem pages are removed from the direct map. This can result in a kernel crash: BUG: unable to handle page fault for address: ffff88810284d000 RIP: 0010:memcpy_orig+0x16/0x130 Call Trace: collapse_file hpage_collapse_scan_file madvise_collapse Secretmem is not affected by the crash on upstream as the memory failure recovery handles the failed copy gracefully, but it still triggers confusing false memory failure reports: Memory failure: 0x106d96f: recovery action for clean unevictable LRU page: Recovered Check IS_ANON_FILE(inode) in file_thp_enabled() to deny THP for all anonymous inode files. Link: https://syzkaller.appspot.com/bug?extid=33a04338019ac7e43a44 Link: https://lore.kernel.org/linux-mm/CAEvNRgHegcz3ro35ixkDw39ES8=U6rs6S7iP0gkR9enr7HoGtA@mail.gmail.com Link: https://lkml.kernel.org/r/[email protected] Fixes: 7fbb5e188248 ("mm: remove VM_EXEC requirement for THP eligibility") Signed-off-by: Deepanshu Kartikey <[email protected]> Reported-by: [email protected] Closes: https://syzkaller.appspot.com/bug?extid=33a04338019ac7e43a44 Tested-by: [email protected] Tested-by: Lance Yang <[email protected]> Acked-by: David Hildenbrand (Arm) <[email protected]> Reviewed-by: Barry Song <[email protected]> Reviewed-by: Ackerley Tng <[email protected]> Tested-by: Ackerley Tng <[email protected]> Reviewed-by: Lorenzo Stoakes <[email protected]> Cc: Baolin Wang <[email protected]> Cc: Dev Jain <[email protected]> Cc: Fangrui Song <[email protected]> Cc: Liam Howlett <[email protected]> Cc: Nico Pache <[email protected]> Cc: Ryan Roberts <[email protected]> Cc: Yang Shi <[email protected]> Cc: Zi Yan <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Petr Pavlu <[email protected]> Date: Wed Jan 7 13:22:57 2026 +0100 module: Remove duplicate freeing of lockdep classes [ Upstream commit a7b4bc094fbaa7dc7b7b91ae33549bbd7eefaac1 ] In the error path of load_module(), under the free_module label, the code calls lockdep_free_key_range() to release lock classes associated with the MOD_DATA, MOD_RODATA and MOD_RO_AFTER_INIT module regions, and subsequently invokes module_deallocate(). Since commit ac3b43283923 ("module: replace module_layout with module_memory"), the module_deallocate() function calls free_mod_mem(), which releases the lock classes as well and considers all module regions. Attempting to free these classes twice is unnecessary. Remove the redundant code in load_module(). Fixes: ac3b43283923 ("module: replace module_layout with module_memory") Signed-off-by: Petr Pavlu <[email protected]> Reviewed-by: Daniel Gomez <[email protected]> Reviewed-by: Aaron Tomlin <[email protected]> Acked-by: Song Liu <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Sami Tolvanen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Matthieu Baerts (NGI0) <[email protected]> Date: Tue Mar 3 11:56:03 2026 +0100 mptcp: pm: avoid sending RM_ADDR over same subflow commit fb8d0bccb221080630efcd9660c9f9349e53cc9e upstream. RM_ADDR are sent over an active subflow, the first one in the subflows list. There is then a high chance the initial subflow is picked. With the in-kernel PM, when an endpoint is removed, a RM_ADDR is sent, then linked subflows are closed. This is done for each active MPTCP connection. MPTCP endpoints are likely removed because the attached network is no longer available or usable. In this case, it is better to avoid sending this RM_ADDR over the subflow that is going to be removed, but prefer sending it over another active and non stale subflow, if any. This modification avoids situations where the other end is not notified when a subflow is no longer usable: typically when the endpoint linked to the initial subflow is removed, especially on the server side. Fixes: 8dd5efb1f91b ("mptcp: send ack for rm_addr") Cc: [email protected] Reported-by: Frank Lorenz <[email protected]> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/612 Reviewed-by: Mat Martineau <[email protected]> Signed-off-by: Matthieu Baerts (NGI0) <[email protected]> Link: https://patch.msgid.link/20260303-net-mptcp-misc-fixes-7-0-rc2-v1-2-4b5462b6f016@kernel.org Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Matthieu Baerts (NGI0) <[email protected]> Date: Tue Mar 3 11:56:05 2026 +0100 mptcp: pm: in-kernel: always mark signal+subflow endp as used commit 579a752464a64cb5f9139102f0e6b90a1f595ceb upstream. Syzkaller managed to find a combination of actions that was generating this warning: msk->pm.local_addr_used == 0 WARNING: net/mptcp/pm_kernel.c:1071 at __mark_subflow_endp_available net/mptcp/pm_kernel.c:1071 [inline], CPU#1: syz.2.17/961 WARNING: net/mptcp/pm_kernel.c:1071 at mptcp_nl_remove_subflow_and_signal_addr net/mptcp/pm_kernel.c:1103 [inline], CPU#1: syz.2.17/961 WARNING: net/mptcp/pm_kernel.c:1071 at mptcp_pm_nl_del_addr_doit+0x81d/0x8f0 net/mptcp/pm_kernel.c:1210, CPU#1: syz.2.17/961 Modules linked in: CPU: 1 UID: 0 PID: 961 Comm: syz.2.17 Not tainted 6.19.0-08368-gfafda3b4b06b #22 PREEMPT(full) Hardware name: QEMU Ubuntu 25.10 PC v2 (i440FX + PIIX, + 10.1 machine, 1996), BIOS 1.17.0-debian-1.17.0-1build1 04/01/2014 RIP: 0010:__mark_subflow_endp_available net/mptcp/pm_kernel.c:1071 [inline] RIP: 0010:mptcp_nl_remove_subflow_and_signal_addr net/mptcp/pm_kernel.c:1103 [inline] RIP: 0010:mptcp_pm_nl_del_addr_doit+0x81d/0x8f0 net/mptcp/pm_kernel.c:1210 Code: 89 c5 e8 46 30 6f fe e9 21 fd ff ff 49 83 ed 80 e8 38 30 6f fe 4c 89 ef be 03 00 00 00 e8 db 49 df fe eb ac e8 24 30 6f fe 90 <0f> 0b 90 e9 1d ff ff ff e8 16 30 6f fe eb 05 e8 0f 30 6f fe e8 9a RSP: 0018:ffffc90001663880 EFLAGS: 00010293 RAX: ffffffff82de1a6c RBX: 0000000000000000 RCX: ffff88800722b500 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff8880158b22d0 R08: 0000000000010425 R09: ffffffffffffffff R10: ffffffff82de18ba R11: 0000000000000000 R12: ffff88800641a640 R13: ffff8880158b1880 R14: ffff88801ec3c900 R15: ffff88800641a650 FS: 00005555722c3500(0000) GS:ffff8880f909d000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f66346e0f60 CR3: 000000001607c000 CR4: 0000000000350ef0 Call Trace: <TASK> genl_family_rcv_msg_doit+0x117/0x180 net/netlink/genetlink.c:1115 genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline] genl_rcv_msg+0x3a8/0x3f0 net/netlink/genetlink.c:1210 netlink_rcv_skb+0x16d/0x240 net/netlink/af_netlink.c:2550 genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219 netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline] netlink_unicast+0x3e9/0x4c0 net/netlink/af_netlink.c:1344 netlink_sendmsg+0x4aa/0x5b0 net/netlink/af_netlink.c:1894 sock_sendmsg_nosec net/socket.c:727 [inline] __sock_sendmsg+0xc9/0xf0 net/socket.c:742 ____sys_sendmsg+0x272/0x3b0 net/socket.c:2592 ___sys_sendmsg+0x2de/0x320 net/socket.c:2646 __sys_sendmsg net/socket.c:2678 [inline] __do_sys_sendmsg net/socket.c:2683 [inline] __se_sys_sendmsg net/socket.c:2681 [inline] __x64_sys_sendmsg+0x110/0x1a0 net/socket.c:2681 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0x143/0x440 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f66346f826d Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffc83d8bdc8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007f6634985fa0 RCX: 00007f66346f826d RDX: 00000000040000b0 RSI: 0000200000000740 RDI: 0000000000000007 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f6634985fa8 R13: 00007f6634985fac R14: 0000000000000000 R15: 0000000000001770 </TASK> The actions that caused that seem to be: - Set the MPTCP subflows limit to 0 - Create an MPTCP endpoint with both the 'signal' and 'subflow' flags - Create a new MPTCP connection from a different address: an ADD_ADDR linked to the MPTCP endpoint will be sent ('signal' flag), but no subflows is initiated ('subflow' flag) - Remove the MPTCP endpoint In this case, msk->pm.local_addr_used has been kept to 0 -- because no subflows have been created -- but the corresponding bit in msk->pm.id_avail_bitmap has been cleared when the ADD_ADDR has been sent. This later causes a splat when removing the MPTCP endpoint because msk->pm.local_addr_used has been kept to 0. Now, if an endpoint has both the signal and subflow flags, but it is not possible to create subflows because of the limits or the c-flag case, then the local endpoint counter is still incremented: the endpoint is used at the end. This avoids issues later when removing the endpoint and calling __mark_subflow_endp_available(), which expects msk->pm.local_addr_used to have been previously incremented if the endpoint was marked as used according to msk->pm.id_avail_bitmap. Note that signal_and_subflow variable is reset to false when the limits and the c-flag case allows subflows creation. Also, local_addr_used is only incremented for non ID0 subflows. Fixes: 85df533a787b ("mptcp: pm: do not ignore 'subflow' if 'signal' flag is also set") Cc: [email protected] Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/613 Reviewed-by: Mat Martineau <[email protected]> Signed-off-by: Matthieu Baerts (NGI0) <[email protected]> Link: https://patch.msgid.link/20260303-net-mptcp-misc-fixes-7-0-rc2-v1-4-4b5462b6f016@kernel.org Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Christian Brauner <[email protected]> Date: Thu Jan 29 14:52:22 2026 +0100 namespace: fix proc mount iteration commit 4a403d7aa9074f527f064ef0806aaab38d14b07c upstream. The m->index isn't updated when m->show() overflows and retains its value before the current mount causing a restart to start at the same value. If that happens in short order to due a quickly expanding mount table this would cause the same mount to be shown again and again. Ensure that *pos always equals the mount id of the mount that was returned by start/next. On restart after overflow mnt_find_id_at(*pos) finds the exact mount. This should avoid duplicates, avoid skips and should handle concurrent modification just fine. Cc: <[email protected]> Fixed: 2eea9ce4310d8 ("mounts: keep list of mounts in an rbtree") Link: https://patch.msgid.link/20260129-geleckt-treuhand-4bb940acacd9@brauner Signed-off-by: Christian Brauner <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Allison Henderson <[email protected]> Date: Fri Feb 27 13:23:36 2026 -0700 net/rds: Fix circular locking dependency in rds_tcp_tune [ Upstream commit 6a877ececd6daa002a9a0002cd0fbca6592a9244 ] syzbot reported a circular locking dependency in rds_tcp_tune() where sk_net_refcnt_upgrade() is called while holding the socket lock: ====================================================== WARNING: possible circular locking dependency detected ====================================================== kworker/u10:8/15040 is trying to acquire lock: ffffffff8e9aaf80 (fs_reclaim){+.+.}-{0:0}, at: __kmalloc_cache_noprof+0x4b/0x6f0 but task is already holding lock: ffff88805a3c1ce0 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: rds_tcp_tune+0xd7/0x930 The issue occurs because sk_net_refcnt_upgrade() performs memory allocation (via get_net_track() -> ref_tracker_alloc()) while the socket lock is held, creating a circular dependency with fs_reclaim. Fix this by moving sk_net_refcnt_upgrade() outside the socket lock critical section. This is safe because the fields modified by the sk_net_refcnt_upgrade() call (sk_net_refcnt, ns_tracker) are not accessed by any concurrent code path at this point. v2: - Corrected fixes tag - check patch line wrap nits - ai commentary nits Reported-by: [email protected] Closes: https://syzkaller.appspot.com/bug?extid=2e2cf5331207053b8106 Fixes: 3a58f13a881e ("net: rds: acquire refcount on TCP sockets") Signed-off-by: Allison Henderson <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Jamal Hadi Salim <[email protected]> Date: Wed Mar 4 09:06:02 2026 -0500 net/sched: act_ife: Fix metalist update behavior [ Upstream commit e2cedd400c3ec0302ffca2490e8751772906ac23 ] Whenever an ife action replace changes the metalist, instead of replacing the old data on the metalist, the current ife code is appending the new metadata. Aside from being innapropriate behavior, this may lead to an unbounded addition of metadata to the metalist which might cause an out of bounds error when running the encode op: [ 138.423369][ C1] ================================================================== [ 138.424317][ C1] BUG: KASAN: slab-out-of-bounds in ife_tlv_meta_encode (net/ife/ife.c:168) [ 138.424906][ C1] Write of size 4 at addr ffff8880077f4ffe by task ife_out_out_bou/255 [ 138.425778][ C1] CPU: 1 UID: 0 PID: 255 Comm: ife_out_out_bou Not tainted 7.0.0-rc1-00169-gfbdfa8da05b6 #624 PREEMPT(full) [ 138.425795][ C1] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 138.425800][ C1] Call Trace: [ 138.425804][ C1] <IRQ> [ 138.425808][ C1] dump_stack_lvl (lib/dump_stack.c:122) [ 138.425828][ C1] print_report (mm/kasan/report.c:379 mm/kasan/report.c:482) [ 138.425839][ C1] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:221) [ 138.425844][ C1] ? __virt_addr_valid (./arch/x86/include/asm/preempt.h:95 (discriminator 1) ./include/linux/rcupdate.h:975 (discriminator 1) ./include/linux/mmzone.h:2207 (discriminator 1) arch/x86/mm/physaddr.c:54 (discriminator 1)) [ 138.425853][ C1] ? ife_tlv_meta_encode (net/ife/ife.c:168) [ 138.425859][ C1] kasan_report (mm/kasan/report.c:221 mm/kasan/report.c:597) [ 138.425868][ C1] ? ife_tlv_meta_encode (net/ife/ife.c:168) [ 138.425878][ C1] kasan_check_range (mm/kasan/generic.c:186 (discriminator 1) mm/kasan/generic.c:200 (discriminator 1)) [ 138.425884][ C1] __asan_memset (mm/kasan/shadow.c:84 (discriminator 2)) [ 138.425889][ C1] ife_tlv_meta_encode (net/ife/ife.c:168) [ 138.425893][ C1] ? ife_tlv_meta_encode (net/ife/ife.c:171) [ 138.425898][ C1] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:221) [ 138.425903][ C1] ife_encode_meta_u16 (net/sched/act_ife.c:57) [ 138.425910][ C1] ? __pfx_do_raw_spin_lock (kernel/locking/spinlock_debug.c:114) [ 138.425916][ C1] ? __asan_memcpy (mm/kasan/shadow.c:105 (discriminator 3)) [ 138.425921][ C1] ? __pfx_ife_encode_meta_u16 (net/sched/act_ife.c:45) [ 138.425927][ C1] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:221) [ 138.425931][ C1] tcf_ife_act (net/sched/act_ife.c:847 net/sched/act_ife.c:879) To solve this issue, fix the replace behavior by adding the metalist to the ife rcu data structure. Fixes: aa9fd9a325d51 ("sched: act: ife: update parameters via rcu handling") Reported-by: Ruitong Liu <[email protected]> Tested-by: Ruitong Liu <[email protected]> Co-developed-by: Victor Nogueira <[email protected]> Signed-off-by: Victor Nogueira <[email protected]> Signed-off-by: Jamal Hadi Salim <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Davide Caratti <[email protected]> Date: Tue Feb 24 21:28:32 2026 +0100 net/sched: ets: fix divide by zero in the offload path commit e35626f610f3d2b7953ccddf6a77453da22b3a9e upstream. Offloading ETS requires computing each class' WRR weight: this is done by averaging over the sums of quanta as 'q_sum' and 'q_psum'. Using unsigned int, the same integer size as the individual DRR quanta, can overflow and even cause division by zero, like it happened in the following splat: Oops: divide error: 0000 [#1] SMP PTI CPU: 13 UID: 0 PID: 487 Comm: tc Tainted: G E 6.19.0-virtme #45 PREEMPT(full) Tainted: [E]=UNSIGNED_MODULE Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 RIP: 0010:ets_offload_change+0x11f/0x290 [sch_ets] Code: e4 45 31 ff eb 03 41 89 c7 41 89 cb 89 ce 83 f9 0f 0f 87 b7 00 00 00 45 8b 08 31 c0 45 01 cc 45 85 c9 74 09 41 6b c4 64 31 d2 <41> f7 f2 89 c2 44 29 fa 45 89 df 41 83 fb 0f 0f 87 c7 00 00 00 44 RSP: 0018:ffffd0a180d77588 EFLAGS: 00010246 RAX: 00000000ffffff38 RBX: ffff8d3d482ca000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffd0a180d77660 RBP: ffffd0a180d77690 R08: ffff8d3d482ca2d8 R09: 00000000fffffffe R10: 0000000000000000 R11: 0000000000000000 R12: 00000000fffffffe R13: ffff8d3d472f2000 R14: 0000000000000003 R15: 0000000000000000 FS: 00007f440b6c2740(0000) GS:ffff8d3dc9803000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000003cdd2000 CR3: 0000000007b58002 CR4: 0000000000172ef0 Call Trace: <TASK> ets_qdisc_change+0x870/0xf40 [sch_ets] qdisc_create+0x12b/0x540 tc_modify_qdisc+0x6d7/0xbd0 rtnetlink_rcv_msg+0x168/0x6b0 netlink_rcv_skb+0x5c/0x110 netlink_unicast+0x1d6/0x2b0 netlink_sendmsg+0x22e/0x470 ____sys_sendmsg+0x38a/0x3c0 ___sys_sendmsg+0x99/0xe0 __sys_sendmsg+0x8a/0xf0 do_syscall_64+0x111/0xf80 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f440b81c77e Code: 4d 89 d8 e8 d4 bc 00 00 4c 8b 5d f8 41 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 11 c9 c3 0f 1f 80 00 00 00 00 48 8b 45 10 0f 05 <c9> c3 83 e2 39 83 fa 08 75 e7 e8 13 ff ff ff 0f 1f 00 f3 0f 1e fa RSP: 002b:00007fff951e4c10 EFLAGS: 00000202 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 0000000000481820 RCX: 00007f440b81c77e RDX: 0000000000000000 RSI: 00007fff951e4cd0 RDI: 0000000000000003 RBP: 00007fff951e4c20 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000202 R12: 00007fff951f4fa8 R13: 00000000699ddede R14: 00007f440bb01000 R15: 0000000000486980 </TASK> Modules linked in: sch_ets(E) netdevsim(E) ---[ end trace 0000000000000000 ]--- RIP: 0010:ets_offload_change+0x11f/0x290 [sch_ets] Code: e4 45 31 ff eb 03 41 89 c7 41 89 cb 89 ce 83 f9 0f 0f 87 b7 00 00 00 45 8b 08 31 c0 45 01 cc 45 85 c9 74 09 41 6b c4 64 31 d2 <41> f7 f2 89 c2 44 29 fa 45 89 df 41 83 fb 0f 0f 87 c7 00 00 00 44 RSP: 0018:ffffd0a180d77588 EFLAGS: 00010246 RAX: 00000000ffffff38 RBX: ffff8d3d482ca000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffd0a180d77660 RBP: ffffd0a180d77690 R08: ffff8d3d482ca2d8 R09: 00000000fffffffe R10: 0000000000000000 R11: 0000000000000000 R12: 00000000fffffffe R13: ffff8d3d472f2000 R14: 0000000000000003 R15: 0000000000000000 FS: 00007f440b6c2740(0000) GS:ffff8d3dc9803000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000003cdd2000 CR3: 0000000007b58002 CR4: 0000000000172ef0 Kernel panic - not syncing: Fatal exception Kernel Offset: 0x30000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) ---[ end Kernel panic - not syncing: Fatal exception ]--- Fix this using 64-bit integers for 'q_sum' and 'q_psum'. Cc: [email protected] Fixes: d35eb52bd2ac ("net: sch_ets: Make the ETS qdisc offloadable") Signed-off-by: Davide Caratti <[email protected]> Reviewed-by: Jamal Hadi Salim <[email protected]> Reviewed-by: Petr Machata <[email protected]> Link: https://patch.msgid.link/28504887df314588c7255e9911769c36f751edee.1771964872.git.dcaratti@redhat.com Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Eric Dumazet <[email protected]> Date: Wed Feb 25 13:15:47 2026 +0000 net: annotate data-races around sk->sk_{data_ready,write_space} [ Upstream commit 2ef2b20cf4e04ac8a6ba68493f8780776ff84300 ] skmsg (and probably other layers) are changing these pointers while other cpus might read them concurrently. Add corresponding READ_ONCE()/WRITE_ONCE() annotations for UDP, TCP and AF_UNIX. Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface") Reported-by: [email protected] Closes: https://lore.kernel.org/netdev/[email protected]/ Signed-off-by: Eric Dumazet <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: John Fastabend <[email protected]> Cc: Jakub Sitnicki <[email protected]> Cc: Willem de Bruijn <[email protected]> Reviewed-by: Kuniyuki Iwashima <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Ethan Nelson-Moore <[email protected]> Date: Thu Feb 12 20:55:09 2026 -0800 net: arcnet: com20020-pci: fix support for 2.5Mbit cards [ Upstream commit c7d9be66b71af490446127c6ffcb66d6bb71b8b9 ] Commit 8c14f9c70327 ("ARCNET: add com20020 PCI IDs with metadata") converted the com20020-pci driver to use a card info structure instead of a single flag mask in driver_data. However, it failed to take into account that in the original code, driver_data of 0 indicates a card with no special flags, not a card that should not have any card info structure. This introduced a null pointer dereference when cards with no flags were probed. Commit bd6f1fd5d33d ("net: arcnet: com20020: Fix null-ptr-deref in com20020pci_probe()") then papered over this issue by rejecting cards with no driver_data instead of resolving the problem at its source. Fix the original issue by introducing a new card info structure for 2.5Mbit cards that does not set any flags and using it if no driver_data is present. Fixes: 8c14f9c70327 ("ARCNET: add com20020 PCI IDs with metadata") Fixes: bd6f1fd5d33d ("net: arcnet: com20020: Fix null-ptr-deref in com20020pci_probe()") Cc: [email protected] Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Ethan Nelson-Moore <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Fernando Fernandez Mancera <[email protected]> Date: Wed Mar 4 13:03:56 2026 +0100 net: bridge: fix nd_tbl NULL dereference when IPv6 is disabled [ Upstream commit e5e890630533bdc15b26a34bb8e7ef539bdf1322 ] When booting with the 'ipv6.disable=1' parameter, the nd_tbl is never initialized because inet6_init() exits before ndisc_init() is called which initializes it. Then, if neigh_suppress is enabled and an ICMPv6 Neighbor Discovery packet reaches the bridge, br_do_suppress_nd() will dereference ipv6_stub->nd_tbl which is NULL, passing it to neigh_lookup(). This causes a kernel NULL pointer dereference. BUG: kernel NULL pointer dereference, address: 0000000000000268 Oops: 0000 [#1] PREEMPT SMP NOPTI [...] RIP: 0010:neigh_lookup+0x16/0xe0 [...] Call Trace: <IRQ> ? neigh_lookup+0x16/0xe0 br_do_suppress_nd+0x160/0x290 [bridge] br_handle_frame_finish+0x500/0x620 [bridge] br_handle_frame+0x353/0x440 [bridge] __netif_receive_skb_core.constprop.0+0x298/0x1110 __netif_receive_skb_one_core+0x3d/0xa0 process_backlog+0xa0/0x140 __napi_poll+0x2c/0x170 net_rx_action+0x2c4/0x3a0 handle_softirqs+0xd0/0x270 do_softirq+0x3f/0x60 Fix this by replacing IS_ENABLED(IPV6) call with ipv6_mod_enabled() in the callers. This is in essence disabling NS/NA suppression when IPv6 is disabled. Fixes: ed842faeb2bd ("bridge: suppress nd pkts on BR_NEIGH_SUPPRESS ports") Reported-by: Guruprasad C P <[email protected]> Closes: https://lore.kernel.org/netdev/CAHXs0ORzd62QOG-Fttqa2Cx_A_VFp=utE2H2VTX5nqfgs7LDxQ@mail.gmail.com/ Signed-off-by: Fernando Fernandez Mancera <[email protected]> Reviewed-by: Ido Schimmel <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Bobby Eshleman <[email protected]> Date: Mon Mar 2 16:32:56 2026 -0800 net: devmem: use READ_ONCE/WRITE_ONCE on binding->dev [ Upstream commit 40bf00ec2ee271df5ba67593991760adf8b5d0ed ] binding->dev is protected on the write-side in mp_dmabuf_devmem_uninstall() against concurrent writes, but due to the concurrent bare reads in net_devmem_get_binding() and validate_xmit_unreadable_skb() it should be wrapped in a READ_ONCE/WRITE_ONCE pair to make sure no compiler optimizations play with the underlying register in unforeseen ways. Doesn't present a critical bug because the known compiler optimizations don't result in bad behavior. There is no tearing on u64, and load omissions/invented loads would only break if additional binding->dev references were inlined together (they aren't right now). This just more strictly follows the linux memory model (i.e., "Lock-Protected Writes With Lockless Reads" in tools/memory-model/Documentation/access-marking.txt). Fixes: bd61848900bf ("net: devmem: Implement TX path") Signed-off-by: Bobby Eshleman <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Mieczyslaw Nalewaj <[email protected]> Date: Sun Mar 1 18:13:14 2026 -0300 net: dsa: realtek: rtl8365mb: fix rtl8365mb_phy_ocp_write return value [ Upstream commit 7cbe98f7bef965241a5908d50d557008cf998aee ] Function rtl8365mb_phy_ocp_write() always returns 0, even when an error occurs during register access. This patch fixes the return value to propagate the actual error code from regmap operations. Link: https://lore.kernel.org/netdev/[email protected]/ Fixes: 2796728460b8 ("net: dsa: realtek: rtl8365mb: serialize indirect PHY register access") Signed-off-by: Mieczyslaw Nalewaj <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: Luiz Angelo Daros de Luca <[email protected]> Reviewed-by: Linus Walleij <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Larysa Zaremba <[email protected]> Date: Thu Mar 5 12:12:49 2026 +0100 net: enetc: use truesize as XDP RxQ info frag_size [ Upstream commit f8e18abf183dbd636a8725532c7f5aa58957de84 ] The only user of frag_size field in XDP RxQ info is bpf_xdp_frags_increase_tail(). It clearly expects truesize instead of DMA write size. Different assumptions in enetc driver configuration lead to negative tailroom. Set frag_size to the same value as frame_sz. Fixes: 2768b2e2f7d2 ("net: enetc: register XDP RX queues with frag_size") Reviewed-by: Aleksandr Loktionov <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Signed-off-by: Larysa Zaremba <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Lorenzo Bianconi <[email protected]> Date: Tue Mar 3 18:56:39 2026 +0100 net: ethernet: mtk_eth_soc: Reset prog ptr to old_prog in case of error in mtk_xdp_setup() [ Upstream commit 0abc73c8a40fd64ac1739c90bb4f42c418d27a5e ] Reset eBPF program pointer to old_prog and do not decrease its ref-count if mtk_open routine in mtk_xdp_setup() fails. Fixes: 7c26c20da5d42 ("net: ethernet: mtk_eth_soc: add basic XDP support") Suggested-by: Paolo Valerio <[email protected]> Signed-off-by: Lorenzo Bianconi <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Chintan Vankar <[email protected]> Date: Tue Feb 24 23:43:59 2026 +0530 net: ethernet: ti: am65-cpsw-nuss/cpsw-ale: Fix multicast entry handling in ALE table [ Upstream commit be11a537224d72b906db6b98510619770298c8a4 ] In the current implementation, flushing multicast entries in MAC mode incorrectly deletes entries for all ports instead of only the target port, disrupting multicast traffic on other ports. The cause is adding multicast entries by setting only host port bit, and not setting the MAC port bits. Fix this by setting the MAC port's bit in the port mask while adding the multicast entry. Also fix the flush logic to preserve the host port bit during removal of MAC port and free ALE entries when mask contains only host port. Fixes: 5c50a856d550 ("drivers: net: ethernet: cpsw: add multicast address to ALE table") Signed-off-by: Chintan Vankar <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Yung Chih Su <[email protected]> Date: Mon Mar 2 14:02:47 2026 +0800 net: ipv4: fix ARM64 alignment fault in multipath hash seed [ Upstream commit 4ee7fa6cf78ff26d783d39e2949d14c4c1cd5e7f ] `struct sysctl_fib_multipath_hash_seed` contains two u32 fields (user_seed and mp_seed), making it an 8-byte structure with a 4-byte alignment requirement. In `fib_multipath_hash_from_keys()`, the code evaluates the entire struct atomically via `READ_ONCE()`: mp_seed = READ_ONCE(net->ipv4.sysctl_fib_multipath_hash_seed).mp_seed; While this silently works on GCC by falling back to unaligned regular loads which the ARM64 kernel tolerates, it causes a fatal kernel panic when compiled with Clang and LTO enabled. Commit e35123d83ee3 ("arm64: lto: Strengthen READ_ONCE() to acquire when CONFIG_LTO=y") strengthens `READ_ONCE()` to use Load-Acquire instructions (`ldar` / `ldapr`) to prevent compiler reordering bugs under Clang LTO. Since the macro evaluates the full 8-byte struct, Clang emits a 64-bit `ldar` instruction. ARM64 architecture strictly requires `ldar` to be naturally aligned, thus executing it on a 4-byte aligned address triggers a strict Alignment Fault (FSC = 0x21). Fix the read side by moving the `READ_ONCE()` directly to the `u32` member, which emits a safe 32-bit `ldar Wn`. Furthermore, Eric Dumazet pointed out that `WRITE_ONCE()` on the entire struct in `proc_fib_multipath_hash_set_seed()` is also flawed. Analysis shows that Clang splits this 8-byte write into two separate 32-bit `str` instructions. While this avoids an alignment fault, it destroys atomicity and exposes a tear-write vulnerability. Fix this by explicitly splitting the write into two 32-bit `WRITE_ONCE()` operations. Finally, add the missing `READ_ONCE()` when reading `user_seed` in `proc_fib_multipath_hash_seed()` to ensure proper pairing and concurrency safety. Fixes: 4ee2a8cace3f ("net: ipv4: Add a sysctl to set multipath hash seed") Signed-off-by: Yung Chih Su <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Jiayuan Chen <[email protected]> Date: Wed Mar 4 19:38:13 2026 +0800 net: ipv6: fix panic when IPv4 route references loopback IPv6 nexthop [ Upstream commit 21ec92774d1536f71bdc90b0e3d052eff99cf093 ] When a standalone IPv6 nexthop object is created with a loopback device (e.g., "ip -6 nexthop add id 100 dev lo"), fib6_nh_init() misclassifies it as a reject route. This is because nexthop objects have no destination prefix (fc_dst=::), causing fib6_is_reject() to match any loopback nexthop. The reject path skips fib_nh_common_init(), leaving nhc_pcpu_rth_output unallocated. If an IPv4 route later references this nexthop, __mkroute_output() dereferences NULL nhc_pcpu_rth_output and panics. Simplify the check in fib6_nh_init() to only match explicit reject routes (RTF_REJECT) instead of using fib6_is_reject(). The loopback promotion heuristic in fib6_is_reject() is handled separately by ip6_route_info_create_nh(). After this change, the three cases behave as follows: 1. Explicit reject route ("ip -6 route add unreachable 2001:db8::/64"): RTF_REJECT is set, enters reject path, skips fib_nh_common_init(). No behavior change. 2. Implicit loopback reject route ("ip -6 route add 2001:db8::/32 dev lo"): RTF_REJECT is not set, takes normal path, fib_nh_common_init() is called. ip6_route_info_create_nh() still promotes it to reject afterward. nhc_pcpu_rth_output is allocated but unused, which is harmless. 3. Standalone nexthop object ("ip -6 nexthop add id 100 dev lo"): RTF_REJECT is not set, takes normal path, fib_nh_common_init() is called. nhc_pcpu_rth_output is properly allocated, fixing the crash when IPv4 routes reference this nexthop. Suggested-by: Ido Schimmel <[email protected]> Fixes: 493ced1ac47c ("ipv4: Allow routes to use nexthop objects") Reported-by: [email protected] Closes: https://lore.kernel.org/all/[email protected]/T/ Signed-off-by: Jiayuan Chen <[email protected]> Reviewed-by: Ido Schimmel <[email protected]> Reviewed-by: David Ahern <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Ian Ray <[email protected]> Date: Mon Mar 2 18:32:37 2026 +0200 net: nfc: nci: Fix zero-length proprietary notifications [ Upstream commit f7d92f11bd33a6eb49c7c812255ef4ab13681f0f ] NCI NFC controllers may have proprietary OIDs with zero-length payload. One example is: drivers/nfc/nxp-nci/core.c, NXP_NCI_RF_TXLDO_ERROR_NTF. Allow a zero length payload in proprietary notifications *only*. Before: -- >8 -- kernel: nci: nci_recv_frame: len 3 -- >8 -- After: -- >8 -- kernel: nci: nci_recv_frame: len 3 kernel: nci: nci_ntf_packet: NCI RX: MT=ntf, PBF=0, GID=0x1, OID=0x23, plen=0 kernel: nci: nci_ntf_packet: unknown ntf opcode 0x123 kernel: nfc nfc0: NFC: RF transmitter couldn't start. Bad power and/or configuration? -- >8 -- After fixing the hardware: -- >8 -- kernel: nci: nci_recv_frame: len 27 kernel: nci: nci_ntf_packet: NCI RX: MT=ntf, PBF=0, GID=0x1, OID=0x5, plen=24 kernel: nci: nci_rf_intf_activated_ntf_packet: rf_discovery_id 1 -- >8 -- Fixes: d24b03535e5e ("nfc: nci: Fix uninit-value in nci_dev_up and nci_ntf_packet") Signed-off-by: Ian Ray <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Andrew Lunn <[email protected]> Date: Sun Feb 22 16:26:01 2026 +0100 net: phy: register phy led_triggers during probe to avoid AB-BA deadlock commit c8dbdc6e380e7e96a51706db3e4b7870d8a9402d upstream. There is an AB-BA deadlock when both LEDS_TRIGGER_NETDEV and LED_TRIGGER_PHY are enabled: [ 1362.049207] [<8054e4b8>] led_trigger_register+0x5c/0x1fc <-- Trying to get lock "triggers_list_lock" via down_write(&triggers_list_lock); [ 1362.054536] [<80662830>] phy_led_triggers_register+0xd0/0x234 [ 1362.060329] [<8065e200>] phy_attach_direct+0x33c/0x40c [ 1362.065489] [<80651fc4>] phylink_fwnode_phy_connect+0x15c/0x23c [ 1362.071480] [<8066ee18>] mtk_open+0x7c/0xba0 [ 1362.075849] [<806d714c>] __dev_open+0x280/0x2b0 [ 1362.080384] [<806d7668>] __dev_change_flags+0x244/0x24c [ 1362.085598] [<806d7698>] dev_change_flags+0x28/0x78 [ 1362.090528] [<807150e4>] dev_ioctl+0x4c0/0x654 <-- Hold lock "rtnl_mutex" by calling rtnl_lock(); [ 1362.094985] [<80694360>] sock_ioctl+0x2f4/0x4e0 [ 1362.099567] [<802e9c4c>] sys_ioctl+0x32c/0xd8c [ 1362.104022] [<80014504>] syscall_common+0x34/0x58 Here LED_TRIGGER_PHY is registering LED triggers during phy_attach while holding RTNL and then taking triggers_list_lock. [ 1362.191101] [<806c2640>] register_netdevice_notifier+0x60/0x168 <-- Trying to get lock "rtnl_mutex" via rtnl_lock(); [ 1362.197073] [<805504ac>] netdev_trig_activate+0x194/0x1e4 [ 1362.202490] [<8054e28c>] led_trigger_set+0x1d4/0x360 <-- Hold lock "triggers_list_lock" by down_read(&triggers_list_lock); [ 1362.207511] [<8054eb38>] led_trigger_write+0xd8/0x14c [ 1362.212566] [<80381d98>] sysfs_kf_bin_write+0x80/0xbc [ 1362.217688] [<8037fcd8>] kernfs_fop_write_iter+0x17c/0x28c [ 1362.223174] [<802cbd70>] vfs_write+0x21c/0x3c4 [ 1362.227712] [<802cc0c4>] ksys_write+0x78/0x12c [ 1362.232164] [<80014504>] syscall_common+0x34/0x58 Here LEDS_TRIGGER_NETDEV is being enabled on an LED. It first takes triggers_list_lock and then RTNL. A classical AB-BA deadlock. phy_led_triggers_registers() does not require the RTNL, it does not make any calls into the network stack which require protection. There is also no requirement the PHY has been attached to a MAC, the triggers only make use of phydev state. This allows the call to phy_led_triggers_registers() to be placed elsewhere. PHY probe() and release() don't hold RTNL, so solving the AB-BA deadlock. Reported-by: Shiji Yang <[email protected]> Closes: https://lore.kernel.org/all/OS7PR01MB13602B128BA1AD3FA38B6D1FFBC69A@OS7PR01MB13602.jpnprd01.prod.outlook.com/ Fixes: 06f502f57d0d ("leds: trigger: Introduce a NETDEV trigger") Cc: [email protected] Signed-off-by: Andrew Lunn <[email protected]> Tested-by: Shiji Yang <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Sebastian Andrzej Siewior <[email protected]> Date: Mon Mar 2 17:26:31 2026 +0100 net: Provide a PREEMPT_RT specific check for netdev_queue::_xmit_lock [ Upstream commit b824c3e16c1904bf80df489e293d1e3cbf98896d ] After acquiring netdev_queue::_xmit_lock the number of the CPU owning the lock is recorded in netdev_queue::xmit_lock_owner. This works as long as the BH context is not preemptible. On PREEMPT_RT the softirq context is preemptible and without the softirq-lock it is possible to have multiple user in __dev_queue_xmit() submitting a skb on the same CPU. This is fine in general but this means also that the current CPU is recorded as netdev_queue::xmit_lock_owner. This in turn leads to the recursion alert and the skb is dropped. Instead checking the for CPU number, that owns the lock, PREEMPT_RT can check if the lockowner matches the current task. Add netif_tx_owned() which returns true if the current context owns the lock by comparing the provided CPU number with the recorded number. This resembles the current check by negating the condition (the current check returns true if the lock is not owned). On PREEMPT_RT use rt_mutex_owner() to return the lock owner and compare the current task against it. Use the new helper in __dev_queue_xmit() and netif_local_xmit_active() which provides a similar check. Update comments regarding pairing READ_ONCE(). Reported-by: Bert Karwatzki <[email protected]> Closes: https://lore.kernel.org/all/[email protected] Fixes: 3253cb49cbad4 ("softirq: Allow to drop the softirq-BKL lock on PREEMPT_RT") Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Reported-by: Bert Karwatzki <[email protected]> Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Manivannan Sadhasivam <[email protected]> Date: Thu Dec 18 22:21:44 2025 +0530 net: qrtr: Drop the MHI auto_queue feature for IPCR DL channels [ Upstream commit 51731792a25cb312ca94cdccfa139eb46de1b2ef ] MHI stack offers the 'auto_queue' feature, which allows the MHI stack to auto queue the buffers for the RX path (DL channel). Though this feature simplifies the client driver design, it introduces race between the client drivers and the MHI stack. For instance, with auto_queue, the 'dl_callback' for the DL channel may get called before the client driver is fully probed. This means, by the time the dl_callback gets called, the client driver's structures might not be initialized, leading to NULL ptr dereference. Currently, the drivers have to workaround this issue by initializing the internal structures before calling mhi_prepare_for_transfer_autoqueue(). But even so, there is a chance that the client driver's internal code path may call the MHI queue APIs before mhi_prepare_for_transfer_autoqueue() is called, leading to similar NULL ptr dereference. This issue has been reported on the Qcom X1E80100 CRD machines affecting boot. So to properly fix all these races, drop the MHI 'auto_queue' feature altogether and let the client driver (QRTR) manage the RX buffers manually. In the QRTR driver, queue the RX buffers based on the ring length during probe and recycle the buffers in 'dl_callback' once they are consumed. This also warrants removing the setting of 'auto_queue' flag from controller drivers. Currently, this 'auto_queue' feature is only enabled for IPCR DL channel. So only the QRTR client driver requires the modification. Fixes: 227fee5fc99e ("bus: mhi: core: Add an API for auto queueing buffers for DL channel") Fixes: 68a838b84eff ("net: qrtr: start MHI channel after endpoit creation") Reported-by: Johan Hovold <[email protected]> Closes: https://lore.kernel.org/linux-arm-msm/[email protected] Suggested-by: Chris Lew <[email protected]> Signed-off-by: Manivannan Sadhasivam <[email protected]> Reviewed-by: Jeff Hugo <[email protected]> Reviewed-by: Loic Poulain <[email protected]> Acked-by: Jeff Johnson <[email protected]> # drivers/net/wireless/ath/... Acked-by: Jeff Hugo <[email protected]> Acked-by: Paolo Abeni <[email protected]> Cc: [email protected] Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
Author: Koichiro Den <[email protected]> Date: Sat Feb 28 23:53:07 2026 +0900 net: sched: avoid qdisc_reset_all_tx_gt() vs dequeue race for lockless qdiscs [ Upstream commit 7f083faf59d14c04e01ec05a7507f036c965acf8 ] When shrinking the number of real tx queues, netif_set_real_num_tx_queues() calls qdisc_reset_all_tx_gt() to flush qdiscs for queues which will no longer be used. qdisc_reset_all_tx_gt() currently serializes qdisc_reset() with qdisc_lock(). However, for lockless qdiscs, the dequeue path is serialized by qdisc_run_begin/end() using qdisc->seqlock instead, so qdisc_reset() can run concurrently with __qdisc_run() and free skbs while they are still being dequeued, leading to UAF. This can easily be reproduced on e.g. virtio-net by imposing heavy traffic while frequently changing the number of queue pairs: iperf3 -ub0 -c $peer -t 0 & while :; do ethtool -L eth0 combined 1 ethtool -L eth0 combined 2 done With KASAN enabled, this leads to reports like: BUG: KASAN: slab-use-after-free in __qdisc_run+0x133f/0x1760 ... Call Trace: <TASK> ... __qdisc_run+0x133f/0x1760 __dev_queue_xmit+0x248f/0x3550 ip_finish_output2+0xa42/0x2110 ip_output+0x1a7/0x410 ip_send_skb+0x2e6/0x480 udp_send_skb+0xb0a/0x1590 udp_sendmsg+0x13c9/0x1fc0 ... </TASK> Allocated by task 1270 on cpu 5 at 44.558414s: ... alloc_skb_with_frags+0x84/0x7c0 sock_alloc_send_pskb+0x69a/0x830 __ip_append_data+0x1b86/0x48c0 ip_make_skb+0x1e8/0x2b0 udp_sendmsg+0x13a6/0x1fc0 ... Freed by task 1306 on cpu 3 at 44.558445s: ... kmem_cache_free+0x117/0x5e0 pfifo_fast_reset+0x14d/0x580 qdisc_reset+0x9e/0x5f0 netif_set_real_num_tx_queues+0x303/0x840 virtnet_set_channels+0x1bf/0x260 [virtio_net] ethnl_set_channels+0x684/0xae0 ethnl_default_set_doit+0x31a/0x890 ... Serialize qdisc_reset_all_tx_gt() against the lockless dequeue path by taking qdisc->seqlock for TCQ_F_NOLOCK qdiscs, matching the serialization model already used by dev_reset_queue(). Additionally clear QDISC_STATE_NON_EMPTY after reset so the qdisc state reflects an empty queue, avoiding needless re-scheduling. Fixes: 6b3ba9146fe6 ("net: sched: allow qdiscs to handle locking") Signed-off-by: Koichiro Den <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Ovidiu Panait <[email protected]> Date: Tue Mar 3 14:58:28 2026 +0000 net: stmmac: Defer VLAN HW configuration when interface is down [ Upstream commit 2cd70e3968f505996d5fefdf7ca684f0f4575734 ] VLAN register accesses on the MAC side require the PHY RX clock to be active. When the network interface is down, the PHY is suspended and the RX clock is unavailable, causing VLAN operations to fail with timeouts. The VLAN core automatically removes VID 0 after the interface goes down and re-adds it when it comes back up, so these timeouts happen during normal interface down/up: # ip link set end1 down renesas-gbeth 15c40000.ethernet end1: Timeout accessing MAC_VLAN_Tag_Filter renesas-gbeth 15c40000.ethernet end1: failed to kill vid 0081/0 Adding VLANs while the interface is down also fails: # ip link add link end1 name end1.10 type vlan id 10 renesas-gbeth 15c40000.ethernet end1: Timeout accessing MAC_VLAN_Tag_Filter RTNETLINK answers: Device or resource busy To fix this, check if the interface is up before accessing VLAN registers. The software state is always kept up to date regardless of interface state. When the interface is brought up, stmmac_vlan_restore() is called to write the VLAN state to hardware. Fixes: ed64639bc1e0 ("net: stmmac: Add support for VLAN Rx filtering") Signed-off-by: Ovidiu Panait <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Ovidiu Panait <[email protected]> Date: Tue Mar 3 14:58:25 2026 +0000 net: stmmac: Fix error handling in VLAN add and delete paths [ Upstream commit 35dfedce442c4060cfe5b98368bc9643fb995716 ] stmmac_vlan_rx_add_vid() updates active_vlans and the VLAN hash register before writing the HW filter entry. If the filter write fails, it leaves a stale VID in active_vlans and the hash register. stmmac_vlan_rx_kill_vid() has the reverse problem: it clears active_vlans before removing the HW filter. On failure, the VID is gone from active_vlans but still present in the HW filter table. To fix this, reorder the operations to update the hash table first, then attempt the HW filter operation. If the HW filter fails, roll back both the active_vlans bitmap and the hash table by calling stmmac_vlan_update() again. Fixes: ed64639bc1e0 ("net: stmmac: Add support for VLAN Rx filtering") Signed-off-by: Ovidiu Panait <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Ovidiu Panait <[email protected]> Date: Tue Mar 3 14:58:27 2026 +0000 net: stmmac: Fix VLAN HW state restore [ Upstream commit bd7ad51253a76fb35886d01cfe9a37f0e4ed6709 ] When the network interface is opened or resumed, a DMA reset is performed, which resets all hardware state, including VLAN state. Currently, only the resume path is restoring the VLAN state via stmmac_restore_hw_vlan_rx_fltr(), but that is incomplete: the VLAN hash table and the VLAN_TAG control bits are not restored. Therefore, add stmmac_vlan_restore(), which restores the full VLAN state by updating both the HW filter entries and the hash table, and call it from both the open and resume paths. The VLAN restore is moved outside of phylink_rx_clk_stop_block/unblock in the resume path because receive clock stop is already disabled when stmmac supports VLAN. Also, remove the hash readback code in vlan_restore_hw_rx_fltr() that attempts to restore VTHM by reading VLAN_HASH_TABLE, as it always reads zero after DMA reset, making it dead code. Fixes: 3cd1cfcba26e ("net: stmmac: Implement VLAN Hash Filtering in XGMAC") Fixes: ed64639bc1e0 ("net: stmmac: Add support for VLAN Rx filtering") Signed-off-by: Ovidiu Panait <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Stable-dep-of: 2cd70e3968f5 ("net: stmmac: Defer VLAN HW configuration when interface is down") Signed-off-by: Sasha Levin <[email protected]>
Author: Ovidiu Panait <[email protected]> Date: Tue Mar 3 14:58:26 2026 +0000 net: stmmac: Improve double VLAN handling [ Upstream commit e38200e361cbe331806dc454c76c11c7cd95e1b9 ] The double VLAN bits (EDVLP, ESVL, DOVLTC) are handled inconsistently between the two vlan_update_hash() implementations: - dwxgmac2_update_vlan_hash() explicitly clears the double VLAN bits when is_double is false, meaning that adding a 802.1Q VLAN will disable double VLAN mode: $ ip link add link eth0 name eth0.200 type vlan id 200 protocol 802.1ad $ ip link add link eth0 name eth0.100 type vlan id 100 # Double VLAN bits no longer set - vlan_update_hash() sets these bits and only clears them when the last VLAN has been removed, so double VLAN mode remains enabled even after all 802.1AD VLANs are removed. Address both issues by tracking the number of active 802.1AD VLANs in priv->num_double_vlans. Pass this count to stmmac_vlan_update() so both implementations correctly set the double VLAN bits when any 802.1AD VLAN is active, and clear them only when none remain. Also update vlan_update_hash() to explicitly clear the double VLAN bits when is_double is false, matching the dwxgmac2 behavior. Signed-off-by: Ovidiu Panait <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Stable-dep-of: 2cd70e3968f5 ("net: stmmac: Defer VLAN HW configuration when interface is down") Signed-off-by: Sasha Levin <[email protected]>
Author: Russell King (Oracle) <[email protected]> Date: Tue Feb 3 16:50:41 2026 +0000 net: stmmac: remove support for lpi_intr_o commit 14eb64db8ff07b58a35b98375f446d9e20765674 upstream. The dwmac databook for v3.74a states that lpi_intr_o is a sideband signal which should be used to ungate the application clock, and this signal is synchronous to the receive clock. The receive clock can run at 2.5, 25 or 125MHz depending on the media speed, and can stop under the control of the link partner. This means that the time it takes to clear is dependent on the negotiated media speed, and thus can be 8, 40, or 400ns after reading the LPI control and status register. It has been observed with some aggressive link partners, this clock can stop while lpi_intr_o is still asserted, meaning that the signal remains asserted for an indefinite period that the local system has no direct control over. The LPI interrupts will still be signalled through the main interrupt path in any case, and this path is not dependent on the receive clock. This, since we do not gate the application clock, and the chances of adding clock gating in the future are slim due to the clocks being ill-defined, lpi_intr_o serves no useful purpose. Remove the code which requests the interrupt, and all associated code. Reported-by: Ovidiu Panait <[email protected]> Tested-by: Ovidiu Panait <[email protected]> # Renesas RZ/V2H board Signed-off-by: Russell King (Oracle) <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Ovidiu Panait <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: MD Danish Anwar <[email protected]> Date: Thu Feb 26 15:53:56 2026 +0530 net: ti: icssg-prueth: Fix ping failure after offload mode setup when link speed is not 1G [ Upstream commit 147792c395db870756a0dc87ce656c75ae7ab7e8 ] When both eth interfaces with links up are added to a bridge or hsr interface, ping fails if the link speed is not 1Gbps (e.g., 100Mbps). The issue is seen because when switching to offload (bridge/hsr) mode, prueth_emac_restart() restarts the firmware and clears DRAM with memset_io(), setting all memory to 0. This includes PORT_LINK_SPEED_OFFSET which firmware reads for link speed. The value 0 corresponds to FW_LINK_SPEED_1G (0x00), so for 1Gbps links the default value is correct and ping works. For 100Mbps links, the firmware needs FW_LINK_SPEED_100M (0x01) but gets 0 instead, causing ping to fail. The function emac_adjust_link() is called to reconfigure, but it detects no state change (emac->link is still 1, speed/duplex match PHY) so new_state remains false and icssg_config_set_speed() is never called to correct the firmware speed value. The fix resets emac->link to 0 before calling emac_adjust_link() in prueth_emac_common_start(). This forces new_state=true, ensuring icssg_config_set_speed() is called to write the correct speed value to firmware memory. Fixes: 06feac15406f ("net: ti: icssg-prueth: Fix emac link speed handling") Signed-off-by: MD Danish Anwar <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Greg Kroah-Hartman <[email protected]> Date: Mon Feb 23 13:59:26 2026 +0100 net: usb: kalmia: validate USB endpoints commit c58b6c29a4c9b8125e8ad3bca0637e00b71e2693 upstream. The kalmia driver should validate that the device it is probing has the proper number and types of USB endpoints it is expecting before it binds to it. If a malicious device were to not have the same urbs the driver will crash later on when it blindly accesses these endpoints. Cc: stable <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Reviewed-by: Simon Horman <[email protected]> Fixes: d40261236e8e ("net/usb: Add Samsung Kalmia driver for Samsung GT-B3730") Link: https://patch.msgid.link/2026022326-shack-headstone-ef6f@gregkh Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Greg Kroah-Hartman <[email protected]> Date: Mon Feb 23 14:00:06 2026 +0100 net: usb: kaweth: validate USB endpoints commit 4b063c002ca759d1b299988ee23f564c9609c875 upstream. The kaweth driver should validate that the device it is probing has the proper number and types of USB endpoints it is expecting before it binds to it. If a malicious device were to not have the same urbs the driver will crash later on when it blindly accesses these endpoints. Cc: stable <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Reviewed-by: Simon Horman <[email protected]> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Link: https://patch.msgid.link/2026022305-substance-virtual-c728@gregkh Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Greg Kroah-Hartman <[email protected]> Date: Mon Feb 23 13:58:48 2026 +0100 net: usb: pegasus: validate USB endpoints commit 11de1d3ae5565ed22ef1f89d73d8f2d00322c699 upstream. The pegasus driver should validate that the device it is probing has the proper number and types of USB endpoints it is expecting before it binds to it. If a malicious device were to not have the same urbs the driver will crash later on when it blindly accesses these endpoints. Cc: Petko Manolov <[email protected]> Cc: stable <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Link: https://patch.msgid.link/2026022347-legibly-attest-cc5c@gregkh Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Fernando Fernandez Mancera <[email protected]> Date: Wed Mar 4 13:03:57 2026 +0100 net: vxlan: fix nd_tbl NULL dereference when IPv6 is disabled [ Upstream commit 168ff39e4758897d2eee4756977d036d52884c7e ] When booting with the 'ipv6.disable=1' parameter, the nd_tbl is never initialized because inet6_init() exits before ndisc_init() is called which initializes it. If an IPv6 packet is injected into the interface, route_shortcircuit() is called and a NULL pointer dereference happens on neigh_lookup(). BUG: kernel NULL pointer dereference, address: 0000000000000380 Oops: Oops: 0000 [#1] SMP NOPTI [...] RIP: 0010:neigh_lookup+0x20/0x270 [...] Call Trace: <TASK> vxlan_xmit+0x638/0x1ef0 [vxlan] dev_hard_start_xmit+0x9e/0x2e0 __dev_queue_xmit+0xbee/0x14e0 packet_sendmsg+0x116f/0x1930 __sys_sendto+0x1f5/0x200 __x64_sys_sendto+0x24/0x30 do_syscall_64+0x12f/0x1590 entry_SYSCALL_64_after_hwframe+0x76/0x7e Fix this by adding an early check on route_shortcircuit() when protocol is ETH_P_IPV6. Note that ipv6_mod_enabled() cannot be used here because VXLAN can be built-in even when IPv6 is built as a module. Fixes: e15a00aafa4b ("vxlan: add ipv6 route short circuit support") Signed-off-by: Fernando Fernandez Mancera <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Eric Dumazet <[email protected]> Date: Wed Mar 4 01:56:40 2026 +0000 net_sched: sch_fq: clear q->band_pkt_count[] in fq_reset() [ Upstream commit a4c2b8be2e5329e7fac6e8f64ddcb8958155cfcb ] When/if a NIC resets, queues are deactivated by dev_deactivate_many(), then reactivated when the reset operation completes. fq_reset() removes all the skbs from various queues. If we do not clear q->band_pkt_count[], these counters keep growing and can eventually reach sch->limit, preventing new packets to be queued. Many thanks to Praveen for discovering the root cause. Fixes: 29f834aa326e ("net_sched: sch_fq: add 3 bands and WRR scheduling") Diagnosed-by: Praveen Kaligineedi <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Reviewed-by: Neal Cardwell <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Pablo Neira Ayuso <[email protected]> Date: Mon Mar 2 23:28:15 2026 +0100 netfilter: nf_tables: clone set on flush only [ Upstream commit fb7fb4016300ac622c964069e286dc83166a5d52 ] Syzbot with fault injection triggered a failing memory allocation with GFP_KERNEL which results in a WARN splat: iter.err WARNING: net/netfilter/nf_tables_api.c:845 at nft_map_deactivate+0x34e/0x3c0 net/netfilter/nf_tables_api.c:845, CPU#0: syz.0.17/5992 Modules linked in: CPU: 0 UID: 0 PID: 5992 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2026 RIP: 0010:nft_map_deactivate+0x34e/0x3c0 net/netfilter/nf_tables_api.c:845 Code: 8b 05 86 5a 4e 09 48 3b 84 24 a0 00 00 00 75 62 48 8d 65 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc cc e8 63 6d fa f7 90 <0f> 0b 90 43 +80 7c 35 00 00 0f 85 23 fe ff ff e9 26 fe ff ff 89 d9 RSP: 0018:ffffc900045af780 EFLAGS: 00010293 RAX: ffffffff89ca45bd RBX: 00000000fffffff4 RCX: ffff888028111e40 RDX: 0000000000000000 RSI: 00000000fffffff4 RDI: 0000000000000000 RBP: ffffc900045af870 R08: 0000000000400dc0 R09: 00000000ffffffff R10: dffffc0000000000 R11: fffffbfff1d141db R12: ffffc900045af7e0 R13: 1ffff920008b5f24 R14: dffffc0000000000 R15: ffffc900045af920 FS: 000055557a6a5500(0000) GS:ffff888125496000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fb5ea271fc0 CR3: 000000003269e000 CR4: 00000000003526f0 Call Trace: <TASK> __nft_release_table+0xceb/0x11f0 net/netfilter/nf_tables_api.c:12115 nft_rcv_nl_event+0xc25/0xdb0 net/netfilter/nf_tables_api.c:12187 notifier_call_chain+0x19d/0x3a0 kernel/notifier.c:85 blocking_notifier_call_chain+0x6a/0x90 kernel/notifier.c:380 netlink_release+0x123b/0x1ad0 net/netlink/af_netlink.c:761 __sock_release net/socket.c:662 [inline] sock_close+0xc3/0x240 net/socket.c:1455 Restrict set clone to the flush set command in the preparation phase. Add NFT_ITER_UPDATE_CLONE and use it for this purpose, update the rbtree and pipapo backends to only clone the set when this iteration type is used. As for the existing NFT_ITER_UPDATE type, update the pipapo backend to use the existing set clone if available, otherwise use the existing set representation. After this update, there is no need to clone a set that is being deleted, this includes bound anonymous set. An alternative approach to NFT_ITER_UPDATE_CLONE is to add a .clone interface and call it from the flush set path. Reported-by: [email protected] Fixes: 3f1d886cc7c3 ("netfilter: nft_set_pipapo: move cloning of match info to insert/removal path") Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Pablo Neira Ayuso <[email protected]> Date: Mon Mar 2 23:12:37 2026 +0100 netfilter: nf_tables: unconditionally bump set->nelems before insertion [ Upstream commit def602e498a4f951da95c95b1b8ce8ae68aa733a ] In case that the set is full, a new element gets published then removed without waiting for the RCU grace period, while RCU reader can be walking over it already. To address this issue, add the element transaction even if set is full, but toggle the set_full flag to report -ENFILE so the abort path safely unwinds the set to its previous state. As for element updates, decrement set->nelems to restore it. A simpler fix is to call synchronize_rcu() in the error path. However, with a large batch adding elements to already maxed-out set, this could cause noticeable slowdown of such batches. Fixes: 35d0ac9070ef ("netfilter: nf_tables: fix set->nelems counting with no NLM_F_EXCL") Reported-by: Inseo An <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Florian Westphal <[email protected]> Date: Tue Mar 3 16:31:32 2026 +0100 netfilter: nft_set_pipapo: split gc into unlink and reclaim phase [ Upstream commit 9df95785d3d8302f7c066050117b04cd3c2048c2 ] Yiming Qian reports Use-after-free in the pipapo set type: Under a large number of expired elements, commit-time GC can run for a very long time in a non-preemptible context, triggering soft lockup warnings and RCU stall reports (local denial of service). We must split GC in an unlink and a reclaim phase. We cannot queue elements for freeing until pointers have been swapped. Expired elements are still exposed to both the packet path and userspace dumpers via the live copy of the data structure. call_rcu() does not protect us: dump operations or element lookups starting after call_rcu has fired can still observe the free'd element, unless the commit phase has made enough progress to swap the clone and live pointers before any new reader has picked up the old version. This a similar approach as done recently for the rbtree backend in commit 35f83a75529a ("netfilter: nft_set_rbtree: don't gc elements on insert"). Fixes: 3c4287f62044 ("nf_tables: Add set type for arbitrary concatenation of ranges") Reported-by: Yiming Qian <[email protected]> Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: David Howells <[email protected]> Date: Thu Feb 26 13:32:33 2026 +0000 netfs: Fix unbuffered/DIO writes to dispatch subrequests in strict sequence [ Upstream commit a0b4c7a49137ed21279f354eb59f49ddae8dffc2 ] Fix netfslib such that when it's making an unbuffered or DIO write, to make sure that it sends each subrequest strictly sequentially, waiting till the previous one is 'committed' before sending the next so that we don't have pieces landing out of order and potentially leaving a hole if an error occurs (ENOSPC for example). This is done by copying in just those bits of issuing, collecting and retrying subrequests that are necessary to do one subrequest at a time. Retrying, in particular, is simpler because if the current subrequest needs retrying, the source iterator can just be copied again and the subrequest prepped and issued again without needing to be concerned about whether it needs merging with the previous or next in the sequence. Note that the issuing loop waits for a subrequest to complete right after issuing it, but this wait could be moved elsewhere allowing preparatory steps to be performed whilst the subrequest is in progress. In particular, once content encryption is available in netfslib, that could be done whilst waiting, as could cleanup of buffers that have been completed. Fixes: 153a9961b551 ("netfs: Implement unbuffered/DIO write support") Signed-off-by: David Howells <[email protected]> Link: https://patch.msgid.link/[email protected] Tested-by: Steve French <[email protected]> Reviewed-by: Paulo Alcantara (Red Hat) <[email protected]> cc: [email protected] cc: [email protected] Signed-off-by: Christian Brauner <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Jakub Kicinski <[email protected]> Date: Tue Mar 3 08:23:44 2026 -0800 nfc: nci: clear NCI_DATA_EXCHANGE before calling completion callback [ Upstream commit 0efdc02f4f6d52f8ca5d5889560f325a836ce0a8 ] Move clear_bit(NCI_DATA_EXCHANGE) before invoking the data exchange callback in nci_data_exchange_complete(). The callback (e.g. rawsock_data_exchange_complete) may immediately schedule another data exchange via schedule_work(tx_work). On a multi-CPU system, tx_work can run and reach nci_transceive() before the current nci_data_exchange_complete() clears the flag, causing test_and_set_bit(NCI_DATA_EXCHANGE) to return -EBUSY and the new transfer to fail. This causes intermittent flakes in nci/nci_dev in NIPA: # # RUN NCI.NCI1_0.t4t_tag_read ... # # t4t_tag_read: Test terminated by timeout # # FAIL NCI.NCI1_0.t4t_tag_read # not ok 3 NCI.NCI1_0.t4t_tag_read Fixes: 38f04c6b1b68 ("NFC: protect nci_data_exchange transactions") Reviewed-by: Joe Damato <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Jakub Kicinski <[email protected]> Date: Tue Mar 3 08:23:43 2026 -0800 nfc: nci: complete pending data exchange on device close [ Upstream commit 66083581945bd5b8e99fe49b5aeb83d03f62d053 ] In nci_close_device(), complete any pending data exchange before closing. The data exchange callback (e.g. rawsock_data_exchange_complete) holds a socket reference. NIPA occasionally hits this leak: unreferenced object 0xff1100000f435000 (size 2048): comm "nci_dev", pid 3954, jiffies 4295441245 hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 27 00 01 40 00 00 00 00 00 00 00 00 00 00 00 00 '..@............ backtrace (crc ec2b3c5): __kmalloc_noprof+0x4db/0x730 sk_prot_alloc.isra.0+0xe4/0x1d0 sk_alloc+0x36/0x760 rawsock_create+0xd1/0x540 nfc_sock_create+0x11f/0x280 __sock_create+0x22d/0x630 __sys_socket+0x115/0x1d0 __x64_sys_socket+0x72/0xd0 do_syscall_64+0x117/0xfc0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 Fixes: 38f04c6b1b68 ("NFC: protect nci_data_exchange transactions") Reviewed-by: Joe Damato <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Jakub Kicinski <[email protected]> Date: Tue Mar 3 08:23:41 2026 -0800 nfc: nci: free skb on nci_transceive early error paths [ Upstream commit 7bd4b0c4779f978a6528c9b7937d2ca18e936e2c ] nci_transceive() takes ownership of the skb passed by the caller, but the -EPROTO, -EINVAL, and -EBUSY error paths return without freeing it. Due to issues clearing NCI_DATA_EXCHANGE fixed by subsequent changes the nci/nci_dev selftest hits the error path occasionally in NIPA, and kmemleak detects leaks: unreferenced object 0xff11000015ce6a40 (size 640): comm "nci_dev", pid 3954, jiffies 4295441246 hex dump (first 32 bytes): 6b 6b 6b 6b 00 a4 00 0c 02 e1 03 6b 6b 6b 6b 6b kkkk.......kkkkk 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk backtrace (crc 7c40cc2a): kmem_cache_alloc_node_noprof+0x492/0x630 __alloc_skb+0x11e/0x5f0 alloc_skb_with_frags+0xc6/0x8f0 sock_alloc_send_pskb+0x326/0x3f0 nfc_alloc_send_skb+0x94/0x1d0 rawsock_sendmsg+0x162/0x4c0 do_syscall_64+0x117/0xfc0 Fixes: 6a2968aaf50c ("NFC: basic NCI protocol implementation") Reviewed-by: Joe Damato <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Greg Kroah-Hartman <[email protected]> Date: Mon Feb 23 12:28:30 2026 +0100 nfc: pn533: properly drop the usb interface reference on disconnect commit 12133a483dfa832241fbbf09321109a0ea8a520e upstream. When the device is disconnected from the driver, there is a "dangling" reference count on the usb interface that was grabbed in the probe callback. Fix this up by properly dropping the reference after we are done with it. Cc: stable <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Reviewed-by: Simon Horman <[email protected]> Fixes: c46ee38620a2 ("NFC: pn533: add NXP pn533 nfc device driver") Link: https://patch.msgid.link/2026022329-flashing-ought-7573@gregkh Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Jakub Kicinski <[email protected]> Date: Tue Mar 3 08:23:45 2026 -0800 nfc: rawsock: cancel tx_work before socket teardown [ Upstream commit d793458c45df2aed498d7f74145eab7ee22d25aa ] In rawsock_release(), cancel any pending tx_work and purge the write queue before orphaning the socket. rawsock_tx_work runs on the system workqueue and calls nfc_data_exchange which dereferences the NCI device. Without synchronization, tx_work can race with socket and device teardown when a process is killed (e.g. by SIGKILL), leading to use-after-free or leaked references. Set SEND_SHUTDOWN first so that if tx_work is already running it will see the flag and skip transmitting, then use cancel_work_sync to wait for any in-progress execution to finish, and finally purge any remaining queued skbs. Fixes: 23b7869c0fd0 ("NFC: add the NFC socket raw protocol") Reviewed-by: Joe Damato <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Kuniyuki Iwashima <[email protected]> Date: Sat Jan 24 04:18:40 2026 +0000 nfsd: Fix cred ref leak in nfsd_nl_threads_set_doit(). commit 1cb968a2013ffa8112d52ebe605009ea1c6a582c upstream. syzbot reported memory leak of struct cred. [0] nfsd_nl_threads_set_doit() passes get_current_cred() to nfsd_svc(), but put_cred() is not called after that. The cred is finally passed down to _svc_xprt_create(), which calls get_cred() with the cred for struct svc_xprt. The ownership of the refcount by get_current_cred() is not transferred to anywhere and is just leaked. nfsd_svc() is also called from write_threads(), but it does not bump file->f_cred there. nfsd_nl_threads_set_doit() is called from sendmsg() and current->cred does not go away. Let's use current_cred() in nfsd_nl_threads_set_doit(). [0]: BUG: memory leak unreferenced object 0xffff888108b89480 (size 184): comm "syz-executor", pid 5994, jiffies 4294943386 hex dump (first 32 bytes): 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace (crc 369454a7): kmemleak_alloc_recursive include/linux/kmemleak.h:44 [inline] slab_post_alloc_hook mm/slub.c:4958 [inline] slab_alloc_node mm/slub.c:5263 [inline] kmem_cache_alloc_noprof+0x412/0x580 mm/slub.c:5270 prepare_creds+0x22/0x600 kernel/cred.c:185 copy_creds+0x44/0x290 kernel/cred.c:286 copy_process+0x7a7/0x2870 kernel/fork.c:2086 kernel_clone+0xac/0x6e0 kernel/fork.c:2651 __do_sys_clone+0x7f/0xb0 kernel/fork.c:2792 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xa4/0xf80 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f Fixes: 924f4fb003ba ("NFSD: convert write_threads to netlink command") Cc: [email protected] Reported-by: [email protected] Closes: https://lore.kernel.org/all/[email protected]/ Tested-by: [email protected] Signed-off-by: Kuniyuki Iwashima <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Keith Busch <[email protected]> Date: Wed Feb 25 11:38:05 2026 -0800 nvme-multipath: fix leak on try_module_get failure [ Upstream commit 0f5197ea9a73a4c406c75e6d8af3a13f7f96ae89 ] We need to fall back to the synchronous removal if we can't get a reference on the module needed for the deferred removal. Fixes: 62188639ec16 ("nvme-multipath: introduce delayed removal of the multipath head node") Reviewed-by: Nilay Shroff <[email protected]> Reviewed-by: John Garry <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Keith Busch <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Ming Lei <[email protected]> Date: Sat Jan 31 22:48:08 2026 +0800 nvme: fix admin queue leak on controller reset [ Upstream commit b84bb7bd913d8ca2f976ee6faf4a174f91c02b8d ] When nvme_alloc_admin_tag_set() is called during a controller reset, a previous admin queue may still exist. Release it properly before allocating a new one to avoid orphaning the old queue. This fixes a regression introduced by commit 03b3bcd319b3 ("nvme: fix admin request_queue lifetime"). Cc: Keith Busch <[email protected]> Fixes: 03b3bcd319b3 ("nvme: fix admin request_queue lifetime"). Reported-and-tested-by: Yi Zhang <[email protected]> Closes: https://lore.kernel.org/linux-block/CAHj4cs9wv3SdPo+N01Fw2SHBYDs9tj2M_e1-GdQOkRy=DsBB1w@mail.gmail.com/ Signed-off-by: Ming Lei <[email protected]> Signed-off-by: Keith Busch <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Sungwoo Kim <[email protected]> Date: Fri Feb 27 19:19:28 2026 -0500 nvme: fix memory allocation in nvme_pr_read_keys() [ Upstream commit c3320153769f05fd7fe9d840cb555dd3080ae424 ] nvme_pr_read_keys() takes num_keys from userspace and uses it to calculate the allocation size for rse via struct_size(). The upper limit is PR_KEYS_MAX (64K). A malicious or buggy userspace can pass a large num_keys value that results in a 4MB allocation attempt at most, causing a warning in the page allocator when the order exceeds MAX_PAGE_ORDER. To fix this, use kvzalloc() instead of kzalloc(). This bug has the same reasoning and fix with the patch below: https://lore.kernel.org/linux-block/[email protected]/ Warning log: WARNING: mm/page_alloc.c:5216 at __alloc_frozen_pages_noprof+0x5aa/0x2300 mm/page_alloc.c:5216, CPU#1: syz-executor117/272 Modules linked in: CPU: 1 UID: 0 PID: 272 Comm: syz-executor117 Not tainted 6.19.0 #1 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 RIP: 0010:__alloc_frozen_pages_noprof+0x5aa/0x2300 mm/page_alloc.c:5216 Code: ff 83 bd a8 fe ff ff 0a 0f 86 69 fb ff ff 0f b6 1d f9 f9 c4 04 80 fb 01 0f 87 3b 76 30 ff 83 e3 01 75 09 c6 05 e4 f9 c4 04 01 <0f> 0b 48 c7 85 70 fe ff ff 00 00 00 00 e9 8f fd ff ff 31 c0 e9 0d RSP: 0018:ffffc90000fcf450 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 1ffff920001f9ea0 RDX: 0000000000000000 RSI: 000000000000000b RDI: 0000000000040dc0 RBP: ffffc90000fcf648 R08: ffff88800b6c3380 R09: 0000000000000001 R10: ffffc90000fcf840 R11: ffff88807ffad280 R12: 0000000000000000 R13: 0000000000040dc0 R14: 0000000000000001 R15: ffffc90000fcf620 FS: 0000555565db33c0(0000) GS:ffff8880be26c000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000002000000c CR3: 0000000003b72000 CR4: 00000000000006f0 Call Trace: <TASK> alloc_pages_mpol+0x236/0x4d0 mm/mempolicy.c:2486 alloc_frozen_pages_noprof+0x149/0x180 mm/mempolicy.c:2557 ___kmalloc_large_node+0x10c/0x140 mm/slub.c:5598 __kmalloc_large_node_noprof+0x25/0xc0 mm/slub.c:5629 __do_kmalloc_node mm/slub.c:5645 [inline] __kmalloc_noprof+0x483/0x6f0 mm/slub.c:5669 kmalloc_noprof include/linux/slab.h:961 [inline] kzalloc_noprof include/linux/slab.h:1094 [inline] nvme_pr_read_keys+0x8f/0x4c0 drivers/nvme/host/pr.c:245 blkdev_pr_read_keys block/ioctl.c:456 [inline] blkdev_common_ioctl+0x1b71/0x29b0 block/ioctl.c:730 blkdev_ioctl+0x299/0x700 block/ioctl.c:786 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:597 [inline] __se_sys_ioctl fs/ioctl.c:583 [inline] __x64_sys_ioctl+0x1bf/0x220 fs/ioctl.c:583 x64_sys_call+0x1280/0x21b0 mnt/fuzznvme_1/fuzznvme/linux-build/v6.19/./arch/x86/include/generated/asm/syscalls_64.h:17 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0x71/0x330 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7fb893d3108d Code: 28 c3 e8 46 1e 00 00 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffff61f2f38 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007ffff61f3138 RCX: 00007fb893d3108d RDX: 0000000020000040 RSI: 00000000c01070ce RDI: 0000000000000003 RBP: 0000000000000001 R08: 0000000000000000 R09: 00007ffff61f3138 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 R13: 00007ffff61f3128 R14: 00007fb893dae530 R15: 0000000000000001 </TASK> Fixes: 5fd96a4e15de (nvme: Add pr_ops read_keys support) Acked-by: Chao Shi <[email protected]> Acked-by: Weidong Zhu <[email protected]> Acked-by: Dave Tian <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Sungwoo Kim <[email protected]> Signed-off-by: Keith Busch <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Stefan Hajnoczi <[email protected]> Date: Mon Dec 1 16:43:27 2025 -0500 nvme: reject invalid pr_read_keys() num_keys values [ Upstream commit 38ec8469f39e0e96e7dd9b76f05e0f8eb78be681 ] The pr_read_keys() interface has a u32 num_keys parameter. The NVMe Reservation Report command has a u32 maximum length. Reject num_keys values that are too large to fit. This will become important when pr_read_keys() is exposed to untrusted userspace via an <linux/pr.h> ioctl. Signed-off-by: Stefan Hajnoczi <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Martin K. Petersen <[email protected]> Signed-off-by: Jens Axboe <[email protected]> Stable-dep-of: c3320153769f ("nvme: fix memory allocation in nvme_pr_read_keys()") Signed-off-by: Sasha Levin <[email protected]>
Author: Justin Tee <[email protected]> Date: Thu Dec 4 12:26:13 2025 -0800 nvmet-fcloop: Check remoteport port_state before calling done callback [ Upstream commit dd677d0598387ea623820ab2bd0e029c377445a3 ] In nvme_fc_handle_ls_rqst_work, the lsrsp->done callback is only set when remoteport->port_state is FC_OBJSTATE_ONLINE. Otherwise, the nvme_fc_xmt_ls_rsp's LLDD call to lport->ops->xmt_ls_rsp is expected to fail and the nvme-fc transport layer itself will directly call nvme_fc_xmt_ls_rsp_free instead of relying on LLDD's done callback to free the lsrsp resources. Update the fcloop_t2h_xmt_ls_rsp routine to check remoteport->port_state. If online, then lsrsp->done callback will free the lsrsp. Else, return -ENODEV to signal the nvme-fc transport to handle freeing lsrsp. Cc: Ewan D. Milne <[email protected]> Tested-by: Aristeu Rozanski <[email protected]> Acked-by: Aristeu Rozanski <[email protected]> Reviewed-by: Daniel Wagner <[email protected]> Closes: https://lore.kernel.org/linux-nvme/21255200-a271-4fa0-b099-97755c8acd4c@work/ Fixes: 10c165af35d2 ("nvmet-fcloop: call done callback even when remote port is gone") Signed-off-by: Justin Tee <[email protected]> Signed-off-by: Keith Busch <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Vimlesh Kumar <[email protected]> Date: Fri Feb 27 09:13:58 2026 +0000 octeon_ep: avoid compiler and IQ/OQ reordering [ Upstream commit 43b3160cb639079a15daeb5f080120afbfbfc918 ] Utilize READ_ONCE and WRITE_ONCE APIs for IO queue Tx/Rx variable access to prevent compiler optimization and reordering. Additionally, ensure IO queue OUT/IN_CNT registers are flushed by performing a read-back after writing. The compiler could reorder reads/writes to pkts_pending, last_pkt_count, etc., causing stale values to be used when calculating packets to process or register updates to send to hardware. The Octeon hardware requires a read-back after writing to OUT_CNT/IN_CNT registers to ensure the write has been flushed through any posted write buffers before the interrupt resend bit is set. Without this, we have observed cases where the hardware didn't properly update its internal state. wmb/rmb only provides ordering guarantees but doesn't prevent the compiler from performing optimizations like caching in registers, load tearing etc. Fixes: 37d79d0596062 ("octeon_ep: add Tx/Rx processing and interrupt support") Signed-off-by: Sathesh Edara <[email protected]> Signed-off-by: Shinas Rasheed <[email protected]> Signed-off-by: Vimlesh Kumar <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Vimlesh Kumar <[email protected]> Date: Fri Feb 27 09:13:57 2026 +0000 octeon_ep: Relocate counter updates before NAPI [ Upstream commit 18c04a808c436d629d5812ce883e3822a5f5a47f ] Relocate IQ/OQ IN/OUT_CNTS updates to occur before NAPI completion, and replace napi_complete with napi_complete_done. Moving the IQ/OQ counter updates before napi_complete_done ensures 1. Counter registers are updated before re-enabling interrupts. 2. Prevents a race where new packets arrive but counters aren't properly synchronized. napi_complete_done (vs napi_complete) allows for better interrupt coalescing. Fixes: 37d79d0596062 ("octeon_ep: add Tx/Rx processing and interrupt support") Signed-off-by: Sathesh Edara <[email protected]> Signed-off-by: Shinas Rasheed <[email protected]> Signed-off-by: Vimlesh Kumar <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Vimlesh Kumar <[email protected]> Date: Fri Feb 27 09:14:00 2026 +0000 octeon_ep_vf: avoid compiler and IQ/OQ reordering [ Upstream commit 6c73126ecd1080351b468fe43353b2f705487f44 ] Utilize READ_ONCE and WRITE_ONCE APIs for IO queue Tx/Rx variable access to prevent compiler optimization and reordering. Additionally, ensure IO queue OUT/IN_CNT registers are flushed by performing a read-back after writing. The compiler could reorder reads/writes to pkts_pending, last_pkt_count, etc., causing stale values to be used when calculating packets to process or register updates to send to hardware. The Octeon hardware requires a read-back after writing to OUT_CNT/IN_CNT registers to ensure the write has been flushed through any posted write buffers before the interrupt resend bit is set. Without this, we have observed cases where the hardware didn't properly update its internal state. wmb/rmb only provides ordering guarantees but doesn't prevent the compiler from performing optimizations like caching in registers, load tearing etc. Fixes: 1cd3b407977c3 ("octeon_ep_vf: add Tx/Rx processing and interrupt support") Signed-off-by: Sathesh Edara <[email protected]> Signed-off-by: Shinas Rasheed <[email protected]> Signed-off-by: Vimlesh Kumar <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Vimlesh Kumar <[email protected]> Date: Fri Feb 27 09:13:59 2026 +0000 octeon_ep_vf: Relocate counter updates before NAPI [ Upstream commit 2ae7d20fb24f598f60faa8f6ecc856dac782261a ] Relocate IQ/OQ IN/OUT_CNTS updates to occur before NAPI completion. Moving the IQ/OQ counter updates before napi_complete_done ensures 1. Counter registers are updated before re-enabling interrupts. 2. Prevents a race where new packets arrive but counters aren't properly synchronized. Fixes: 1cd3b407977c3 ("octeon_ep_vf: add Tx/Rx processing and interrupt support") Signed-off-by: Sathesh Edara <[email protected]> Signed-off-by: Shinas Rasheed <[email protected]> Signed-off-by: Vimlesh Kumar <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Qiang Yu <[email protected]> Date: Sun Nov 9 22:59:40 2025 -0800 PCI: Add preceding capability position support in PCI_FIND_NEXT_*_CAP macros [ Upstream commit a2582e05e39adf9ab82a02561cd6f70738540ae0 ] Add support for finding the preceding capability position in PCI capability list by extending the capability finding macros with an additional parameter. This functionality is essential for modifying PCI capability list, as it provides the necessary information to update the "next" pointer of the predecessor capability when removing entries. Modify two macros to accept a new 'prev_ptr' parameter: - PCI_FIND_NEXT_CAP - Now accepts 'prev_ptr' parameter for standard capabilities - PCI_FIND_NEXT_EXT_CAP - Now accepts 'prev_ptr' parameter for extended capabilities When a capability is found, these macros: - Store the position of the preceding capability in *prev_ptr (if prev_ptr != NULL) - Maintain all existing functionality when prev_ptr is NULL Update current callers to accommodate this API change by passing NULL to 'prev_ptr' argument if they do not care about the preceding capability position. No functional changes to driver behavior result from this commit as it maintains the existing capability finding functionality while adding the infrastructure for future capability removal operations. Signed-off-by: Qiang Yu <[email protected]> Signed-off-by: Manivannan Sadhasivam <[email protected]> Link: https://patch.msgid.link/[email protected] Stable-dep-of: 43d67ec26b32 ("PCI: dwc: ep: Fix resizable BAR support for multi-PF configurations") Signed-off-by: Sasha Levin <[email protected]>
Author: Bjorn Helgaas <[email protected]> Date: Fri Feb 27 06:10:08 2026 -0600 PCI: Correct PCI_CAP_EXP_ENDPOINT_SIZEOF_V2 value [ Upstream commit 39195990e4c093c9eecf88f29811c6de29265214 ] fb82437fdd8c ("PCI: Change capability register offsets to hex") incorrectly converted the PCI_CAP_EXP_ENDPOINT_SIZEOF_V2 value from decimal 52 to hex 0x32: -#define PCI_CAP_EXP_ENDPOINT_SIZEOF_V2 52 /* v2 endpoints with link end here */ +#define PCI_CAP_EXP_ENDPOINT_SIZEOF_V2 0x32 /* end of v2 EPs w/ link */ This broke PCI capabilities in a VMM because subsequent ones weren't DWORD-aligned. Change PCI_CAP_EXP_ENDPOINT_SIZEOF_V2 to the correct value of 0x34. fb82437fdd8c was from Baruch Siach <[email protected]>, but this was not Baruch's fault; it's a mistake I made when applying the patch. Fixes: fb82437fdd8c ("PCI: Change capability register offsets to hex") Reported-by: David Woodhouse <[email protected]> Closes: https://lore.kernel.org/all/[email protected] Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Krzysztof Wilczyński <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Shawn Lin <[email protected]> Date: Fri Dec 12 09:33:25 2025 +0800 PCI: dw-rockchip: Change get_ltssm() to provide L1 Substates info [ Upstream commit f994bb8f1c94726e0124356ccd31c3c23a8a69f4 ] Rename rockchip_pcie_get_ltssm() to rockchip_pcie_get_ltssm_reg() and add rockchip_pcie_get_ltssm() to get_ltssm() callback in order to show the proper L1 Substates. The PCIE_CLIENT_LTSSM_STATUS[5:0] register returns the same LTSSM layout as enum dw_pcie_ltssm. So the driver just need to convey L1 PM Substates by returning the proper value defined in pcie-designware.h. cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status L1_2 (0x142) Signed-off-by: Shawn Lin <[email protected]> Signed-off-by: Manivannan Sadhasivam <[email protected]> Link: https://patch.msgid.link/[email protected] Stable-dep-of: 180c3cfe3678 ("Revert "PCI: dw-rockchip: Enumerate endpoints based on dll_link_up IRQ"") Signed-off-by: Sasha Levin <[email protected]>
Author: Shawn Lin <[email protected]> Date: Tue Nov 18 15:42:17 2025 -0600 PCI: dw-rockchip: Configure L1SS support [ Upstream commit b5e719f26107f4a7f82946dc5be92dceb9b443cb ] L1 PM Substates for RC mode require support in the dw-rockchip driver including proper handling of the CLKREQ# sideband signal. It is mostly handled by hardware, but software still needs to set the clkreq fields in the PCIE_CLIENT_POWER_CON register to match the hardware implementation. For more details, see section '18.6.6.4 L1 Substate' in the RK3568 TRM 1.1 Part 2, or section '11.6.6.4 L1 Substate' in the RK3588 TRM 1.0 Part2. [bhelgaas: set pci->l1ss_support so DWC core preserves L1SS Capability bits; drop corresponding code here, include updates from https://lore.kernel.org/r/aRRG8wv13HxOCqgA@ryzen] Signed-off-by: Shawn Lin <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Link: https://patch.msgid.link/[email protected] Link: https://patch.msgid.link/[email protected] Stable-dep-of: 180c3cfe3678 ("Revert "PCI: dw-rockchip: Enumerate endpoints based on dll_link_up IRQ"") Signed-off-by: Sasha Levin <[email protected]>
Author: Shawn Lin <[email protected]> Date: Fri Dec 12 09:33:24 2025 +0800 PCI: dwc: Add L1 Substates context to ltssm_status of debugfs [ Upstream commit 679ec639f29cbdaf36bd79bf3e98240fffa335ee ] DWC core couldn't distinguish LTSSM state among L1.0, L1.1 and L1.2. But the vendor glue driver may implement additional logic to convey this information. So add two pseudo definitions for vendor glue drivers to translate their internal L1 Substates for debugfs to show. Signed-off-by: Shawn Lin <[email protected]> Signed-off-by: Manivannan Sadhasivam <[email protected]> Link: https://patch.msgid.link/[email protected] Stable-dep-of: 180c3cfe3678 ("Revert "PCI: dw-rockchip: Enumerate endpoints based on dll_link_up IRQ"") Signed-off-by: Sasha Levin <[email protected]>
Author: Qiang Yu <[email protected]> Date: Sun Nov 9 22:59:41 2025 -0800 PCI: dwc: Add new APIs to remove standard and extended Capability [ Upstream commit 0183562f1e824c0ca6c918309a0978e9a269af3e ] On some platforms, certain PCIe Capabilities may be present in hardware but are not fully implemented as defined in PCIe spec. These incomplete capabilities should be hidden from the PCI framework to prevent unexpected behavior. Introduce two APIs to remove a specific PCIe Capability and Extended Capability by updating the previous capability's next offset field to skip over the unwanted capability. These APIs allow RC drivers to easily hide unsupported or partially implemented capabilities from software. Co-developed-by: Wenbin Yao <[email protected]> Signed-off-by: Wenbin Yao <[email protected]> Signed-off-by: Qiang Yu <[email protected]> Signed-off-by: Manivannan Sadhasivam <[email protected]> Link: https://patch.msgid.link/[email protected] Stable-dep-of: 43d67ec26b32 ("PCI: dwc: ep: Fix resizable BAR support for multi-PF configurations") Signed-off-by: Sasha Levin <[email protected]>
Author: Bjorn Helgaas <[email protected]> Date: Tue Nov 18 15:42:15 2025 -0600 PCI: dwc: Advertise L1 PM Substates only if driver requests it [ Upstream commit a00bba406b5a682764ecb507e580ca8159196aa3 ] L1 PM Substates require the CLKREQ# signal and may also require device-specific support. If CLKREQ# is not supported or driver support is lacking, enabling L1.1 or L1.2 may cause errors when accessing devices, e.g., nvme nvme0: controller is down; will reset: CSTS=0xffffffff, PCI_STATUS=0x10 If the kernel is built with CONFIG_PCIEASPM_POWER_SUPERSAVE=y or users enable L1.x via sysfs, users may trip over these errors even if L1 Substates haven't been enabled by firmware or the driver. To prevent such errors, disable advertising the L1 PM Substates unless the driver sets "dw_pcie.l1ss_support" to indicate that it knows CLKREQ# is present and any device-specific configuration has been done. Set "dw_pcie.l1ss_support" in tegra194 (if DT includes the "supports-clkreq' property) and qcom (for cfg_2_7_0, cfg_1_9_0, cfg_1_34_0, and cfg_sc8280xp controllers) so they can continue to use L1 Substates. Based on Niklas's patch: https://patch.msgid.link/[email protected] [bhelgaas: drop hiding for endpoints] Signed-off-by: Bjorn Helgaas <[email protected]> Link: https://patch.msgid.link/[email protected] Stable-dep-of: 180c3cfe3678 ("Revert "PCI: dw-rockchip: Enumerate endpoints based on dll_link_up IRQ"") Signed-off-by: Sasha Levin <[email protected]>
Author: Aksh Garg <[email protected]> Date: Fri Jan 30 17:25:14 2026 +0530 PCI: dwc: ep: Fix resizable BAR support for multi-PF configurations [ Upstream commit 43d67ec26b329f8aea34ba9dff23d69b84a8e564 ] The resizable BAR support added by the commit 3a3d4cabe681 ("PCI: dwc: ep: Allow EPF drivers to configure the size of Resizable BARs") incorrectly configures the resizable BARs only for the first Physical Function (PF0) in EP mode. The resizable BAR configuration functions use generic dw_pcie_*_dbi() operations instead of physical function specific dw_pcie_ep_*_dbi() operations. This causes resizable BAR configuration to always target PF0 regardless of the requested function number. Additionally, dw_pcie_ep_init_non_sticky_registers() only initializes resizable BAR registers for PF0, leaving other PFs unconfigured during the execution of this function. Fix this by using physical function specific configuration space access operations throughout the resizable BAR code path and initializing registers for all the physical functions that support resizable BARs. Fixes: 3a3d4cabe681 ("PCI: dwc: ep: Allow EPF drivers to configure the size of Resizable BARs") Signed-off-by: Aksh Garg <[email protected]> [mani: added stable tag] Signed-off-by: Manivannan Sadhasivam <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Niklas Cassel <[email protected]> Cc: [email protected] Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
Author: Niklas Cassel <[email protected]> Date: Wed Feb 11 18:55:41 2026 +0100 PCI: dwc: ep: Flush MSI-X write before unmapping its ATU entry [ Upstream commit c22533c66ccae10511ad6a7afc34bb26c47577e3 ] Endpoint drivers use dw_pcie_ep_raise_msix_irq() to raise an MSI-X interrupt to the host using a writel(), which generates a PCI posted write transaction. There's no completion for posted writes, so the writel() may return before the PCI write completes. dw_pcie_ep_raise_msix_irq() also unmaps the outbound ATU entry used for the PCI write, so the write races with the unmap. If the PCI write loses the race with the ATU unmap, the write may corrupt host memory or cause IOMMU errors, e.g., these when running fio with a larger queue depth against nvmet-pci-epf: arm-smmu-v3 fc900000.iommu: 0x0000010000000010 arm-smmu-v3 fc900000.iommu: 0x0000020000000000 arm-smmu-v3 fc900000.iommu: 0x000000090000f040 arm-smmu-v3 fc900000.iommu: 0x0000000000000000 arm-smmu-v3 fc900000.iommu: event: F_TRANSLATION client: 0000:01:00.0 sid: 0x100 ssid: 0x0 iova: 0x90000f040 ipa: 0x0 arm-smmu-v3 fc900000.iommu: unpriv data write s1 "Input address caused fault" stag: 0x0 Flush the write by performing a readl() of the same address to ensure that the write has reached the destination before the ATU entry is unmapped. The same problem was solved for dw_pcie_ep_raise_msi_irq() in commit 8719c64e76bf ("PCI: dwc: ep: Cache MSI outbound iATU mapping"), but there it was solved by dedicating an outbound iATU only for MSI. We can't do the same for MSI-X because each vector can have a different msg_addr and the msg_addr may be changed while the vector is masked. Fixes: beb4641a787d ("PCI: dwc: Add MSI-X callbacks handler") Signed-off-by: Niklas Cassel <[email protected]> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Frank Li <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
Author: Qiang Yu <[email protected]> Date: Wed Dec 24 02:10:46 2025 -0800 PCI: dwc: Remove duplicate dw_pcie_ep_hide_ext_capability() function [ Upstream commit 86291f774fe8524178446cb2c792939640b4970c ] Remove dw_pcie_ep_hide_ext_capability() and replace its usage with dw_pcie_remove_ext_capability(). Both functions serve the same purpose of hiding PCIe extended capabilities, but dw_pcie_remove_ext_capability() provides a cleaner API that doesn't require the caller to specify the previous capability ID. Suggested-by: Niklas Cassel <[email protected]> Signed-off-by: Qiang Yu <[email protected]> Signed-off-by: Manivannan Sadhasivam <[email protected]> Tested-by: Niklas Cassel <[email protected]> Link: https://patch.msgid.link/20251224-remove_dw_pcie_ep_hide_ext_capability-v1-1-4302c9cdc316@oss.qualcomm.com Stable-dep-of: 43d67ec26b32 ("PCI: dwc: ep: Fix resizable BAR support for multi-PF configurations") Signed-off-by: Sasha Levin <[email protected]>
Author: Siddharth Vadapalli <[email protected]> Date: Mon Nov 17 17:02:06 2025 +0530 PCI: j721e: Add config guards for Cadence Host and Endpoint library APIs [ Upstream commit 4b361b1e92be255ff923453fe8db74086cc7cf66 ] Commit under Fixes enabled loadable module support for the driver under the assumption that it shall be the sole user of the Cadence Host and Endpoint library APIs. This assumption guarantees that we won't end up in a case where the driver is built-in and the library support is built as a loadable module. With the introduction of [1], this assumption is no longer valid. The SG2042 driver could be built as a loadable module, implying that the Cadence Host library is also selected as a loadable module. However, the pci-j721e.c driver could be built-in as indicated by CONFIG_PCI_J721E=y due to which the Cadence Endpoint library is built-in. Despite the library drivers being built as specified by their respective consumers, since the 'pci-j721e.c' driver has references to the Cadence Host library APIs as well, we run into a build error as reported at [0]. Fix this by adding config guards as a temporary workaround. The proper fix is to split the 'pci-j721e.c' driver into independent Host and Endpoint drivers as aligned at [2]. [0]: https://lore.kernel.org/r/[email protected]/ [1]: commit 1c72774df028 ("PCI: sg2042: Add Sophgo SG2042 PCIe driver") [2]: https://lore.kernel.org/r/[email protected]/ Fixes: a2790bf81f0f ("PCI: j721e: Add support to build as a loadable module") Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Suggested-by: Arnd Bergmann <[email protected]> Signed-off-by: Siddharth Vadapalli <[email protected]> Signed-off-by: Manivannan Sadhasivam <[email protected]> Reviewed-by: Arnd Bergmann <[email protected]> Cc: [email protected] Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
Author: Anand Moon <[email protected]> Date: Tue Oct 28 21:12:23 2025 +0530 PCI: j721e: Use devm_clk_get_optional_enabled() to get and enable the clock [ Upstream commit 6fad11c61d0dbf87601ab9e2e37cba7a9a427f7b ] Use devm_clk_get_optional_enabled() helper instead of calling devm_clk_get_optional() and then clk_prepare_enable(). Assign the result of devm_clk_get_optional_enabled() directly to pcie->refclk to avoid using a local 'clk' variable. Signed-off-by: Anand Moon <[email protected]> Signed-off-by: Manivannan Sadhasivam <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Siddharth Vadapalli <[email protected]> Link: https://patch.msgid.link/[email protected] Stable-dep-of: 4b361b1e92be ("PCI: j721e: Add config guards for Cadence Host and Endpoint library APIs") Signed-off-by: Sasha Levin <[email protected]>
Author: Namhyung Kim <[email protected]> Date: Mon Jun 2 21:51:05 2025 -0700 perf/core: Fix invalid wait context in ctx_sched_in() [ Upstream commit 486ff5ad49bc50315bcaf6d45f04a33ef0a45ced ] Lockdep found a bug in the event scheduling when a pinned event was failed and wakes up the threads in the ring buffer like below. It seems it should not grab a wait-queue lock under perf-context lock. Let's do it with irq_work. [ 39.913691] ============================= [ 39.914157] [ BUG: Invalid wait context ] [ 39.914623] 6.15.0-next-20250530-next-2025053 #1 Not tainted [ 39.915271] ----------------------------- [ 39.915731] repro/837 is trying to lock: [ 39.916191] ffff88801acfabd8 (&event->waitq){....}-{3:3}, at: __wake_up+0x26/0x60 [ 39.917182] other info that might help us debug this: [ 39.917761] context-{5:5} [ 39.918079] 4 locks held by repro/837: [ 39.918530] #0: ffffffff8725cd00 (rcu_read_lock){....}-{1:3}, at: __perf_event_task_sched_in+0xd1/0xbc0 [ 39.919612] #1: ffff88806ca3c6f8 (&cpuctx_lock){....}-{2:2}, at: __perf_event_task_sched_in+0x1a7/0xbc0 [ 39.920748] #2: ffff88800d91fc18 (&ctx->lock){....}-{2:2}, at: __perf_event_task_sched_in+0x1f9/0xbc0 [ 39.921819] #3: ffffffff8725cd00 (rcu_read_lock){....}-{1:3}, at: perf_event_wakeup+0x6c/0x470 Fixes: f4b07fd62d4d ("perf/core: Use POLLHUP for a pinned event in error") Closes: https://lore.kernel.org/lkml/aD2w50VDvGIH95Pf@ly-workstation Reported-by: "Lai, Yi" <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Tested-by: "Lai, Yi" <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
Author: Haocheng Yu <[email protected]> Date: Tue Feb 3 00:20:56 2026 +0800 perf/core: Fix refcount bug and potential UAF in perf_mmap commit 77de62ad3de3967818c3dbe656b7336ebee461d2 upstream. Syzkaller reported a refcount_t: addition on 0; use-after-free warning in perf_mmap. The issue is caused by a race condition between a failing mmap() setup and a concurrent mmap() on a dependent event (e.g., using output redirection). In perf_mmap(), the ring_buffer (rb) is allocated and assigned to event->rb with the mmap_mutex held. The mutex is then released to perform map_range(). If map_range() fails, perf_mmap_close() is called to clean up. However, since the mutex was dropped, another thread attaching to this event (via inherited events or output redirection) can acquire the mutex, observe the valid event->rb pointer, and attempt to increment its reference count. If the cleanup path has already dropped the reference count to zero, this results in a use-after-free or refcount saturation warning. Fix this by extending the scope of mmap_mutex to cover the map_range() call. This ensures that the ring buffer initialization and mapping (or cleanup on failure) happens atomically effectively, preventing other threads from accessing a half-initialized or dying ring buffer. Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Reported-by: kernel test robot <[email protected]> Signed-off-by: Haocheng Yu <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Zide Chen <[email protected]> Date: Mon Feb 9 16:52:25 2026 -0800 perf/x86/intel/uncore: Add per-scheduler IMC CAS count events commit 6a8a48644c4b804123e59dbfc5d6cd29a0194046 upstream. IMC on SPR and EMR does not support sub-channels. In contrast, CPUs that use gnr_uncores[] (e.g. Granite Rapids and Sierra Forest) implement two command schedulers (SCH0/SCH1) per memory channel, providing logically independent command and data paths. Do not reuse the spr_uncore_imc[] configuration for these CPUs. Instead, introduce a dedicated gnr_uncore_imc[] with per-scheduler events, so userspace can monitor SCH0 and SCH1 independently. On these CPUs, replace cas_count_{read,write} with cas_count_{read,write}_sch{0,1}. This may break existing userspace that relies on cas_count_{read,write}, prompting it to switch to the per-scheduler events, as the legacy event reports only partial traffic (SCH0). Fixes: 632c4bf6d007 ("perf/x86/intel/uncore: Support Granite Rapids") Fixes: cb4a6ccf3583 ("perf/x86/intel/uncore: Support Sierra Forest and Grand Ridge") Reported-by: Reinette Chatre <[email protected]> Signed-off-by: Zide Chen <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Dapeng Mi <[email protected]> Cc: [email protected] Link: https://patch.msgid.link/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Peter Zijlstra <[email protected]> Date: Tue Feb 24 13:29:09 2026 +0100 perf: Fix __perf_event_overflow() vs perf_remove_from_context() race [ Upstream commit c9bc1753b3cc41d0e01fbca7f035258b5f4db0ae ] Make sure that __perf_event_overflow() runs with IRQs disabled for all possible callchains. Specifically the software events can end up running it with only preemption disabled. This opens up a race vs perf_event_exit_event() and friends that will go and free various things the overflow path expects to be present, like the BPF program. Fixes: 592903cdcbf6 ("perf_counter: add an event_list") Reported-by: Simond Hu <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Tested-by: Simond Hu <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
Author: Felix Gu <[email protected]> Date: Mon Feb 23 17:39:07 2026 +0800 pinctrl: cirrus: cs42l43: Fix double-put in cs42l43_pin_probe() [ Upstream commit fd5bed798f45eb3a178ad527b43ab92705faaf8a ] devm_add_action_or_reset() already invokes the action on failure, so the explicit put causes a double-put. Fixes: 9b07cdf86a0b ("pinctrl: cirrus: Fix fwnode leak in cs42l43_pin_probe()") Signed-off-by: Felix Gu <[email protected]> Reviewed-by: Charles Keepax <[email protected]> Signed-off-by: Linus Walleij <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Florian Eckert <[email protected]> Date: Thu Feb 5 13:55:46 2026 +0100 pinctrl: equilibrium: fix warning trace on load [ Upstream commit 3e00b1b332e54ba50cca6691f628b9c06574024f ] The callback functions 'eqbr_irq_mask()' and 'eqbr_irq_ack()' are also called in the callback function 'eqbr_irq_mask_ack()'. This is done to avoid source code duplication. The problem, is that in the function 'eqbr_irq_mask()' also calles the gpiolib function 'gpiochip_disable_irq()' This generates the following warning trace in the log for every gpio on load. [ 6.088111] ------------[ cut here ]------------ [ 6.092440] WARNING: CPU: 3 PID: 1 at drivers/gpio/gpiolib.c:3810 gpiochip_disable_irq+0x39/0x50 [ 6.097847] Modules linked in: [ 6.097847] CPU: 3 UID: 0 PID: 1 Comm: swapper/0 Tainted: G W 6.12.59+ #0 [ 6.097847] Tainted: [W]=WARN [ 6.097847] RIP: 0010:gpiochip_disable_irq+0x39/0x50 [ 6.097847] Code: 39 c6 48 19 c0 21 c6 48 c1 e6 05 48 03 b2 38 03 00 00 48 81 fe 00 f0 ff ff 77 11 48 8b 46 08 f6 c4 02 74 06 f0 80 66 09 fb c3 <0f> 0b 90 0f 1f 40 00 c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 [ 6.097847] RSP: 0000:ffffc9000000b830 EFLAGS: 00010046 [ 6.097847] RAX: 0000000000000045 RBX: ffff888001be02a0 RCX: 0000000000000008 [ 6.097847] RDX: ffff888001be9000 RSI: ffff888001b2dd00 RDI: ffff888001be02a0 [ 6.097847] RBP: ffffc9000000b860 R08: 0000000000000000 R09: 0000000000000000 [ 6.097847] R10: 0000000000000001 R11: ffff888001b2a154 R12: ffff888001be0514 [ 6.097847] R13: ffff888001be02a0 R14: 0000000000000008 R15: 0000000000000000 [ 6.097847] FS: 0000000000000000(0000) GS:ffff888041d80000(0000) knlGS:0000000000000000 [ 6.097847] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 6.097847] CR2: 0000000000000000 CR3: 0000000003030000 CR4: 00000000001026b0 [ 6.097847] Call Trace: [ 6.097847] <TASK> [ 6.097847] ? eqbr_irq_mask+0x63/0x70 [ 6.097847] ? no_action+0x10/0x10 [ 6.097847] eqbr_irq_mask_ack+0x11/0x60 In an other driver (drivers/pinctrl/starfive/pinctrl-starfive-jh7100.c) the interrupt is not disabled here. To fix this, do not call the 'eqbr_irq_mask()' and 'eqbr_irq_ack()' function. Implement instead this directly without disabling the interrupts. Fixes: 52066a53bd11 ("pinctrl: equilibrium: Convert to immutable irq_chip") Signed-off-by: Florian Eckert <[email protected]> Signed-off-by: Linus Walleij <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Florian Eckert <[email protected]> Date: Thu Feb 5 13:55:45 2026 +0100 pinctrl: equilibrium: rename irq_chip function callbacks [ Upstream commit 1f96b84835eafb3e6f366dc3a66c0e69504cec9d ] Renaming of the irq_chip callback functions to improve clarity. Signed-off-by: Florian Eckert <[email protected]> Signed-off-by: Linus Walleij <[email protected]> Stable-dep-of: 3e00b1b332e5 ("pinctrl: equilibrium: fix warning trace on load") Signed-off-by: Sasha Levin <[email protected]>
Author: Conor Dooley <[email protected]> Date: Tue Feb 3 16:17:07 2026 +0000 pinctrl: generic: move function to amlogic-am4 driver [ Upstream commit 9c5a40f2922a5a6d6b42e7b3d4c8e253918c07a1 ] pinconf_generic_dt_node_to_map_pinmux() is not actually a generic function, and really belongs in the amlogic-am4 driver. There are three reasons why. First, and least, of the reasons is that this function behaves differently to the other dt_node_to_map functions in a way that is not obvious from a first glance. This difference stems for the devicetree properties that the function is intended for use with, and how they are typically used. The other generic dt_node_to_map functions support platforms where the pins, groups and functions are described statically in the driver and require a function that will produce a mapping from dt nodes to these pre-established descriptions. No other code in the driver is require to be executed at runtime. pinconf_generic_dt_node_to_map_pinmux() on the other hand is intended for use with the pinmux property, where groups and functions are determined entirely from the devicetree. As a result, there are no statically defined groups and functions in the driver for this function to perform a mapping to. Other drivers that use the pinmux property (e.g. the k1) their dt_node_to_map function creates the groups and functions as the devicetree is parsed. Instead of that, pinconf_generic_dt_node_to_map_pinmux() requires that the devicetree is parsed twice, once by it and once at probe, so that the driver dynamically creates the groups and functions before the dt_node_to_map callback is executed. I don't believe this double parsing requirement is how developers would expect this to work and is not necessary given there are drivers that do not have this behaviour. Secondly and thirdly, the function bakes in some assumptions that only really match the amlogic platform about how the devicetree is constructed. These, to me, are problematic for something that claims to be generic. The other dt_node_to_map implementations accept a being called for either a node containing pin configuration properties or a node containing child nodes that each contain the configuration properties. IOW, they support the following two devicetree configurations: | cfg { | label: group { | pinmux = <asjhdasjhlajskd>; | config-item1; | }; | }; | label: cfg { | group1 { | pinmux = <dsjhlfka>; | config-item2; | }; | group2 { | pinmux = <lsdjhaf>; | config-item1; | }; | }; pinconf_generic_dt_node_to_map_pinmux() only supports the latter. The other assumption about devicetree configuration that the function makes is that the labeled node's parent is a "function node". The amlogic driver uses these "function nodes" to create the functions at probe time, and pinconf_generic_dt_node_to_map_pinmux() finds the parent of the node it is operating on's name as part of the mapping. IOW, it requires that the devicetree look like: | pinctrl@bla { | | func-foo { | label: group-default { | pinmuxes = <lskdf>; | }; | }; | }; and couldn't be used if the nodes containing the pinmux and configuration properties are children of the pinctrl node itself: | pinctrl@bla { | | label: group-default { | pinmuxes = <lskdf>; | }; | }; These final two reasons are mainly why I believe this is not suitable as a generic function, and should be moved into the driver that is the sole user and originator of the "generic" function. Signed-off-by: Conor Dooley <[email protected]> Acked-by: Andy Shevchenko <[email protected]> Signed-off-by: Linus Walleij <[email protected]> Stable-dep-of: a2539b92e4b7 ("pinctrl: meson: amlogic-a4: Fix device node reference leak in aml_dt_node_to_map_pinmux()") Signed-off-by: Sasha Levin <[email protected]>
Author: Felix Gu <[email protected]> Date: Thu Feb 19 00:51:22 2026 +0800 pinctrl: meson: amlogic-a4: Fix device node reference leak in aml_dt_node_to_map_pinmux() [ Upstream commit a2539b92e4b791c1ba482930b5e51b1591975461 ] The of_get_parent() function returns a device_node with an incremented reference count. Use the __free(device_node) cleanup attribute to ensure of_node_put() is automatically called when pnode goes out of scope, fixing a reference leak. Fixes: 6e9be3abb78c ("pinctrl: Add driver support for Amlogic SoCs") Signed-off-by: Felix Gu <[email protected]> Signed-off-by: Linus Walleij <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Maulik Shah <[email protected]> Date: Mon Feb 9 09:33:44 2026 +0530 pinctrl: qcom: qcs615: Add missing dual edge GPIO IRQ errata flag [ Upstream commit 09a30b7a035f9f4ac918c8a9af89d70e43462152 ] Wakeup capable GPIOs uses PDC as parent IRQ chip and PDC on qcs615 do not support dual edge IRQs. Add missing wakeirq_dual_edge_errata configuration to enable workaround for dual edge GPIO IRQs. Fixes: b698f36a9d40 ("pinctrl: qcom: add the tlmm driver for QCS615 platform") Signed-off-by: Maulik Shah <[email protected]> Reviewed-by: Dmitry Baryshkov <[email protected]> Signed-off-by: Linus Walleij <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Kurt Borja <[email protected]> Date: Thu Jan 29 12:19:24 2026 -0500 platform/x86: alienware-wmi-wmax: Add G-Mode support to m18 laptops commit bd5914caeb4b2de233992c31babccda88041b035 upstream. Alienware m18 laptops support G-Mode. Therefore, match them with G-Series quirks. Cc: [email protected] Tested-by: Olexa Bilaniuk <[email protected]> Signed-off-by: Kurt Borja <[email protected]> Link: https://patch.msgid.link/[email protected] Reviewed-by: Ilpo Järvinen <[email protected]> Signed-off-by: Ilpo Järvinen <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Thorsten Blum <[email protected]> Date: Tue Mar 3 12:30:51 2026 +0100 platform/x86: dell-wmi-sysman: Don't hex dump plaintext password data commit d1a196e0a6dcddd03748468a0e9e3100790fc85c upstream. set_new_password() hex dumps the entire buffer, which contains plaintext password data, including current and new passwords. Remove the hex dump to avoid leaking credentials. Fixes: e8a60aa7404b ("platform/x86: Introduce support for Systems Management Driver over WMI for Dell Systems") Cc: [email protected] Signed-off-by: Thorsten Blum <[email protected]> Link: https://patch.msgid.link/[email protected] Reviewed-by: Ilpo Järvinen <[email protected]> Signed-off-by: Ilpo Järvinen <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Kurt Borja <[email protected]> Date: Sat Feb 7 12:16:34 2026 -0500 platform/x86: dell-wmi: Add audio/mic mute key codes commit 26a7601471f62b95d56a81c3a8ccb551b5a6630f upstream. Add audio/mic mute key codes found in Alienware m18 r1 AMD. Cc: [email protected] Tested-by: Olexa Bilaniuk <[email protected]> Suggested-by: Olexa Bilaniuk <[email protected]> Signed-off-by: Kurt Borja <[email protected]> Acked-by: Pali Rohár <[email protected]> Link: https://patch.msgid.link/[email protected] Reviewed-by: Ilpo Järvinen <[email protected]> Signed-off-by: Ilpo Järvinen <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Mario Limonciello <[email protected]> Date: Mon Mar 9 07:24:45 2026 -0400 platform/x86: hp-bioscfg: Support allocations of larger data [ Upstream commit 916727cfdb72cd01fef3fa6746e648f8cb70e713 ] Some systems have much larger amounts of enumeration attributes than have been previously encountered. This can lead to page allocation failures when using kcalloc(). Switch over to using kvcalloc() to allow larger allocations. Fixes: 6b2770bfd6f92 ("platform/x86: hp-bioscfg: enum-attributes") Cc: [email protected] Reported-by: Paul Kerry <[email protected]> Tested-by: Paul Kerry <[email protected]> Closes: https://bugs.debian.org/1127612 Signed-off-by: Mario Limonciello <[email protected]> Link: https://patch.msgid.link/[email protected] Reviewed-by: Ilpo Järvinen <[email protected]> Signed-off-by: Ilpo Järvinen <[email protected]> [ kcalloc() => kvcalloc() ] Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Jonathan Teh <[email protected]> Date: Mon Feb 16 01:01:29 2026 +0000 platform/x86: thinkpad_acpi: Fix errors reading battery thresholds [ Upstream commit 53e977b1d50c46f2c4ec3865cd13a822f58ad3cd ] Check whether the battery supports the relevant charge threshold before reading the value to silence these errors: thinkpad_acpi: acpi_evalf(BCTG, dd, ...) failed: AE_NOT_FOUND ACPI: \_SB_.PCI0.LPC_.EC__.HKEY: BCTG: evaluate failed thinkpad_acpi: acpi_evalf(BCSG, dd, ...) failed: AE_NOT_FOUND ACPI: \_SB_.PCI0.LPC_.EC__.HKEY: BCSG: evaluate failed when reading the charge thresholds via sysfs on platforms that do not support them such as the ThinkPad T400. Fixes: 2801b9683f74 ("thinkpad_acpi: Add support for battery thresholds") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=202619 Signed-off-by: Jonathan Teh <[email protected]> Reviewed-by: Mark Pearson <[email protected]> Link: https://patch.msgid.link/MI0P293MB01967B206E1CA6F337EBFB12926CA@MI0P293MB0196.ITAP293.PROD.OUTLOOK.COM Reviewed-by: Ilpo Järvinen <[email protected]> Signed-off-by: Ilpo Järvinen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Xuewen Yan <[email protected]> Date: Wed Feb 4 13:25:09 2026 +0100 PM: sleep: core: Avoid bit field races related to work_in_progress [ Upstream commit 0491f3f9f664e7e0131eb4d2a8b19c49562e5c64 ] In all of the system suspend transition phases, the async processing of a device may be carried out in parallel with power.work_in_progress updates for the device's parent or suppliers and if it touches bit fields from the same group (for example, power.must_resume or power.wakeup_path), bit field corruption is possible. To avoid that, turn work_in_progress in struct dev_pm_info into a proper bool field and relocate it to save space. Fixes: aa7a9275ab81 ("PM: sleep: Suspend async parents after suspending children") Fixes: 443046d1ad66 ("PM: sleep: Make suspend of devices more asynchronous") Signed-off-by: Xuewen Yan <[email protected]> Closes: https://lore.kernel.org/linux-pm/[email protected]/ Cc: All applicable <[email protected]> [ rjw: Added subject and changelog ] Link: https://patch.msgid.link/CAB8ipk_VX2VPm706Jwa1=8NSA7_btWL2ieXmBgHr2JcULEP76g@mail.gmail.com Signed-off-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Jason Gunthorpe <[email protected]> Date: Mon Feb 16 11:02:50 2026 -0400 RDMA/ionic: Fix kernel stack leak in ionic_create_cq() commit faa72102b178c7ae6c6afea23879e7c84fc59b4e upstream. struct ionic_cq_resp resp { __u32 cqid[2]; // offset 0 - PARTIALLY SET (see below) __u8 udma_mask; // offset 8 - SET (resp.udma_mask = vcq->udma_mask) __u8 rsvd[7]; // offset 9 - NEVER SET <- LEAK }; rsvd[7]: 7 bytes of stack memory leaked unconditionally. cqid[2]: The loop at line 1256 iterates over udma_idx but skips indices where !(vcq->udma_mask & BIT(udma_idx)). The array has 2 entries but udma_count could be 1, meaning cqid[1] might never be written via ionic_create_cq_common(). If udma_mask only has bit 0 set, cqid[1] (4 bytes) is also leaked. So potentially 11 bytes leaked. Cc: [email protected] Fixes: e8521822c733 ("RDMA/ionic: Register device ops for control path") Signed-off-by: Jason Gunthorpe <[email protected]> Link: https://patch.msgid.link/[email protected] Acked-by: Abhijit Gangurde <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Jason Gunthorpe <[email protected]> Date: Mon Feb 16 11:02:49 2026 -0400 RDMA/irdma: Fix kernel stack leak in irdma_create_user_ah() commit 74586c6da9ea222a61c98394f2fc0a604748438c upstream. struct irdma_create_ah_resp { // 8 bytes, no padding __u32 ah_id; // offset 0 - SET (uresp.ah_id = ah->sc_ah.ah_info.ah_idx) __u8 rsvd[4]; // offset 4 - NEVER SET <- LEAK }; rsvd[4]: 4 bytes of stack memory leaked unconditionally. Only ah_id is assigned before ib_respond_udata(). The reserved members of the structure were not zeroed. Cc: [email protected] Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs") Signed-off-by: Jason Gunthorpe <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Felix Gu <[email protected]> Date: Tue Feb 24 19:19:03 2026 +0800 regulator: bq257xx: Fix device node reference leak in bq257xx_reg_dt_parse_gpio() [ Upstream commit 4baaddaa44af01cd4ce239493060738fd0881835 ] In bq257xx_reg_dt_parse_gpio(), if fails to get subchild, it returns without calling of_node_put(child), causing the device node reference leak. Fixes: 981dd162b635 ("regulator: bq257xx: Add bq257xx boost regulator driver") Signed-off-by: Felix Gu <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Greg Kroah-Hartman <[email protected]> Date: Mon Mar 9 14:30:49 2026 +0100 Revert "netfilter: nft_set_rbtree: validate open interval overlap" This reverts commit 12b1681793e9b7552495290785a3570c539f409d which is commit 648946966a08e4cb1a71619e3d1b12bd7642de7b upstream. It is causing netfilter issues, so revert it for now. Link: https://lore.kernel.org/r/aaeEd8UqYQ33Af7_@chamomile Cc: Pablo Neira Ayuso <[email protected]> Cc: Florian Westphal <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Niklas Cassel <[email protected]> Date: Mon Dec 22 07:42:09 2025 +0100 Revert "PCI: dw-rockchip: Enumerate endpoints based on dll_link_up IRQ" [ Upstream commit 180c3cfe36786d261a55da52a161f9e279b19a6f ] This reverts commit 0e0b45ab5d770a748487ba0ae8f77d1fb0f0de3e. While this fake hotplugging was a nice idea, it has shown that this feature does not handle PCIe switches correctly: pci_bus 0004:43: busn_res: can not insert [bus 43-41] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci_bus 0004:43: busn_res: [bus 43-41] end is updated to 43 pci_bus 0004:43: busn_res: can not insert [bus 43] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci 0004:42:00.0: devices behind bridge are unusable because [bus 43] cannot be assigned for them pci_bus 0004:44: busn_res: can not insert [bus 44-41] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci_bus 0004:44: busn_res: [bus 44-41] end is updated to 44 pci_bus 0004:44: busn_res: can not insert [bus 44] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci 0004:42:02.0: devices behind bridge are unusable because [bus 44] cannot be assigned for them pci_bus 0004:45: busn_res: can not insert [bus 45-41] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci_bus 0004:45: busn_res: [bus 45-41] end is updated to 45 pci_bus 0004:45: busn_res: can not insert [bus 45] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci 0004:42:06.0: devices behind bridge are unusable because [bus 45] cannot be assigned for them pci_bus 0004:46: busn_res: can not insert [bus 46-41] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci_bus 0004:46: busn_res: [bus 46-41] end is updated to 46 pci_bus 0004:46: busn_res: can not insert [bus 46] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci 0004:42:0e.0: devices behind bridge are unusable because [bus 46] cannot be assigned for them pci_bus 0004:42: busn_res: [bus 42-41] end is updated to 46 pci_bus 0004:42: busn_res: can not insert [bus 42-46] under [bus 41] (conflicts with (null) [bus 41]) pci 0004:41:00.0: devices behind bridge are unusable because [bus 42-46] cannot be assigned for them pcieport 0004:40:00.0: bridge has subordinate 41 but max busn 46 During the initial scan, PCI core doesn't see the switch and since the Root Port is not hot plug capable, the secondary bus number gets assigned as the subordinate bus number. This means, the PCI core assumes that only one bus will appear behind the Root Port since the Root Port is not hot plug capable. This works perfectly fine for PCIe endpoints connected to the Root Port, since they don't extend the bus. However, if a PCIe switch is connected, then there is a problem when the downstream busses starts showing up and the PCI core doesn't extend the subordinate bus number and bridge resources after initial scan during boot. The long term plan is to migrate this driver to the upcoming pwrctrl APIs that are supposed to handle this problem elegantly. Suggested-by: Manivannan Sadhasivam <[email protected]> Signed-off-by: Niklas Cassel <[email protected]> Signed-off-by: Manivannan Sadhasivam <[email protected]> Tested-by: Shawn Lin <[email protected]> Acked-by: Shawn Lin <[email protected]> Cc: [email protected] Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
Author: Mathieu Desnoyers <[email protected]> Date: Fri Feb 20 15:06:40 2026 -0500 rseq: Clarify rseq registration rseq_size bound check comment [ Upstream commit 26d43a90be81fc90e26688a51d3ec83188602731 ] The rseq registration validates that the rseq_size argument is greater or equal to 32 (the original rseq size), but the comment associated with this check does not clearly state this. Clarify the comment to that effect. Fixes: ee3e3ac05c26 ("rseq: Introduce extensible rseq ABI") Signed-off-by: Mathieu Desnoyers <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
Author: Alexandre Courbot <[email protected]> Date: Tue Feb 24 19:37:56 2026 +0900 rust: kunit: fix warning when !CONFIG_PRINTK [ Upstream commit 7dd34dfc8dfa92a7244242098110388367996ac3 ] If `CONFIG_PRINTK` is not set, then the following warnings are issued during build: warning: unused variable: `args` --> ../rust/kernel/kunit.rs:16:12 | 16 | pub fn err(args: fmt::Arguments<'_>) { | ^^^^ help: if this is intentional, prefix it with an underscore: `_args` | = note: `#[warn(unused_variables)]` (part of `#[warn(unused)]`) on by default warning: unused variable: `args` --> ../rust/kernel/kunit.rs:32:13 | 32 | pub fn info(args: fmt::Arguments<'_>) { | ^^^^ help: if this is intentional, prefix it with an underscore: `_args` Fix this by adding a no-op assignment using `args` when `CONFIG_PRINTK` is not set. Fixes: a66d733da801 ("rust: support running Rust documentation tests as KUnit ones") Signed-off-by: Alexandre Courbot <[email protected]> Reviewed-by: Alice Ryhl <[email protected]> Reviewed-by: David Gow <[email protected]> Signed-off-by: Shuah Khan <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Heiko Carstens <[email protected]> Date: Wed Feb 18 15:20:04 2026 +0100 s390/idle: Fix cpu idle exit cpu time accounting [ Upstream commit 0d785e2c324c90662baa4fe07a0d02233ff92824 ] With the conversion to generic entry [1] cpu idle exit cpu time accounting was converted from assembly to C. This introduced an reversed order of cpu time accounting. On cpu idle exit the current accounting happens with the following call chain: -> do_io_irq()/do_ext_irq() -> irq_enter_rcu() -> account_hardirq_enter() -> vtime_account_irq() -> vtime_account_kernel() vtime_account_kernel() accounts the passed cpu time since last_update_timer as system time, and updates last_update_timer to the current cpu timer value. However the subsequent call of -> account_idle_time_irq() will incorrectly subtract passed cpu time from timer_idle_enter to the updated last_update_timer value from system_timer. Then last_update_timer is updated to a sys_enter_timer, which means that last_update_timer goes back in time. Subsequently account_hardirq_exit() will account too much cpu time as hardirq time. The sum of all accounted cpu times is still correct, however some cpu time which was previously accounted as system time is now accounted as hardirq time, plus there is the oddity that last_update_timer goes back in time. Restore previous behavior by extracting cpu time accounting code from account_idle_time_irq() into a new update_timer_idle() function and call it before irq_enter_rcu(). Fixes: 56e62a737028 ("s390: convert to generic entry") [1] Reviewed-by: Sven Schnelle <[email protected]> Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Heiko Carstens <[email protected]> Date: Wed Feb 18 15:20:05 2026 +0100 s390/vtime: Fix virtual timer forwarding [ Upstream commit dbc0fb35679ed5d0adecf7d02137ac2c77244b3b ] Since delayed accounting of system time [1] the virtual timer is forwarded by do_account_vtime() but also vtime_account_kernel(), vtime_account_softirq(), and vtime_account_hardirq(). This leads to double accounting of system, guest, softirq, and hardirq time. Remove accounting from the vtime_account*() family to restore old behavior. There is only one user of the vtimer interface, which might explain why nobody noticed this so far. Fixes: b7394a5f4ce9 ("sched/cputime, s390: Implement delayed accounting of system time") [1] Reviewed-by: Sven Schnelle <[email protected]> Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Wang Tao <[email protected]> Date: Tue Jan 20 12:31:13 2026 +0000 sched/eevdf: Update se->vprot in reweight_entity() [ Upstream commit ff38424030f98976150e42ca35f4b00e6ab8fa23 ] In the EEVDF framework with Run-to-Parity protection, `se->vprot` is an independent variable defining the virtual protection timestamp. When `reweight_entity()` is called (e.g., via nice/renice), it performs the following actions to preserve Lag consistency: 1. Scales `se->vlag` based on the new weight. 2. Calls `place_entity()`, which recalculates `se->vruntime` based on the new weight and scaled lag. However, the current implementation fails to update `se->vprot`, leading to mismatches between the task's actual runtime and its expected duration. Fixes: 63304558ba5d ("sched/eevdf: Curb wakeup-preemption") Suggested-by: Zhang Qiao <[email protected]> Signed-off-by: Wang Tao <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Vincent Guittot <[email protected]> Tested-by: K Prateek Nayak <[email protected]> Tested-by: Shubhang Kaushik <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
Author: Peter Zijlstra <[email protected]> Date: Tue Apr 22 12:16:28 2025 +0200 sched/fair: Fix lag clamp [ Upstream commit 6e3c0a4e1ad1e0455b7880fad02b3ee179f56c09 ] Vincent reported that he was seeing undue lag clamping in a mixed slice workload. Implement the max_slice tracking as per the todo comment. Fixes: 147f3efaa241 ("sched/fair: Implement an EEVDF-like scheduling policy") Reported-off-by: Vincent Guittot <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Tested-by: Vincent Guittot <[email protected]> Tested-by: K Prateek Nayak <[email protected]> Tested-by: Shubhang Kaushik <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
Author: Peter Zijlstra <[email protected]> Date: Mon Feb 9 15:28:16 2026 +0100 sched/fair: Fix zero_vruntime tracking [ Upstream commit b3d99f43c72b56cf7a104a364e7fb34b0702828b ] It turns out that zero_vruntime tracking is broken when there is but a single task running. Current update paths are through __{en,de}queue_entity(), and when there is but a single task, pick_next_task() will always return that one task, and put_prev_set_next_task() will end up in neither function. This can cause entity_key() to grow indefinitely large and cause overflows, leading to much pain and suffering. Furtermore, doing update_zero_vruntime() from __{de,en}queue_entity(), which are called from {set_next,put_prev}_entity() has problems because: - set_next_entity() calls __dequeue_entity() before it does cfs_rq->curr = se. This means the avg_vruntime() will see the removal but not current, missing the entity for accounting. - put_prev_entity() calls __enqueue_entity() before it does cfs_rq->curr = NULL. This means the avg_vruntime() will see the addition *and* current, leading to double accounting. Both cases are incorrect/inconsistent. Noting that avg_vruntime is already called on each {en,de}queue, remove the explicit avg_vruntime() calls (which removes an extra 64bit division for each {en,de}queue) and have avg_vruntime() update zero_vruntime itself. Additionally, have the tick call avg_vruntime() -- discarding the result, but for the side-effect of updating zero_vruntime. While there, optimize avg_vruntime() by noting that the average of one value is rather trivial to compute. Test case: # taskset -c -p 1 $$ # taskset -c 2 bash -c 'while :; do :; done&' # cat /sys/kernel/debug/sched/debug | awk '/^cpu#/ {P=0} /^cpu#2,/ {P=1} {if (P) print $0}' | grep -e zero_vruntime -e "^>" PRE: .zero_vruntime : 31316.407903 >R bash 487 50787.345112 E 50789.145972 2.800000 50780.298364 16 120 0.000000 0.000000 0.000000 / .zero_vruntime : 382548.253179 >R bash 487 427275.204288 E 427276.003584 2.800000 427268.157540 23 120 0.000000 0.000000 0.000000 / POST: .zero_vruntime : 17259.709467 >R bash 526 17259.709467 E 17262.509467 2.800000 16915.031624 9 120 0.000000 0.000000 0.000000 / .zero_vruntime : 18702.723356 >R bash 526 18702.723356 E 18705.523356 2.800000 18358.045513 9 120 0.000000 0.000000 0.000000 / Fixes: 79f3f9bedd14 ("sched/eevdf: Fix min_vruntime vs avg_vruntime") Reported-by: K Prateek Nayak <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Tested-by: K Prateek Nayak <[email protected]> Tested-by: Shubhang Kaushik <[email protected]> Link: https://patch.msgid.link/20260219080624.438854780%40infradead.org Signed-off-by: Sasha Levin <[email protected]>
Author: Ingo Molnar <[email protected]> Date: Tue Dec 2 16:10:32 2025 +0100 sched/fair: Introduce and use the vruntime_cmp() and vruntime_op() wrappers for wrapped-signed aritmetics [ Upstream commit 5758e48eefaf111d7764d8f1c8b666140fe5fa27 ] We have to be careful with vruntime comparisons and subtraction, due to the possibility of wrapping, so we have macros like: #define vruntime_gt(field, lse, rse) ({ (s64)((lse)->field - (rse)->field) > 0; }) Which is used like this: if (vruntime_gt(min_vruntime, se, rse)) se->min_vruntime = rse->min_vruntime; Replace this with an easier to read pattern that uses the regular arithmetics operators: if (vruntime_cmp(se->min_vruntime, ">", rse->min_vruntime)) se->min_vruntime = rse->min_vruntime; Also replace vruntime subtractions with vruntime_op(): - delta = (s64)(sea->vruntime - seb->vruntime) + - (s64)(cfs_rqb->zero_vruntime_fi - cfs_rqa->zero_vruntime_fi); + delta = vruntime_op(sea->vruntime, "-", seb->vruntime) + + vruntime_op(cfs_rqb->zero_vruntime_fi, "-", cfs_rqa->zero_vruntime_fi); In the vruntime_cmp() and vruntime_op() macros use Use __builtin_strcmp(), because of __HAVE_ARCH_STRCMP might turn off the compiler optimizations we rely on here to catch usage bugs. No change in functionality. Signed-off-by: Ingo Molnar <[email protected]> Stable-dep-of: b3d99f43c72b ("sched/fair: Fix zero_vruntime tracking") Signed-off-by: Sasha Levin <[email protected]>
Author: Peter Zijlstra <[email protected]> Date: Fri Jan 23 16:49:09 2026 +0100 sched/fair: Only set slice protection at pick time [ Upstream commit bcd74b2ffdd0a2233adbf26b65c62fc69a809c8e ] We should not (re)set slice protection in the sched_change pattern which calls put_prev_task() / set_next_task(). Fixes: 63304558ba5d ("sched/eevdf: Curb wakeup-preemption") Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Vincent Guittot <[email protected]> Tested-by: K Prateek Nayak <[email protected]> Tested-by: Shubhang Kaushik <[email protected]> Link: https://patch.msgid.link/20260219080624.561421378%40infradead.org Signed-off-by: Sasha Levin <[email protected]>
Author: Ingo Molnar <[email protected]> Date: Wed Nov 26 12:09:16 2025 +0100 sched/fair: Rename cfs_rq::avg_load to cfs_rq::sum_weight [ Upstream commit 4ff674fa986c27ec8a0542479258c92d361a2566 ] The ::avg_load field is a long-standing misnomer: it says it's an 'average load', but in reality it's the momentary sum of the load of all currently runnable tasks. We'd have to also perform a division by nr_running (or use time-decay) to arrive at any sort of average value. This is clear from comments about the math of fair scheduling: * \Sum w_i := cfs_rq->avg_load The sum of all weights is ... the sum of all weights, not the average of all weights. To make it doubly confusing, there's also an ::avg_load in the load-balancing struct sg_lb_stats, which *is* a true average. The second part of the field's name is a minor misnomer as well: it says 'load', and it is indeed a load_weight structure as it shares code with the load-balancer - but it's only in an SMP load-balancing context where load = weight, in the fair scheduling context the primary purpose is the weighting of different nice levels. So rename the field to ::sum_weight instead, which makes the terminology of the EEVDF math match up with our implementation of it: * \Sum w_i := cfs_rq->sum_weight Signed-off-by: Ingo Molnar <[email protected]> Link: https://patch.msgid.link/[email protected] Stable-dep-of: b3d99f43c72b ("sched/fair: Fix zero_vruntime tracking") Signed-off-by: Sasha Levin <[email protected]>
Author: Ingo Molnar <[email protected]> Date: Tue Dec 2 16:09:23 2025 +0100 sched/fair: Rename cfs_rq::avg_vruntime to ::sum_w_vruntime, and helper functions [ Upstream commit dcbc9d3f0e594223275a18f7016001889ad35eff ] The ::avg_vruntime field is a misnomer: it says it's an 'average vruntime', but in reality it's the momentary sum of the weighted vruntimes of all queued tasks, which is at least a division away from being an average. This is clear from comments about the math of fair scheduling: * \Sum (v_i - v0) * w_i := cfs_rq->avg_vruntime This confusion is increased by the cfs_avg_vruntime() function, which does perform the division and returns a true average. The sum of all weighted vruntimes should be named thusly, so rename the field to ::sum_w_vruntime. (As arguably ::sum_weighted_vruntime would be a bit of a mouthful.) Understanding the scheduler is hard enough already, without extra layers of obfuscated naming. ;-) Also rename related helper functions: sum_vruntime_add() => sum_w_vruntime_add() sum_vruntime_sub() => sum_w_vruntime_sub() sum_vruntime_update() => sum_w_vruntime_update() With the notable exception of cfs_avg_vruntime(), which was named accurately. Signed-off-by: Ingo Molnar <[email protected]> Link: https://patch.msgid.link/[email protected] Stable-dep-of: b3d99f43c72b ("sched/fair: Fix zero_vruntime tracking") Signed-off-by: Sasha Levin <[email protected]>
Author: David Carlier <[email protected]> Date: Thu Feb 26 12:45:17 2026 +0000 sched_ext: Fix SCX_EFLAG_INITIALIZED being a no-op flag [ Upstream commit 749989b2d90ddc7dd253ad3b11a77cf882721acf ] SCX_EFLAG_INITIALIZED is the sole member of enum scx_exit_flags with no explicit value, so the compiler assigns it 0. This makes the bitwise OR in scx_ops_init() a no-op: sch->exit_info->flags |= SCX_EFLAG_INITIALIZED; /* |= 0 */ As a result, BPF schedulers cannot distinguish whether ops.init() completed successfully by inspecting exit_info->flags. Assign the value 1LLU << 0 so the flag is actually set. Fixes: f3aec2adce8d ("sched_ext: Add SCX_EFLAG_INITIALIZED to indicate successful ops.init()") Signed-off-by: David Carlier <[email protected]> Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Junxiao Bi <[email protected]> Date: Mon Feb 23 15:27:28 2026 -0800 scsi: core: Fix refcount leak for tagset_refcnt commit 1ac22c8eae81366101597d48360718dff9b9d980 upstream. This leak will cause a hang when tearing down the SCSI host. For example, iscsid hangs with the following call trace: [130120.652718] scsi_alloc_sdev: Allocation failure during SCSI scanning, some SCSI devices might not be configured PID: 2528 TASK: ffff9d0408974e00 CPU: 3 COMMAND: "iscsid" #0 [ffffb5b9c134b9e0] __schedule at ffffffff860657d4 #1 [ffffb5b9c134ba28] schedule at ffffffff86065c6f #2 [ffffb5b9c134ba40] schedule_timeout at ffffffff86069fb0 #3 [ffffb5b9c134bab0] __wait_for_common at ffffffff8606674f #4 [ffffb5b9c134bb10] scsi_remove_host at ffffffff85bfe84b #5 [ffffb5b9c134bb30] iscsi_sw_tcp_session_destroy at ffffffffc03031c4 [iscsi_tcp] #6 [ffffb5b9c134bb48] iscsi_if_recv_msg at ffffffffc0292692 [scsi_transport_iscsi] #7 [ffffb5b9c134bb98] iscsi_if_rx at ffffffffc02929c2 [scsi_transport_iscsi] #8 [ffffb5b9c134bbf0] netlink_unicast at ffffffff85e551d6 #9 [ffffb5b9c134bc38] netlink_sendmsg at ffffffff85e554ef Fixes: 8fe4ce5836e9 ("scsi: core: Fix a use-after-free") Cc: [email protected] Signed-off-by: Junxiao Bi <[email protected]> Reviewed-by: Mike Christie <[email protected]> Reviewed-by: Bart Van Assche <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Mathias Krause <[email protected]> Date: Thu Feb 12 11:23:27 2026 -0800 scsi: lpfc: Properly set WC for DPP mapping [ Upstream commit bffda93a51b40afd67c11bf558dc5aae83ca0943 ] Using set_memory_wc() to enable write-combining for the DPP portion of the MMIO mapping is wrong as set_memory_*() is meant to operate on RAM only, not MMIO mappings. In fact, as used currently triggers a BUG_ON() with enabled CONFIG_DEBUG_VIRTUAL. Simply map the DPP region separately and in addition to the already existing mappings, avoiding any possible negative side effects for these. Fixes: 1351e69fc6db ("scsi: lpfc: Add push-to-adapter support to sli4") Signed-off-by: Mathias Krause <[email protected]> Signed-off-by: Justin Tee <[email protected]> Reviewed-by: Mathias Krause <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Salomon Dushimirimana <[email protected]> Date: Fri Feb 13 19:28:06 2026 +0000 scsi: pm8001: Fix use-after-free in pm8001_queue_command() [ Upstream commit 38353c26db28efd984f51d426eac2396d299cca7 ] Commit e29c47fe8946 ("scsi: pm8001: Simplify pm8001_task_exec()") refactors pm8001_queue_command(), however it introduces a potential cause of a double free scenario when it changes the function to return -ENODEV in case of phy down/device gone state. In this path, pm8001_queue_command() updates task status and calls task_done to indicate to upper layer that the task has been handled. However, this also frees the underlying SAS task. A -ENODEV is then returned to the caller. When libsas sas_ata_qc_issue() receives this error value, it assumes the task wasn't handled/queued by LLDD and proceeds to clean up and free the task again, resulting in a double free. Since pm8001_queue_command() handles the SAS task in this case, it should return 0 to the caller indicating that the task has been handled. Fixes: e29c47fe8946 ("scsi: pm8001: Simplify pm8001_task_exec()") Signed-off-by: Salomon Dushimirimana <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Prithvi Tambewagh <[email protected]> Date: Mon Feb 16 11:50:02 2026 +0530 scsi: target: Fix recursive locking in __configfs_open_file() commit 14d4ac19d1895397532eec407433c5d74d9da53b upstream. In flush_write_buffer, &p->frag_sem is acquired and then the loaded store function is called, which, here, is target_core_item_dbroot_store(). This function called filp_open(), following which these functions were called (in reverse order), according to the call trace: down_read __configfs_open_file do_dentry_open vfs_open do_open path_openat do_filp_open file_open_name filp_open target_core_item_dbroot_store flush_write_buffer configfs_write_iter target_core_item_dbroot_store() tries to validate the new file path by trying to open the file path provided to it; however, in this case, the bug report shows: db_root: not a directory: /sys/kernel/config/target/dbroot indicating that the same configfs file was tried to be opened, on which it is currently working on. Thus, it is trying to acquire frag_sem semaphore of the same file of which it already holds the semaphore obtained in flush_write_buffer(), leading to acquiring the semaphore in a nested manner and a possibility of recursive locking. Fix this by modifying target_core_item_dbroot_store() to use kern_path() instead of filp_open() to avoid opening the file using filesystem-specific function __configfs_open_file(), and further modifying it to make this fix compatible. Reported-by: [email protected] Closes: https://syzkaller.appspot.com/bug?extid=f6e8174215573a84b797 Tested-by: [email protected] Cc: [email protected] Signed-off-by: Prithvi Tambewagh <[email protected]> Reviewed-by: Dmitry Bogdanov <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Peter Wang <[email protected]> Date: Mon Feb 23 18:37:57 2026 +0800 scsi: ufs: core: Move link recovery for hibern8 exit failure to wl_resume [ Upstream commit 62c015373e1cdb1cdca824bd2dbce2dac0819467 ] Move the link recovery trigger from ufshcd_uic_pwr_ctrl() to __ufshcd_wl_resume(). Ensure link recovery is only attempted when hibern8 exit fails during resume, not during hibern8 enter in suspend. Improve error handling and prevent unnecessary link recovery attempts. Fixes: 35dabf4503b9 ("scsi: ufs: core: Use link recovery when h8 exit fails during runtime resume") Signed-off-by: Peter Wang <[email protected]> Reviewed-by: Bart Van Assche <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Yifan Wu <[email protected]> Date: Thu Mar 5 09:36:37 2026 +0800 selftest/arm64: Fix sve2p1_sigill() to hwcap test [ Upstream commit d87c828daa7ead9763416f75cc416496969cf1dc ] The FEAT_SVE2p1 is indicated by ID_AA64ZFR0_EL1.SVEver. However, the BFADD requires the FEAT_SVE_B16B16, which is indicated by ID_AA64ZFR0_EL1.B16B16. This could cause the test to incorrectly fail on a CPU that supports FEAT_SVE2.1 but not FEAT_SVE_B16B16. LD1Q Gather load quadwords which is decoded from SVE encodings and implied by FEAT_SVE2p1. Fixes: c5195b027d29 ("kselftest/arm64: Add SVE 2.1 to hwcap test") Signed-off-by: Yifan Wu <[email protected]> Reviewed-by: Mark Brown <[email protected]> Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Paul Chaignon <[email protected]> Date: Fri Feb 27 22:42:45 2026 +0100 selftests/bpf: Avoid simplification of crafted bounds test [ Upstream commit 024cea2d647ed8ab942f19544b892d324dba42b4 ] The reg_bounds_crafted tests validate the verifier's range analysis logic. They focus on the actual ranges and thus ignore the tnum. As a consequence, they carry the assumption that the tested cases can be reproduced in userspace without using the tnum information. Unfortunately, the previous change the refinement logic breaks that assumption for one test case: (u64)2147483648 (u32)<op> [4294967294; 0x100000000] The tested bytecode is shown below. Without our previous improvement, on the false branch of the condition, R7 is only known to have u64 range [0xfffffffe; 0x100000000]. With our improvement, and using the tnum information, we can deduce that R7 equals 0x100000000. 19: (bc) w0 = w6 ; R6=0x80000000 20: (bc) w0 = w7 ; R7=scalar(smin=umin=0xfffffffe,smax=umax=0x100000000,smin32=-2,smax32=0,var_off=(0x0; 0x1ffffffff)) 21: (be) if w6 <= w7 goto pc+3 ; R6=0x80000000 R7=0x100000000 R7's tnum is (0; 0x1ffffffff). On the false branch, regs_refine_cond_op refines R7's u32 range to [0; 0x7fffffff]. Then, __reg32_deduce_bounds refines the s32 range to 0 using u32 and finally also sets u32=0. From this, __reg_bound_offset improves the tnum to (0; 0x100000000). Finally, our previous patch uses this new tnum to deduce that it only intersect with u64=[0xfffffffe; 0x100000000] in a single value: 0x100000000. Because the verifier uses the tnum to reach this constant value, the selftest is unable to reproduce it by only simulating ranges. The solution implemented in this patch is to change the test case such that there is more than one overlap value between u64 and the tnum. The max. u64 value is thus changed from 0x100000000 to 0x300000000. Acked-by: Eduard Zingerman <[email protected]> Signed-off-by: Paul Chaignon <[email protected]> Link: https://lore.kernel.org/r/50641c6a7ef39520595dcafa605692427c1006ec.1772225741.git.paul.chaignon@gmail.com Signed-off-by: Alexei Starovoitov <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: T.J. Mercier <[email protected]> Date: Tue Feb 24 16:33:48 2026 -0800 selftests/bpf: Fix OOB read in dmabuf_collector [ Upstream commit 6881af27f9ea0f5ca8f606f573ef5cc25ca31fe4 ] Dmabuf name allocations can be less than DMA_BUF_NAME_LEN characters, but bpf_probe_read_kernel always tries to read exactly that many bytes. If a name is less than DMA_BUF_NAME_LEN characters, bpf_probe_read_kernel will read past the end. bpf_probe_read_kernel_str stops at the first NUL terminator so use it instead, like iter_dmabuf_for_each already does. Fixes: ae5d2c59ecd7 ("selftests/bpf: Add test for dmabuf_iter") Reported-by: Jerome Lee <[email protected]> Signed-off-by: T.J. Mercier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Sun Jian <[email protected]> Date: Wed Feb 25 19:14:50 2026 +0800 selftests/harness: order TEST_F and XFAIL_ADD constructors [ Upstream commit 6be2681514261324c8ee8a1c6f76cefdf700220f ] TEST_F() allocates and registers its struct __test_metadata via mmap() inside its constructor, and only then assigns the _##fixture_##test##_object pointer. XFAIL_ADD() runs in a constructor too and reads _##fixture_##test##_object to initialize xfail->test. If XFAIL_ADD runs first, xfail->test can be NULL and the expected failure will be reported as FAIL. Use constructor priorities to ensure TEST_F registration runs before XFAIL_ADD, without adding extra state or runtime lookups. Fixes: 2709473c9386 ("selftests: kselftest_harness: support using xfail") Signed-off-by: Sun Jian <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Matthieu Baerts (NGI0) <[email protected]> Date: Tue Mar 3 11:56:06 2026 +0100 selftests: mptcp: join: check removing signal+subflow endp commit 1777f349ff41b62dfe27454b69c27b0bc99ffca5 upstream. This validates the previous commit: endpoints with both the signal and subflow flags should always be marked as used even if it was not possible to create new subflows due to the MPTCP PM limits. For this test, an extra endpoint is created with both the signal and the subflow flags, and limits are set not to create extra subflows. In this case, an ADD_ADDR is sent, but no subflows are created. Still, the local endpoint is marked as used, and no warning is fired when removing the endpoint, after having sent a RM_ADDR. The 'Fixes' tag here below is the same as the one from the previous commit: this patch here is not fixing anything wrong in the selftests, but it validates the previous fix for an issue introduced by this commit ID. Fixes: 85df533a787b ("mptcp: pm: do not ignore 'subflow' if 'signal' flag is also set") Cc: [email protected] Reviewed-by: Mat Martineau <[email protected]> Signed-off-by: Matthieu Baerts (NGI0) <[email protected]> Link: https://patch.msgid.link/20260303-net-mptcp-misc-fixes-7-0-rc2-v1-5-4b5462b6f016@kernel.org Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Matthieu Baerts (NGI0) <[email protected]> Date: Tue Mar 3 11:56:04 2026 +0100 selftests: mptcp: join: check RM_ADDR not sent over same subflow commit 560edd99b5f58b2d4bbe3c8e51e1eed68d887b0e upstream. This validates the previous commit: RM_ADDR were sent over the first found active subflow which could be the same as the one being removed. It is more likely to loose this notification. For this check, RM_ADDR are explicitly dropped when trying to send them over the initial subflow, when removing the endpoint attached to it. If it is dropped, the test will complain because some RM_ADDR have not been received. Note that only the RM_ADDR are dropped, to allow the linked subflow to be quickly and cleanly closed. To only drop those RM_ADDR, a cBPF byte code is used. If the IPTables commands fail, that's OK, the tests will continue to pass, but not validate this part. This can be ignored: another subtest fully depends on such command, and will be marked as skipped. The 'Fixes' tag here below is the same as the one from the previous commit: this patch here is not fixing anything wrong in the selftests, but it validates the previous fix for an issue introduced by this commit ID. Fixes: 8dd5efb1f91b ("mptcp: send ack for rm_addr") Cc: [email protected] Reviewed-by: Mat Martineau <[email protected]> Signed-off-by: Matthieu Baerts (NGI0) <[email protected]> Link: https://patch.msgid.link/20260303-net-mptcp-misc-fixes-7-0-rc2-v1-3-4b5462b6f016@kernel.org Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Paolo Abeni <[email protected]> Date: Tue Mar 3 11:56:02 2026 +0100 selftests: mptcp: more stable simult_flows tests commit 8c09412e584d9bcc0e71d758ec1008d1c8d1a326 upstream. By default, the netem qdisc can keep up to 1000 packets under its belly to deal with the configured rate and delay. The simult flows test-case simulates very low speed links, to avoid problems due to slow CPUs and the TCP stack tend to transmit at a slightly higher rate than the (virtual) link constraints. All the above causes a relatively large amount of packets being enqueued in the netem qdiscs - the longer the transfer, the longer the queue - producing increasingly high TCP RTT samples and consequently increasingly larger receive buffer size due to DRS. When the receive buffer size becomes considerably larger than the needed size, the tests results can flake, i.e. because minimal inaccuracy in the pacing rate can lead to a single subflow usage towards the end of the connection for a considerable amount of data. Address the issue explicitly setting netem limits suitable for the configured link speeds and unflake all the affected tests. Fixes: 1a418cb8e888 ("mptcp: simult flow self-tests") Cc: [email protected] Signed-off-by: Paolo Abeni <[email protected]> Reviewed-by: Matthieu Baerts (NGI0) <[email protected]> Signed-off-by: Matthieu Baerts (NGI0) <[email protected]> Link: https://patch.msgid.link/20260303-net-mptcp-misc-fixes-7-0-rc2-v1-1-4b5462b6f016@kernel.org Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Vlastimil Babka <[email protected]> Date: Wed Nov 5 10:05:32 2025 +0100 slub: remove CONFIG_SLUB_TINY specific code paths [ Upstream commit 31e0886fd57d426d18a239dd55e176032c9c1cb0 ] CONFIG_SLUB_TINY minimizes the SLUB's memory overhead in multiple ways, mainly by avoiding percpu caching of slabs and objects. It also reduces code size by replacing some code paths with simplified ones through ifdefs, but the benefits of that are smaller and would complicate the upcoming changes. Thus remove these code paths and associated ifdefs and simplify the code base. Link: https://patch.msgid.link/[email protected] Reviewed-by: Harry Yoo <[email protected]> Signed-off-by: Vlastimil Babka <[email protected]> Stable-dep-of: a1e244a9f177 ("mm/slab: use prandom if !allow_spin") Signed-off-by: Sasha Levin <[email protected]>
Author: ZhangGuoDong <[email protected]> Date: Tue Mar 3 15:13:11 2026 +0000 smb/client: fix buffer size for smb311_posix_qinfo in smb2_compound_op() [ Upstream commit 12c43a062acb0ac137fc2a4a106d4d084b8c5416 ] Use `sizeof(struct smb311_posix_qinfo)` instead of sizeof its pointer, so the allocated buffer matches the actual struct size. Fixes: 6a5f6592a0b6 ("SMB311: Add support for query info using posix extensions (level 100)") Reported-by: ChenXiaoSong <[email protected]> Signed-off-by: ZhangGuoDong <[email protected]> Reviewed-by: ChenXiaoSong <[email protected]> Signed-off-by: Steve French <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: ZhangGuoDong <[email protected]> Date: Tue Mar 3 15:13:12 2026 +0000 smb/client: fix buffer size for smb311_posix_qinfo in SMB311_posix_query_info() [ Upstream commit 9621b996e4db1dbc2b3dc5d5910b7d6179397320 ] SMB311_posix_query_info() is currently unused, but it may still be used in some stable versions, so these changes are submitted as a separate patch. Use `sizeof(struct smb311_posix_qinfo)` instead of sizeof its pointer, so the allocated buffer matches the actual struct size. Fixes: b1bc1874b885 ("smb311: Add support for SMB311 query info (non-compounded)") Reported-by: ChenXiaoSong <[email protected]> Signed-off-by: ZhangGuoDong <[email protected]> Reviewed-by: ChenXiaoSong <[email protected]> Signed-off-by: Steve French <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Thorsten Blum <[email protected]> Date: Thu Feb 26 22:28:45 2026 +0100 smb: client: Don't log plaintext credentials in cifs_set_cifscreds commit 2f37dc436d4e61ff7ae0b0353cf91b8c10396e4d upstream. When debug logging is enabled, cifs_set_cifscreds() logs the key payload and exposes the plaintext username and password. Remove the debug log to avoid exposing credentials. Fixes: 8a8798a5ff90 ("cifs: fetch credentials out of keyring for non-krb5 auth multiuser mounts") Cc: [email protected] Acked-by: Paulo Alcantara (Red Hat) <[email protected]> Signed-off-by: Thorsten Blum <[email protected]> Signed-off-by: Steve French <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Paulo Alcantara <[email protected]> Date: Wed Feb 25 21:34:55 2026 -0300 smb: client: fix broken multichannel with krb5+signing commit d9d1e319b39ea685ede59319002d567c159d23c3 upstream. When mounting a share with 'multichannel,max_channels=n,sec=krb5i', the client was duplicating signing key for all secondary channels, thus making the server fail all commands sent from secondary channels due to bad signatures. Every channel has its own signing key, so when establishing a new channel with krb5 auth, make sure to use the new session key as the derived key to generate channel's signing key in SMB2_auth_kerberos(). Repro: $ mount.cifs //srv/share /mnt -o multichannel,max_channels=4,sec=krb5i $ sleep 5 $ umount /mnt $ dmesg ... CIFS: VFS: sign fail cmd 0x5 message id 0x2 CIFS: VFS: \\srv SMB signature verification returned error = -13 CIFS: VFS: sign fail cmd 0x5 message id 0x2 CIFS: VFS: \\srv SMB signature verification returned error = -13 CIFS: VFS: sign fail cmd 0x4 message id 0x2 CIFS: VFS: \\srv SMB signature verification returned error = -13 Reported-by: Xiaoli Feng <[email protected]> Reviewed-by: Enzo Matsumiya <[email protected]> Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]> Cc: David Howells <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Steve French <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Henrique Carvalho <[email protected]> Date: Sat Feb 21 01:59:44 2026 -0300 smb: client: fix cifs_pick_channel when channels are equally loaded commit 663c28469d3274d6456f206a6671c91493d85ff1 upstream. cifs_pick_channel uses (start % chan_count) when channels are equally loaded, but that can return a channel that failed the eligibility checks. Drop the fallback and return the scan-selected channel instead. If none is eligible, keep the existing behavior of using the primary channel. Signed-off-by: Henrique Carvalho <[email protected]> Acked-by: Paulo Alcantara (Red Hat) <[email protected]> Acked-by: Meetakshi Setiya <[email protected]> Reviewed-by: Shyam Prasad N <[email protected]> Cc: [email protected] Signed-off-by: Steve French <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Paulo Alcantara <[email protected]> Date: Thu Mar 5 21:57:06 2026 -0300 smb: client: fix oops due to uninitialised var in smb2_unlink() commit 048efe129a297256d3c2088cf8d79515ff5ec864 upstream. If SMB2_open_init() or SMB2_close_init() fails (e.g. reconnect), the iovs set @rqst will be left uninitialised, hence calling SMB2_open_free(), SMB2_close_free() or smb2_set_related() on them will oops. Fix this by initialising @close_iov and @open_iov before setting them in @rqst. Reported-by: Thiago Becker <[email protected]> Fixes: 1cf9f2a6a544 ("smb: client: handle unlink(2) of files open by different clients") Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]> Cc: David Howells <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Steve French <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Alain Volmat <[email protected]> Date: Tue Feb 24 16:09:22 2026 +0100 spi: stm32: fix missing pointer assignment in case of dma chaining [ Upstream commit e96493229a6399e902062213c6381162464cdd50 ] Commit c4f2c05ab029 ("spi: stm32: fix pointer-to-pointer variables usage") introduced a regression since dma descriptors generated as part of the stm32_spi_prepare_rx_dma_mdma_chaining function are not well propagated to the caller function, leading to mdma-dma chaining being no more functional. Fixes: c4f2c05ab029 ("spi: stm32: fix pointer-to-pointer variables usage") Signed-off-by: Alain Volmat <[email protected]> Acked-by: Antonio Quartulli <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Phillip Lougher <[email protected]> Date: Tue Feb 17 05:09:55 2026 +0000 Squashfs: check metadata block offset is within range commit fdb24a820a5832ec4532273282cbd4f22c291a0d upstream. Syzkaller reports a "general protection fault in squashfs_copy_data" This is ultimately caused by a corrupted index look-up table, which produces a negative metadata block offset. This is subsequently passed to squashfs_copy_data (via squashfs_read_metadata) where the negative offset causes an out of bounds access. The fix is to check that the offset is within range in squashfs_read_metadata. This will trap this and other cases. Link: https://lkml.kernel.org/r/[email protected] Fixes: f400e12656ab ("Squashfs: cache operations") Reported-by: [email protected] Closes: https://lore.kernel.org/all/[email protected]/ Signed-off-by: Phillip Lougher <[email protected]> Cc: Christian Brauner <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Jakub Kicinski <[email protected]> Date: Thu Feb 26 16:33:59 2026 -0800 tcp: give up on stronger sk_rcvbuf checks (for now) [ Upstream commit 026dfef287c07f37d4d4eef7a0b5a4bfdb29b32d ] We hit another corner case which leads to TcpExtTCPRcvQDrop Connections which send RPCs in the 20-80kB range over loopback experience spurious drops. The exact conditions for most of the drops I investigated are that: - socket exchanged >1MB of data so its not completely fresh - rcvbuf is around 128kB (default, hasn't grown) - there is ~60kB of data in rcvq - skb > 64kB arrives The sum of skb->len (!) of both of the skbs (the one already in rcvq and the arriving one) is larger than rwnd. My suspicion is that this happens because __tcp_select_window() rounds the rwnd up to (1 << wscale) if less than half of the rwnd has been consumed. Eric suggests that given the number of Fixes we already have pointing to 1d2fbaad7cd8 it's probably time to give up on it, until a bigger revamp of rmem management. Also while we could risk tweaking the rwnd math, there are other drops on workloads I investigated, after the commit in question, not explained by this phenomenon. Suggested-by: Eric Dumazet <[email protected]> Link: https://lore.kernel.org/[email protected] Fixes: 1d2fbaad7cd8 ("tcp: stronger sk_rcvbuf checks") Reviewed-by: Kuniyuki Iwashima <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Eric Dumazet <[email protected]> Date: Mon Mar 2 20:55:27 2026 +0000 tcp: secure_seq: add back ports to TS offset [ Upstream commit 165573e41f2f66ef98940cf65f838b2cb575d9d1 ] This reverts 28ee1b746f49 ("secure_seq: downgrade to per-host timestamp offsets") tcp_tw_recycle went away in 2017. Zhouyan Deng reported off-path TCP source port leakage via SYN cookie side-channel that can be fixed in multiple ways. One of them is to bring back TCP ports in TS offset randomization. As a bonus, we perform a single siphash() computation to provide both an ISN and a TS offset. Fixes: 28ee1b746f49 ("secure_seq: downgrade to per-host timestamp offsets") Reported-by: Zhouyan Deng <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Reviewed-by: Kuniyuki Iwashima <[email protected]> Acked-by: Florian Westphal <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Miroslav Lichvar <[email protected]> Date: Wed Feb 25 09:51:35 2026 +0100 timekeeping: Fix timex status validation for auxiliary clocks [ Upstream commit e48a869957a70cc39b4090cd27c36a86f8db9b92 ] The timekeeping_validate_timex() function validates the timex status of an auxiliary system clock even when the status is not to be changed, which causes unexpected errors for applications that make read-only clock_adjtime() calls, or set some other timex fields, but without clearing the status field. Do the AUX-specific status validation only when the modes field contains ADJ_STATUS, i.e. the application is actually trying to change the status. This makes the AUX-specific clock_adjtime() behavior consistent with CLOCK_REALTIME. Fixes: 4eca49d0b621 ("timekeeping: Prepare do_adtimex() for auxiliary clocks") Signed-off-by: Miroslav Lichvar <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
Author: Guenter Roeck <[email protected]> Date: Thu Mar 5 11:33:39 2026 -0800 tracing: Add NULL pointer check to trigger_data_free() [ Upstream commit 457965c13f0837a289c9164b842d0860133f6274 ] If trigger_data_alloc() fails and returns NULL, event_hist_trigger_parse() jumps to the out_free error path. While kfree() safely handles a NULL pointer, trigger_data_free() does not. This causes a NULL pointer dereference in trigger_data_free() when evaluating data->cmd_ops->set_filter. Fix the problem by adding a NULL pointer check to trigger_data_free(). The problem was found by an experimental code review agent based on gemini-3.1-pro while reviewing backports into v6.18.y. Cc: Miaoqian Lin <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Steven Rostedt (Google) <[email protected]> Link: https://patch.msgid.link/[email protected] Fixes: 0550069cc25f ("tracing: Properly process error handling in event_hist_trigger_parse()") Assisted-by: Gemini:gemini-3.1-pro Signed-off-by: Guenter Roeck <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Qing Wang <[email protected]> Date: Fri Feb 27 10:58:42 2026 +0800 tracing: Fix WARN_ON in tracing_buffers_mmap_close commit e39bb9e02b68942f8e9359d2a3efe7d37ae6be0e upstream. When a process forks, the child process copies the parent's VMAs but the user_mapped reference count is not incremented. As a result, when both the parent and child processes exit, tracing_buffers_mmap_close() is called twice. On the second call, user_mapped is already 0, causing the function to return -ENODEV and triggering a WARN_ON. Normally, this isn't an issue as the memory is mapped with VM_DONTCOPY set. But this is only a hint, and the application can call madvise(MADVISE_DOFORK) which resets the VM_DONTCOPY flag. When the application does that, it can trigger this issue on fork. Fix it by incrementing the user_mapped reference count without re-mapping the pages in the VMA's open callback. Cc: [email protected] Cc: Masami Hiramatsu <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Vincent Donnefort <[email protected]> Cc: Lorenzo Stoakes <[email protected]> Link: https://patch.msgid.link/[email protected] Fixes: cf9f0f7c4c5bb ("tracing: Allow user-space mapping of the ring-buffer") Reported-by: [email protected] Closes: https://syzkaller.appspot.com/bug?extid=3b5dd2030fe08afdf65d Tested-by: [email protected] Signed-off-by: Qing Wang <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Kuniyuki Iwashima <[email protected]> Date: Fri Feb 27 03:55:35 2026 +0000 udp: Unhash auto-bound connected sk from 4-tuple hash table when disconnected. [ Upstream commit 6996a2d2d0a64808c19c98002aeb5d9d1b2df6a4 ] Let's say we bind() an UDP socket to the wildcard address with a non-zero port, connect() it to an address, and disconnect it from the address. bind() sets SOCK_BINDPORT_LOCK on sk->sk_userlocks (but not SOCK_BINDADDR_LOCK), and connect() calls udp_lib_hash4() to put the socket into the 4-tuple hash table. Then, __udp_disconnect() calls sk->sk_prot->rehash(sk). It computes a new hash based on the wildcard address and moves the socket to a new slot in the 4-tuple hash table, leaving a garbage in the chain that no packet hits. Let's remove such a socket from 4-tuple hash table when disconnected. Note that udp_sk(sk)->udp_portaddr_hash needs to be udpated after udp_hash4_dec(hslot2) in udp_unhash4(). Fixes: 78c91ae2c6de ("ipv4/udp: Add 4-tuple hash for connected socket") Signed-off-by: Kuniyuki Iwashima <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Peter Zijlstra <[email protected]> Date: Tue Sep 23 13:27:34 2025 +0200 unwind: Implement compat fp unwind [ Upstream commit c79dd946e370af3537edb854f210cba3a94b4516 ] It is important to be able to unwind compat tasks too. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://patch.msgid.link/[email protected] Stable-dep-of: d55c571e4333 ("x86/uprobes: Fix XOL allocation failure for 32-bit tasks") Signed-off-by: Sasha Levin <[email protected]>
Author: Peter Zijlstra <[email protected]> Date: Tue Sep 23 13:04:09 2025 +0200 unwind: Simplify unwind_user_next_fp() alignment check [ Upstream commit 5578534e4b92350995a20068f2e6ea3186c62d7f ] 2^log_2(n) == n Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Steven Rostedt (Google) <[email protected]> Link: https://patch.msgid.link/[email protected] Stable-dep-of: d55c571e4333 ("x86/uprobes: Fix XOL allocation failure for 32-bit tasks") Signed-off-by: Sasha Levin <[email protected]>
Author: Josh Poimboeuf <[email protected]> Date: Wed Aug 27 15:36:45 2025 -0400 unwind_user/x86: Enable frame pointer unwinding on x86 [ Upstream commit 49cf34c0815f93fb2ea3ab5cfbac1124bd9b45d0 ] Use ARCH_INIT_USER_FP_FRAME to describe how frame pointers are unwound on x86, and enable CONFIG_HAVE_UNWIND_USER_FP accordingly so the unwind_user interfaces can be used. Signed-off-by: Josh Poimboeuf <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://patch.msgid.link/[email protected] Stable-dep-of: d55c571e4333 ("x86/uprobes: Fix XOL allocation failure for 32-bit tasks") Signed-off-by: Sasha Levin <[email protected]>
Author: Peter Zijlstra <[email protected]> Date: Fri Oct 24 12:31:10 2025 +0200 unwind_user/x86: Teach FP unwind about start of function [ Upstream commit ae25884ad749e7f6e0c3565513bdc8aa2554a425 ] When userspace is interrupted at the start of a function, before we get a chance to complete the frame, unwind will miss one caller. X86 has a uprobe specific fixup for this, add bits to the generic unwinder to support this. Suggested-by: Jens Remus <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://patch.msgid.link/[email protected] Stable-dep-of: d55c571e4333 ("x86/uprobes: Fix XOL allocation failure for 32-bit tasks") Signed-off-by: Sasha Levin <[email protected]>
Author: Kuen-Han Tsai <[email protected]> Date: Tue Dec 30 18:13:16 2025 +0800 usb: gadget: f_ncm: align net_device lifecycle with bind/unbind [ Upstream commit 56a512a9b4107079f68701e7d55da8507eb963d9 ] Currently, the net_device is allocated in ncm_alloc_inst() and freed in ncm_free_inst(). This ties the network interface's lifetime to the configuration instance rather than the USB connection (bind/unbind). This decoupling causes issues when the USB gadget is disconnected where the underlying gadget device is removed. The net_device can outlive its parent, leading to dangling sysfs links and NULL pointer dereferences when accessing the freed gadget device. Problem 1: NULL pointer dereference on disconnect Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 Call trace: __pi_strlen+0x14/0x150 rtnl_fill_ifinfo+0x6b4/0x708 rtmsg_ifinfo_build_skb+0xd8/0x13c rtmsg_ifinfo+0x50/0xa0 __dev_notify_flags+0x4c/0x1f0 dev_change_flags+0x54/0x70 do_setlink+0x390/0xebc rtnl_newlink+0x7d0/0xac8 rtnetlink_rcv_msg+0x27c/0x410 netlink_rcv_skb+0x134/0x150 rtnetlink_rcv+0x18/0x28 netlink_unicast+0x254/0x3f0 netlink_sendmsg+0x2e0/0x3d4 Problem 2: Dangling sysfs symlinks console:/ # ls -l /sys/class/net/ncm0 lrwxrwxrwx ... /sys/class/net/ncm0 -> /sys/devices/platform/.../gadget.0/net/ncm0 console:/ # ls -l /sys/devices/platform/.../gadget.0/net/ncm0 ls: .../gadget.0/net/ncm0: No such file or directory Move the net_device allocation to ncm_bind() and deallocation to ncm_unbind(). This ensures the network interface exists only when the gadget function is actually bound to a configuration. To support pre-bind configuration (e.g., setting interface name or MAC address via configfs), cache user-provided options in f_ncm_opts using the gether_opts structure. Apply these cached settings to the net_device upon creation in ncm_bind(). Preserve the use-after-free fix from commit 6334b8e4553c ("usb: gadget: f_ncm: Fix UAF ncm object at re-bind after usb ep transport error"). Check opts->net in ncm_set_alt() and ncm_disable() to ensure gether_disconnect() runs only if a connection was established. Fixes: 40d133d7f542 ("usb: gadget: f_ncm: convert to new function interface with backward compatibility") Cc: [email protected] Signed-off-by: Kuen-Han Tsai <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Kuen-Han Tsai <[email protected]> Date: Tue Dec 30 18:13:15 2025 +0800 usb: gadget: u_ether: Add auto-cleanup helper for freeing net_device [ Upstream commit 0c0981126b99288ed354d3d414c8a5fd42ac9e25 ] The net_device in the u_ether framework currently requires explicit calls to unregister and free the device. Introduce gether_unregister_free_netdev() and the corresponding auto-cleanup macro. This ensures that if a net_device is registered, it is properly unregistered and the associated work queue is flushed before the memory is freed. This is a preparatory patch to simplify error handling paths in gadget drivers by removing the need for explicit goto labels for net_device cleanup. Signed-off-by: Kuen-Han Tsai <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]> Stable-dep-of: 56a512a9b410 ("usb: gadget: f_ncm: align net_device lifecycle with bind/unbind") Signed-off-by: Sasha Levin <[email protected]>
Author: Kuen-Han Tsai <[email protected]> Date: Tue Dec 30 18:13:14 2025 +0800 usb: gadget: u_ether: add gether_opts for config caching [ Upstream commit e065c6a7e46c2ee9c677fdbf50035323d2de1215 ] Currently, the net_device is allocated when the function instance is created (e.g., in ncm_alloc_inst()). While this allows userspace to configure the device early, it decouples the net_device lifecycle from the actual USB connection state (bind/unbind). The goal is to defer net_device creation to the bind callback to properly align the lifecycle with its parent gadget device. However, deferring net_device allocation would prevent userspace from configuring parameters (like interface name or MAC address) before the net_device exists. Introduce a new structure, struct gether_opts, associated with the usb_function_instance, to cache settings independently of the net_device. These settings include the interface name pattern, MAC addresses (device and host), queue multiplier, and address assignment type. New helper functions are added: - gether_setup_opts_default(): Initializes struct gether_opts with defaults, including random MAC addresses. - gether_apply_opts(): Applies the cached options from a struct gether_opts to a valid net_device. To expose these options to userspace, new configfs macros (USB_ETHER_OPTS_ITEM and USB_ETHER_OPTS_ATTR_*) are defined in u_ether_configfs.h. These attributes are part of the function instance's configfs group. This refactoring is a preparatory step. It allows the subsequent patch to safely move the net_device allocation from the instance creation phase to the bind phase without losing the ability to pre-configure the interface via configfs. Signed-off-by: Kuen-Han Tsai <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]> Stable-dep-of: 56a512a9b410 ("usb: gadget: f_ncm: align net_device lifecycle with bind/unbind") Signed-off-by: Sasha Levin <[email protected]>
Author: Daniil Dulov <[email protected]> Date: Wed Feb 11 11:20:24 2026 +0300 wifi: cfg80211: cancel rfkill_block work in wiphy_unregister() commit 767d23ade706d5fa51c36168e92a9c5533c351a1 upstream. There is a use-after-free error in cfg80211_shutdown_all_interfaces found by syzkaller: BUG: KASAN: use-after-free in cfg80211_shutdown_all_interfaces+0x213/0x220 Read of size 8 at addr ffff888112a78d98 by task kworker/0:5/5326 CPU: 0 UID: 0 PID: 5326 Comm: kworker/0:5 Not tainted 6.19.0-rc2 #2 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Workqueue: events cfg80211_rfkill_block_work Call Trace: <TASK> dump_stack_lvl+0x116/0x1f0 print_report+0xcd/0x630 kasan_report+0xe0/0x110 cfg80211_shutdown_all_interfaces+0x213/0x220 cfg80211_rfkill_block_work+0x1e/0x30 process_one_work+0x9cf/0x1b70 worker_thread+0x6c8/0xf10 kthread+0x3c5/0x780 ret_from_fork+0x56d/0x700 ret_from_fork_asm+0x1a/0x30 </TASK> The problem arises due to the rfkill_block work is not cancelled when wiphy is being unregistered. In order to fix the issue cancel the corresponding work in wiphy_unregister(). Found by Linux Verification Center (linuxtesting.org) with Syzkaller. Fixes: 1f87f7d3a3b4 ("cfg80211: add rfkill support") Cc: [email protected] Signed-off-by: Daniil Dulov <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Bart Van Assche <[email protected]> Date: Mon Feb 23 14:00:24 2026 -0800 wifi: cw1200: Fix locking in error paths [ Upstream commit d98c24617a831e92e7224a07dcaed2dd0b02af96 ] cw1200_wow_suspend() must only return with priv->conf_mutex locked if it returns zero. This mutex must be unlocked if an error is returned. Add mutex_unlock() calls to the error paths from which that call is missing. This has been detected by the Clang thread-safety analyzer. Fixes: a910e4a94f69 ("cw1200: add driver for the ST-E CW1100 & CW1200 WLAN chipsets") Signed-off-by: Bart Van Assche <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Daniel Hodges <[email protected]> Date: Fri Feb 6 14:53:56 2026 -0500 wifi: libertas: fix use-after-free in lbs_free_adapter() commit 03cc8f90d0537fcd4985c3319b4fafbf2e3fb1f0 upstream. The lbs_free_adapter() function uses timer_delete() (non-synchronous) for both command_timer and tx_lockup_timer before the structure is freed. This is incorrect because timer_delete() does not wait for any running timer callback to complete. If a timer callback is executing when lbs_free_adapter() is called, the callback will access freed memory since lbs_cfg_free() frees the containing structure immediately after lbs_free_adapter() returns. Both timer callbacks (lbs_cmd_timeout_handler and lbs_tx_lockup_handler) access priv->driver_lock, priv->cur_cmd, priv->dev, and other fields, which would all be use-after-free violations. Use timer_delete_sync() instead to ensure any running timer callback has completed before returning. This bug was introduced in commit 8f641d93c38a ("libertas: detect TX lockups and reset hardware") where del_timer() was used instead of del_timer_sync() in the cleanup path. The command_timer has had the same issue since the driver was first written. Fixes: 8f641d93c38a ("libertas: detect TX lockups and reset hardware") Fixes: 954ee164f4f4 ("[PATCH] libertas: reorganize and simplify init sequence") Cc: [email protected] Signed-off-by: Daniel Hodges <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Ariel Silver <[email protected]> Date: Fri Feb 20 10:11:29 2026 +0000 wifi: mac80211: bounds-check link_id in ieee80211_ml_reconfiguration commit 162d331d833dc73a3e905a24c44dd33732af1fc5 upstream. link_id is taken from the ML Reconfiguration element (control & 0x000f), so it can be 0..15. link_removal_timeout[] has IEEE80211_MLD_MAX_NUM_LINKS (15) elements, so index 15 is out-of-bounds. Skip subelements with link_id >= IEEE80211_MLD_MAX_NUM_LINKS to avoid a stack out-of-bounds write. Fixes: 8eb8dd2ffbbb ("wifi: mac80211: Support link removal using Reconfiguration ML element") Reported-by: Ariel Silver <[email protected]> Signed-off-by: Ariel Silver <[email protected]> Cc: [email protected] Link: https://patch.msgid.link/[email protected] Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Vahagn Vardanian <[email protected]> Date: Mon Feb 23 00:00:00 2026 +0000 wifi: mac80211: fix NULL pointer dereference in mesh_rx_csa_frame() commit 017c1792525064a723971f0216e6ef86a8c7af11 upstream. In mesh_rx_csa_frame(), elems->mesh_chansw_params_ie is dereferenced at lines 1638 and 1642 without a prior NULL check: ifmsh->chsw_ttl = elems->mesh_chansw_params_ie->mesh_ttl; ... pre_value = le16_to_cpu(elems->mesh_chansw_params_ie->mesh_pre_value); The mesh_matches_local() check above only validates the Mesh ID, Mesh Configuration, and Supported Rates IEs. It does not verify the presence of the Mesh Channel Switch Parameters IE (element ID 118). When a received CSA action frame omits that IE, ieee802_11_parse_elems() leaves elems->mesh_chansw_params_ie as NULL, and the unconditional dereference causes a kernel NULL pointer dereference. A remote mesh peer with an established peer link (PLINK_ESTAB) can trigger this by sending a crafted SPECTRUM_MGMT/CHL_SWITCH action frame that includes a matching Mesh ID and Mesh Configuration IE but omits the Mesh Channel Switch Parameters IE. No authentication beyond the default open mesh peering is required. Crash confirmed on kernel 6.17.0-5-generic via mac80211_hwsim: BUG: kernel NULL pointer dereference, address: 0000000000000000 Oops: Oops: 0000 [#1] SMP NOPTI RIP: 0010:ieee80211_mesh_rx_queued_mgmt+0x143/0x2a0 [mac80211] CR2: 0000000000000000 Fix by adding a NULL check for mesh_chansw_params_ie after mesh_matches_local() returns, consistent with how other optional IEs are guarded throughout the mesh code. The bug has been present since v3.13 (released 2014-01-19). Fixes: 8f2535b92d68 ("mac80211: process the CSA frame for mesh accordingly") Cc: [email protected] Signed-off-by: Vahagn Vardanian <[email protected]> Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Lorenzo Bianconi <[email protected]> Date: Thu Feb 26 20:11:16 2026 +0100 wifi: mt76: Fix possible oob access in mt76_connac2_mac_write_txwi_80211() [ Upstream commit 4e10a730d1b511ff49723371ed6d694dd1b2c785 ] Check frame length before accessing the mgmt fields in mt76_connac2_mac_write_txwi_80211 in order to avoid a possible oob access. Fixes: 577dbc6c656d ("mt76: mt7915: enable offloading of sequence number assignment") Signed-off-by: Lorenzo Bianconi <[email protected]> Link: https://patch.msgid.link/[email protected] [fix check to also cover mgmt->u.action.u.addba_req.capab, correct Fixes tag] Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Lorenzo Bianconi <[email protected]> Date: Thu Feb 26 20:11:15 2026 +0100 wifi: mt76: mt7925: Fix possible oob access in mt7925_mac_write_txwi_80211() [ Upstream commit c41a9abd6ae31d130e8f332e7c8800c4c866234b ] Check frame length before accessing the mgmt fields in mt7925_mac_write_txwi_80211 in order to avoid a possible oob access. Fixes: c948b5da6bbec ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 chips") Signed-off-by: Lorenzo Bianconi <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Lorenzo Bianconi <[email protected]> Date: Thu Feb 26 20:11:14 2026 +0100 wifi: mt76: mt7996: Fix possible oob access in mt7996_mac_write_txwi_80211() [ Upstream commit 60862846308627e9e15546bb647a00de44deb27b ] Check frame length before accessing the mgmt fields in mt7996_mac_write_txwi_80211 in order to avoid a possible oob access. Fixes: 98686cd21624c ("wifi: mt76: mt7996: add driver for MediaTek Wi-Fi 7 (802.11be) devices") Signed-off-by: Lorenzo Bianconi <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Johannes Berg <[email protected]> Date: Tue Feb 17 13:05:26 2026 +0100 wifi: radiotap: reject radiotap with unknown bits commit c854758abe0b8d86f9c43dc060ff56a0ee5b31e0 upstream. The radiotap parser is currently only used with the radiotap namespace (not with vendor namespaces), but if the undefined field 18 is used, the alignment/size is unknown as well. In this case, iterator->_next_ns_data isn't initialized (it's only set for skipping vendor namespaces), and syzbot points out that we later compare against this uninitialized value. Fix this by moving the rejection of unknown radiotap fields down to after the in-namespace lookup, so it will really use iterator->_next_ns_data only for vendor namespaces, even in case undefined fields are present. Cc: [email protected] Fixes: 33e5a2f776e3 ("wireless: update radiotap parser") Reported-by: [email protected] Closes: https://lore.kernel.org/r/[email protected] Link: https://patch.msgid.link/[email protected] Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Sebastian Krzyszkowiak <[email protected]> Date: Sat Feb 21 17:28:04 2026 +0100 wifi: rsi: Don't default to -EOPNOTSUPP in rsi_mac80211_config [ Upstream commit d973b1039ccde6b241b438d53297edce4de45b5c ] This triggers a WARN_ON in ieee80211_hw_conf_init and isn't the expected behavior from the driver - other drivers default to 0 too. Fixes: 0a44dfc07074 ("wifi: mac80211: simplify non-chanctx drivers") Signed-off-by: Sebastian Krzyszkowiak <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Bart Van Assche <[email protected]> Date: Mon Feb 23 14:00:25 2026 -0800 wifi: wlcore: Fix a locking bug [ Upstream commit 72c6df8f284b3a49812ce2ac136727ace70acc7c ] Make sure that wl->mutex is locked before it is unlocked. This has been detected by the Clang thread-safety analyzer. Fixes: 45aa7f071b06 ("wlcore: Use generic runtime pm calls for wowlan elp configuration") Signed-off-by: Bart Van Assche <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Yazen Ghannam <[email protected]> Date: Tue Nov 11 14:53:57 2025 +0000 x86/acpi/boot: Correct acpi_is_processor_usable() check again [ Upstream commit adbf61cc47cb72b102682e690ad323e1eda652c2 ] ACPI v6.3 defined a new "Online Capable" MADT LAPIC flag. This bit is used in conjunction with the "Enabled" MADT LAPIC flag to determine if a CPU can be enabled/hotplugged by the OS after boot. Before the new bit was defined, the "Enabled" bit was explicitly described like this (ACPI v6.0 wording provided): "If zero, this processor is unusable, and the operating system support will not attempt to use it" This means that CPU hotplug (based on MADT) is not possible. Many BIOS implementations follow this guidance. They may include LAPIC entries in MADT for unavailable CPUs, but since these entries are marked with "Enabled=0" it is expected that the OS will completely ignore these entries. However, QEMU will do the same (include entries with "Enabled=0") for the purpose of allowing CPU hotplug within the guest. Comment from QEMU function pc_madt_cpu_entry(): /* ACPI spec says that LAPIC entry for non present * CPU may be omitted from MADT or it must be marked * as disabled. However omitting non present CPU from * MADT breaks hotplug on linux. So possible CPUs * should be put in MADT but kept disabled. */ Recent Linux topology changes broke the QEMU use case. A following fix for the QEMU use case broke bare metal topology enumeration. Rework the Linux MADT LAPIC flags check to allow the QEMU use case only for guests and to maintain the ACPI spec behavior for bare metal. Remove an unnecessary check added to fix a bare metal case introduced by the QEMU "fix". [ bp: Change logic as Michal suggested. ] [ mingo: Removed misapplied -stable tag. ] Fixes: fed8d8773b8e ("x86/acpi/boot: Correct acpi_is_processor_usable() check") Fixes: f0551af02130 ("x86/topology: Ignore non-present APIC IDs in a present package") Closes: https://lore.kernel.org/r/[email protected] Reported-by: Michal Pecio <[email protected]> Signed-off-by: Yazen Ghannam <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Tested-by: Michal Pecio <[email protected]> Tested-by: Ricardo Neri <[email protected]> Link: https://lore.kernel.org/[email protected] Cc: [email protected] Signed-off-by: Sasha Levin <[email protected]>
Author: Tom Lendacky <[email protected]> Date: Wed Feb 4 09:01:00 2026 -0600 x86/boot/sev: Move SEV decompressor variables into the .data section commit 4ca191cec17a997d0e3b2cd312f3a884288acc27 upstream. As part of the work to remove the dependency on calling into the decompressor code (startup_64()) for a UEFI boot, a call to rmpadjust() was removed from sev_enable() in favor of checking the value of the snp_vmpl variable. When booting through a non-UEFI path and calling startup_64(), the call to sev_enable() is performed before the BSS section is zeroed. With the removal of the rmpadjust() call and the corresponding check of the return code, the snp_vmpl variable is checked. Since the kernel is running at VMPL0, the snp_vmpl variable will not have been set and should be the default value of 0. However, since the call occurs before the BSS is zeroed, the snp_vmpl variable may not actually be zero, which will cause the guest boot to fail. Since the decompressor relocates itself, the BSS would need to be cleared both before and after the relocation, but this would, in effect, cause all of the changes to BSS variables before relocation to be lost after relocation. Instead, move the snp_vmpl variable into the .data section so that it is initialized and the value made safe during relocation. As a pre-caution against future changes, move other SEV-related decompressor variables into the .data section, too. Fixes: 68a501d7fd82 ("x86/boot: Drop redundant RMPADJUST in SEV SVSM presence check") Signed-off-by: Tom Lendacky <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Ard Biesheuvel <[email protected]> Reviewed-by: Changyuan Lyu <[email protected]> Tested-by: Kevin Hui <[email protected]> Tested-by: Changyuan Lyu <[email protected]> Cc: [email protected] Link: https://patch.msgid.link/5648b7de5b0a5d0dfef3785f9582b718678c6448.1770217260.git.thomas.lendacky@amd.com Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Jan Stancek <[email protected]> Date: Wed Feb 25 20:30:23 2026 +0100 x86/boot: Handle relative CONFIG_EFI_SBAT_FILE file paths commit 3d1973a0c76a78a4728cff13648a188ed486cf44 upstream. CONFIG_EFI_SBAT_FILE can be a relative path. When compiling using a different output directory (O=) the build currently fails because it can't find the filename set in CONFIG_EFI_SBAT_FILE: arch/x86/boot/compressed/sbat.S: Assembler messages: arch/x86/boot/compressed/sbat.S:6: Error: file not found: kernel.sbat Add $(srctree) as include dir for sbat.o. [ bp: Massage commit message. ] Fixes: 61b57d35396a ("x86/efi: Implement support for embedding SBAT data for x86") Signed-off-by: Jan Stancek <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Vitaly Kuznetsov <[email protected]> Cc: <[email protected]> Link: https://patch.msgid.link/f4eda155b0cef91d4d316b4e92f5771cb0aa7187.1772047658.git.jstancek@redhat.com Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Peter Zijlstra <[email protected]> Date: Wed Feb 11 13:59:43 2026 +0100 x86/cfi: Fix CFI rewrite for odd alignments [ Upstream commit 24c8147abb39618d74fcc36e325765e8fe7bdd7a ] Rustam reported his clang builds did not boot properly; turns out his .config has: CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B=y set. Fix up the FineIBT code to deal with this unusual alignment. Fixes: 931ab63664f0 ("x86/ibt: Implement FineIBT") Reported-by: Rustam Kovhaev <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Tested-by: Rustam Kovhaev <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Mike Rapoport (Microsoft) <[email protected]> Date: Wed Feb 25 08:55:55 2026 +0200 x86/efi: defer freeing of boot services memory commit a4b0bf6a40f3c107c67a24fbc614510ef5719980 upstream. efi_free_boot_services() frees memory occupied by EFI_BOOT_SERVICES_CODE and EFI_BOOT_SERVICES_DATA using memblock_free_late(). There are two issue with that: memblock_free_late() should be used for memory allocated with memblock_alloc() while the memory reserved with memblock_reserve() should be freed with free_reserved_area(). More acutely, with CONFIG_DEFERRED_STRUCT_PAGE_INIT=y efi_free_boot_services() is called before deferred initialization of the memory map is complete. Benjamin Herrenschmidt reports that this causes a leak of ~140MB of RAM on EC2 t3a.nano instances which only have 512MB or RAM. If the freed memory resides in the areas that memory map for them is still uninitialized, they won't be actually freed because memblock_free_late() calls memblock_free_pages() and the latter skips uninitialized pages. Using free_reserved_area() at this point is also problematic because __free_page() accesses the buddy of the freed page and that again might end up in uninitialized part of the memory map. Delaying the entire efi_free_boot_services() could be problematic because in addition to freeing boot services memory it updates efi.memmap without any synchronization and that's undesirable late in boot when there is concurrency. More robust approach is to only defer freeing of the EFI boot services memory. Split efi_free_boot_services() in two. First efi_unmap_boot_services() collects ranges that should be freed into an array then efi_free_boot_services() later frees them after deferred init is complete. Link: https://lore.kernel.org/all/ec2aaef14783869b3be6e3c253b2dcbf67dbc12a.camel@kernel.crashing.org Fixes: 916f676f8dc0 ("x86, efi: Retain boot service code until after switching to virtual mode") Cc: <[email protected]> Signed-off-by: Mike Rapoport (Microsoft) <[email protected]> Reviewed-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Ard Biesheuvel <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Andrew Cooper <[email protected]> Date: Tue Jan 6 13:15:04 2026 +0000 x86/fred: Correct speculative safety in fred_extint() [ Upstream commit aa280a08e7d8fae58557acc345b36b3dc329d595 ] array_index_nospec() is no use if the result gets spilled to the stack, as it makes the believed safe-under-speculation value subject to memory predictions. For all practical purposes, this means array_index_nospec() must be used in the expression that accesses the array. As the code currently stands, it's the wrong side of irqentry_enter(), and 'index' is put into %ebp across the function call. Remove the index variable and reposition array_index_nospec(), so it's calculated immediately before the array access. Fixes: 14619d912b65 ("x86/fred: FRED entry/exit and dispatch code") Signed-off-by: Andrew Cooper <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
Author: Kim Phillips <[email protected]> Date: Tue Feb 3 16:24:03 2026 -0600 x86/sev: Allow IBPB-on-Entry feature for SNP guests commit 9073428bb204d921ae15326bb7d4558d9d269aab upstream. The SEV-SNP IBPB-on-Entry feature does not require a guest-side implementation. It was added in Zen5 h/w, after the first SNP Zen implementation, and thus was not accounted for when the initial set of SNP features were added to the kernel. In its abundant precaution, commit 8c29f0165405 ("x86/sev: Add SEV-SNP guest feature negotiation support") included SEV_STATUS' IBPB-on-Entry bit as a reserved bit, thereby masking guests from using the feature. Allow guests to make use of IBPB-on-Entry when supported by the hypervisor, as the bit is now architecturally defined and safe to expose. Fixes: 8c29f0165405 ("x86/sev: Add SEV-SNP guest feature negotiation support") Signed-off-by: Kim Phillips <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Nikunj A Dadhania <[email protected]> Reviewed-by: Tom Lendacky <[email protected]> Cc: [email protected] Link: https://patch.msgid.link/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Oleg Nesterov <[email protected]> Date: Sun Jan 11 16:00:37 2026 +0100 x86/uprobes: Fix XOL allocation failure for 32-bit tasks [ Upstream commit d55c571e4333fac71826e8db3b9753fadfbead6a ] This script #!/usr/bin/bash echo 0 > /proc/sys/kernel/randomize_va_space echo 'void main(void) {}' > TEST.c # -fcf-protection to ensure that the 1st endbr32 insn can't be emulated gcc -m32 -fcf-protection=branch TEST.c -o test bpftrace -e 'uprobe:./test:main {}' -c ./test "hangs", the probed ./test task enters an endless loop. The problem is that with randomize_va_space == 0 get_unmapped_area(TASK_SIZE - PAGE_SIZE) called by xol_add_vma() can not just return the "addr == TASK_SIZE - PAGE_SIZE" hint, this addr is used by the stack vma. arch_get_unmapped_area_topdown() doesn't take TIF_ADDR32 into account and in_32bit_syscall() is false, this leads to info.high_limit > TASK_SIZE. vm_unmapped_area() happily returns the high address > TASK_SIZE and then get_unmapped_area() returns -ENOMEM after the "if (addr > TASK_SIZE - len)" check. handle_swbp() doesn't report this failure (probably it should) and silently restarts the probed insn. Endless loop. I think that the right fix should change the x86 get_unmapped_area() paths to rely on TIF_ADDR32 rather than in_32bit_syscall(). Note also that if CONFIG_X86_X32_ABI=y, in_x32_syscall() falsely returns true in this case because ->orig_ax = -1. But we need a simple fix for -stable, so this patch just sets TS_COMPAT if the probed task is 32-bit to make in_ia32_syscall() true. Fixes: 1b028f784e8c ("x86/mm: Introduce mmap_compat_base() for 32-bit mmap()") Reported-by: Paulo Andrade <[email protected]> Signed-off-by: Oleg Nesterov <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/all/[email protected]/ Cc: [email protected] Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
Author: Larysa Zaremba <[email protected]> Date: Thu Mar 5 12:12:50 2026 +0100 xdp: produce a warning when calculated tailroom is negative [ Upstream commit 8821e857759be9db3cde337ad328b71fe5c8a55f ] Many ethernet drivers report xdp Rx queue frag size as being the same as DMA write size. However, the only user of this field, namely bpf_xdp_frags_increase_tail(), clearly expects a truesize. Such difference leads to unspecific memory corruption issues under certain circumstances, e.g. in ixgbevf maximum DMA write size is 3 KB, so when running xskxceiver's XDP_ADJUST_TAIL_GROW_MULTI_BUFF, 6K packet fully uses all DMA-writable space in 2 buffers. This would be fine, if only rxq->frag_size was properly set to 4K, but value of 3K results in a negative tailroom, because there is a non-zero page offset. We are supposed to return -EINVAL and be done with it in such case, but due to tailroom being stored as an unsigned int, it is reported to be somewhere near UINT_MAX, resulting in a tail being grown, even if the requested offset is too much (it is around 2K in the abovementioned test). This later leads to all kinds of unspecific calltraces. [ 7340.337579] xskxceiver[1440]: segfault at 1da718 ip 00007f4161aeac9d sp 00007f41615a6a00 error 6 [ 7340.338040] xskxceiver[1441]: segfault at 7f410000000b ip 00000000004042b5 sp 00007f415bffecf0 error 4 [ 7340.338179] in libc.so.6[61c9d,7f4161aaf000+160000] [ 7340.339230] in xskxceiver[42b5,400000+69000] [ 7340.340300] likely on CPU 6 (core 0, socket 6) [ 7340.340302] Code: ff ff 01 e9 f4 fe ff ff 0f 1f 44 00 00 4c 39 f0 74 73 31 c0 ba 01 00 00 00 f0 0f b1 17 0f 85 ba 00 00 00 49 8b 87 88 00 00 00 <4c> 89 70 08 eb cc 0f 1f 44 00 00 48 8d bd f0 fe ff ff 89 85 ec fe [ 7340.340888] likely on CPU 3 (core 0, socket 3) [ 7340.345088] Code: 00 00 00 ba 00 00 00 00 be 00 00 00 00 89 c7 e8 31 ca ff ff 89 45 ec 8b 45 ec 85 c0 78 07 b8 00 00 00 00 eb 46 e8 0b c8 ff ff <8b> 00 83 f8 69 74 24 e8 ff c7 ff ff 8b 00 83 f8 0b 74 18 e8 f3 c7 [ 7340.404334] Oops: general protection fault, probably for non-canonical address 0x6d255010bdffc: 0000 [#1] SMP NOPTI [ 7340.405972] CPU: 7 UID: 0 PID: 1439 Comm: xskxceiver Not tainted 6.19.0-rc1+ #21 PREEMPT(lazy) [ 7340.408006] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.17.0-5.fc42 04/01/2014 [ 7340.409716] RIP: 0010:lookup_swap_cgroup_id+0x44/0x80 [ 7340.410455] Code: 83 f8 1c 73 39 48 ba ff ff ff ff ff ff ff 03 48 8b 04 c5 20 55 fa bd 48 21 d1 48 89 ca 83 e1 01 48 d1 ea c1 e1 04 48 8d 04 90 <8b> 00 48 83 c4 10 d3 e8 c3 cc cc cc cc 31 c0 e9 98 b7 dd 00 48 89 [ 7340.412787] RSP: 0018:ffffcc5c04f7f6d0 EFLAGS: 00010202 [ 7340.413494] RAX: 0006d255010bdffc RBX: ffff891f477895a8 RCX: 0000000000000010 [ 7340.414431] RDX: 0001c17e3fffffff RSI: 00fa070000000000 RDI: 000382fc7fffffff [ 7340.415354] RBP: 00fa070000000000 R08: ffffcc5c04f7f8f8 R09: ffffcc5c04f7f7d0 [ 7340.416283] R10: ffff891f4c1a7000 R11: ffffcc5c04f7f9c8 R12: ffffcc5c04f7f7d0 [ 7340.417218] R13: 03ffffffffffffff R14: 00fa06fffffffe00 R15: ffff891f47789500 [ 7340.418229] FS: 0000000000000000(0000) GS:ffff891ffdfaa000(0000) knlGS:0000000000000000 [ 7340.419489] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 7340.420286] CR2: 00007f415bfffd58 CR3: 0000000103f03002 CR4: 0000000000772ef0 [ 7340.421237] PKRU: 55555554 [ 7340.421623] Call Trace: [ 7340.421987] <TASK> [ 7340.422309] ? softleaf_from_pte+0x77/0xa0 [ 7340.422855] swap_pte_batch+0xa7/0x290 [ 7340.423363] zap_nonpresent_ptes.constprop.0.isra.0+0xd1/0x270 [ 7340.424102] zap_pte_range+0x281/0x580 [ 7340.424607] zap_pmd_range.isra.0+0xc9/0x240 [ 7340.425177] unmap_page_range+0x24d/0x420 [ 7340.425714] unmap_vmas+0xa1/0x180 [ 7340.426185] exit_mmap+0xe1/0x3b0 [ 7340.426644] __mmput+0x41/0x150 [ 7340.427098] exit_mm+0xb1/0x110 [ 7340.427539] do_exit+0x1b2/0x460 [ 7340.427992] do_group_exit+0x2d/0xc0 [ 7340.428477] get_signal+0x79d/0x7e0 [ 7340.428957] arch_do_signal_or_restart+0x34/0x100 [ 7340.429571] exit_to_user_mode_loop+0x8e/0x4c0 [ 7340.430159] do_syscall_64+0x188/0x6b0 [ 7340.430672] ? __do_sys_clone3+0xd9/0x120 [ 7340.431212] ? switch_fpu_return+0x4e/0xd0 [ 7340.431761] ? arch_exit_to_user_mode_prepare.isra.0+0xa1/0xc0 [ 7340.432498] ? do_syscall_64+0xbb/0x6b0 [ 7340.433015] ? __handle_mm_fault+0x445/0x690 [ 7340.433582] ? count_memcg_events+0xd6/0x210 [ 7340.434151] ? handle_mm_fault+0x212/0x340 [ 7340.434697] ? do_user_addr_fault+0x2b4/0x7b0 [ 7340.435271] ? clear_bhb_loop+0x30/0x80 [ 7340.435788] ? clear_bhb_loop+0x30/0x80 [ 7340.436299] ? clear_bhb_loop+0x30/0x80 [ 7340.436812] ? clear_bhb_loop+0x30/0x80 [ 7340.437323] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 7340.437973] RIP: 0033:0x7f4161b14169 [ 7340.438468] Code: Unable to access opcode bytes at 0x7f4161b1413f. [ 7340.439242] RSP: 002b:00007ffc6ebfa770 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca [ 7340.440173] RAX: fffffffffffffe00 RBX: 00000000000005a1 RCX: 00007f4161b14169 [ 7340.441061] RDX: 00000000000005a1 RSI: 0000000000000109 RDI: 00007f415bfff990 [ 7340.441943] RBP: 00007ffc6ebfa7a0 R08: 0000000000000000 R09: 00000000ffffffff [ 7340.442824] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 7340.443707] R13: 0000000000000000 R14: 00007f415bfff990 R15: 00007f415bfff6c0 [ 7340.444586] </TASK> [ 7340.444922] Modules linked in: rfkill intel_rapl_msr intel_rapl_common intel_uncore_frequency_common skx_edac_common nfit libnvdimm kvm_intel vfat fat kvm snd_pcm irqbypass rapl iTCO_wdt snd_timer intel_pmc_bxt iTCO_vendor_support snd ixgbevf virtio_net soundcore i2c_i801 pcspkr libeth_xdp net_failover i2c_smbus lpc_ich failover libeth virtio_balloon joydev 9p fuse loop zram lz4hc_compress lz4_compress 9pnet_virtio 9pnet netfs ghash_clmulni_intel serio_raw qemu_fw_cfg [ 7340.449650] ---[ end trace 0000000000000000 ]--- The issue can be fixed in all in-tree drivers, but we cannot just trust OOT drivers to not do this. Therefore, make tailroom a signed int and produce a warning when it is negative to prevent such mistakes in the future. Fixes: bf25146a5595 ("bpf: add frags support to the bpf_xdp_adjust_tail() API") Reviewed-by: Aleksandr Loktionov <[email protected]> Reviewed-by: Toke Høiland-Jørgensen <[email protected]> Acked-by: Martin KaFai Lau <[email protected]> Signed-off-by: Larysa Zaremba <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Larysa Zaremba <[email protected]> Date: Thu Mar 5 12:12:42 2026 +0100 xdp: use modulo operation to calculate XDP frag tailroom [ Upstream commit 88b6b7f7b216108a09887b074395fa7b751880b1 ] The current formula for calculating XDP tailroom in mbuf packets works only if each frag has its own page (if rxq->frag_size is PAGE_SIZE), this defeats the purpose of the parameter overall and without any indication leads to negative calculated tailroom on at least half of frags, if shared pages are used. There are not many drivers that set rxq->frag_size. Among them: * i40e and enetc always split page uniformly between frags, use shared pages * ice uses page_pool frags via libeth, those are power-of-2 and uniformly distributed across page * idpf has variable frag_size with XDP on, so current API is not applicable * mlx5, mtk and mvneta use PAGE_SIZE or 0 as frag_size for page_pool As for AF_XDP ZC, only ice, i40e and idpf declare frag_size for it. Modulo operation yields good results for aligned chunks, they are all power-of-2, between 2K and PAGE_SIZE. Formula without modulo fails when chunk_size is 2K. Buffers in unaligned mode are not distributed uniformly, so modulo operation would not work. To accommodate unaligned buffers, we could define frag_size as data + tailroom, and hence do not subtract offset when calculating tailroom, but this would necessitate more changes in the drivers. Define rxq->frag_size as an even portion of a page that fully belongs to a single frag. When calculating tailroom, locate the data start within such portion by performing a modulo operation on page offset. Fixes: bf25146a5595 ("bpf: add frags support to the bpf_xdp_adjust_tail() API") Acked-by: Jakub Kicinski <[email protected]> Reviewed-by: Aleksandr Loktionov <[email protected]> Signed-off-by: Larysa Zaremba <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: David Thomson <[email protected]> Date: Tue Feb 24 09:37:11 2026 +0000 xen/acpi-processor: fix _CST detection using undersized evaluation buffer [ Upstream commit 8b57227d59a86fc06d4f09de08f98133680f2cae ] read_acpi_id() attempts to evaluate _CST using a stack buffer of sizeof(union acpi_object) (48 bytes), but _CST returns a nested Package of sub-Packages (one per C-state, each containing a register descriptor, type, latency, and power) requiring hundreds of bytes. The evaluation always fails with AE_BUFFER_OVERFLOW. On modern systems using FFH/MWAIT entry (where pblk is zero), this causes the function to return before setting the acpi_id_cst_present bit. In check_acpi_ids(), flags.power is then zero for all Phase 2 CPUs (physical CPUs beyond dom0's vCPU count), so push_cxx_to_hypervisor() is never called for them. On a system with dom0_max_vcpus=2 and 8 physical CPUs, only PCPUs 0-1 receive C-state data. PCPUs 2-7 are stuck in C0/C1 idle, unable to enter C2/C3. This costs measurable wall power (4W observed on an Intel Core Ultra 7 265K with Xen 4.20). The function never uses the _CST return value -- it only needs to know whether _CST exists. Replace the broken acpi_evaluate_object() call with acpi_has_method(), which correctly detects _CST presence using acpi_get_handle() without any buffer allocation. This brings C-state detection to parity with the P-state path, which already works correctly for Phase 2 CPUs. Fixes: 59a568029181 ("xen/acpi-processor: C and P-state driver that uploads said data to hypervisor.") Signed-off-by: David Thomson <[email protected]> Reviewed-by: Jan Beulich <[email protected]> Signed-off-by: Juergen Gross <[email protected]> Message-ID: <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Ethan Tidmore <[email protected]> Date: Thu Feb 19 21:38:25 2026 -0600 xfs: Fix error pointer dereference commit cddfa648f1ab99e30e91455be19cd5ade26338c2 upstream. The function try_lookup_noperm() can return an error pointer and is not checked for one. Add checks for error pointer in xrep_adoption_check_dcache() and xrep_adoption_zap_dcache(). Detected by Smatch: fs/xfs/scrub/orphanage.c:449 xrep_adoption_check_dcache() error: 'd_child' dereferencing possible ERR_PTR() fs/xfs/scrub/orphanage.c:485 xrep_adoption_zap_dcache() error: 'd_child' dereferencing possible ERR_PTR() Fixes: 73597e3e42b4 ("xfs: ensure dentry consistency when the orphanage adopts a file") Cc: [email protected] # v6.16 Signed-off-by: Ethan Tidmore <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Nirjhar Roy (IBM) <[email protected]> Signed-off-by: Carlos Maiolino <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Darrick J. Wong <[email protected]> Date: Wed Feb 18 15:25:36 2026 -0800 xfs: fix xfs_group release bug in xfs_dax_notify_dev_failure commit eb8550fb75a875657dc29e3925a40244ec6b6bd6 upstream. Chris Mason reports that his AI tools noticed that we were using xfs_perag_put and xfs_group_put to release the group reference returned by xfs_group_next_range. However, the iterator function returns an object with an active refcount, which means that we must use the correct function to release the active refcount, which is _rele. Cc: <[email protected]> # v6.0 Fixes: 6f643c57d57c56 ("xfs: implement ->notify_failure() for XFS") Signed-off-by: "Darrick J. Wong" <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Carlos Maiolino <[email protected]> Signed-off-by: Carlos Maiolino <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Nikhil P. Rao <[email protected]> Date: Wed Feb 25 00:00:26 2026 +0000 xsk: Fix fragment node deletion to prevent buffer leak [ Upstream commit 60abb0ac11dccd6b98fd9182bc5f85b621688861 ] After commit b692bf9a7543 ("xsk: Get rid of xdp_buff_xsk::xskb_list_node"), the list_node field is reused for both the xskb pool list and the buffer free list, this causes a buffer leak as described below. xp_free() checks if a buffer is already on the free list using list_empty(&xskb->list_node). When list_del() is used to remove a node from the xskb pool list, it doesn't reinitialize the node pointers. This means list_empty() will return false even after the node has been removed, causing xp_free() to incorrectly skip adding the buffer to the free list. Fix this by using list_del_init() instead of list_del() in all fragment handling paths, this ensures the list node is reinitialized after removal, allowing the list_empty() to work correctly. Fixes: b692bf9a7543 ("xsk: Get rid of xdp_buff_xsk::xskb_list_node") Acked-by: Maciej Fijalkowski <[email protected]> Signed-off-by: Nikhil P. Rao <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Nikhil P. Rao <[email protected]> Date: Wed Feb 25 00:00:27 2026 +0000 xsk: Fix zero-copy AF_XDP fragment drop [ Upstream commit f7387d6579d65efd490a864254101cb665f2e7a7 ] AF_XDP should ensure that only a complete packet is sent to application. In the zero-copy case, if the Rx queue gets full as fragments are being enqueued, the remaining fragments are dropped. For the multi-buffer case, add a check to ensure that the Rx queue has enough space for all fragments of a packet before starting to enqueue them. Fixes: 24ea50127ecf ("xsk: support mbuf on ZC RX") Signed-off-by: Nikhil P. Rao <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Larysa Zaremba <[email protected]> Date: Thu Mar 5 12:12:43 2026 +0100 xsk: introduce helper to determine rxq->frag_size [ Upstream commit 16394d80539937d348dd3b9ea32415c54e67a81b ] rxq->frag_size is basically a step between consecutive strictly aligned frames. In ZC mode, chunk size fits exactly, but if chunks are unaligned, there is no safe way to determine accessible space to grow tailroom. Report frag_size to be zero, if chunks are unaligned, chunk_size otherwise. Fixes: 24ea50127ecf ("xsk: support mbuf on ZC RX") Reviewed-by: Aleksandr Loktionov <[email protected]> Signed-off-by: Larysa Zaremba <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Christoph Hellwig <[email protected]> Date: Tue Feb 24 06:21:44 2026 -0800 zloop: advertise a volatile write cache [ Upstream commit 6acf7860dcc79ed045cc9e6a79c8a8bb6959dba7 ] Zloop is file system backed and thus needs to sync the underlying file system to persist data. Set BLK_FEAT_WRITE_CACHE so that the block layer actually send flush commands, and fix the flush implementation as sync_filesystem requires s_umount to be held and the code currently misses that. Fixes: eb0570c7df23 ("block: new zoned loop block device driver") Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Christoph Hellwig <[email protected]> Date: Tue Feb 24 06:21:45 2026 -0800 zloop: check for spurious options passed to remove [ Upstream commit 3c4617117a2b7682cf037be5e5533e379707f050 ] Zloop uses a command option parser for all control commands, but most options are only valid for adding a new device. Check for incorrectly specified options in the remove handler. Fixes: eb0570c7df23 ("block: new zoned loop block device driver") Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Sasha Levin <[email protected]>