Changelog in Linux kernel 6.11.3

 
accel/ivpu: Add missing MODULE_FIRMWARE metadata [+ + +]
Author: Alexander F. Lent <lx@xanderlent.com>
Date:   Tue Jul 9 07:54:14 2024 -0400

    accel/ivpu: Add missing MODULE_FIRMWARE metadata
    
    [ Upstream commit 58b5618ba80a5e5a8d531a70eae12070e5bd713f ]
    
    Modules that load firmware from various paths at runtime must declare
    those paths at compile time, via the MODULE_FIRMWARE macro, so that the
    firmware paths are included in the module's metadata.
    
    The accel/ivpu driver loads firmware but lacks this metadata,
    preventing dracut from correctly locating firmware files. Fix it.
    
    Fixes: 9ab43e95f922 ("accel/ivpu: Switch to generation based FW names")
    Fixes: 02d5b0aacd05 ("accel/ivpu: Implement firmware parsing and booting")
    Signed-off-by: Alexander F. Lent <lx@xanderlent.com>
    Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
    Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240709-fix-ivpu-firmware-metadata-v3-1-55f70bba055b@xanderlent.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ACPI: battery: Fix possible crash when unregistering a battery hook [+ + +]
Author: Armin Wolf <W_Armin@gmx.de>
Date:   Tue Oct 1 23:28:34 2024 +0200

    ACPI: battery: Fix possible crash when unregistering a battery hook
    
    [ Upstream commit 76959aff14a0012ad6b984ec7686d163deccdc16 ]
    
    When a battery hook returns an error when adding a new battery, then
    the battery hook is automatically unregistered.
    However the battery hook provider cannot know that, so it will later
    call battery_hook_unregister() on the already unregistered battery
    hook, resulting in a crash.
    
    Fix this by using the list head to mark already unregistered battery
    hooks as already being unregistered so that they can be ignored by
    battery_hook_unregister().
    
    Fixes: fa93854f7a7e ("battery: Add the battery hooking API")
    Signed-off-by: Armin Wolf <W_Armin@gmx.de>
    Link: https://patch.msgid.link/20241001212835.341788-3-W_Armin@gmx.de
    Cc: All applicable <stable@vger.kernel.org>
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ACPI: battery: Simplify battery hook locking [+ + +]
Author: Armin Wolf <W_Armin@gmx.de>
Date:   Tue Oct 1 23:28:33 2024 +0200

    ACPI: battery: Simplify battery hook locking
    
    [ Upstream commit 86309cbed26139e1caae7629dcca1027d9a28e75 ]
    
    Move the conditional locking from __battery_hook_unregister()
    into battery_hook_unregister() and rename the low-level function
    to simplify the locking during battery hook removal.
    
    Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
    Reviewed-by: Pali Rohár <pali@kernel.org>
    Signed-off-by: Armin Wolf <W_Armin@gmx.de>
    Link: https://patch.msgid.link/20241001212835.341788-2-W_Armin@gmx.de
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Stable-dep-of: 76959aff14a0 ("ACPI: battery: Fix possible crash when unregistering a battery hook")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ACPI: CPPC: Add support for setting EPP register in FFH [+ + +]
Author: Mario Limonciello <mario.limonciello@amd.com>
Date:   Mon Sep 9 22:15:24 2024 -0500

    ACPI: CPPC: Add support for setting EPP register in FFH
    
    [ Upstream commit aaf21ac93909e08a12931173336bdb52ac8499f1 ]
    
    Some Asus AMD systems are reported to not be able to change EPP values
    because the BIOS doesn't advertise support for the CPPC MSR and the PCC
    region is not configured.
    
    However the ACPI 6.2 specification allows CPC registers to be declared
    in FFH:
    ```
    Starting with ACPI Specification 6.2, all _CPC registers can be in
    PCC, System Memory, System IO, or Functional Fixed Hardware address
    spaces. OSPM support for this more flexible register space scheme
    is indicated by the “Flexible Address Space for CPPC Registers” _OSC
    bit.
    ```
    
    If this _OSC has been set allow using FFH to configure EPP.
    
    Reported-by: al0uette@outlook.com
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218686
    Suggested-by: al0uette@outlook.com
    Tested-by: vderp@icloud.com
    Tested-by: al0uette@outlook.com
    Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
    Link: https://patch.msgid.link/20240910031524.106387-1-superm1@kernel.org
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ACPI: EC: Do not release locks during operation region accesses [+ + +]
Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Date:   Thu Jul 4 18:26:54 2024 +0200

    ACPI: EC: Do not release locks during operation region accesses
    
    [ Upstream commit dc171114926ec390ab90f46534545420ec03e458 ]
    
    It is not particularly useful to release locks (the EC mutex and the
    ACPI global lock, if present) and re-acquire them immediately thereafter
    during EC address space accesses in acpi_ec_space_handler().
    
    First, releasing them for a while before grabbing them again does not
    really help anyone because there may not be enough time for another
    thread to acquire them.
    
    Second, if another thread successfully acquires them and carries out
    a new EC write or read in the middle if an operation region access in
    progress, it may confuse the EC firmware, especially after the burst
    mode has been enabled.
    
    Finally, manipulating the locks after writing or reading every single
    byte of data is overhead that it is better to avoid.
    
    Accordingly, modify the code to carry out EC address space accesses
    entirely without releasing the locks.
    
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Reviewed-by: Hans de Goede <hdegoede@redhat.com>
    Link: https://patch.msgid.link/12473338.O9o76ZdvQC@rjwysocki.net
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ACPI: PAD: fix crash in exit_round_robin() [+ + +]
Author: Seiji Nishikawa <snishika@redhat.com>
Date:   Sun Aug 25 23:13:52 2024 +0900

    ACPI: PAD: fix crash in exit_round_robin()
    
    [ Upstream commit 0a2ed70a549e61c5181bad5db418d223b68ae932 ]
    
    The kernel occasionally crashes in cpumask_clear_cpu(), which is called
    within exit_round_robin(), because when executing clear_bit(nr, addr) with
    nr set to 0xffffffff, the address calculation may cause misalignment within
    the memory, leading to access to an invalid memory address.
    
    ----------
    BUG: unable to handle kernel paging request at ffffffffe0740618
            ...
    CPU: 3 PID: 2919323 Comm: acpi_pad/14 Kdump: loaded Tainted: G           OE  X --------- -  - 4.18.0-425.19.2.el8_7.x86_64 #1
            ...
    RIP: 0010:power_saving_thread+0x313/0x411 [acpi_pad]
    Code: 89 cd 48 89 d3 eb d1 48 c7 c7 55 70 72 c0 e8 64 86 b0 e4 c6 05 0d a1 02 00 01 e9 bc fd ff ff 45 89 e4 42 8b 04 a5 20 82 72 c0 <f0> 48 0f b3 05 f4 9c 01 00 42 c7 04 a5 20 82 72 c0 ff ff ff ff 31
    RSP: 0018:ff72a5d51fa77ec8 EFLAGS: 00010202
    RAX: 00000000ffffffff RBX: ff462981e5d8cb80 RCX: 0000000000000000
    RDX: 0000000000000000 RSI: 0000000000000246 RDI: 0000000000000246
    RBP: ff46297556959d80 R08: 0000000000000382 R09: ff46297c8d0f38d8
    R10: 0000000000000000 R11: 0000000000000001 R12: 000000000000000e
    R13: 0000000000000000 R14: ffffffffffffffff R15: 000000000000000e
    FS:  0000000000000000(0000) GS:ff46297a800c0000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: ffffffffe0740618 CR3: 0000007e20410004 CR4: 0000000000771ee0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    PKRU: 55555554
    Call Trace:
     ? acpi_pad_add+0x120/0x120 [acpi_pad]
     kthread+0x10b/0x130
     ? set_kthread_struct+0x50/0x50
     ret_from_fork+0x1f/0x40
            ...
    CR2: ffffffffe0740618
    
    crash> dis -lr ffffffffc0726923
            ...
    /usr/src/debug/kernel-4.18.0-425.19.2.el8_7/linux-4.18.0-425.19.2.el8_7.x86_64/./include/linux/cpumask.h: 114
    0xffffffffc0726918 <power_saving_thread+776>:   mov    %r12d,%r12d
    /usr/src/debug/kernel-4.18.0-425.19.2.el8_7/linux-4.18.0-425.19.2.el8_7.x86_64/./include/linux/cpumask.h: 325
    0xffffffffc072691b <power_saving_thread+779>:   mov    -0x3f8d7de0(,%r12,4),%eax
    /usr/src/debug/kernel-4.18.0-425.19.2.el8_7/linux-4.18.0-425.19.2.el8_7.x86_64/./arch/x86/include/asm/bitops.h: 80
    0xffffffffc0726923 <power_saving_thread+787>:   lock btr %rax,0x19cf4(%rip)        # 0xffffffffc0740620 <pad_busy_cpus_bits>
    
    crash> px tsk_in_cpu[14]
    $66 = 0xffffffff
    
    crash> px 0xffffffffc072692c+0x19cf4
    $99 = 0xffffffffc0740620
    
    crash> sym 0xffffffffc0740620
    ffffffffc0740620 (b) pad_busy_cpus_bits [acpi_pad]
    
    crash> px pad_busy_cpus_bits[0]
    $42 = 0xfffc0
    ----------
    
    To fix this, ensure that tsk_in_cpu[tsk_index] != -1 before calling
    cpumask_clear_cpu() in exit_round_robin(), just as it is done in
    round_robin_cpu().
    
    Signed-off-by: Seiji Nishikawa <snishika@redhat.com>
    Link: https://patch.msgid.link/20240825141352.25280-1-snishika@redhat.com
    [ rjw: Subject edit, avoid updates to the same value ]
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ACPI: resource: Add Asus ExpertBook B2502CVA to irq1_level_low_skip_override[] [+ + +]
Author: Hans de Goede <hdegoede@redhat.com>
Date:   Fri Sep 27 16:16:06 2024 +0200

    ACPI: resource: Add Asus ExpertBook B2502CVA to irq1_level_low_skip_override[]
    
    commit 056301e7c7c886f96d799edd36f3406cc30e1822 upstream.
    
    Like other Asus ExpertBook models the B2502CVA has its keybopard IRQ (1)
    described as ActiveLow in the DSDT, which the kernel overrides to EdgeHigh
    which breaks the keyboard.
    
    Add the B2502CVA to the irq1_level_low_skip_override[] quirk table to fix
    this.
    
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217760
    Cc: All applicable <stable@vger.kernel.org>
    Signed-off-by: Hans de Goede <hdegoede@redhat.com>
    Link: https://patch.msgid.link/20240927141606.66826-4-hdegoede@redhat.com
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ACPI: resource: Add Asus Vivobook X1704VAP to irq1_level_low_skip_override[] [+ + +]
Author: Hans de Goede <hdegoede@redhat.com>
Date:   Fri Sep 27 16:16:05 2024 +0200

    ACPI: resource: Add Asus Vivobook X1704VAP to irq1_level_low_skip_override[]
    
    commit 2f80ce0b78c340e332f04a5801dee5e4ac8cfaeb upstream.
    
    Like other Asus Vivobook models the X1704VAP has its keybopard IRQ (1)
    described as ActiveLow in the DSDT, which the kernel overrides to EdgeHigh
    which breaks the keyboard.
    
    Add the X1704VAP to the irq1_level_low_skip_override[] quirk table to fix
    this.
    
    Reported-by: Lamome Julien <julien.lamome@wanadoo.fr>
    Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1078696
    Closes: https://lore.kernel.org/all/1226760b-4699-4529-bf57-6423938157a3@wanadoo.fr/
    Cc: All applicable <stable@vger.kernel.org>
    Signed-off-by: Hans de Goede <hdegoede@redhat.com>
    Link: https://patch.msgid.link/20240927141606.66826-3-hdegoede@redhat.com
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ACPI: resource: Loosen the Asus E1404GAB DMI match to also cover the E1404GA [+ + +]
Author: Hans de Goede <hdegoede@redhat.com>
Date:   Fri Sep 27 16:16:04 2024 +0200

    ACPI: resource: Loosen the Asus E1404GAB DMI match to also cover the E1404GA
    
    commit 63539defee17bf0cbd8e24078cf103efee9c6633 upstream.
    
    Like other Asus Vivobooks, the Asus Vivobook Go E1404GA has a DSDT
    describing IRQ 1 as ActiveLow, while the kernel overrides to Edge_High.
    
        $ sudo dmesg | grep DMI:.*BIOS
        [    0.000000] DMI: ASUSTeK COMPUTER INC. Vivobook Go E1404GA_E1404GA/E1404GA, BIOS E1404GA.302 08/23/2023
        $ sudo cp /sys/firmware/acpi/tables/DSDT dsdt.dat
        $ iasl -d dsdt.dat
        $ grep -A 30 PS2K dsdt.dsl | grep IRQ -A 1
                    IRQ (Level, ActiveLow, Exclusive, )
                        {1}
    
    There already is an entry in the irq1_level_low_skip_override[] DMI match
    table for the "E1404GAB", change this to match on "E1404GA" to cover
    the E1404GA model as well (DMI_MATCH() does a substring match).
    
    Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219224
    Cc: All applicable <stable@vger.kernel.org>
    Signed-off-by: Hans de Goede <hdegoede@redhat.com>
    Link: https://patch.msgid.link/20240927141606.66826-2-hdegoede@redhat.com
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ACPI: resource: Remove duplicate Asus E1504GAB IRQ override [+ + +]
Author: Hans de Goede <hdegoede@redhat.com>
Date:   Fri Sep 27 16:16:03 2024 +0200

    ACPI: resource: Remove duplicate Asus E1504GAB IRQ override
    
    commit 65bdebf38e5fac7c56a9e05d3479a707e6dc783c upstream.
    
    Commit d2aaf1996504 ("ACPI: resource: Add DMI quirks for ASUS Vivobook
    E1504GA and E1504GAB") does exactly what the subject says, adding DMI
    matches for both the E1504GA and E1504GAB.
    
    But DMI_MATCH() does a substring match, so checking for E1504GA will also
    match E1504GAB.
    
    Drop the unnecessary E1504GAB entry since that is covered already by
    the E1504GA entry.
    
    Fixes: d2aaf1996504 ("ACPI: resource: Add DMI quirks for ASUS Vivobook E1504GA and E1504GAB")
    Cc: All applicable <stable@vger.kernel.org>
    Signed-off-by: Hans de Goede <hdegoede@redhat.com>
    Link: https://patch.msgid.link/20240927141606.66826-1-hdegoede@redhat.com
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ACPI: resource: Skip IRQ override on Asus Vivobook Go E1404GAB [+ + +]
Author: Tamim Khan <tamim@fusetak.com>
Date:   Mon Sep 2 21:43:05 2024 -0400

    ACPI: resource: Skip IRQ override on Asus Vivobook Go E1404GAB
    
    [ Upstream commit 49e9cc315604972cc14868cb67831e3e8c3f1470 ]
    
    Like other Asus Vivobooks, the Asus Vivobook Go E1404GAB has a DSDT
    that describes IRQ 1 as ActiveLow, while the kernel overrides to Edge_High.
    
    This override prevents the internal keyboard from working.
    
    Fix the problem by adding this laptop to the table that prevents the kernel
    from overriding the IRQ.
    
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=219212
    Signed-off-by: Tamim Khan <tamim@fusetak.com>
    Link: https://patch.msgid.link/20240903014317.38858-1-tamim@fusetak.com
    [ rjw: Changelog edits ]
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ACPI: video: Add backlight=native quirk for Dell OptiPlex 5480 AIO [+ + +]
Author: Hans de Goede <hdegoede@redhat.com>
Date:   Wed Sep 18 17:38:49 2024 +0200

    ACPI: video: Add backlight=native quirk for Dell OptiPlex 5480 AIO
    
    commit ac78288fe062b64e45a479eaae74aaaafcc8ecdd upstream.
    
    Dell All In One (AIO) models released after 2017 may use a backlight
    controller board connected to an UART.
    
    In DSDT this uart port will be defined as:
    
       Name (_HID, "DELL0501")
       Name (_CID, EisaId ("PNP0501")
    
    The Dell OptiPlex 5480 AIO has an ACPI device for one of its UARTs with
    the above _HID + _CID. Loading the dell-uart-backlight driver fails with
    the following errors:
    
    [   18.261353] dell_uart_backlight serial0-0: Timed out waiting for response.
    [   18.261356] dell_uart_backlight serial0-0: error -ETIMEDOUT: getting firmware version
    [   18.261359] dell_uart_backlight serial0-0: probe with driver dell_uart_backlight failed with error -110
    
    Indicating that there is no backlight controller board attached to
    the UART, while the GPU's native backlight control method does work.
    
    Add a quirk to use the GPU's native backlight control method on this model.
    
    Fixes: cd8e468efb4f ("ACPI: video: Add Dell UART backlight controller detection")
    Cc: All applicable <stable@vger.kernel.org>
    Signed-off-by: Hans de Goede <hdegoede@redhat.com>
    Link: https://patch.msgid.link/20240918153849.37221-1-hdegoede@redhat.com
    [ rjw: Changelog edit ]
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ACPI: video: Add force_vendor quirk for Panasonic Toughbook CF-18 [+ + +]
Author: Hans de Goede <hdegoede@redhat.com>
Date:   Sat Sep 7 14:44:19 2024 +0200

    ACPI: video: Add force_vendor quirk for Panasonic Toughbook CF-18
    
    [ Upstream commit eb7b0f12e13ba99e64e3a690c2166895ed63b437 ]
    
    The Panasonic Toughbook CF-18 advertises both native and vendor backlight
    control interfaces. But only the vendor one actually works.
    
    acpi_video_get_backlight_type() will pick the non working native backlight
    by default, add a quirk to select the working vendor backlight instead.
    
    Signed-off-by: Hans de Goede <hdegoede@redhat.com>
    Link: https://patch.msgid.link/20240907124419.21195-1-hdegoede@redhat.com
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ACPICA: check null return of ACPI_ALLOCATE_ZEROED() in acpi_db_convert_to_package() [+ + +]
Author: Pei Xiao <xiaopei01@kylinos.cn>
Date:   Thu Jul 18 14:05:48 2024 +0800

    ACPICA: check null return of ACPI_ALLOCATE_ZEROED() in acpi_db_convert_to_package()
    
    [ Upstream commit a5242874488eba2b9062985bf13743c029821330 ]
    
    ACPICA commit 4d4547cf13cca820ff7e0f859ba83e1a610b9fd0
    
    ACPI_ALLOCATE_ZEROED() may fail, elements might be NULL and will cause
    NULL pointer dereference later.
    
    Link: https://github.com/acpica/acpica/commit/4d4547cf
    Signed-off-by: Pei Xiao <xiaopei01@kylinos.cn>
    Link: https://patch.msgid.link/tencent_4A21A2865B8B0A0D12CAEBEB84708EDDB505@qq.com
    [ rjw: Subject and changelog edits ]
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ACPICA: Fix memory leak if acpi_ps_get_next_field() fails [+ + +]
Author: Armin Wolf <W_Armin@gmx.de>
Date:   Sun Apr 14 21:50:33 2024 +0200

    ACPICA: Fix memory leak if acpi_ps_get_next_field() fails
    
    [ Upstream commit e6169a8ffee8a012badd8c703716e761ce851b15 ]
    
    ACPICA commit 1280045754264841b119a5ede96cd005bc09b5a7
    
    If acpi_ps_get_next_field() fails, the previously created field list
    needs to be properly disposed before returning the status code.
    
    Link: https://github.com/acpica/acpica/commit/12800457
    Signed-off-by: Armin Wolf <W_Armin@gmx.de>
    [ rjw: Rename local variable to avoid compiler confusion ]
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ACPICA: Fix memory leak if acpi_ps_get_next_namepath() fails [+ + +]
Author: Armin Wolf <W_Armin@gmx.de>
Date:   Wed Apr 3 20:50:11 2024 +0200

    ACPICA: Fix memory leak if acpi_ps_get_next_namepath() fails
    
    [ Upstream commit 5accb265f7a1b23e52b0ec42313d1e12895552f4 ]
    
    ACPICA commit 2802af722bbde7bf1a7ac68df68e179e2555d361
    
    If acpi_ps_get_next_namepath() fails, the previously allocated
    union acpi_parse_object needs to be freed before returning the
    status code.
    
    The issue was first being reported on the Linux ACPI mailing list:
    
    Link: https://lore.kernel.org/linux-acpi/56f94776-484f-48c0-8855-dba8e6a7793b@yandex.ru/T/
    Link: https://github.com/acpica/acpica/commit/2802af72
    Signed-off-by: Armin Wolf <W_Armin@gmx.de>
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ACPICA: iasl: handle empty connection_node [+ + +]
Author: Aleksandrs Vinarskis <alex.vinarskis@gmail.com>
Date:   Sun Aug 11 23:33:44 2024 +0200

    ACPICA: iasl: handle empty connection_node
    
    [ Upstream commit a0a2459b79414584af6c46dd8c6f866d8f1aa421 ]
    
    ACPICA commit 6c551e2c9487067d4b085333e7fe97e965a11625
    
    Link: https://github.com/acpica/acpica/commit/6c551e2c
    Signed-off-by: Aleksandrs Vinarskis <alex.vinarskis@gmail.com>
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
afs: Fix missing wire-up of afs_retry_request() [+ + +]
Author: David Howells <dhowells@redhat.com>
Date:   Sat Sep 14 21:40:02 2024 +0100

    afs: Fix missing wire-up of afs_retry_request()
    
    [ Upstream commit 2cf36327ee1e47733aba96092d7bd082a4056ff5 ]
    
    afs_retry_request() is supposed to be pointed to by the afs_req_ops netfs
    operations table, but the pointer got lost somewhere.  The function is used
    during writeback to rotate through the authentication keys that were in
    force when the file was modified locally.
    
    Fix this by adding the pointer to the function.
    
    Fixes: 1ecb146f7cd8 ("netfs, afs: Use writeback retry to deal with alternate keys")
    Reported-by: Dr. David Alan Gilbert <linux@treblig.org>
    Signed-off-by: David Howells <dhowells@redhat.com>
    Link: https://lore.kernel.org/r/1690847.1726346402@warthog.procyon.org.uk
    cc: Marc Dionne <marc.dionne@auristor.com>
    cc: Jeff Layton <jlayton@kernel.org>
    cc: linux-afs@lists.infradead.org
    cc: netfs@lists.linux.dev
    cc: linux-fsdevel@vger.kernel.org
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

afs: Fix the setting of the server responding flag [+ + +]
Author: David Howells <dhowells@redhat.com>
Date:   Mon Sep 23 16:07:50 2024 +0100

    afs: Fix the setting of the server responding flag
    
    [ Upstream commit ff98751bae40faed1ba9c6a7287e84430f7dec64 ]
    
    In afs_wait_for_operation(), we set transcribe the call responded flag to
    the server record that we used after doing the fileserver iteration loop -
    but it's possible to exit the loop having had a response from the server
    that we've discarded (e.g. it returned an abort or we started receiving
    data, but the call didn't complete).
    
    This means that op->server might be NULL, but we don't check that before
    attempting to set the server flag.
    
    Fixes: 98f9fda2057b ("afs: Fold the afs_addr_cursor struct in")
    Signed-off-by: David Howells <dhowells@redhat.com>
    Link: https://lore.kernel.org/r/20240923150756.902363-7-dhowells@redhat.com
    cc: Marc Dionne <marc.dionne@auristor.com>
    cc: linux-afs@lists.infradead.org
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ALSA: asihpi: Fix potential OOB array access [+ + +]
Author: Takashi Iwai <tiwai@suse.de>
Date:   Thu Aug 8 11:14:42 2024 +0200

    ALSA: asihpi: Fix potential OOB array access
    
    [ Upstream commit 7b986c7430a6bb68d523dac7bfc74cbd5b44ef96 ]
    
    ASIHPI driver stores some values in the static array upon a response
    from the driver, and its index depends on the firmware.  We shouldn't
    trust it blindly.
    
    This patch adds a sanity check of the array index to fit in the array
    size.
    
    Link: https://patch.msgid.link/20240808091454.30846-1-tiwai@suse.de
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ALSA: control: Fix leftover snd_power_unref() [+ + +]
Author: Takashi Iwai <tiwai@suse.de>
Date:   Thu Aug 1 08:42:01 2024 +0200

    ALSA: control: Fix leftover snd_power_unref()
    
    commit fef1ac950c600ba50ef4d65ca03c8dae9be7f9ea upstream.
    
    One snd_power_unref() was forgotten and left at __snd_ctl_elem_info()
    in the previous change for reorganizing the locking order.
    
    Fixes: fcc62b19104a ("ALSA: control: Take power_ref lock primarily")
    Link: https://github.com/thesofproject/linux/pull/5127
    Link: https://patch.msgid.link/20240801064203.30284-1-tiwai@suse.de
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: control: Fix power_ref lock order for compat code, too [+ + +]
Author: Takashi Iwai <tiwai@suse.de>
Date:   Thu Aug 8 18:31:27 2024 +0200

    ALSA: control: Fix power_ref lock order for compat code, too
    
    [ Upstream commit a1066453b5e49a28523f3ecbbfe4e06c6a29561c ]
    
    In the previous change for swapping the power_ref and controls_rwsem
    lock order, the code path for the compat layer was forgotten.
    This patch covers the remaining code.
    
    Fixes: fcc62b19104a ("ALSA: control: Take power_ref lock primarily")
    Link: https://patch.msgid.link/20240808163128.20383-1-tiwai@suse.de
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ALSA: control: Take power_ref lock primarily [+ + +]
Author: Takashi Iwai <tiwai@suse.de>
Date:   Mon Jul 29 18:06:58 2024 +0200

    ALSA: control: Take power_ref lock primarily
    
    [ Upstream commit fcc62b19104a67b9a2941513771e09389b75bd95 ]
    
    The code path for kcontrol accesses have often nested locks of both
    card's controls_rwsem and power_ref, and applies in that order.
    However, what could take much longer is the latter, power_ref; it
    waits for the power state of the device, and it pretty much depends on
    the user's action.
    
    This patch swaps the locking order of those locks to a more natural
    way, namely, power_ref -> controls_rwsem, in order to shorten the time
    of possible nested locks.  For consistency, power_ref is taken always
    in the top-level caller side (that is, *_user() functions and the
    ioctl handler itself).
    
    Link: https://patch.msgid.link/20240729160659.4516-1-tiwai@suse.de
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ALSA: core: add isascii() check to card ID generator [+ + +]
Author: Jaroslav Kysela <perex@perex.cz>
Date:   Wed Oct 2 21:46:49 2024 +0200

    ALSA: core: add isascii() check to card ID generator
    
    commit d278a9de5e1837edbe57b2f1f95a104ff6c84846 upstream.
    
    The card identifier should contain only safe ASCII characters. The isalnum()
    returns true also for characters for non-ASCII characters.
    
    Link: https://gitlab.freedesktop.org/pipewire/pipewire/-/issues/4135
    Link: https://lore.kernel.org/linux-sound/yk3WTvKkwheOon_LzZlJ43PPInz6byYfBzpKkbasww1yzuiMRqn7n6Y8vZcXB-xwFCu_vb8hoNjv7DTNwH5TWjpEuiVsyn9HPCEXqwF4120=@protonmail.com/
    Cc: stable@vger.kernel.org
    Reported-by: Barnabás Pőcze <pobrn@protonmail.com>
    Signed-off-by: Jaroslav Kysela <perex@perex.cz>
    Link: https://patch.msgid.link/20241002194649.1944696-1-perex@perex.cz
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: gus: Fix some error handling paths related to get_bpos() usage [+ + +]
Author: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Date:   Thu Oct 3 21:34:01 2024 +0200

    ALSA: gus: Fix some error handling paths related to get_bpos() usage
    
    [ Upstream commit 9df39a872c462ea07a3767ebd0093c42b2ff78a2 ]
    
    If get_bpos() fails, it is likely that the corresponding error code should
    be returned.
    
    Fixes: a6970bb1dd99 ("ALSA: gus: Convert to the new PCM ops")
    Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
    Link: https://patch.msgid.link/d9ca841edad697154afa97c73a5d7a14919330d9.1727984008.git.christophe.jaillet@wanadoo.fr
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ALSA: hda/conexant: Fix conflicting quirk for System76 Pangolin [+ + +]
Author: Takashi Iwai <tiwai@suse.de>
Date:   Fri Oct 4 10:25:58 2024 +0200

    ALSA: hda/conexant: Fix conflicting quirk for System76 Pangolin
    
    [ Upstream commit b3ebb007060f89d5a45c9b99f06a55e36a1945b5 ]
    
    We received a regression report for System76 Pangolin (pang14) due to
    the recent fix for Tuxedo Sirius devices to support the top speaker.
    The reason was the conflicting PCI SSID, as often seen.
    
    As a workaround, now the codec SSID is checked and the quirk is
    applied conditionally only to Sirius devices.
    
    Fixes: 4178d78cd7a8 ("ALSA: hda/conexant: Add pincfg quirk to enable top speakers on Sirius devices")
    Reported-by: Christian Heusel <christian@heusel.eu>
    Reported-by: Jerry <jerryluo225@gmail.com>
    Closes: https://lore.kernel.org/c930b6a6-64e5-498f-b65a-1cd5e0a1d733@heusel.eu
    Link: https://patch.msgid.link/20241004082602.29016-1-tiwai@suse.de
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ALSA: hda/generic: Unconditionally prefer preferred_dacs pairs [+ + +]
Author: Takashi Iwai <tiwai@suse.de>
Date:   Tue Oct 1 14:14:36 2024 +0200

    ALSA: hda/generic: Unconditionally prefer preferred_dacs pairs
    
    [ Upstream commit 1c801e7f77445bc56e5e1fec6191fd4503534787 ]
    
    Some time ago, we introduced the obey_preferred_dacs flag for choosing
    the DAC/pin pairs specified by the driver instead of parsing the
    paths.  This works as expected, per se, but there have been a few
    cases where we forgot to set this flag while preferred_dacs table is
    already set up.  It ended up with incorrect wiring and made us
    wondering why it doesn't work.
    
    Basically, when the preferred_dacs table is provided, it means that
    the driver really wants to wire up to follow that.  That is, the
    presence of the preferred_dacs table itself is already a "do-it"
    flag.
    
    In this patch, we simply replace the evaluation of obey_preferred_dacs
    flag with the presence of preferred_dacs table for fixing the
    misbehavior.  Another patch to drop of the obsoleted flag will
    follow.
    
    Fixes: 242d990c158d ("ALSA: hda/generic: Add option to enforce preferred_dacs pairs")
    Link: https://bugzilla.suse.com/show_bug.cgi?id=1219803
    Link: https://patch.msgid.link/20241001121439.26060-1-tiwai@suse.de
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ALSA: hda/realtek: Add a quirk for HP Pavilion 15z-ec200 [+ + +]
Author: Abhishek Tamboli <abhishektamboli9@gmail.com>
Date:   Mon Sep 30 20:23:00 2024 +0530

    ALSA: hda/realtek: Add a quirk for HP Pavilion 15z-ec200
    
    commit d75dba49744478c32f6ce1c16b5f391c2d5cef5f upstream.
    
    Add the quirk for HP Pavilion Gaming laptop 15z-ec200 for
    enabling the mute led. The fix apply the ALC285_FIXUP_HP_MUTE_LED
    quirk for this model.
    
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=219303
    Signed-off-by: Abhishek Tamboli <abhishektamboli9@gmail.com>
    Cc: <stable@vger.kernel.org>
    Link: https://patch.msgid.link/20240930145300.4604-1-abhishektamboli9@gmail.com
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: hda/realtek: Add quirk for Huawei MateBook 13 KLV-WX9 [+ + +]
Author: Ai Chao <aichao@kylinos.cn>
Date:   Thu Sep 26 14:02:52 2024 +0800

    ALSA: hda/realtek: Add quirk for Huawei MateBook 13 KLV-WX9
    
    commit dee476950cbd83125655a3f49e00d63b79f6114e upstream.
    
    The headset mic requires a fixup to be properly detected/used.
    
    Signed-off-by: Ai Chao <aichao@kylinos.cn>
    Cc: <stable@vger.kernel.org>
    Link: https://patch.msgid.link/20240926060252.25630-1-aichao@kylinos.cn
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: hda/realtek: fix mute/micmute LED for HP mt645 G8 [+ + +]
Author: Nikolai Afanasenkov <nikolai.afanasenkov@hp.com>
Date:   Mon Sep 16 13:50:42 2024 -0600

    ALSA: hda/realtek: fix mute/micmute LED for HP mt645 G8
    
    commit cb2deca056d579fe008c8d0a4ceb04d2b368fe42 upstream.
    
    The HP Elite mt645 G8 Mobile Thin Client uses an ALC236 codec
    and needs the ALC236_FIXUP_HP_MUTE_LED_MICMUTE_VREF quirk
    to enable the mute and micmute LED functionality.
    
    This patch adds the system ID of the HP Elite mt645 G8
    to the `alc269_fixup_tbl` in `patch_realtek.c`
    to enable the required quirk.
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Nikolai Afanasenkov <nikolai.afanasenkov@hp.com>
    Link: https://patch.msgid.link/20240916195042.4050-1-nikolai.afanasenkov@hp.com
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: hda/realtek: Fix the push button function for the ALC257 [+ + +]
Author: Oder Chiou <oder_chiou@realtek.com>
Date:   Mon Sep 30 18:50:39 2024 +0800

    ALSA: hda/realtek: Fix the push button function for the ALC257
    
    [ Upstream commit 05df9732a0894846c46d0062d4af535c5002799d ]
    
    The headset push button cannot work properly in case of the ALC257.
    This patch reverted the previous commit to correct the side effect.
    
    Fixes: ef9718b3d54e ("ALSA: hda/realtek: Fix noise from speakers on Lenovo IdeaPad 3 15IAU7")
    Signed-off-by: Oder Chiou <oder_chiou@realtek.com>
    Link: https://patch.msgid.link/20240930105039.3473266-1-oder_chiou@realtek.com
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ALSA: hda/realtek: Refactor and simplify Samsung Galaxy Book init [+ + +]
Author: Joshua Grisham <josh@joshuagrisham.com>
Date:   Mon Sep 9 21:30:00 2024 +0200

    ALSA: hda/realtek: Refactor and simplify Samsung Galaxy Book init
    
    [ Upstream commit 7e4d4b32ab9532bd1babcd5d0763d727ebb04be0 ]
    
    I have done a lot of analysis for these type of devices and collaborated
    quite a bit with Nick Weihs (author of the first patch submitted for this
    including adding samsung_helper.c). More information can be found in the
    issue on Github [1] including additional rationale and testing.
    
    The existing implementation includes a large number of equalizer coef
    values that are not necessary to actually init and enable the speaker
    amps, as well as create a somewhat worse sound profile. Users have
    reported "muffled" or "muddy" sound; more information about this including
    my analysis of the differences can be found in the linked Github issue.
    
    This patch refactors the "v2" version of ALC298_FIXUP_SAMSUNG_AMP to a much
    simpler implementation which removes the new samsung_helper.c, reuses more
    of the existing patch_realtek.c, and sends significantly fewer unnecessary
    coef values (including removing all of these EQ-specific coef values).
    
    A pcm_playback_hook is used to dynamically enable and disable the speaker
    amps only when there will be audio playback; this is to match the behavior
    of how the driver for these devices is working in Windows, and is
    suspected but not yet tested or confirmed to help with power consumption.
    
    Support for models with 2 speaker amps vs 4 speaker amps is controlled by
    a specific quirk name for both types. A new int num_speaker_amps has been
    added to alc_spec so that the hooks can know how many speaker amps to
    enable or disable. This design was chosen to limit the number of places
    that subsystem ids will need to be maintained: like this, they can be
    maintained only once in the quirk table and there will not be another
    separate list of subsystem ids to maintain elsewhere in the code.
    
    Also updated the quirk name from ALC298_FIXUP_SAMSUNG_AMP2 to
    ALC298_FIXUP_SAMSUNG_AMP_V2_.. as this is not a quirk for "Amp #2" on
    ALC298 but is instead a different version of how to handle it.
    
    More devices have been added (see Github issue for testing confirmation),
    as well as a small cleanup to existing names.
    
    [1]: https://github.com/thesofproject/linux/issues/4055#issuecomment-2323411911
    
    Signed-off-by: Joshua Grisham <josh@joshuagrisham.com>
    Link: https://patch.msgid.link/20240909193000.838815-1-josh@joshuagrisham.com
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ALSA: hda/tas2781: Add new quirk for Lenovo Y990 Laptop [+ + +]
Author: Baojun Xu <baojun.xu@ti.com>
Date:   Thu Sep 19 15:57:43 2024 +0800

    ALSA: hda/tas2781: Add new quirk for Lenovo Y990 Laptop
    
    commit 49f5ee951f11f4d6a124f00f71b2590507811a55 upstream.
    
    Add new vendor_id and subsystem_id in quirk for Lenovo Y990 Laptop.
    
    Signed-off-by: Baojun Xu <baojun.xu@ti.com>
    Cc: <stable@vger.kernel.org>
    Link: https://patch.msgid.link/20240919075743.259-1-baojun.xu@ti.com
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: hdsp: Break infinite MIDI input flush loop [+ + +]
Author: Takashi Iwai <tiwai@suse.de>
Date:   Thu Aug 8 11:15:12 2024 +0200

    ALSA: hdsp: Break infinite MIDI input flush loop
    
    [ Upstream commit c01f3815453e2d5f699ccd8c8c1f93a5b8669e59 ]
    
    The current MIDI input flush on HDSP and HDSPM drivers relies on the
    hardware reporting the right value.  If the hardware doesn't give the
    proper value but returns -1, it may be stuck at an infinite loop.
    
    Add a counter and break if the loop is unexpectedly too long.
    
    Link: https://patch.msgid.link/20240808091513.31380-1-tiwai@suse.de
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ALSA: line6: add hw monitor volume control to POD HD500X [+ + +]
Author: Hans P. Moller <hmoller@uc.cl>
Date:   Thu Oct 3 20:28:28 2024 -0300

    ALSA: line6: add hw monitor volume control to POD HD500X
    
    commit 703235a244e533652346844cfa42623afb36eed1 upstream.
    
    Add hw monitor volume control for POD HD500X. This is done adding
    LINE6_CAP_HWMON_CTL to the capabilities
    
    Signed-off-by: Hans P. Moller <hmoller@uc.cl>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Link: https://patch.msgid.link/20241003232828.5819-1-hmoller@uc.cl
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: mixer_oss: Remove some incorrect kfree_const() usages [+ + +]
Author: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Date:   Thu Sep 26 20:17:36 2024 +0200

    ALSA: mixer_oss: Remove some incorrect kfree_const() usages
    
    [ Upstream commit 368e4663c557de4a33f321b44e7eeec0a21b2e4e ]
    
    "assigned" and "assigned->name" are allocated in snd_mixer_oss_proc_write()
    using kmalloc() and kstrdup(), so there is no point in using kfree_const()
    to free these resources.
    
    Switch to the more standard kfree() to free these resources.
    
    This could avoid a memory leak.
    
    Fixes: 454f5ec1d2b7 ("ALSA: mixer: oss: Constify snd_mixer_oss_assign_table definition")
    Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
    Link: https://patch.msgid.link/63ac20f64234b7c9ea87a7fa9baf41e8255852f7.1727374631.git.christophe.jaillet@wanadoo.fr
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ALSA: usb-audio: Add delay quirk for VIVO USB-C HEADSET [+ + +]
Author: Lianqin Hu <hulianqin@vivo.com>
Date:   Wed Sep 25 03:16:29 2024 +0000

    ALSA: usb-audio: Add delay quirk for VIVO USB-C HEADSET
    
    commit 73385f3e0d8088b715ae8f3f66d533c482a376ab upstream.
    
    Audio control requests that sets sampling frequency sometimes fail on
    this card. Adding delay between control messages eliminates that problem.
    
    Signed-off-by: Lianqin Hu <hulianqin@vivo.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Link: https://patch.msgid.link/TYUPR06MB62177E629E9DEF2401333BF7D2692@TYUPR06MB6217.apcprd06.prod.outlook.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: usb-audio: Add input value sanity checks for standard types [+ + +]
Author: Takashi Iwai <tiwai@suse.de>
Date:   Tue Aug 6 14:46:50 2024 +0200

    ALSA: usb-audio: Add input value sanity checks for standard types
    
    [ Upstream commit 901e85677ec0bb9a69fb9eab1feafe0c4eb7d07e ]
    
    For an invalid input value that is out of the given range, currently
    USB-audio driver corrects the value silently and accepts without
    errors.  This is no wrong behavior, per se, but the recent kselftest
    rather wants to have an error in such a case, hence a different
    behavior is expected now.
    
    This patch adds a sanity check at each control put for the standard
    mixer types and returns an error if an invalid value is given.
    
    Note that this covers only the standard mixer types.  The mixer quirks
    that have own control callbacks would need different coverage.
    
    Link: https://patch.msgid.link/20240806124651.28203-1-tiwai@suse.de
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ALSA: usb-audio: Add logitech Audio profile quirk [+ + +]
Author: Joshua Pius <joshuapius@chromium.org>
Date:   Thu Sep 12 15:26:28 2024 +0000

    ALSA: usb-audio: Add logitech Audio profile quirk
    
    [ Upstream commit a51c925c11d7b855167e64b63eb4378e5adfc11d ]
    
    Specify shortnames for the following Logitech Devices: Rally bar, Rally
    bar mini, Tap, MeetUp and Huddle.
    
    Signed-off-by: Joshua Pius <joshuapius@chromium.org>
    Link: https://patch.msgid.link/20240912152635.1859737-1-joshuapius@google.com
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ALSA: usb-audio: Add mixer quirk for RME Digiface USB [+ + +]
Author: Asahi Lina <lina@asahilina.net>
Date:   Tue Sep 3 19:52:30 2024 +0900

    ALSA: usb-audio: Add mixer quirk for RME Digiface USB
    
    [ Upstream commit 611a96f6acf2e74fe28cb90908a9c183862348ce ]
    
    Implement sync, output format, and input status mixer controls, to allow
    the interface to be used as a straight ADAT/SPDIF (+ Headphones) I/O
    interface.
    
    This does not implement the matrix mixer, output gain controls, or input
    level meter feedback. The full mixer interface is only really usable
    using a dedicated userspace control app (there are too many mixer nodes
    for alsamixer to be usable), so for now we leave it up to userspace to
    directly control these features using raw USB control messages. This is
    similar to how it's done with some FireWire interfaces (ffado-mixer).
    
    Signed-off-by: Asahi Lina <lina@asahilina.net>
    Link: https://patch.msgid.link/20240903-rme-digiface-v2-2-71b06c912e97@asahilina.net
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ALSA: usb-audio: Add native DSD support for Luxman D-08u [+ + +]
Author: Jan Lalinsky <lalinsky@c4.cz>
Date:   Thu Oct 3 05:08:11 2024 +0200

    ALSA: usb-audio: Add native DSD support for Luxman D-08u
    
    commit 6b0bde5d8d4078ca5feec72fd2d828f0e5cf115d upstream.
    
    Add native DSD support for Luxman D-08u DAC, by adding the PID/VID 1852:5062.
    This makes DSD playback work, and also sound quality when playing PCM files
    is improved, crackling sounds are gone.
    
    Signed-off-by: Jan Lalinsky <lalinsky@c4.cz>
    Cc: <stable@vger.kernel.org>
    Link: https://patch.msgid.link/20241003030811.2655735-1-lalinsky@c4.cz
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: usb-audio: Add quirk for RME Digiface USB [+ + +]
Author: Cyan Nyan <cyan.vtb@gmail.com>
Date:   Tue Sep 3 19:52:29 2024 +0900

    ALSA: usb-audio: Add quirk for RME Digiface USB
    
    [ Upstream commit c032044e9672408c534d64a6df2b1ba14449e948 ]
    
    Add trivial support for audio streaming on the RME Digiface USB. Binds
    only to the first interface to allow userspace to directly drive the
    complex I/O and matrix mixer controls.
    
    Signed-off-by: Cyan Nyan <cyan.vtb@gmail.com>
    [Lina: Added 2x/4x sample rate support & boot/format quirks]
    Co-developed-by: Asahi Lina <lina@asahilina.net>
    Signed-off-by: Asahi Lina <lina@asahilina.net>
    Link: https://patch.msgid.link/20240903-rme-digiface-v2-1-71b06c912e97@asahilina.net
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ALSA: usb-audio: Define macros for quirk table entries [+ + +]
Author: Takashi Iwai <tiwai@suse.de>
Date:   Wed Aug 14 15:48:41 2024 +0200

    ALSA: usb-audio: Define macros for quirk table entries
    
    [ Upstream commit 0c3ad39b791c2ecf718afcaca30e5ceafa939d5c ]
    
    Many entries in the USB-audio quirk tables have relatively complex
    expressions.  For improving the readability, introduce a few macros.
    Those are applied in the following patch.
    
    Link: https://patch.msgid.link/20240814134844.2726-2-tiwai@suse.de
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ALSA: usb-audio: Replace complex quirk lines with macros [+ + +]
Author: Takashi Iwai <tiwai@suse.de>
Date:   Wed Aug 14 15:48:42 2024 +0200

    ALSA: usb-audio: Replace complex quirk lines with macros
    
    [ Upstream commit d79e13f8e8abb5cd3a2a0f9fc9bc3fc750c5b06f ]
    
    Apply the newly introduced macros for reduce the complex expressions
    and cast in the quirk table definitions.  It results in a significant
    code reduction, too.
    
    There should be no functional changes.
    
    Link: https://patch.msgid.link/20240814134844.2726-3-tiwai@suse.de
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
aoe: fix the potential use-after-free problem in more places [+ + +]
Author: Chun-Yi Lee <joeyli.kernel@gmail.com>
Date:   Wed Oct 2 11:54:58 2024 +0800

    aoe: fix the potential use-after-free problem in more places
    
    commit 6d6e54fc71ad1ab0a87047fd9c211e75d86084a3 upstream.
    
    For fixing CVE-2023-6270, f98364e92662 ("aoe: fix the potential
    use-after-free problem in aoecmd_cfg_pkts") makes tx() calling dev_put()
    instead of doing in aoecmd_cfg_pkts(). It avoids that the tx() runs
    into use-after-free.
    
    Then Nicolai Stange found more places in aoe have potential use-after-free
    problem with tx(). e.g. revalidate(), aoecmd_ata_rw(), resend(), probe()
    and aoecmd_cfg_rsp(). Those functions also use aoenet_xmit() to push
    packet to tx queue. So they should also use dev_hold() to increase the
    refcnt of skb->dev.
    
    On the other hand, moving dev_put() to tx() causes that the refcnt of
    skb->dev be reduced to a negative value, because corresponding
    dev_hold() are not called in revalidate(), aoecmd_ata_rw(), resend(),
    probe(), and aoecmd_cfg_rsp(). This patch fixed this issue.
    
    Cc: stable@vger.kernel.org
    Link: https://nvd.nist.gov/vuln/detail/CVE-2023-6270
    Fixes: f98364e92662 ("aoe: fix the potential use-after-free problem in aoecmd_cfg_pkts")
    Reported-by: Nicolai Stange <nstange@suse.com>
    Signed-off-by: Chun-Yi Lee <jlee@suse.com>
    Link: https://lore.kernel.org/stable/20240624064418.27043-1-jlee%40suse.com
    Link: https://lore.kernel.org/r/20241002035458.24401-1-jlee@suse.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
arm64: cputype: Add Neoverse-N3 definitions [+ + +]
Author: Mark Rutland <mark.rutland@arm.com>
Date:   Mon Oct 7 13:04:19 2024 +0100

    arm64: cputype: Add Neoverse-N3 definitions
    
    [ Upstream commit 924725707d80bc2588cefafef76ff3f164d299bc ]
    
    Add cputype definitions for Neoverse-N3. These will be used for errata
    detection in subsequent patches.
    
    These values can be found in Table A-261 ("MIDR_EL1 bit descriptions")
    in issue 02 of the Neoverse-N3 TRM, which can be found at:
    
      https://developer.arm.com/documentation/107997/0000/?lang=en
    
    Signed-off-by: Mark Rutland <mark.rutland@arm.com>
    Cc: James Morse <james.morse@arm.com>
    Cc: Will Deacon <will@kernel.org>
    Link: https://lore.kernel.org/r/20240930111705.3352047-2-mark.rutland@arm.com
    Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
    [ Mark: trivial backport ]
    Signed-off-by: Mark Rutland <mark.rutland@arm.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

arm64: errata: Expand speculative SSBS workaround once more [+ + +]
Author: Mark Rutland <mark.rutland@arm.com>
Date:   Mon Oct 7 13:04:20 2024 +0100

    arm64: errata: Expand speculative SSBS workaround once more
    
    [ Upstream commit 081eb7932c2b244f63317a982c5e3990e2c7fbdd ]
    
    A number of Arm Ltd CPUs suffer from errata whereby an MSR to the SSBS
    special-purpose register does not affect subsequent speculative
    instructions, permitting speculative store bypassing for a window of
    time.
    
    We worked around this for a number of CPUs in commits:
    
    * 7187bb7d0b5c7dfa ("arm64: errata: Add workaround for Arm errata 3194386 and 3312417")
    * 75b3c43eab594bfb ("arm64: errata: Expand speculative SSBS workaround")
    * 145502cac7ea70b5 ("arm64: errata: Expand speculative SSBS workaround (again)")
    
    Since then, a (hopefully final) batch of updates have been published,
    with two more affected CPUs. For the affected CPUs the existing
    mitigation is sufficient, as described in their respective Software
    Developer Errata Notice (SDEN) documents:
    
    * Cortex-A715 (MP148) SDEN v15.0, erratum 3456084
      https://developer.arm.com/documentation/SDEN-2148827/1500/
    
    * Neoverse-N3 (MP195) SDEN v5.0, erratum 3456111
      https://developer.arm.com/documentation/SDEN-3050973/0500/
    
    Enable the existing mitigation by adding the relevant MIDRs to
    erratum_spec_ssbs_list, and update silicon-errata.rst and the
    Kconfig text accordingly.
    
    Signed-off-by: Mark Rutland <mark.rutland@arm.com>
    Cc: James Morse <james.morse@arm.com>
    Cc: Will Deacon <will@kernel.org>
    Link: https://lore.kernel.org/r/20240930111705.3352047-3-mark.rutland@arm.com
    Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
    [ Mark: trivial backport ]
    Signed-off-by: Mark Rutland <mark.rutland@arm.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

arm64: fix selection of HAVE_DYNAMIC_FTRACE_WITH_ARGS [+ + +]
Author: Mark Rutland <mark.rutland@arm.com>
Date:   Mon Sep 30 13:04:48 2024 +0100

    arm64: fix selection of HAVE_DYNAMIC_FTRACE_WITH_ARGS
    
    commit b3d6121eaeb22aee8a02f46706745b1968cc0292 upstream.
    
    The Kconfig logic to select HAVE_DYNAMIC_FTRACE_WITH_ARGS is incorrect,
    and HAVE_DYNAMIC_FTRACE_WITH_ARGS may be selected when it is not
    supported by the combination of clang and GNU LD, resulting in link-time
    errors:
    
      aarch64-linux-gnu-ld: .init.data has both ordered [`__patchable_function_entries' in init/main.o] and unordered [`.meminit.data' in mm/sparse.o] sections
      aarch64-linux-gnu-ld: final link failed: bad value
    
    ... which can be seen when building with CC=clang using a binutils
    version older than 2.36.
    
    We originally fixed that in commit:
    
      45bd8951806eb5e8 ("arm64: Improve HAVE_DYNAMIC_FTRACE_WITH_REGS selection for clang")
    
    ... by splitting the "select HAVE_DYNAMIC_FTRACE_WITH_ARGS" statement
    into separete CLANG_SUPPORTS_DYNAMIC_FTRACE_WITH_ARGS and
    GCC_SUPPORTS_DYNAMIC_FTRACE_WITH_ARGS options which individually select
    HAVE_DYNAMIC_FTRACE_WITH_ARGS.
    
    Subsequently we accidentally re-introduced the common "select
    HAVE_DYNAMIC_FTRACE_WITH_ARGS" statement in commit:
    
      26299b3f6ba26bfc ("ftrace: arm64: move from REGS to ARGS")
    
    ... then we removed it again in commit:
    
      68a63a412d18bd2e ("arm64: Fix build with CC=clang, CONFIG_FTRACE=y and CONFIG_STACK_TRACER=y")
    
    ... then we accidentally re-introduced it again in commit:
    
      2aa6ac03516d078c ("arm64: ftrace: Add direct call support")
    
    Fix this for the third time by keeping the unified select statement and
    making this depend onf either GCC_SUPPORTS_DYNAMIC_FTRACE_WITH_ARGS or
    CLANG_SUPPORTS_DYNAMIC_FTRACE_WITH_ARGS. This is more consistent with
    usual style and less likely to go wrong in future.
    
    Fixes: 2aa6ac03516d ("arm64: ftrace: Add direct call support")
    Cc: <stable@vger.kernel.org> # 6.4.x
    Signed-off-by: Mark Rutland <mark.rutland@arm.com>
    Cc: Will Deacon <will@kernel.org>
    Link: https://lore.kernel.org/r/20240930120448.3352564-1-mark.rutland@arm.com
    Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: Subscribe Microsoft Azure Cobalt 100 to erratum 3194386 [+ + +]
Author: Easwar Hariharan <eahariha@linux.microsoft.com>
Date:   Thu Oct 3 22:52:35 2024 +0000

    arm64: Subscribe Microsoft Azure Cobalt 100 to erratum 3194386
    
    commit 3eddb108abe3de6723cc4b77e8558ce1b3047987 upstream.
    
    Add the Microsoft Azure Cobalt 100 CPU to the list of CPUs suffering
    from erratum 3194386 added in commit 75b3c43eab59 ("arm64: errata:
    Expand speculative SSBS workaround")
    
    CC: Mark Rutland <mark.rutland@arm.com>
    CC: James More <james.morse@arm.com>
    CC: Will Deacon <will@kernel.org>
    CC: stable@vger.kernel.org # 6.6+
    Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
    Link: https://lore.kernel.org/r/20241003225239.321774-1-eahariha@linux.microsoft.com
    Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: trans_pgd: mark PTEs entries as valid to avoid dead kexec() [+ + +]
Author: Fares Mehanna <faresx@amazon.de>
Date:   Mon Sep 2 16:33:08 2024 +0000

    arm64: trans_pgd: mark PTEs entries as valid to avoid dead kexec()
    
    [ Upstream commit 7eced90b202d63cdc1b9b11b1353adb1389830f9 ]
    
    The reasons for PTEs in the kernel direct map to be marked invalid are not
    limited to kfence / debug pagealloc machinery. In particular,
    memfd_secret() also steals pages with set_direct_map_invalid_noflush().
    
    When building the transitional page tables for kexec from the current
    kernel's page tables, those pages need to become regular writable pages,
    otherwise, if the relocation places kexec segments over such pages, a fault
    will occur during kexec, leading to host going dark during kexec.
    
    This patch addresses the kexec issue by marking any PTE as valid if it is
    not none. While this fixes the kexec crash, it does not address the
    security concern that if processes owning secret memory are not terminated
    before kexec, the secret content will be mapped in the new kernel without
    being scrubbed.
    
    Suggested-by: Jan H. Schönherr <jschoenh@amazon.de>
    Signed-off-by: Fares Mehanna <faresx@amazon.de>
    Link: https://lore.kernel.org/r/20240902163309.97113-1-faresx@amazon.de
    Signed-off-by: Will Deacon <will@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ASoC: atmel: mchp-pdmc: Skip ALSA restoration if substream runtime is uninitialized [+ + +]
Author: Andrei Simion <andrei.simion@microchip.com>
Date:   Tue Sep 24 11:12:38 2024 +0300

    ASoC: atmel: mchp-pdmc: Skip ALSA restoration if substream runtime is uninitialized
    
    [ Upstream commit 09cfc6a532d249a51d3af5022d37ebbe9c3d31f6 ]
    
    Update the driver to prevent alsa-restore.service from failing when
    reading data from /var/lib/alsa/asound.state at boot. Ensure that the
    restoration of ALSA mixer configurations is skipped if substream->runtime
    is NULL.
    
    Fixes: 50291652af52 ("ASoC: atmel: mchp-pdmc: add PDMC driver")
    Signed-off-by: Andrei Simion <andrei.simion@microchip.com>
    Link: https://patch.msgid.link/20240924081237.50046-1-andrei.simion@microchip.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ASoC: codecs: wsa883x: Handle reading version failure [+ + +]
Author: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Date:   Wed Jul 10 15:52:31 2024 +0200

    ASoC: codecs: wsa883x: Handle reading version failure
    
    [ Upstream commit 2fbf16992e5aa14acf0441320033a01a32309ded ]
    
    If reading version and variant from registers fails (which is unlikely
    but possible, because it is a read over bus), the driver will proceed
    and perform device configuration based on uninitialized stack variables.
    Handle it a bit better - bail out without doing any init and failing the
    update status Soundwire callback.
    
    Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
    Link: https://patch.msgid.link/20240710-asoc-wsa88xx-version-v1-2-f1c54966ccde@linaro.org
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ASoC: imx-card: Set card.owner to avoid a warning calltrace if SND=m [+ + +]
Author: Hui Wang <hui.wang@canonical.com>
Date:   Wed Oct 2 10:56:59 2024 +0800

    ASoC: imx-card: Set card.owner to avoid a warning calltrace if SND=m
    
    [ Upstream commit 47d7d3fd72afc7dcd548806291793ee6f3848215 ]
    
    In most Linux distribution kernels, the SND is set to m, in such a
    case, when booting the kernel on i.MX8MP EVK board, there is a
    warning calltrace like below:
     Call trace:
     snd_card_init+0x484/0x4cc [snd]
     snd_card_new+0x70/0xa8 [snd]
     snd_soc_bind_card+0x310/0xbd0 [snd_soc_core]
     snd_soc_register_card+0xf0/0x108 [snd_soc_core]
     devm_snd_soc_register_card+0x4c/0xa4 [snd_soc_core]
    
    That is because the card.owner is not set, a warning calltrace is
    raised in the snd_card_init() due to it.
    
    Fixes: aa736700f42f ("ASoC: imx-card: Add imx-card machine driver")
    Signed-off-by: Hui Wang <hui.wang@canonical.com>
    Link: https://patch.msgid.link/20241002025659.723544-1-hui.wang@canonical.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ASoC: Intel: boards: always check the result of acpi_dev_get_first_match_dev() [+ + +]
Author: Pierre-Louis Bossart <pierre-louis.bossart@linux.dev>
Date:   Tue Aug 27 20:32:01 2024 +0800

    ASoC: Intel: boards: always check the result of acpi_dev_get_first_match_dev()
    
    [ Upstream commit 14e91ddd5c02d8c3e5a682ebfa0546352b459911 ]
    
    The code seems mostly copy-pasted, with some machine drivers
    forgetting to test if the 'adev' result is NULL.
    
    Add this check when missing, and use -ENOENT consistently as an error
    code.
    
    Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
    Closes: https://lore.kernel.org/alsa-devel/918944d2-3d00-465e-a9d1-5d57fc966113@stanley.mountain/T/#u
    Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
    Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
    Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
    Link: https://patch.msgid.link/20240827123215.258859-4-yung-chuan.liao@linux.intel.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ASoC: Intel: soc-acpi-intel-rpl-match: add missing empty item [+ + +]
Author: Bard Liao <yung-chuan.liao@linux.intel.com>
Date:   Tue Oct 1 14:17:37 2024 +0800

    ASoC: Intel: soc-acpi-intel-rpl-match: add missing empty item
    
    [ Upstream commit 5afc29ba44fdd1bcbad4e07246c395d946301580 ]
    
    There is no links_num in struct snd_soc_acpi_mach {}, and we test
    !link->num_adr as a condition to end the loop in hda_sdw_machine_select().
    So an empty item in struct snd_soc_acpi_link_adr array is required.
    
    Fixes: 65ab45b90656 ("ASoC: Intel: soc-acpi: Add match entries for some cs42l43 laptops")
    Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
    Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
    Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com>
    Link: https://patch.msgid.link/20241001061738.34854-2-yung-chuan.liao@linux.intel.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ASoC: topology: Fix incorrect addressing assignments [+ + +]
Author: Tang Bin <tangbin@cmss.chinamobile.com>
Date:   Sat Sep 14 16:16:08 2024 +0800

    ASoC: topology: Fix incorrect addressing assignments
    
    [ Upstream commit 85109780543b5100aba1d0842b6a7c3142be74d2 ]
    
    The variable 'kc' is handled in the function
    soc_tplg_control_dbytes_create(), and 'kc->private_value'
    is assigned to 'sbe', so In the function soc_tplg_dbytes_create(),
    the right 'sbe' should be 'kc.private_value', the same logical error
    in the function soc_tplg_dmixer_create(), thus fix them.
    
    Fixes: 0867278200f7 ("ASoC: topology: Unify code for creating standalone and widget bytes control")
    Fixes: 4654ca7cc8d6 ("ASoC: topology: Unify code for creating standalone and widget mixer control")
    Signed-off-by: Tang Bin <tangbin@cmss.chinamobile.com>
    Reviewed-by: Amadeusz Sławiński <amadeuszx.slawinski@linux.intel.com>
    Link: https://patch.msgid.link/20240914081608.3514-1-tangbin@cmss.chinamobile.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ata: pata_serverworks: Do not use the term blacklist [+ + +]
Author: Damien Le Moal <dlemoal@kernel.org>
Date:   Fri Jul 26 10:58:36 2024 +0900

    ata: pata_serverworks: Do not use the term blacklist
    
    [ Upstream commit 858048568c9e3887d8b19e101ee72f129d65cb15 ]
    
    Let's not use the term blacklist in the function
    serverworks_osb4_filter() documentation comment and rather simply refer
    to what that function looks at: the list of devices with groken UDMA5.
    
    While at it, also constify the values of the csb_bad_ata100 array.
    
    Of note is that all of this should probably be handled using libata
    quirk mechanism but it is unclear if these UDMA5 quirks are specific
    to this controller only.
    
    Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
    Reviewed-by: Niklas Cassel <cassel@kernel.org>
    Reviewed-by: Igor Pylypiv <ipylypiv@google.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ata: sata_sil: Rename sil_blacklist to sil_quirks [+ + +]
Author: Damien Le Moal <dlemoal@kernel.org>
Date:   Fri Jul 26 11:14:11 2024 +0900

    ata: sata_sil: Rename sil_blacklist to sil_quirks
    
    [ Upstream commit 93b0f9e11ce511353c65b7f924cf5f95bd9c3aba ]
    
    Rename the array sil_blacklist to sil_quirks as this name is more
    neutral and is also consistent with how this driver define quirks with
    the SIL_QUIRK_XXX flags.
    
    Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
    Reviewed-by: Niklas Cassel <cassel@kernel.org>
    Reviewed-by: Igor Pylypiv <ipylypiv@google.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
blk_iocost: fix more out of bound shifts [+ + +]
Author: Konstantin Ovsepian <ovs@ovs.to>
Date:   Thu Aug 22 08:41:36 2024 -0700

    blk_iocost: fix more out of bound shifts
    
    [ Upstream commit 9bce8005ec0dcb23a58300e8522fe4a31da606fa ]
    
    Recently running UBSAN caught few out of bound shifts in the
    ioc_forgive_debts() function:
    
    UBSAN: shift-out-of-bounds in block/blk-iocost.c:2142:38
    shift exponent 80 is too large for 64-bit type 'u64' (aka 'unsigned long
    long')
    ...
    UBSAN: shift-out-of-bounds in block/blk-iocost.c:2144:30
    shift exponent 80 is too large for 64-bit type 'u64' (aka 'unsigned long
    long')
    ...
    Call Trace:
    <IRQ>
    dump_stack_lvl+0xca/0x130
    __ubsan_handle_shift_out_of_bounds+0x22c/0x280
    ? __lock_acquire+0x6441/0x7c10
    ioc_timer_fn+0x6cec/0x7750
    ? blk_iocost_init+0x720/0x720
    ? call_timer_fn+0x5d/0x470
    call_timer_fn+0xfa/0x470
    ? blk_iocost_init+0x720/0x720
    __run_timer_base+0x519/0x700
    ...
    
    Actual impact of this issue was not identified but I propose to fix the
    undefined behaviour.
    The proposed fix to prevent those out of bound shifts consist of
    precalculating exponent before using it the shift operations by taking
    min value from the actual exponent and maximum possible number of bits.
    
    Reported-by: Breno Leitao <leitao@debian.org>
    Signed-off-by: Konstantin Ovsepian <ovs@ovs.to>
    Acked-by: Tejun Heo <tj@kernel.org>
    Link: https://lore.kernel.org/r/20240822154137.2627818-1-ovs@ovs.to
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
block: fix integer overflow in BLKSECDISCARD [+ + +]
Author: Alexey Dobriyan <adobriyan@gmail.com>
Date:   Tue Sep 3 22:48:19 2024 +0300

    block: fix integer overflow in BLKSECDISCARD
    
    [ Upstream commit 697ba0b6ec4ae04afb67d3911799b5e2043b4455 ]
    
    I independently rediscovered
    
            commit 22d24a544b0d49bbcbd61c8c0eaf77d3c9297155
            block: fix overflow in blk_ioctl_discard()
    
    but for secure erase.
    
    Same problem:
    
            uint64_t r[2] = {512, 18446744073709551104ULL};
            ioctl(fd, BLKSECDISCARD, r);
    
    will enter near infinite loop inside blkdev_issue_secure_erase():
    
            a.out: attempt to access beyond end of device
            loop0: rw=5, sector=3399043073, nr_sectors = 1024 limit=2048
            bio_check_eod: 3286214 callbacks suppressed
    
    Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
    Link: https://lore.kernel.org/r/9e64057f-650a-46d1-b9f7-34af391536ef@p183
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
Bluetooth: btmrvl: Use IRQF_NO_AUTOEN flag in request_irq() [+ + +]
Author: Jinjie Ruan <ruanjinjie@huawei.com>
Date:   Thu Sep 12 11:12:04 2024 +0800

    Bluetooth: btmrvl: Use IRQF_NO_AUTOEN flag in request_irq()
    
    [ Upstream commit 7b1ab460592ca818e7b52f27cd3ec86af79220d1 ]
    
    disable_irq() after request_irq() still has a time gap in which
    interrupts can come. request_irq() with IRQF_NO_AUTOEN flag will
    disable IRQ auto-enable when request IRQ.
    
    Fixes: bb7f4f0bcee6 ("btmrvl: add platform specific wakeup interrupt support")
    Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
    Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Bluetooth: btrtl: Set msft ext address filter quirk for RTL8852B [+ + +]
Author: Hilda Wu <hildawu@realtek.com>
Date:   Thu Aug 29 16:40:05 2024 +0800

    Bluetooth: btrtl: Set msft ext address filter quirk for RTL8852B
    
    [ Upstream commit 9a0570948c5def5c59e588dc0e009ed850a1f5a1 ]
    
    For tracking multiple devices concurrently with a condition.
    The patch enables the HCI_QUIRK_USE_MSFT_EXT_ADDRESS_FILTER quirk
    on RTL8852B controller.
    
    The quirk setting is based on commit 9e14606d8f38 ("Bluetooth: msft:
    Extended monitor tracking by address filter")
    
    With this setting, when a pattern monitor detects a device, this
    feature issues an address monitor for tracking that device. Let the
    original pattern monitor keep monitor new devices.
    
    Signed-off-by: Hilda Wu <hildawu@realtek.com>
    Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Bluetooth: btusb: Add Realtek RTL8852C support ID 0x0489:0xe122 [+ + +]
Author: Hilda Wu <hildawu@realtek.com>
Date:   Fri Aug 16 16:58:22 2024 +0800

    Bluetooth: btusb: Add Realtek RTL8852C support ID 0x0489:0xe122
    
    [ Upstream commit bdf9557f70e7512bb2f754abf90d9e9958745316 ]
    
    Add the support ID (0x0489, 0xe122) to usb_device_id table for
    Realtek RTL8852C.
    
    The device info from /sys/kernel/debug/usb/devices as below.
    
    T:  Bus=03 Lev=01 Prnt=01 Port=02 Cnt=01 Dev#=  2 Spd=12   MxCh= 0
    D:  Ver= 1.00 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs=  1
    P:  Vendor=0489 ProdID=e122 Rev= 0.00
    S:  Manufacturer=Realtek
    S:  Product=Bluetooth Radio
    S:  SerialNumber=00e04c000001
    C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=500mA
    I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
    E:  Ad=81(I) Atr=03(Int.) MxPS=  16 Ivl=1ms
    E:  Ad=02(O) Atr=02(Bulk) MxPS=  64 Ivl=0ms
    E:  Ad=82(I) Atr=02(Bulk) MxPS=  64 Ivl=0ms
    I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
    E:  Ad=03(O) Atr=01(Isoc) MxPS=   0 Ivl=1ms
    E:  Ad=83(I) Atr=01(Isoc) MxPS=   0 Ivl=1ms
    I:  If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
    E:  Ad=03(O) Atr=01(Isoc) MxPS=   9 Ivl=1ms
    E:  Ad=83(I) Atr=01(Isoc) MxPS=   9 Ivl=1ms
    I:  If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
    E:  Ad=03(O) Atr=01(Isoc) MxPS=  17 Ivl=1ms
    E:  Ad=83(I) Atr=01(Isoc) MxPS=  17 Ivl=1ms
    I:  If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
    E:  Ad=03(O) Atr=01(Isoc) MxPS=  25 Ivl=1ms
    E:  Ad=83(I) Atr=01(Isoc) MxPS=  25 Ivl=1ms
    I:  If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
    E:  Ad=03(O) Atr=01(Isoc) MxPS=  33 Ivl=1ms
    E:  Ad=83(I) Atr=01(Isoc) MxPS=  33 Ivl=1ms
    I:  If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
    E:  Ad=03(O) Atr=01(Isoc) MxPS=  49 Ivl=1ms
    E:  Ad=83(I) Atr=01(Isoc) MxPS=  49 Ivl=1ms
    
    Signed-off-by: Hilda Wu <hildawu@realtek.com>
    Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Bluetooth: hci_event: Align BR/EDR JUST_WORKS paring with LE [+ + +]
Author: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Date:   Thu Sep 12 12:17:00 2024 -0400

    Bluetooth: hci_event: Align BR/EDR JUST_WORKS paring with LE
    
    commit b25e11f978b63cb7857890edb3a698599cddb10e upstream.
    
    This aligned BR/EDR JUST_WORKS method with LE which since 92516cd97fd4
    ("Bluetooth: Always request for user confirmation for Just Works")
    always request user confirmation with confirm_hint set since the
    likes of bluetoothd have dedicated policy around JUST_WORKS method
    (e.g. main.conf:JustWorksRepairing).
    
    CVE: CVE-2024-8805
    Cc: stable@vger.kernel.org
    Fixes: ba15a58b179e ("Bluetooth: Fix SSP acceptor just-works confirmation without MITM")
    Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
    Tested-by: Kiran K <kiran.k@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bluetooth: L2CAP: Fix uaf in l2cap_connect [+ + +]
Author: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Date:   Mon Sep 23 12:47:39 2024 -0400

    Bluetooth: L2CAP: Fix uaf in l2cap_connect
    
    [ Upstream commit 333b4fd11e89b29c84c269123f871883a30be586 ]
    
    [Syzbot reported]
    BUG: KASAN: slab-use-after-free in l2cap_connect.constprop.0+0x10d8/0x1270 net/bluetooth/l2cap_core.c:3949
    Read of size 8 at addr ffff8880241e9800 by task kworker/u9:0/54
    
    CPU: 0 UID: 0 PID: 54 Comm: kworker/u9:0 Not tainted 6.11.0-rc6-syzkaller-00268-g788220eee30d #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
    Workqueue: hci2 hci_rx_work
    Call Trace:
     <TASK>
     __dump_stack lib/dump_stack.c:93 [inline]
     dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:119
     print_address_description mm/kasan/report.c:377 [inline]
     print_report+0xc3/0x620 mm/kasan/report.c:488
     kasan_report+0xd9/0x110 mm/kasan/report.c:601
     l2cap_connect.constprop.0+0x10d8/0x1270 net/bluetooth/l2cap_core.c:3949
     l2cap_connect_req net/bluetooth/l2cap_core.c:4080 [inline]
     l2cap_bredr_sig_cmd net/bluetooth/l2cap_core.c:4772 [inline]
     l2cap_sig_channel net/bluetooth/l2cap_core.c:5543 [inline]
     l2cap_recv_frame+0xf0b/0x8eb0 net/bluetooth/l2cap_core.c:6825
     l2cap_recv_acldata+0x9b4/0xb70 net/bluetooth/l2cap_core.c:7514
     hci_acldata_packet net/bluetooth/hci_core.c:3791 [inline]
     hci_rx_work+0xaab/0x1610 net/bluetooth/hci_core.c:4028
     process_one_work+0x9c5/0x1b40 kernel/workqueue.c:3231
     process_scheduled_works kernel/workqueue.c:3312 [inline]
     worker_thread+0x6c8/0xed0 kernel/workqueue.c:3389
     kthread+0x2c1/0x3a0 kernel/kthread.c:389
     ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
     ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
    ...
    
    Freed by task 5245:
     kasan_save_stack+0x33/0x60 mm/kasan/common.c:47
     kasan_save_track+0x14/0x30 mm/kasan/common.c:68
     kasan_save_free_info+0x3b/0x60 mm/kasan/generic.c:579
     poison_slab_object+0xf7/0x160 mm/kasan/common.c:240
     __kasan_slab_free+0x32/0x50 mm/kasan/common.c:256
     kasan_slab_free include/linux/kasan.h:184 [inline]
     slab_free_hook mm/slub.c:2256 [inline]
     slab_free mm/slub.c:4477 [inline]
     kfree+0x12a/0x3b0 mm/slub.c:4598
     l2cap_conn_free net/bluetooth/l2cap_core.c:1810 [inline]
     kref_put include/linux/kref.h:65 [inline]
     l2cap_conn_put net/bluetooth/l2cap_core.c:1822 [inline]
     l2cap_conn_del+0x59d/0x730 net/bluetooth/l2cap_core.c:1802
     l2cap_connect_cfm+0x9e6/0xf80 net/bluetooth/l2cap_core.c:7241
     hci_connect_cfm include/net/bluetooth/hci_core.h:1960 [inline]
     hci_conn_failed+0x1c3/0x370 net/bluetooth/hci_conn.c:1265
     hci_abort_conn_sync+0x75a/0xb50 net/bluetooth/hci_sync.c:5583
     abort_conn_sync+0x197/0x360 net/bluetooth/hci_conn.c:2917
     hci_cmd_sync_work+0x1a4/0x410 net/bluetooth/hci_sync.c:328
     process_one_work+0x9c5/0x1b40 kernel/workqueue.c:3231
     process_scheduled_works kernel/workqueue.c:3312 [inline]
     worker_thread+0x6c8/0xed0 kernel/workqueue.c:3389
     kthread+0x2c1/0x3a0 kernel/kthread.c:389
     ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
     ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
    
    Reported-by: syzbot+c12e2f941af1feb5632c@syzkaller.appspotmail.com
    Tested-by: syzbot+c12e2f941af1feb5632c@syzkaller.appspotmail.com
    Closes: https://syzkaller.appspot.com/bug?extid=c12e2f941af1feb5632c
    Fixes: 7b064edae38d ("Bluetooth: Fix authentication if acl data comes before remote feature evt")
    Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Bluetooth: MGMT: Fix possible crash on mgmt_index_removed [+ + +]
Author: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Date:   Thu Sep 12 12:34:42 2024 -0400

    Bluetooth: MGMT: Fix possible crash on mgmt_index_removed
    
    [ Upstream commit f53e1c9c726d83092167f2226f32bd3b73f26c21 ]
    
    If mgmt_index_removed is called while there are commands queued on
    cmd_sync it could lead to crashes like the bellow trace:
    
    0x0000053D: __list_del_entry_valid_or_report+0x98/0xdc
    0x0000053D: mgmt_pending_remove+0x18/0x58 [bluetooth]
    0x0000053E: mgmt_remove_adv_monitor_complete+0x80/0x108 [bluetooth]
    0x0000053E: hci_cmd_sync_work+0xbc/0x164 [bluetooth]
    
    So while handling mgmt_index_removed this attempts to dequeue
    commands passed as user_data to cmd_sync.
    
    Fixes: 7cf5c2978f23 ("Bluetooth: hci_sync: Refactor remove Adv Monitor")
    Reported-by: jiaymao <quic_jiaymao@quicinc.com>
    Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
bnxt_en: Extend maximum length of version string by 1 byte [+ + +]
Author: Simon Horman <horms@kernel.org>
Date:   Tue Aug 13 15:32:55 2024 +0100

    bnxt_en: Extend maximum length of version string by 1 byte
    
    [ Upstream commit ffff7ee843c351ce71d6e0d52f0f20bea35e18c9 ]
    
    This corrects an out-by-one error in the maximum length of the package
    version string. The size argument of snprintf includes space for the
    trailing '\0' byte, so there is no need to allow extra space for it by
    reducing the value of the size argument by 1.
    
    Found by inspection.
    Compile tested only.
    
    Signed-off-by: Simon Horman <horms@kernel.org>
    Reviewed-by: Michael Chan <michael.chan@broadcom.com>
    Link: https://patch.msgid.link/20240813-bnxt-str-v2-1-872050a157e7@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
bpf: Fix a sdiv overflow issue [+ + +]
Author: Yonghong Song <yonghong.song@linux.dev>
Date:   Fri Sep 13 08:03:26 2024 -0700

    bpf: Fix a sdiv overflow issue
    
    [ Upstream commit 7dd34d7b7dcf9309fc6224caf4dd5b35bedddcb7 ]
    
    Zac Ecob reported a problem where a bpf program may cause kernel crash due
    to the following error:
      Oops: divide error: 0000 [#1] PREEMPT SMP KASAN PTI
    
    The failure is due to the below signed divide:
      LLONG_MIN/-1 where LLONG_MIN equals to -9,223,372,036,854,775,808.
    LLONG_MIN/-1 is supposed to give a positive number 9,223,372,036,854,775,808,
    but it is impossible since for 64-bit system, the maximum positive
    number is 9,223,372,036,854,775,807. On x86_64, LLONG_MIN/-1 will
    cause a kernel exception. On arm64, the result for LLONG_MIN/-1 is
    LLONG_MIN.
    
    Further investigation found all the following sdiv/smod cases may trigger
    an exception when bpf program is running on x86_64 platform:
      - LLONG_MIN/-1 for 64bit operation
      - INT_MIN/-1 for 32bit operation
      - LLONG_MIN%-1 for 64bit operation
      - INT_MIN%-1 for 32bit operation
    where -1 can be an immediate or in a register.
    
    On arm64, there are no exceptions:
      - LLONG_MIN/-1 = LLONG_MIN
      - INT_MIN/-1 = INT_MIN
      - LLONG_MIN%-1 = 0
      - INT_MIN%-1 = 0
    where -1 can be an immediate or in a register.
    
    Insn patching is needed to handle the above cases and the patched codes
    produced results aligned with above arm64 result. The below are pseudo
    codes to handle sdiv/smod exceptions including both divisor -1 and divisor 0
    and the divisor is stored in a register.
    
    sdiv:
          tmp = rX
          tmp += 1 /* [-1, 0] -> [0, 1]
          if tmp >(unsigned) 1 goto L2
          if tmp == 0 goto L1
          rY = 0
      L1:
          rY = -rY;
          goto L3
      L2:
          rY /= rX
      L3:
    
    smod:
          tmp = rX
          tmp += 1 /* [-1, 0] -> [0, 1]
          if tmp >(unsigned) 1 goto L1
          if tmp == 1 (is64 ? goto L2 : goto L3)
          rY = 0;
          goto L2
      L1:
          rY %= rX
      L2:
          goto L4  // only when !is64
      L3:
          wY = wY  // only when !is64
      L4:
    
      [1] https://lore.kernel.org/bpf/tPJLTEh7S_DxFEqAI2Ji5MBSoZVg7_G-Py2iaZpAaWtM961fFTWtsnlzwvTbzBzaUzwQAoNATXKUlt0LZOFgnDcIyKCswAnAGdUF3LBrhGQ=@protonmail.com/
    
    Reported-by: Zac Ecob <zacecob@protonmail.com>
    Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
    Acked-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/r/20240913150326.1187788-1-yonghong.song@linux.dev
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

bpf: Make the pointer returned by iter next method valid [+ + +]
Author: Juntong Deng <juntong.deng@outlook.com>
Date:   Thu Aug 29 21:11:17 2024 +0100

    bpf: Make the pointer returned by iter next method valid
    
    [ Upstream commit 4cc8c50c9abcb2646a7a4fcef3cea5dcb30c06cf ]
    
    Currently we cannot pass the pointer returned by iter next method as
    argument to KF_TRUSTED_ARGS or KF_RCU kfuncs, because the pointer
    returned by iter next method is not "valid".
    
    This patch sets the pointer returned by iter next method to be valid.
    
    This is based on the fact that if the iterator is implemented correctly,
    then the pointer returned from the iter next method should be valid.
    
    This does not make NULL pointer valid. If the iter next method has
    KF_RET_NULL flag, then the verifier will ask the ebpf program to
    check NULL pointer.
    
    KF_RCU_PROTECTED iterator is a special case, the pointer returned by
    iter next method should only be valid within RCU critical section,
    so it should be with MEM_RCU, not PTR_TRUSTED.
    
    Another special case is bpf_iter_num_next, which returns a pointer with
    base type PTR_TO_MEM. PTR_TO_MEM should not be combined with type flag
    PTR_TRUSTED (PTR_TO_MEM already means the pointer is valid).
    
    The pointer returned by iter next method of other types of iterators
    is with PTR_TRUSTED.
    
    In addition, this patch adds get_iter_from_state to help us get the
    current iterator from the current state.
    
    Signed-off-by: Juntong Deng <juntong.deng@outlook.com>
    Link: https://lore.kernel.org/r/AM6PR03MB584869F8B448EA1C87B7CDA399962@AM6PR03MB5848.eurprd03.prod.outlook.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
bpftool: Fix undefined behavior caused by shifting into the sign bit [+ + +]
Author: Kuan-Wei Chiu <visitorckw@gmail.com>
Date:   Sun Sep 8 22:00:09 2024 +0800

    bpftool: Fix undefined behavior caused by shifting into the sign bit
    
    [ Upstream commit 4cdc0e4ce5e893bc92255f5f734d983012f2bc2e ]
    
    Replace shifts of '1' with '1U' in bitwise operations within
    __show_dev_tc_bpf() to prevent undefined behavior caused by shifting
    into the sign bit of a signed integer. By using '1U', the operations
    are explicitly performed on unsigned integers, avoiding potential
    integer overflow or sign-related issues.
    
    Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Quentin Monnet <qmo@kernel.org>
    Link: https://lore.kernel.org/bpf/20240908140009.3149781-1-visitorckw@gmail.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

bpftool: Fix undefined behavior in qsort(NULL, 0, ...) [+ + +]
Author: Kuan-Wei Chiu <visitorckw@gmail.com>
Date:   Tue Sep 10 23:02:07 2024 +0800

    bpftool: Fix undefined behavior in qsort(NULL, 0, ...)
    
    [ Upstream commit f04e2ad394e2755d0bb2d858ecb5598718bf00d5 ]
    
    When netfilter has no entry to display, qsort is called with
    qsort(NULL, 0, ...). This results in undefined behavior, as UBSan
    reports:
    
    net.c:827:2: runtime error: null pointer passed as argument 1, which is declared to never be null
    
    Although the C standard does not explicitly state whether calling qsort
    with a NULL pointer when the size is 0 constitutes undefined behavior,
    Section 7.1.4 of the C standard (Use of library functions) mentions:
    
    "Each of the following statements applies unless explicitly stated
    otherwise in the detailed descriptions that follow: If an argument to a
    function has an invalid value (such as a value outside the domain of
    the function, or a pointer outside the address space of the program, or
    a null pointer, or a pointer to non-modifiable storage when the
    corresponding parameter is not const-qualified) or a type (after
    promotion) not expected by a function with variable number of
    arguments, the behavior is undefined."
    
    To avoid this, add an early return when nf_link_info is NULL to prevent
    calling qsort with a NULL pointer.
    
    Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Reviewed-by: Quentin Monnet <qmo@kernel.org>
    Link: https://lore.kernel.org/bpf/20240910150207.3179306-1-visitorckw@gmail.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
bridge: mcast: Fail MDB get request on empty entry [+ + +]
Author: Ido Schimmel <idosch@nvidia.com>
Date:   Sun Sep 29 15:36:40 2024 +0300

    bridge: mcast: Fail MDB get request on empty entry
    
    [ Upstream commit 555f45d24ba7cd5527716553031641cdebbe76c7 ]
    
    When user space deletes a port from an MDB entry, the port is removed
    synchronously. If this was the last port in the entry and the entry is
    not joined by the host itself, then the entry is scheduled for deletion
    via a timer.
    
    The above means that it is possible for the MDB get netlink request to
    retrieve an empty entry which is scheduled for deletion. This is
    problematic as after deleting the last port in an entry, user space
    cannot rely on a non-zero return code from the MDB get request as an
    indication that the port was successfully removed.
    
    Fix by returning an error when the entry's port list is empty and the
    entry is not joined by the host.
    
    Fixes: 68b380a395a7 ("bridge: mcast: Add MDB get support")
    Reported-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
    Closes: https://lore.kernel.org/netdev/c92569919307749f879b9482b0f3e125b7d9d2e3.1726480066.git.jamie.bainbridge@gmail.com/
    Tested-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
    Signed-off-by: Ido Schimmel <idosch@nvidia.com>
    Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
    Link: https://patch.msgid.link/20240929123640.558525-1-idosch@nvidia.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
btrfs: don't readahead the relocation inode on RST [+ + +]
Author: Johannes Thumshirn <jthumshirn@wdc.com>
Date:   Wed Jul 31 22:43:06 2024 +0200

    btrfs: don't readahead the relocation inode on RST
    
    [ Upstream commit 04915240e2c3a018e4c7f23418478d27226c8957 ]
    
    On relocation we're doing readahead on the relocation inode, but if the
    filesystem is backed by a RAID stripe tree we can get ENOENT (e.g. due to
    preallocated extents not being mapped in the RST) from the lookup.
    
    But readahead doesn't handle the error and submits invalid reads to the
    device, causing an assertion in the scatter-gather list code:
    
      BTRFS info (device nvme1n1): balance: start -d -m -s
      BTRFS info (device nvme1n1): relocating block group 6480920576 flags data|raid0
      BTRFS error (device nvme1n1): cannot find raid-stripe for logical [6481928192, 6481969152] devid 2, profile raid0
      ------------[ cut here ]------------
      kernel BUG at include/linux/scatterlist.h:115!
      Oops: invalid opcode: 0000 [#1] PREEMPT SMP PTI
      CPU: 0 PID: 1012 Comm: btrfs Not tainted 6.10.0-rc7+ #567
      RIP: 0010:__blk_rq_map_sg+0x339/0x4a0
      RSP: 0018:ffffc90001a43820 EFLAGS: 00010202
      RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffea00045d4802
      RDX: 0000000117520000 RSI: 0000000000000000 RDI: ffff8881027d1000
      RBP: 0000000000003000 R08: ffffea00045d4902 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000001000 R12: ffff8881003d10b8
      R13: ffffc90001a438f0 R14: 0000000000000000 R15: 0000000000003000
      FS:  00007fcc048a6900(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 000000002cd11000 CR3: 00000001109ea001 CR4: 0000000000370eb0
      Call Trace:
       <TASK>
       ? __die_body.cold+0x14/0x25
       ? die+0x2e/0x50
       ? do_trap+0xca/0x110
       ? do_error_trap+0x65/0x80
       ? __blk_rq_map_sg+0x339/0x4a0
       ? exc_invalid_op+0x50/0x70
       ? __blk_rq_map_sg+0x339/0x4a0
       ? asm_exc_invalid_op+0x1a/0x20
       ? __blk_rq_map_sg+0x339/0x4a0
       nvme_prep_rq.part.0+0x9d/0x770
       nvme_queue_rq+0x7d/0x1e0
       __blk_mq_issue_directly+0x2a/0x90
       ? blk_mq_get_budget_and_tag+0x61/0x90
       blk_mq_try_issue_list_directly+0x56/0xf0
       blk_mq_flush_plug_list.part.0+0x52b/0x5d0
       __blk_flush_plug+0xc6/0x110
       blk_finish_plug+0x28/0x40
       read_pages+0x160/0x1c0
       page_cache_ra_unbounded+0x109/0x180
       relocate_file_extent_cluster+0x611/0x6a0
       ? btrfs_search_slot+0xba4/0xd20
       ? balance_dirty_pages_ratelimited_flags+0x26/0xb00
       relocate_data_extent.constprop.0+0x134/0x160
       relocate_block_group+0x3f2/0x500
       btrfs_relocate_block_group+0x250/0x430
       btrfs_relocate_chunk+0x3f/0x130
       btrfs_balance+0x71b/0xef0
       ? kmalloc_trace_noprof+0x13b/0x280
       btrfs_ioctl+0x2c2e/0x3030
       ? kvfree_call_rcu+0x1e6/0x340
       ? list_lru_add_obj+0x66/0x80
       ? mntput_no_expire+0x3a/0x220
       __x64_sys_ioctl+0x96/0xc0
       do_syscall_64+0x54/0x110
       entry_SYSCALL_64_after_hwframe+0x76/0x7e
      RIP: 0033:0x7fcc04514f9b
      Code: Unable to access opcode bytes at 0x7fcc04514f71.
      RSP: 002b:00007ffeba923370 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
      RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fcc04514f9b
      RDX: 00007ffeba923460 RSI: 00000000c4009420 RDI: 0000000000000003
      RBP: 0000000000000000 R08: 0000000000000013 R09: 0000000000000001
      R10: 00007fcc043fbba8 R11: 0000000000000246 R12: 00007ffeba924fc5
      R13: 00007ffeba923460 R14: 0000000000000002 R15: 00000000004d4bb0
       </TASK>
      Modules linked in:
      ---[ end trace 0000000000000000 ]---
      RIP: 0010:__blk_rq_map_sg+0x339/0x4a0
      RSP: 0018:ffffc90001a43820 EFLAGS: 00010202
      RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffea00045d4802
      RDX: 0000000117520000 RSI: 0000000000000000 RDI: ffff8881027d1000
      RBP: 0000000000003000 R08: ffffea00045d4902 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000001000 R12: ffff8881003d10b8
      R13: ffffc90001a438f0 R14: 0000000000000000 R15: 0000000000003000
      FS:  00007fcc048a6900(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007fcc04514f71 CR3: 00000001109ea001 CR4: 0000000000370eb0
      Kernel panic - not syncing: Fatal exception
      Kernel Offset: disabled
      ---[ end Kernel panic - not syncing: Fatal exception ]---
    
    So in case of a relocation on a RAID stripe-tree based file system, skip
    the readahead.
    
    Reviewed-by: Josef Bacik <josef@toxicpanda.com>
    Reviewed-by: Qu Wenruo <wqu@suse.com>
    Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
    Reviewed-by: David Sterba <dsterba@suse.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

btrfs: drop the backref cache during relocation if we commit [+ + +]
Author: Josef Bacik <josef@toxicpanda.com>
Date:   Tue Sep 24 16:50:22 2024 -0400

    btrfs: drop the backref cache during relocation if we commit
    
    commit db7e68b522c01eb666cfe1f31637775f18997811 upstream.
    
    Since the inception of relocation we have maintained the backref cache
    across transaction commits, updating the backref cache with the new
    bytenr whenever we COWed blocks that were in the cache, and then
    updating their bytenr once we detected a transaction id change.
    
    This works as long as we're only ever modifying blocks, not changing the
    structure of the tree.
    
    However relocation does in fact change the structure of the tree.  For
    example, if we are relocating a data extent, we will look up all the
    leaves that point to this data extent.  We will then call
    do_relocation() on each of these leaves, which will COW down to the leaf
    and then update the file extent location.
    
    But, a key feature of do_relocation() is the pending list.  This is all
    the pending nodes that we modified when we updated the file extent item.
    We will then process all of these blocks via finish_pending_nodes, which
    calls do_relocation() on all of the nodes that led up to that leaf.
    
    The purpose of this is to make sure we don't break sharing unless we
    absolutely have to.  Consider the case that we have 3 snapshots that all
    point to this leaf through the same nodes, the initial COW would have
    created a whole new path.  If we did this for all 3 snapshots we would
    end up with 3x the number of nodes we had originally.  To avoid this we
    will cycle through each of the snapshots that point to each of these
    nodes and update their pointers to point at the new nodes.
    
    Once we update the pointer to the new node we will drop the node we
    removed the link for and all of its children via btrfs_drop_subtree().
    This is essentially just btrfs_drop_snapshot(), but for an arbitrary
    point in the snapshot.
    
    The problem with this is that we will never reflect this in the backref
    cache.  If we do this btrfs_drop_snapshot() for a node that is in the
    backref tree, we will leave the node in the backref tree.  This becomes
    a problem when we change the transid, as now the backref cache has
    entire subtrees that no longer exist, but exist as if they still are
    pointed to by the same roots.
    
    In the best case scenario you end up with "adding refs to an existing
    tree ref" errors from insert_inline_extent_backref(), where we attempt
    to link in nodes on roots that are no longer valid.
    
    Worst case you will double free some random block and re-use it when
    there's still references to the block.
    
    This is extremely subtle, and the consequences are quite bad.  There
    isn't a way to make sure our backref cache is consistent between
    transid's.
    
    In order to fix this we need to simply evict the entire backref cache
    anytime we cross transid's.  This reduces performance in that we have to
    rebuild this backref cache every time we change transid's, but fixes the
    bug.
    
    This has existed since relocation was added, and is a pretty critical
    bug.  There's a lot more cleanup that can be done now that this
    functionality is going away, but this patch is as small as possible in
    order to fix the problem and make it easy for us to backport it to all
    the kernels it needs to be backported to.
    
    Followup series will dismantle more of this code and simplify relocation
    drastically to remove this functionality.
    
    We have a reproducer that reproduced the corruption within a few minutes
    of running.  With this patch it survives several iterations/hours of
    running the reproducer.
    
    Fixes: 3fd0a5585eb9 ("Btrfs: Metadata ENOSPC handling for balance")
    CC: stable@vger.kernel.org
    Reviewed-by: Boris Burkov <boris@bur.io>
    Signed-off-by: Josef Bacik <josef@toxicpanda.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

btrfs: fix a NULL pointer dereference when failed to start a new trasacntion [+ + +]
Author: Qu Wenruo <wqu@suse.com>
Date:   Sat Sep 28 08:05:58 2024 +0930

    btrfs: fix a NULL pointer dereference when failed to start a new trasacntion
    
    commit c3b47f49e83197e8dffd023ec568403bcdbb774b upstream.
    
    [BUG]
    Syzbot reported a NULL pointer dereference with the following crash:
    
      FAULT_INJECTION: forcing a failure.
       start_transaction+0x830/0x1670 fs/btrfs/transaction.c:676
       prepare_to_relocate+0x31f/0x4c0 fs/btrfs/relocation.c:3642
       relocate_block_group+0x169/0xd20 fs/btrfs/relocation.c:3678
      ...
      BTRFS info (device loop0): balance: ended with status: -12
      Oops: general protection fault, probably for non-canonical address 0xdffffc00000000cc: 0000 [#1] PREEMPT SMP KASAN NOPTI
      KASAN: null-ptr-deref in range [0x0000000000000660-0x0000000000000667]
      RIP: 0010:btrfs_update_reloc_root+0x362/0xa80 fs/btrfs/relocation.c:926
      Call Trace:
       <TASK>
       commit_fs_roots+0x2ee/0x720 fs/btrfs/transaction.c:1496
       btrfs_commit_transaction+0xfaf/0x3740 fs/btrfs/transaction.c:2430
       del_balance_item fs/btrfs/volumes.c:3678 [inline]
       reset_balance_state+0x25e/0x3c0 fs/btrfs/volumes.c:3742
       btrfs_balance+0xead/0x10c0 fs/btrfs/volumes.c:4574
       btrfs_ioctl_balance+0x493/0x7c0 fs/btrfs/ioctl.c:3673
       vfs_ioctl fs/ioctl.c:51 [inline]
       __do_sys_ioctl fs/ioctl.c:907 [inline]
       __se_sys_ioctl+0xf9/0x170 fs/ioctl.c:893
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
    
    [CAUSE]
    The allocation failure happens at the start_transaction() inside
    prepare_to_relocate(), and during the error handling we call
    unset_reloc_control(), which makes fs_info->balance_ctl to be NULL.
    
    Then we continue the error path cleanup in btrfs_balance() by calling
    reset_balance_state() which will call del_balance_item() to fully delete
    the balance item in the root tree.
    
    However during the small window between set_reloc_contrl() and
    unset_reloc_control(), we can have a subvolume tree update and created a
    reloc_root for that subvolume.
    
    Then we go into the final btrfs_commit_transaction() of
    del_balance_item(), and into btrfs_update_reloc_root() inside
    commit_fs_roots().
    
    That function checks if fs_info->reloc_ctl is in the merge_reloc_tree
    stage, but since fs_info->reloc_ctl is NULL, it results a NULL pointer
    dereference.
    
    [FIX]
    Just add extra check on fs_info->reloc_ctl inside
    btrfs_update_reloc_root(), before checking
    fs_info->reloc_ctl->merge_reloc_tree.
    
    That DEAD_RELOC_TREE handling is to prevent further modification to the
    reloc tree during merge stage, but since there is no reloc_ctl at all,
    we do not need to bother that.
    
    Reported-by: syzbot+283673dbc38527ef9f3d@syzkaller.appspotmail.com
    Link: https://lore.kernel.org/linux-btrfs/66f6bfa7.050a0220.38ace9.0019.GAE@google.com/
    CC: stable@vger.kernel.org # 4.19+
    Reviewed-by: Josef Bacik <josef@toxicpanda.com>
    Signed-off-by: Qu Wenruo <wqu@suse.com>
    Reviewed-by: David Sterba <dsterba@suse.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

btrfs: send: fix buffer overflow detection when copying path to cache entry [+ + +]
Author: Filipe Manana <fdmanana@suse.com>
Date:   Thu Sep 19 22:20:34 2024 +0100

    btrfs: send: fix buffer overflow detection when copying path to cache entry
    
    commit 96c6ca71572a3556ed0c37237305657ff47174b7 upstream.
    
    Starting with commit c0247d289e73 ("btrfs: send: annotate struct
    name_cache_entry with __counted_by()") we annotated the variable length
    array "name" from the name_cache_entry structure with __counted_by() to
    improve overflow detection. However that alone was not correct, because
    the length of that array does not match the "name_len" field - it matches
    that plus 1 to include the NUL string terminator, so that makes a
    fortified kernel think there's an overflow and report a splat like this:
    
      strcpy: detected buffer overflow: 20 byte write of buffer size 19
      WARNING: CPU: 3 PID: 3310 at __fortify_report+0x45/0x50
      CPU: 3 UID: 0 PID: 3310 Comm: btrfs Not tainted 6.11.0-prnet #1
      Hardware name: CompuLab Ltd.  sbc-ihsw/Intense-PC2 (IPC2), BIOS IPC2_3.330.7 X64 03/15/2018
      RIP: 0010:__fortify_report+0x45/0x50
      Code: 48 8b 34 (...)
      RSP: 0018:ffff97ebc0d6f650 EFLAGS: 00010246
      RAX: 7749924ef60fa600 RBX: ffff8bf5446a521a RCX: 0000000000000027
      RDX: 00000000ffffdfff RSI: ffff97ebc0d6f548 RDI: ffff8bf84e7a1cc8
      RBP: ffff8bf548574080 R08: ffffffffa8c40e10 R09: 0000000000005ffd
      R10: 0000000000000004 R11: ffffffffa8c70e10 R12: ffff8bf551eef400
      R13: 0000000000000000 R14: 0000000000000013 R15: 00000000000003a8
      FS:  00007fae144de8c0(0000) GS:ffff8bf84e780000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007fae14691690 CR3: 00000001027a2003 CR4: 00000000001706f0
      Call Trace:
       <TASK>
       ? __warn+0x12a/0x1d0
       ? __fortify_report+0x45/0x50
       ? report_bug+0x154/0x1c0
       ? handle_bug+0x42/0x70
       ? exc_invalid_op+0x1a/0x50
       ? asm_exc_invalid_op+0x1a/0x20
       ? __fortify_report+0x45/0x50
       __fortify_panic+0x9/0x10
      __get_cur_name_and_parent+0x3bc/0x3c0
       get_cur_path+0x207/0x3b0
       send_extent_data+0x709/0x10d0
       ? find_parent_nodes+0x22df/0x25d0
       ? mas_nomem+0x13/0x90
       ? mtree_insert_range+0xa5/0x110
       ? btrfs_lru_cache_store+0x5f/0x1e0
       ? iterate_extent_inodes+0x52d/0x5a0
       process_extent+0xa96/0x11a0
       ? __pfx_lookup_backref_cache+0x10/0x10
       ? __pfx_store_backref_cache+0x10/0x10
       ? __pfx_iterate_backrefs+0x10/0x10
       ? __pfx_check_extent_item+0x10/0x10
       changed_cb+0x6fa/0x930
       ? tree_advance+0x362/0x390
       ? memcmp_extent_buffer+0xd7/0x160
       send_subvol+0xf0a/0x1520
       btrfs_ioctl_send+0x106b/0x11d0
       ? __pfx___clone_root_cmp_sort+0x10/0x10
       _btrfs_ioctl_send+0x1ac/0x240
       btrfs_ioctl+0x75b/0x850
       __se_sys_ioctl+0xca/0x150
       do_syscall_64+0x85/0x160
       ? __count_memcg_events+0x69/0x100
       ? handle_mm_fault+0x1327/0x15c0
       ? __se_sys_rt_sigprocmask+0xf1/0x180
       ? syscall_exit_to_user_mode+0x75/0xa0
       ? do_syscall_64+0x91/0x160
       ? do_user_addr_fault+0x21d/0x630
      entry_SYSCALL_64_after_hwframe+0x76/0x7e
      RIP: 0033:0x7fae145eeb4f
      Code: 00 48 89 (...)
      RSP: 002b:00007ffdf1cb09b0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
      RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007fae145eeb4f
      RDX: 00007ffdf1cb0ad0 RSI: 0000000040489426 RDI: 0000000000000004
      RBP: 00000000000078fe R08: 00007fae144006c0 R09: 00007ffdf1cb0927
      R10: 0000000000000008 R11: 0000000000000246 R12: 00007ffdf1cb1ce8
      R13: 0000000000000003 R14: 000055c499fab2e0 R15: 0000000000000004
       </TASK>
    
    Fix this by not storing the NUL string terminator since we don't actually
    need it for name cache entries, this way "name_len" corresponds to the
    actual size of the "name" array. This requires marking the "name" array
    field with __nonstring and using memcpy() instead of strcpy() as
    recommended by the guidelines at:
    
       https://github.com/KSPP/linux/issues/90
    
    Reported-by: David Arendt <admin@prnet.org>
    Link: https://lore.kernel.org/linux-btrfs/cee4591a-3088-49ba-99b8-d86b4242b8bd@prnet.org/
    Fixes: c0247d289e73 ("btrfs: send: annotate struct name_cache_entry with __counted_by()")
    CC: stable@vger.kernel.org # 6.11
    Tested-by: David Arendt <admin@prnet.org>
    Reviewed-by: Josef Bacik <josef@toxicpanda.com>
    Reviewed-by: Qu Wenruo <wqu@suse.com>
    Signed-off-by: Filipe Manana <fdmanana@suse.com>
    Reviewed-by: David Sterba <dsterba@suse.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

btrfs: send: fix invalid clone operation for file that got its size decreased [+ + +]
Author: Filipe Manana <fdmanana@suse.com>
Date:   Fri Sep 27 10:50:12 2024 +0100

    btrfs: send: fix invalid clone operation for file that got its size decreased
    
    commit fa630df665aa9ddce3a96ce7b54e10a38e4d2a2b upstream.
    
    During an incremental send we may end up sending an invalid clone
    operation, for the last extent of a file which ends at an unaligned offset
    that matches the final i_size of the file in the send snapshot, in case
    the file had its initial size (the size in the parent snapshot) decreased
    in the send snapshot. In this case the destination will fail to apply the
    clone operation because its end offset is not sector size aligned and it
    ends before the current size of the file.
    
    Sending the truncate operation always happens when we finish processing an
    inode, after we process all its extents (and xattrs, names, etc). So fix
    this by ensuring the file has a valid size before we send a clone
    operation for an unaligned extent that ends at the final i_size of the
    file. The size we truncate to matches the start offset of the clone range
    but it could be any value between that start offset and the final size of
    the file since the clone operation will expand the i_size if the current
    size is smaller than the end offset. The start offset of the range was
    chosen because it's always sector size aligned and avoids a truncation
    into the middle of a page, which results in dirtying the page due to
    filling part of it with zeroes and then making the clone operation at the
    receiver trigger IO.
    
    The following test reproduces the issue:
    
      $ cat test.sh
      #!/bin/bash
    
      DEV=/dev/sdi
      MNT=/mnt/sdi
    
      mkfs.btrfs -f $DEV
      mount $DEV $MNT
    
      # Create a file with a size of 256K + 5 bytes, having two extents, one
      # with a size of 128K and another one with a size of 128K + 5 bytes.
      last_ext_size=$((128 * 1024 + 5))
      xfs_io -f -d -c "pwrite -S 0xab -b 128K 0 128K" \
             -c "pwrite -S 0xcd -b $last_ext_size 128K $last_ext_size" \
             $MNT/foo
    
      # Another file which we will later clone foo into, but initially with
      # a larger size than foo.
      xfs_io -f -c "pwrite -S 0xef 0 1M" $MNT/bar
    
      btrfs subvolume snapshot -r $MNT/ $MNT/snap1
    
      # Now resize bar and clone foo into it.
      xfs_io -c "truncate 0" \
             -c "reflink $MNT/foo" $MNT/bar
    
      btrfs subvolume snapshot -r $MNT/ $MNT/snap2
    
      rm -f /tmp/send-full /tmp/send-inc
      btrfs send -f /tmp/send-full $MNT/snap1
      btrfs send -p $MNT/snap1 -f /tmp/send-inc $MNT/snap2
    
      umount $MNT
      mkfs.btrfs -f $DEV
      mount $DEV $MNT
    
      btrfs receive -f /tmp/send-full $MNT
      btrfs receive -f /tmp/send-inc $MNT
    
      umount $MNT
    
    Running it before this patch:
    
      $ ./test.sh
      (...)
      At subvol snap1
      At snapshot snap2
      ERROR: failed to clone extents to bar: Invalid argument
    
    A test case for fstests will be sent soon.
    
    Reported-by: Ben Millwood <thebenmachine@gmail.com>
    Link: https://lore.kernel.org/linux-btrfs/CAJhrHS2z+WViO2h=ojYvBPDLsATwLbg+7JaNCyYomv0fUxEpQQ@mail.gmail.com/
    Fixes: 46a6e10a1ab1 ("btrfs: send: allow cloning non-aligned extent if it ends at i_size")
    CC: stable@vger.kernel.org # 6.11
    Reviewed-by: Qu Wenruo <wqu@suse.com>
    Signed-off-by: Filipe Manana <fdmanana@suse.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

btrfs: wait for fixup workers before stopping cleaner kthread during umount [+ + +]
Author: Filipe Manana <fdmanana@suse.com>
Date:   Tue Oct 1 11:06:52 2024 +0100

    btrfs: wait for fixup workers before stopping cleaner kthread during umount
    
    commit 41fd1e94066a815a7ab0a7025359e9b40e4b3576 upstream.
    
    During unmount, at close_ctree(), we have the following steps in this order:
    
    1) Park the cleaner kthread - this doesn't destroy the kthread, it basically
       halts its execution (wake ups against it work but do nothing);
    
    2) We stop the cleaner kthread - this results in freeing the respective
       struct task_struct;
    
    3) We call btrfs_stop_all_workers() which waits for any jobs running in all
       the work queues and then free the work queues.
    
    Syzbot reported a case where a fixup worker resulted in a crash when doing
    a delayed iput on its inode while attempting to wake up the cleaner at
    btrfs_add_delayed_iput(), because the task_struct of the cleaner kthread
    was already freed. This can happen during unmount because we don't wait
    for any fixup workers still running before we call kthread_stop() against
    the cleaner kthread, which stops and free all its resources.
    
    Fix this by waiting for any fixup workers at close_ctree() before we call
    kthread_stop() against the cleaner and run pending delayed iputs.
    
    The stack traces reported by syzbot were the following:
    
      BUG: KASAN: slab-use-after-free in __lock_acquire+0x77/0x2050 kernel/locking/lockdep.c:5065
      Read of size 8 at addr ffff8880272a8a18 by task kworker/u8:3/52
    
      CPU: 1 UID: 0 PID: 52 Comm: kworker/u8:3 Not tainted 6.12.0-rc1-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
      Workqueue: btrfs-fixup btrfs_work_helper
      Call Trace:
       <TASK>
       __dump_stack lib/dump_stack.c:94 [inline]
       dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
       print_address_description mm/kasan/report.c:377 [inline]
       print_report+0x169/0x550 mm/kasan/report.c:488
       kasan_report+0x143/0x180 mm/kasan/report.c:601
       __lock_acquire+0x77/0x2050 kernel/locking/lockdep.c:5065
       lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5825
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162
       class_raw_spinlock_irqsave_constructor include/linux/spinlock.h:551 [inline]
       try_to_wake_up+0xb0/0x1480 kernel/sched/core.c:4154
       btrfs_writepage_fixup_worker+0xc16/0xdf0 fs/btrfs/inode.c:2842
       btrfs_work_helper+0x390/0xc50 fs/btrfs/async-thread.c:314
       process_one_work kernel/workqueue.c:3229 [inline]
       process_scheduled_works+0xa63/0x1850 kernel/workqueue.c:3310
       worker_thread+0x870/0xd30 kernel/workqueue.c:3391
       kthread+0x2f0/0x390 kernel/kthread.c:389
       ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
       </TASK>
    
      Allocated by task 2:
       kasan_save_stack mm/kasan/common.c:47 [inline]
       kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
       unpoison_slab_object mm/kasan/common.c:319 [inline]
       __kasan_slab_alloc+0x66/0x80 mm/kasan/common.c:345
       kasan_slab_alloc include/linux/kasan.h:247 [inline]
       slab_post_alloc_hook mm/slub.c:4086 [inline]
       slab_alloc_node mm/slub.c:4135 [inline]
       kmem_cache_alloc_node_noprof+0x16b/0x320 mm/slub.c:4187
       alloc_task_struct_node kernel/fork.c:180 [inline]
       dup_task_struct+0x57/0x8c0 kernel/fork.c:1107
       copy_process+0x5d1/0x3d50 kernel/fork.c:2206
       kernel_clone+0x223/0x880 kernel/fork.c:2787
       kernel_thread+0x1bc/0x240 kernel/fork.c:2849
       create_kthread kernel/kthread.c:412 [inline]
       kthreadd+0x60d/0x810 kernel/kthread.c:765
       ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
    
      Freed by task 61:
       kasan_save_stack mm/kasan/common.c:47 [inline]
       kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
       kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:579
       poison_slab_object mm/kasan/common.c:247 [inline]
       __kasan_slab_free+0x59/0x70 mm/kasan/common.c:264
       kasan_slab_free include/linux/kasan.h:230 [inline]
       slab_free_hook mm/slub.c:2343 [inline]
       slab_free mm/slub.c:4580 [inline]
       kmem_cache_free+0x1a2/0x420 mm/slub.c:4682
       put_task_struct include/linux/sched/task.h:144 [inline]
       delayed_put_task_struct+0x125/0x300 kernel/exit.c:228
       rcu_do_batch kernel/rcu/tree.c:2567 [inline]
       rcu_core+0xaaa/0x17a0 kernel/rcu/tree.c:2823
       handle_softirqs+0x2c5/0x980 kernel/softirq.c:554
       __do_softirq kernel/softirq.c:588 [inline]
       invoke_softirq kernel/softirq.c:428 [inline]
       __irq_exit_rcu+0xf4/0x1c0 kernel/softirq.c:637
       irq_exit_rcu+0x9/0x30 kernel/softirq.c:649
       instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1037 [inline]
       sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1037
       asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702
    
      Last potentially related work creation:
       kasan_save_stack+0x3f/0x60 mm/kasan/common.c:47
       __kasan_record_aux_stack+0xac/0xc0 mm/kasan/generic.c:541
       __call_rcu_common kernel/rcu/tree.c:3086 [inline]
       call_rcu+0x167/0xa70 kernel/rcu/tree.c:3190
       context_switch kernel/sched/core.c:5318 [inline]
       __schedule+0x184b/0x4ae0 kernel/sched/core.c:6675
       schedule_idle+0x56/0x90 kernel/sched/core.c:6793
       do_idle+0x56a/0x5d0 kernel/sched/idle.c:354
       cpu_startup_entry+0x42/0x60 kernel/sched/idle.c:424
       start_secondary+0x102/0x110 arch/x86/kernel/smpboot.c:314
       common_startup_64+0x13e/0x147
    
      The buggy address belongs to the object at ffff8880272a8000
       which belongs to the cache task_struct of size 7424
      The buggy address is located 2584 bytes inside of
       freed 7424-byte region [ffff8880272a8000, ffff8880272a9d00)
    
      The buggy address belongs to the physical page:
      page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x272a8
      head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
      flags: 0xfff00000000040(head|node=0|zone=1|lastcpupid=0x7ff)
      page_type: f5(slab)
      raw: 00fff00000000040 ffff88801bafa500 dead000000000122 0000000000000000
      raw: 0000000000000000 0000000080040004 00000001f5000000 0000000000000000
      head: 00fff00000000040 ffff88801bafa500 dead000000000122 0000000000000000
      head: 0000000000000000 0000000080040004 00000001f5000000 0000000000000000
      head: 00fff00000000003 ffffea00009caa01 ffffffffffffffff 0000000000000000
      head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      page_owner tracks the page as allocated
      page last allocated via order 3, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC), pid 2, tgid 2 (kthreadd), ts 71247381401, free_ts 71214998153
       set_page_owner include/linux/page_owner.h:32 [inline]
       post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1537
       prep_new_page mm/page_alloc.c:1545 [inline]
       get_page_from_freelist+0x3039/0x3180 mm/page_alloc.c:3457
       __alloc_pages_noprof+0x256/0x6c0 mm/page_alloc.c:4733
       alloc_pages_mpol_noprof+0x3e8/0x680 mm/mempolicy.c:2265
       alloc_slab_page+0x6a/0x120 mm/slub.c:2413
       allocate_slab+0x5a/0x2f0 mm/slub.c:2579
       new_slab mm/slub.c:2632 [inline]
       ___slab_alloc+0xcd1/0x14b0 mm/slub.c:3819
       __slab_alloc+0x58/0xa0 mm/slub.c:3909
       __slab_alloc_node mm/slub.c:3962 [inline]
       slab_alloc_node mm/slub.c:4123 [inline]
       kmem_cache_alloc_node_noprof+0x1fe/0x320 mm/slub.c:4187
       alloc_task_struct_node kernel/fork.c:180 [inline]
       dup_task_struct+0x57/0x8c0 kernel/fork.c:1107
       copy_process+0x5d1/0x3d50 kernel/fork.c:2206
       kernel_clone+0x223/0x880 kernel/fork.c:2787
       kernel_thread+0x1bc/0x240 kernel/fork.c:2849
       create_kthread kernel/kthread.c:412 [inline]
       kthreadd+0x60d/0x810 kernel/kthread.c:765
       ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
      page last free pid 5230 tgid 5230 stack trace:
       reset_page_owner include/linux/page_owner.h:25 [inline]
       free_pages_prepare mm/page_alloc.c:1108 [inline]
       free_unref_page+0xcd0/0xf00 mm/page_alloc.c:2638
       discard_slab mm/slub.c:2678 [inline]
       __put_partials+0xeb/0x130 mm/slub.c:3146
       put_cpu_partial+0x17c/0x250 mm/slub.c:3221
       __slab_free+0x2ea/0x3d0 mm/slub.c:4450
       qlink_free mm/kasan/quarantine.c:163 [inline]
       qlist_free_all+0x9a/0x140 mm/kasan/quarantine.c:179
       kasan_quarantine_reduce+0x14f/0x170 mm/kasan/quarantine.c:286
       __kasan_slab_alloc+0x23/0x80 mm/kasan/common.c:329
       kasan_slab_alloc include/linux/kasan.h:247 [inline]
       slab_post_alloc_hook mm/slub.c:4086 [inline]
       slab_alloc_node mm/slub.c:4135 [inline]
       kmem_cache_alloc_noprof+0x135/0x2a0 mm/slub.c:4142
       getname_flags+0xb7/0x540 fs/namei.c:139
       do_sys_openat2+0xd2/0x1d0 fs/open.c:1409
       do_sys_open fs/open.c:1430 [inline]
       __do_sys_openat fs/open.c:1446 [inline]
       __se_sys_openat fs/open.c:1441 [inline]
       __x64_sys_openat+0x247/0x2a0 fs/open.c:1441
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
    
      Memory state around the buggy address:
       ffff8880272a8900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff8880272a8980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      >ffff8880272a8a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                  ^
       ffff8880272a8a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff8880272a8b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      ==================================================================
    
    Reported-by: syzbot+8aaf2df2ef0164ffe1fb@syzkaller.appspotmail.com
    Link: https://lore.kernel.org/linux-btrfs/66fb36b1.050a0220.aab67.003b.GAE@google.com/
    CC: stable@vger.kernel.org # 4.19+
    Reviewed-by: Qu Wenruo <wqu@suse.com>
    Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
    Reviewed-by: David Sterba <dsterba@suse.com>
    Signed-off-by: Filipe Manana <fdmanana@suse.com>
    Reviewed-by: David Sterba <dsterba@suse.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
cachefiles: fix dentry leak in cachefiles_open_file() [+ + +]
Author: Baokun Li <libaokun1@huawei.com>
Date:   Thu Aug 29 16:34:09 2024 +0800

    cachefiles: fix dentry leak in cachefiles_open_file()
    
    commit da6ef2dffe6056aad3435e6cf7c6471c2a62187c upstream.
    
    A dentry leak may be caused when a lookup cookie and a cull are concurrent:
    
                P1             |             P2
    -----------------------------------------------------------
    cachefiles_lookup_cookie
      cachefiles_look_up_object
        lookup_one_positive_unlocked
         // get dentry
                                cachefiles_cull
                                  inode->i_flags |= S_KERNEL_FILE;
        cachefiles_open_file
          cachefiles_mark_inode_in_use
            __cachefiles_mark_inode_in_use
              can_use = false
              if (!(inode->i_flags & S_KERNEL_FILE))
                can_use = true
              return false
            return false
            // Returns an error but doesn't put dentry
    
    After that the following WARNING will be triggered when the backend folder
    is umounted:
    
    ==================================================================
    BUG: Dentry 000000008ad87947{i=7a,n=Dx_1_1.img}  still in use (1) [unmount of ext4 sda]
    WARNING: CPU: 4 PID: 359261 at fs/dcache.c:1767 umount_check+0x5d/0x70
    CPU: 4 PID: 359261 Comm: umount Not tainted 6.6.0-dirty #25
    RIP: 0010:umount_check+0x5d/0x70
    Call Trace:
     <TASK>
     d_walk+0xda/0x2b0
     do_one_tree+0x20/0x40
     shrink_dcache_for_umount+0x2c/0x90
     generic_shutdown_super+0x20/0x160
     kill_block_super+0x1a/0x40
     ext4_kill_sb+0x22/0x40
     deactivate_locked_super+0x35/0x80
     cleanup_mnt+0x104/0x160
    ==================================================================
    
    Whether cachefiles_open_file() returns true or false, the reference count
    obtained by lookup_positive_unlocked() in cachefiles_look_up_object()
    should be released.
    
    Therefore release that reference count in cachefiles_look_up_object() to
    fix the above issue and simplify the code.
    
    Fixes: 1f08c925e7a3 ("cachefiles: Implement backing file wrangling")
    Cc: stable@kernel.org
    Signed-off-by: Baokun Li <libaokun1@huawei.com>
    Link: https://lore.kernel.org/r/20240829083409.3788142-1-libaokun@huaweicloud.com
    Acked-by: David Howells <dhowells@redhat.com>
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
can: netlink: avoid call to do_set_data_bittiming callback with stale can_priv::ctrlmode [+ + +]
Author: Stefan Mätje <stefan.maetje@esd.eu>
Date:   Thu Aug 8 18:42:24 2024 +0200

    can: netlink: avoid call to do_set_data_bittiming callback with stale can_priv::ctrlmode
    
    [ Upstream commit 2423cc20087ae9a7b7af575aa62304ef67cad7b6 ]
    
    This patch moves the evaluation of data[IFLA_CAN_CTRLMODE] in function
    can_changelink in front of the evaluation of data[IFLA_CAN_BITTIMING].
    
    This avoids a call to do_set_data_bittiming providing a stale
    can_priv::ctrlmode with a CAN_CTRLMODE_FD flag not matching the
    requested state when switching between a CAN Classic and CAN-FD bitrate.
    
    In the same manner the evaluation of data[IFLA_CAN_CTRLMODE] in function
    can_validate is also moved in front of the evaluation of
    data[IFLA_CAN_BITTIMING].
    
    This is a preparation for patches where the nominal and data bittiming
    may have interdependencies on the driver side depending on the
    CAN_CTRLMODE_FD flag state.
    
    Signed-off-by: Stefan Mätje <stefan.maetje@esd.eu>
    Link: https://patch.msgid.link/20240808164224.213522-1-stefan.maetje@esd.eu
    Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ceph: fix a memory leak on cap_auths in MDS client [+ + +]
Author: Luis Henriques (SUSE) <luis.henriques@linux.dev>
Date:   Mon Aug 19 10:52:17 2024 +0100

    ceph: fix a memory leak on cap_auths in MDS client
    
    [ Upstream commit d97079e97eab20e08afc507f2bed4501e2824717 ]
    
    The cap_auths that are allocated during an MDS session opening are never
    released, causing a memory leak detected by kmemleak.  Fix this by freeing
    the memory allocated when shutting down the MDS client.
    
    Fixes: 1d17de9534cb ("ceph: save cap_auths in MDS client when session is opened")
    Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
    Reviewed-by: Xiubo Li <xiubli@redhat.com>
    Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ceph: fix cap ref leak via netfs init_request [+ + +]
Author: Patrick Donnelly <pdonnell@redhat.com>
Date:   Wed Oct 2 21:05:12 2024 -0400

    ceph: fix cap ref leak via netfs init_request
    
    commit ccda9910d8490f4fb067131598e4b2e986faa5a0 upstream.
    
    Log recovered from a user's cluster:
    
        <7>[ 5413.970692] ceph:  get_cap_refs 00000000958c114b ret 1 got Fr
        <7>[ 5413.970695] ceph:  start_read 00000000958c114b, no cache cap
        ...
        <7>[ 5473.934609] ceph:   my wanted = Fr, used = Fr, dirty -
        <7>[ 5473.934616] ceph:  revocation: pAsLsXsFr -> pAsLsXs (revoking Fr)
        <7>[ 5473.934632] ceph:  __ceph_caps_issued 00000000958c114b cap 00000000f7784259 issued pAsLsXs
        <7>[ 5473.934638] ceph:  check_caps 10000000e68.fffffffffffffffe file_want - used Fr dirty - flushing - issued pAsLsXs revoking Fr retain pAsLsXsFsr  AUTHONLY NOINVAL FLUSH_FORCE
    
    The MDS subsequently complains that the kernel client is late releasing
    caps.
    
    Approximately, a series of changes to this code by commits 49870056005c
    ("ceph: convert ceph_readpages to ceph_readahead"), 2de160417315
    ("netfs: Change ->init_request() to return an error code") and
    a5c9dc445139 ("ceph: Make ceph_init_request() check caps on readahead")
    resulted in subtle resource cleanup to be missed. The main culprit is
    the change in error handling in 2de160417315 which meant that a failure
    in init_request() would no longer cause cleanup to be called. That
    would prevent the ceph_put_cap_refs() call which would cleanup the
    leaked cap ref.
    
    Cc: stable@vger.kernel.org
    Fixes: a5c9dc445139 ("ceph: Make ceph_init_request() check caps on readahead")
    Link: https://tracker.ceph.com/issues/67008
    Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
    Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
    Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ceph: remove the incorrect Fw reference check when dirtying pages [+ + +]
Author: Xiubo Li <xiubli@redhat.com>
Date:   Thu Sep 5 06:22:18 2024 +0800

    ceph: remove the incorrect Fw reference check when dirtying pages
    
    [ Upstream commit c08dfb1b49492c09cf13838c71897493ea3b424e ]
    
    When doing the direct-io reads it will also try to mark pages dirty,
    but for the read path it won't hold the Fw caps and there is case
    will it get the Fw reference.
    
    Fixes: 5dda377cf0a6 ("ceph: set i_head_snapc when getting CEPH_CAP_FILE_WR reference")
    Signed-off-by: Xiubo Li <xiubli@redhat.com>
    Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
    Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
cifs: Do not convert delimiter when parsing NFS-style symlinks [+ + +]
Author: Pali Rohár <pali@kernel.org>
Date:   Sat Sep 28 23:59:46 2024 +0200

    cifs: Do not convert delimiter when parsing NFS-style symlinks
    
    [ Upstream commit d3a49f60917323228f8fdeee313260ef14f94df7 ]
    
    NFS-style symlinks have target location always stored in NFS/UNIX form
    where backslash means the real UNIX backslash and not the SMB path
    separator.
    
    So do not mangle slash and backslash content of NFS-style symlink during
    readlink() syscall as it is already in the correct Linux form.
    
    This fixes interoperability of NFS-style symlinks with backslashes created
    by Linux NFS3 client throw Windows NFS server and retrieved by Linux SMB
    client throw Windows SMB server, where both Windows servers exports the
    same directory.
    
    Fixes: d5ecebc4900d ("smb3: Allow query of symlinks stored as reparse points")
    Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
    Signed-off-by: Pali Rohár <pali@kernel.org>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

cifs: Fix buffer overflow when parsing NFS reparse points [+ + +]
Author: Pali Rohár <pali@kernel.org>
Date:   Sun Sep 29 12:22:40 2024 +0200

    cifs: Fix buffer overflow when parsing NFS reparse points
    
    [ Upstream commit e2a8910af01653c1c268984855629d71fb81f404 ]
    
    ReparseDataLength is sum of the InodeType size and DataBuffer size.
    So to get DataBuffer size it is needed to subtract InodeType's size from
    ReparseDataLength.
    
    Function cifs_strndup_from_utf16() is currentlly accessing buf->DataBuffer
    at position after the end of the buffer because it does not subtract
    InodeType size from the length. Fix this problem and correctly subtract
    variable len.
    
    Member InodeType is present only when reparse buffer is large enough. Check
    for ReparseDataLength before accessing InodeType to prevent another invalid
    memory access.
    
    Major and minor rdev values are present also only when reparse buffer is
    large enough. Check for reparse buffer size before calling reparse_mkdev().
    
    Fixes: d5ecebc4900d ("smb3: Allow query of symlinks stored as reparse points")
    Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
    Signed-off-by: Pali Rohár <pali@kernel.org>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

cifs: Remove intermediate object of failed create reparse call [+ + +]
Author: Pali Rohár <pali@kernel.org>
Date:   Mon Sep 30 22:25:10 2024 +0200

    cifs: Remove intermediate object of failed create reparse call
    
    [ Upstream commit c9432ad5e32f066875b1bf95939c363bc46d6a45 ]
    
    If CREATE was successful but SMB2_OP_SET_REPARSE failed then remove the
    intermediate object created by CREATE. Otherwise empty object stay on the
    server when reparse call failed.
    
    This ensures that if the creating of special files is unsupported by the
    server then no empty file stay on the server as a result of unsupported
    operation.
    
    Fixes: 102466f303ff ("smb: client: allow creating special files via reparse points")
    Signed-off-by: Pali Rohár <pali@kernel.org>
    Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
clk: qcom: clk-alpha-pll: Fix CAL_L_VAL override for LUCID EVO PLL [+ + +]
Author: Ajit Pandey <quic_ajipan@quicinc.com>
Date:   Tue Jun 11 19:07:45 2024 +0530

    clk: qcom: clk-alpha-pll: Fix CAL_L_VAL override for LUCID EVO PLL
    
    commit fff617979f97c773aaa9432c31cf62444b3bdbd4 upstream.
    
    In LUCID EVO PLL CAL_L_VAL and L_VAL bitfields are part of single
    PLL_L_VAL register. Update for L_VAL bitfield values in PLL_L_VAL
    register using regmap_write() API in __alpha_pll_trion_set_rate
    callback will override LUCID EVO PLL initial configuration related
    to PLL_CAL_L_VAL bit fields in PLL_L_VAL register.
    
    Observed random PLL lock failures during PLL enable due to such
    override in PLL calibration value. Use regmap_update_bits() with
    L_VAL bitfield mask instead of regmap_write() API to update only
    PLL_L_VAL bitfields in __alpha_pll_trion_set_rate callback.
    
    Fixes: 260e36606a03 ("clk: qcom: clk-alpha-pll: add Lucid EVO PLL configuration interfaces")
    Cc: stable@vger.kernel.org
    Signed-off-by: Ajit Pandey <quic_ajipan@quicinc.com>
    Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
    Acked-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org>
    Link: https://lore.kernel.org/r/20240611133752.2192401-2-quic_ajipan@quicinc.com
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

clk: qcom: clk-rpmh: Fix overflow in BCM vote [+ + +]
Author: Mike Tipton <quic_mdtipton@quicinc.com>
Date:   Fri Aug 9 10:51:29 2024 +0530

    clk: qcom: clk-rpmh: Fix overflow in BCM vote
    
    commit a4e5af27e6f6a8b0d14bc0d7eb04f4a6c7291586 upstream.
    
    Valid frequencies may result in BCM votes that exceed the max HW value.
    Set vote ceiling to BCM_TCS_CMD_VOTE_MASK to ensure the votes aren't
    truncated, which can result in lower frequencies than desired.
    
    Fixes: 04053f4d23a4 ("clk: qcom: clk-rpmh: Add IPA clock support")
    Cc: stable@vger.kernel.org
    Signed-off-by: Mike Tipton <quic_mdtipton@quicinc.com>
    Reviewed-by: Taniya Das <quic_tdas@quicinc.com>
    Signed-off-by: Imran Shaik <quic_imrashai@quicinc.com>
    Link: https://lore.kernel.org/r/20240809-clk-rpmh-bcm-vote-fix-v2-1-240c584b7ef9@quicinc.com
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

clk: qcom: dispcc-sm8250: use CLK_SET_RATE_PARENT for branch clocks [+ + +]
Author: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Date:   Sun Aug 4 08:40:05 2024 +0300

    clk: qcom: dispcc-sm8250: use CLK_SET_RATE_PARENT for branch clocks
    
    commit 0e93c6320ecde0583de09f3fe801ce8822886fec upstream.
    
    Add CLK_SET_RATE_PARENT for several branch clocks. Such clocks don't
    have a way to change the rate, so set the parent rate instead.
    
    Fixes: 80a18f4a8567 ("clk: qcom: Add display clock controller driver for SM8150 and SM8250")
    Cc: stable@vger.kernel.org
    Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
    Link: https://lore.kernel.org/r/20240804-sm8350-fixes-v1-1-1149dd8399fe@linaro.org
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

clk: qcom: gcc-sc8180x: Add GPLL9 support [+ + +]
Author: Satya Priya Kakitapalli <quic_skakitap@quicinc.com>
Date:   Mon Aug 12 10:43:03 2024 +0530

    clk: qcom: gcc-sc8180x: Add GPLL9 support
    
    commit 818a2f8d5e4ad2c1e39a4290158fe8e39a744c70 upstream.
    
    Add the missing GPLL9 pll and fix the gcc_parents_7 data to use
    the correct pll hw.
    
    Fixes: 4433594bbe5d ("clk: qcom: gcc: Add global clock controller driver for SC8180x")
    Cc: stable@vger.kernel.org
    Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
    Signed-off-by: Satya Priya Kakitapalli <quic_skakitap@quicinc.com>
    Link: https://lore.kernel.org/r/20240812-gcc-sc8180x-fixes-v2-3-8b3eaa5fb856@quicinc.com
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

clk: qcom: gcc-sc8180x: Fix the sdcc2 and sdcc4 clocks freq table [+ + +]
Author: Satya Priya Kakitapalli <quic_skakitap@quicinc.com>
Date:   Mon Aug 12 10:43:04 2024 +0530

    clk: qcom: gcc-sc8180x: Fix the sdcc2 and sdcc4 clocks freq table
    
    commit b8acaf2de8081371761ab4cf1e7a8ee4e7acc139 upstream.
    
    Update the frequency tables of gcc_sdcc2_apps_clk and gcc_sdcc4_apps_clk
    as per the latest frequency plan.
    
    Fixes: 4433594bbe5d ("clk: qcom: gcc: Add global clock controller driver for SC8180x")
    Cc: stable@vger.kernel.org
    Signed-off-by: Satya Priya Kakitapalli <quic_skakitap@quicinc.com>
    Link: https://lore.kernel.org/r/20240812-gcc-sc8180x-fixes-v2-4-8b3eaa5fb856@quicinc.com
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

clk: qcom: gcc-sc8180x: Register QUPv3 RCGs for DFS on sc8180x [+ + +]
Author: Satya Priya Kakitapalli <quic_skakitap@quicinc.com>
Date:   Mon Aug 12 10:43:01 2024 +0530

    clk: qcom: gcc-sc8180x: Register QUPv3 RCGs for DFS on sc8180x
    
    commit 1fc8c02e1d80463ce1b361d82b83fc43bb92d964 upstream.
    
    QUPv3 clocks support DFS on sc8180x platform but currently the code
    changes for it are missing from the driver, this results in not
    populating all the DFS supported frequencies and returns incorrect
    frequency when the clients request for them. Hence add the DFS
    registration for QUPv3 RCGs.
    
    Fixes: 4433594bbe5d ("clk: qcom: gcc: Add global clock controller driver for SC8180x")
    Cc: stable@vger.kernel.org
    Signed-off-by: Satya Priya Kakitapalli <quic_skakitap@quicinc.com>
    Link: https://lore.kernel.org/r/20240812-gcc-sc8180x-fixes-v2-1-8b3eaa5fb856@quicinc.com
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

clk: qcom: gcc-sm8150: De-register gcc_cpuss_ahb_clk_src [+ + +]
Author: Satya Priya Kakitapalli <quic_skakitap@quicinc.com>
Date:   Mon Aug 12 10:43:05 2024 +0530

    clk: qcom: gcc-sm8150: De-register gcc_cpuss_ahb_clk_src
    
    commit bab0c7a0bc586e736b7cd2aac8e6391709a70ef2 upstream.
    
    The branch clocks of gcc_cpuss_ahb_clk_src are marked critical
    and hence these clocks vote on XO blocking the suspend.
    De-register these clocks and its source as there is no rate
    setting happening on them.
    
    Fixes: 4433594bbe5d ("clk: qcom: gcc: Add global clock controller driver for SC8180x")
    Cc: stable@vger.kernel.org
    Signed-off-by: Satya Priya Kakitapalli <quic_skakitap@quicinc.com>
    Link: https://lore.kernel.org/r/20240812-gcc-sc8180x-fixes-v2-5-8b3eaa5fb856@quicinc.com
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

clk: qcom: gcc-sm8250: Do not turn off PCIe GDSCs during gdsc_disable() [+ + +]
Author: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Date:   Fri Jul 19 19:12:38 2024 +0530

    clk: qcom: gcc-sm8250: Do not turn off PCIe GDSCs during gdsc_disable()
    
    commit ade508b545c969c72cd68479f275a5dd640fd8b9 upstream.
    
    With PWRSTS_OFF_ON, PCIe GDSCs are turned off during gdsc_disable(). This
    can happen during scenarios such as system suspend and breaks the resume
    of PCIe controllers from suspend.
    
    So use PWRSTS_RET_ON to indicate the GDSC driver to not turn off the GDSCs
    during gdsc_disable() and allow the hardware to transition the GDSCs to
    retention when the parent domain enters low power state during system
    suspend.
    
    Cc: stable@vger.kernel.org # 5.7
    Fixes: 3e5770921a88 ("clk: qcom: gcc: Add global clock controller driver for SM8250")
    Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
    Link: https://lore.kernel.org/r/20240719134238.312191-1-manivannan.sadhasivam@linaro.org
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

clk: qcom: gcc-sm8450: Do not turn off PCIe GDSCs during gdsc_disable() [+ + +]
Author: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Date:   Mon Jul 22 16:27:33 2024 +0530

    clk: qcom: gcc-sm8450: Do not turn off PCIe GDSCs during gdsc_disable()
    
    commit 889e1332310656961855c0dcedbb4dbe78e39d22 upstream.
    
    With PWRSTS_OFF_ON, PCIe GDSCs are turned off during gdsc_disable(). This
    can happen during scenarios such as system suspend and breaks the resume
    of PCIe controllers from suspend.
    
    So use PWRSTS_RET_ON to indicate the GDSC driver to not turn off the GDSCs
    during gdsc_disable() and allow the hardware to transition the GDSCs to
    retention when the parent domain enters low power state during system
    suspend.
    
    Cc: stable@vger.kernel.org # 5.17
    Fixes: db0c944ee92b ("clk: qcom: Add clock driver for SM8450")
    Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
    Link: https://lore.kernel.org/r/20240722105733.13040-1-manivannan.sadhasivam@linaro.org
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

clk: rockchip: fix error for unknown clocks [+ + +]
Author: Sebastian Reichel <sebastian.reichel@collabora.com>
Date:   Mon Mar 25 20:33:36 2024 +0100

    clk: rockchip: fix error for unknown clocks
    
    commit 12fd64babaca4dc09d072f63eda76ba44119816a upstream.
    
    There is a clk == NULL check after the switch to check for
    unsupported clk types. Since clk is re-assigned in a loop,
    this check is useless right now for anything but the first
    round. Let's fix this up by assigning clk = NULL in the
    loop before the switch statement.
    
    Fixes: a245fecbb806 ("clk: rockchip: add basic infrastructure for clock branches")
    Cc: stable@vger.kernel.org
    Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
    [added fixes + stable-cc]
    Link: https://lore.kernel.org/r/20240325193609.237182-6-sebastian.reichel@collabora.com
    Signed-off-by: Heiko Stuebner <heiko@sntech.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

clk: samsung: exynos7885: Update CLKS_NR_FSYS after bindings fix [+ + +]
Author: David Virag <virag.david003@gmail.com>
Date:   Tue Aug 6 14:11:47 2024 +0200

    clk: samsung: exynos7885: Update CLKS_NR_FSYS after bindings fix
    
    commit 217a5f23c290c349ceaa37a6f2c014ad4c2d5759 upstream.
    
    Update CLKS_NR_FSYS to the proper value after a fix in DT bindings.
    This should always be the last clock in a CMU + 1.
    
    Fixes: cd268e309c29 ("dt-bindings: clock: Add bindings for Exynos7885 CMU_FSYS")
    Cc: stable@vger.kernel.org
    Signed-off-by: David Virag <virag.david003@gmail.com>
    Link: https://lore.kernel.org/r/20240806121157.479212-5-virag.david003@gmail.com
    Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
close_range(): fix the logics in descriptor table trimming [+ + +]
Author: Al Viro <viro@zeniv.linux.org.uk>
Date:   Fri Aug 16 15:17:00 2024 -0400

    close_range(): fix the logics in descriptor table trimming
    
    commit 678379e1d4f7443b170939525d3312cfc37bf86b upstream.
    
    Cloning a descriptor table picks the size that would cover all currently
    opened files.  That's fine for clone() and unshare(), but for close_range()
    there's an additional twist - we clone before we close, and it would be
    a shame to have
            close_range(3, ~0U, CLOSE_RANGE_UNSHARE)
    leave us with a huge descriptor table when we are not going to keep
    anything past stderr, just because some large file descriptor used to
    be open before our call has taken it out.
    
    Unfortunately, it had been dealt with in an inherently racy way -
    sane_fdtable_size() gets a "don't copy anything past that" argument
    (passed via unshare_fd() and dup_fd()), close_range() decides how much
    should be trimmed and passes that to unshare_fd().
    
    The problem is, a range that used to extend to the end of descriptor
    table back when close_range() had looked at it might very well have stuff
    grown after it by the time dup_fd() has allocated a new files_struct
    and started to figure out the capacity of fdtable to be attached to that.
    
    That leads to interesting pathological cases; at the very least it's a
    QoI issue, since unshare(CLONE_FILES) is atomic in a sense that it takes
    a snapshot of descriptor table one might have observed at some point.
    Since CLOSE_RANGE_UNSHARE close_range() is supposed to be a combination
    of unshare(CLONE_FILES) with plain close_range(), ending up with a
    weird state that would never occur with unshare(2) is confusing, to put
    it mildly.
    
    It's not hard to get rid of - all it takes is passing both ends of the
    range down to sane_fdtable_size().  There we are under ->files_lock,
    so the race is trivially avoided.
    
    So we do the following:
            * switch close_files() from calling unshare_fd() to calling
    dup_fd().
            * undo the calling convention change done to unshare_fd() in
    60997c3d45d9 "close_range: add CLOSE_RANGE_UNSHARE"
            * introduce struct fd_range, pass a pointer to that to dup_fd()
    and sane_fdtable_size() instead of "trim everything past that point"
    they are currently getting.  NULL means "we are not going to be punching
    any holes"; NR_OPEN_MAX is gone.
            * make sane_fdtable_size() use find_last_bit() instead of
    open-coding it; it's easier to follow that way.
            * while we are at it, have dup_fd() report errors by returning
    ERR_PTR(), no need to use a separate int *errorp argument.
    
    Fixes: 60997c3d45d9 "close_range: add CLOSE_RANGE_UNSHARE"
    Cc: stable@vger.kernel.org
    Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
cpufreq: amd-pstate: add check for cpufreq_cpu_get's return value [+ + +]
Author: Anastasia Belova <abelova@astralinux.ru>
Date:   Mon Aug 26 16:38:41 2024 +0300

    cpufreq: amd-pstate: add check for cpufreq_cpu_get's return value
    
    [ Upstream commit 5493f9714e4cdaf0ee7cec15899a231400cb1a9f ]
    
    cpufreq_cpu_get may return NULL. To avoid NULL-dereference check it
    and return in case of error.
    
    Found by Linux Verification Center (linuxtesting.org) with SVACE.
    
    Signed-off-by: Anastasia Belova <abelova@astralinux.ru>
    Reviewed-by: Perry Yuan <perry.yuan@amd.com>
    Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

cpufreq: Avoid a bad reference count on CPU node [+ + +]
Author: Miquel Sabaté Solà <mikisabate@gmail.com>
Date:   Tue Sep 17 15:42:46 2024 +0200

    cpufreq: Avoid a bad reference count on CPU node
    
    commit c0f02536fffbbec71aced36d52a765f8c4493dc2 upstream.
    
    In the parse_perf_domain function, if the call to
    of_parse_phandle_with_args returns an error, then the reference to the
    CPU device node that was acquired at the start of the function would not
    be properly decremented.
    
    Address this by declaring the variable with the __free(device_node)
    cleanup attribute.
    
    Signed-off-by: Miquel Sabaté Solà <mikisabate@gmail.com>
    Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
    Link: https://patch.msgid.link/20240917134246.584026-1-mikisabate@gmail.com
    Cc: All applicable <stable@vger.kernel.org>
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

cpufreq: intel_pstate: Make hwp_notify_lock a raw spinlock [+ + +]
Author: Uwe Kleine-König <ukleinek@debian.org>
Date:   Thu Sep 19 10:11:21 2024 +0200

    cpufreq: intel_pstate: Make hwp_notify_lock a raw spinlock
    
    commit 8b4865cd904650cbed7f2407e653934c621b8127 upstream.
    
    notify_hwp_interrupt() is called via sysvec_thermal() ->
    smp_thermal_vector() -> intel_thermal_interrupt() in hard irq context.
    For this reason it must not use a simple spin_lock that sleeps with
    PREEMPT_RT enabled. So convert it to a raw spinlock.
    
    Reported-by: xiao sheng wen <atzlinux@sina.com>
    Link: https://bugs.debian.org/1076483
    Signed-off-by: Uwe Kleine-König <ukleinek@debian.org>
    Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
    Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
    Tested-by: xiao sheng wen <atzlinux@sina.com>
    Link: https://patch.msgid.link/20240919081121.10784-2-ukleinek@debian.org
    Cc: All applicable <stable@vger.kernel.org>
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

cpufreq: loongson3: Use raw_smp_processor_id() in do_service_request() [+ + +]
Author: Huacai Chen <chenhuacai@kernel.org>
Date:   Wed Aug 28 14:24:59 2024 +0800

    cpufreq: loongson3: Use raw_smp_processor_id() in do_service_request()
    
    [ Upstream commit 2b7ec33e534f7a10033a5cf07794acf48b182bbe ]
    
    Use raw_smp_processor_id() instead of plain smp_processor_id() in
    do_service_request(), otherwise we may get some errors with the driver
    enabled:
    
     BUG: using smp_processor_id() in preemptible [00000000] code: (udev-worker)/208
     caller is loongson3_cpufreq_probe+0x5c/0x250 [loongson3_cpufreq]
    
    Reported-by: Xi Ruoyao <xry111@xry111.site>
    Tested-by: Binbin Zhou <zhoubinbin@loongson.cn>
    Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
    Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
crypto: hisilicon - fix missed error branch [+ + +]
Author: Yang Shen <shenyang39@huawei.com>
Date:   Sat Aug 31 17:50:07 2024 +0800

    crypto: hisilicon - fix missed error branch
    
    [ Upstream commit f386dc64e1a5d3dcb84579119ec350ab026fea88 ]
    
    If an error occurs in the process after the SGL is mapped
    successfully, it need to unmap the SGL.
    
    Otherwise, memory problems may occur.
    
    Signed-off-by: Yang Shen <shenyang39@huawei.com>
    Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

crypto: octeontx - Fix authenc setkey [+ + +]
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date:   Sat Aug 17 12:13:23 2024 +0800

    crypto: octeontx - Fix authenc setkey
    
    [ Upstream commit 311eea7e37c4c0b44b557d0c100860a03b4eab65 ]
    
    Use the generic crypto_authenc_extractkeys helper instead of custom
    parsing code that is slightly broken.  Also fix a number of memory
    leaks by moving memory allocation from setkey to init_tfm (setkey
    can be called multiple times over the life of a tfm).
    
    Finally accept all hash key lengths by running the digest over
    extra-long keys.
    
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

crypto: octeontx* - Select CRYPTO_AUTHENC [+ + +]
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date:   Thu Sep 5 10:21:49 2024 +0800

    crypto: octeontx* - Select CRYPTO_AUTHENC
    
    commit c398cb8eb0a263a1b7a18892d9f244751689675c upstream.
    
    Select CRYPTO_AUTHENC as the function crypto_authenec_extractkeys
    may not be available without it.
    
    Fixes: 311eea7e37c4 ("crypto: octeontx - Fix authenc setkey")
    Fixes: 7ccb750dcac8 ("crypto: octeontx2 - Fix authenc setkey")
    Reported-by: kernel test robot <lkp@intel.com>
    Closes: https://lore.kernel.org/oe-kbuild-all/202409042013.gT2ZI4wR-lkp@intel.com/
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

crypto: octeontx2 - Fix authenc setkey [+ + +]
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date:   Sat Aug 17 12:36:19 2024 +0800

    crypto: octeontx2 - Fix authenc setkey
    
    [ Upstream commit 7ccb750dcac8abbfc7743aab0db6a72c1c3703c7 ]
    
    Use the generic crypto_authenc_extractkeys helper instead of custom
    parsing code that is slightly broken.  Also fix a number of memory
    leaks by moving memory allocation from setkey to init_tfm (setkey
    can be called multiple times over the life of a tfm).
    
    Finally accept all hash key lengths by running the digest over
    extra-long keys.
    
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

crypto: simd - Do not call crypto_alloc_tfm during registration [+ + +]
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date:   Sat Aug 17 14:58:35 2024 +0800

    crypto: simd - Do not call crypto_alloc_tfm during registration
    
    [ Upstream commit 3c44d31cb34ce4eb8311a2e73634d57702948230 ]
    
    Algorithm registration is usually carried out during module init,
    where as little work as possible should be carried out.  The SIMD
    code violated this rule by allocating a tfm, this then triggers a
    full test of the algorithm which may dead-lock in certain cases.
    
    SIMD is only allocating the tfm to get at the alg object, which is
    in fact already available as it is what we are registering.  Use
    that directly and remove the crypto_alloc_tfm call.
    
    Also remove some obsolete and unused SIMD API.
    
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

crypto: x86/sha256 - Add parentheses around macros' single arguments [+ + +]
Author: Fangrui Song <i@maskray.me>
Date:   Tue Aug 13 21:48:02 2024 -0700

    crypto: x86/sha256 - Add parentheses around macros' single arguments
    
    [ Upstream commit 3363c460ef726ba693704dbcd73b7e7214ccc788 ]
    
    The macros FOUR_ROUNDS_AND_SCHED and DO_4ROUNDS rely on an
    unexpected/undocumented behavior of the GNU assembler, which might
    change in the future
    (https://sourceware.org/bugzilla/show_bug.cgi?id=32073).
    
        M (1) (2) // 1 arg !? Future: 2 args
        M 1 + 2   // 1 arg !? Future: 3 args
    
        M 1 2     // 2 args
    
    Add parentheses around the single arguments to support future GNU
    assembler and LLVM integrated assembler (when the IsOperator hack from
    the following link is dropped).
    
    Link: https://github.com/llvm/llvm-project/commit/055006475e22014b28a070db1bff41ca15f322f0
    Signed-off-by: Fangrui Song <maskray@google.com>
    Reviewed-by: Jan Beulich <jbeulich@suse.com>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
drivers/perf: arm_spe: Use perf_allow_kernel() for permissions [+ + +]
Author: James Clark <james.clark@linaro.org>
Date:   Tue Aug 27 15:51:12 2024 +0100

    drivers/perf: arm_spe: Use perf_allow_kernel() for permissions
    
    [ Upstream commit 5e9629d0ae977d6f6916d7e519724804e95f0b07 ]
    
    Use perf_allow_kernel() for 'pa_enable' (physical addresses),
    'pct_enable' (physical timestamps) and context IDs. This means that
    perf_event_paranoid is now taken into account and LSM hooks can be used,
    which is more consistent with other perf_event_open calls. For example
    PERF_SAMPLE_PHYS_ADDR uses perf_allow_kernel() rather than just
    perfmon_capable().
    
    This also indirectly fixes the following error message which is
    misleading because perf_event_paranoid is not taken into account by
    perfmon_capable():
    
      $ perf record -e arm_spe/pa_enable/
    
      Error:
      Access to performance monitoring and observability operations is
      limited. Consider adjusting /proc/sys/kernel/perf_event_paranoid
      setting ...
    
    Suggested-by: Al Grant <al.grant@arm.com>
    Signed-off-by: James Clark <james.clark@linaro.org>
    Link: https://lore.kernel.org/r/20240827145113.1224604-1-james.clark@linaro.org
    Link: https://lore.kernel.org/all/20240807120039.GD37996@noisy.programming.kicks-ass.net/
    Signed-off-by: Will Deacon <will@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drivers/perf: riscv: Align errno for unsupported perf event [+ + +]
Author: Pu Lehui <pulehui@huawei.com>
Date:   Sat Aug 31 07:15:20 2024 +0000

    drivers/perf: riscv: Align errno for unsupported perf event
    
    commit c625154993d0d24a962b1830cd5ed92adda2cf86 upstream.
    
    RISC-V perf driver does not yet support PERF_TYPE_BREAKPOINT. It would
    be more appropriate to return -EOPNOTSUPP or -ENOENT for this type in
    pmu_sbi_event_map. Considering that other implementations return -ENOENT
    for unsupported perf types, let's synchronize this behavior. Due to this
    reason, a riscv bpf testcases perf_skip fail. Meanwhile, align that
    behavior to the rest of proper place.
    
    Signed-off-by: Pu Lehui <pulehui@huawei.com>
    Reviewed-by: Atish Patra <atishp@rivosinc.com>
    Fixes: 9b3e150e310e ("RISC-V: Add a simple platform driver for RISC-V legacy perf")
    Fixes: 16d3b1af0944 ("perf: RISC-V: Check standard event availability")
    Fixes: e9991434596f ("RISC-V: Add perf platform driver based on SBI PMU extension")
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20240831071520.1630360-1-pulehui@huaweicloud.com
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
drm/amd/display: Add HDR workaround for specific eDP [+ + +]
Author: Alex Hung <alex.hung@amd.com>
Date:   Fri Sep 6 11:39:18 2024 -0600

    drm/amd/display: Add HDR workaround for specific eDP
    
    commit 05af800704ee7187d9edd461ec90f3679b1c4aba upstream.
    
    [WHY & HOW]
    Some eDP panels suffer from flicking when HDR is enabled in KDE. This
    quirk works around it by skipping VSC that is incompatible with eDP
    panels.
    
    Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3151
    Cc: Mario Limonciello <mario.limonciello@amd.com>
    Cc: Alex Deucher <alexander.deucher@amd.com>
    Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
    Signed-off-by: Alex Hung <alex.hung@amd.com>
    Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    (cherry picked from commit 4d4257280d7957727998ef90ccc7b69c7cca8376)
    Cc: stable@vger.kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/amd/display: Add null check for 'afb' in amdgpu_dm_plane_handle_cursor_update (v2) [+ + +]
Author: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Date:   Fri Aug 2 12:35:13 2024 +0530

    drm/amd/display: Add null check for 'afb' in amdgpu_dm_plane_handle_cursor_update (v2)
    
    [ Upstream commit cd9e9e0852d501f169aa3bb34e4b413d2eb48c37 ]
    
    This commit adds a null check for the 'afb' variable in the
    amdgpu_dm_plane_handle_cursor_update function. Previously, 'afb' was
    assumed to be null, but was used later in the code without a null check.
    This could potentially lead to a null pointer dereference.
    
    Changes since v1:
    - Moved the null check for 'afb' to the line where 'afb' is used. (Alex)
    
    Fixes the below:
    drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_plane.c:1298 amdgpu_dm_plane_handle_cursor_update() error: we previously assumed 'afb' could be null (see line 1252)
    
    Cc: Tom Chung <chiahsuan.chung@amd.com>
    Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
    Cc: Roman Li <roman.li@amd.com>
    Cc: Alex Hung <alex.hung@amd.com>
    Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
    Cc: Harry Wentland <harry.wentland@amd.com>
    Co-developed-by: Alex Hung <alex.hung@amd.com>
    Signed-off-by: Alex Hung <alex.hung@amd.com>
    Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
    Reviewed-by: Tom Chung <chiahsuan.chung@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Add null check for 'afb' in amdgpu_dm_update_cursor (v2) [+ + +]
Author: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Date:   Fri Aug 2 12:20:36 2024 +0530

    drm/amd/display: Add null check for 'afb' in amdgpu_dm_update_cursor (v2)
    
    [ Upstream commit 0fe20258b4989b9112b5e9470df33a0939403fd4 ]
    
    This commit adds a null check for the 'afb' variable in the
    amdgpu_dm_update_cursor function. Previously, 'afb' was assumed to be
    null at line 8388, but was used later in the code without a null check.
    This could potentially lead to a null pointer dereference.
    
    Changes since v1:
    - Moved the null check for 'afb' to the line where 'afb' is used. (Alex)
    
    Fixes the below:
    drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:8433 amdgpu_dm_update_cursor()
            error: we previously assumed 'afb' could be null (see line 8388)
    
    Cc: Tom Chung <chiahsuan.chung@amd.com>
    Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
    Cc: Roman Li <roman.li@amd.com>
    Cc: Alex Hung <alex.hung@amd.com>
    Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
    Cc: Harry Wentland <harry.wentland@amd.com>
    Co-developed-by: Alex Hung <alex.hung@amd.com>
    Signed-off-by: Alex Hung <alex.hung@amd.com>
    Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
    Reviewed-by: Tom Chung <chiahsuan.chung@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Add NULL check for clk_mgr and clk_mgr->funcs in dcn30_init_hw [+ + +]
Author: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Date:   Mon Jul 22 16:21:19 2024 +0530

    drm/amd/display: Add NULL check for clk_mgr and clk_mgr->funcs in dcn30_init_hw
    
    [ Upstream commit cba7fec864172dadd953daefdd26e01742b71a6a ]
    
    This commit addresses a potential null pointer dereference issue in the
    `dcn30_init_hw` function. The issue could occur when `dc->clk_mgr` or
    `dc->clk_mgr->funcs` is null.
    
    The fix adds a check to ensure `dc->clk_mgr` and `dc->clk_mgr->funcs` is
    not null before accessing its functions. This prevents a potential null
    pointer dereference.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn30/dcn30_hwseq.c:789 dcn30_init_hw() error: we previously assumed 'dc->clk_mgr' could be null (see line 628)
    
    Cc: Tom Chung <chiahsuan.chung@amd.com>
    Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
    Cc: Roman Li <roman.li@amd.com>
    Cc: Alex Hung <alex.hung@amd.com>
    Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
    Cc: Harry Wentland <harry.wentland@amd.com>
    Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
    Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
    Reviewed-by: Alex Hung <alex.hung@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Add NULL check for clk_mgr and clk_mgr->funcs in dcn401_init_hw [+ + +]
Author: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Date:   Mon Jul 22 16:58:32 2024 +0530

    drm/amd/display: Add NULL check for clk_mgr and clk_mgr->funcs in dcn401_init_hw
    
    [ Upstream commit 4b6377f0e96085cbec96eb7f0b282430ccdd3d75 ]
    
    This commit addresses a potential null pointer dereference issue in the
    `dcn401_init_hw` function. The issue could occur when `dc->clk_mgr` or
    `dc->clk_mgr->funcs` is null.
    
    The fix adds a check to ensure `dc->clk_mgr` and `dc->clk_mgr->funcs` is
    not null before accessing its functions. This prevents a potential null
    pointer dereference.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn401/dcn401_hwseq.c:416 dcn401_init_hw() error: we previously assumed 'dc->clk_mgr' could be null (see line 225)
    
    Cc: Tom Chung <chiahsuan.chung@amd.com>
    Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
    Cc: Roman Li <roman.li@amd.com>
    Cc: Alex Hung <alex.hung@amd.com>
    Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
    Cc: Harry Wentland <harry.wentland@amd.com>
    Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
    Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
    Reviewed-by: Alex Hung <alex.hung@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Add NULL check for clk_mgr in dcn32_init_hw [+ + +]
Author: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Date:   Mon Jul 22 16:44:40 2024 +0530

    drm/amd/display: Add NULL check for clk_mgr in dcn32_init_hw
    
    [ Upstream commit c395fd47d1565bd67671f45cca281b3acc2c31ef ]
    
    This commit addresses a potential null pointer dereference issue in the
    `dcn32_init_hw` function. The issue could occur when `dc->clk_mgr` is
    null.
    
    The fix adds a check to ensure `dc->clk_mgr` is not null before
    accessing its functions. This prevents a potential null pointer
    dereference.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn32/dcn32_hwseq.c:961 dcn32_init_hw() error: we previously assumed 'dc->clk_mgr' could be null (see line 782)
    
    Cc: Tom Chung <chiahsuan.chung@amd.com>
    Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
    Cc: Roman Li <roman.li@amd.com>
    Cc: Alex Hung <alex.hung@amd.com>
    Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
    Cc: Harry Wentland <harry.wentland@amd.com>
    Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
    Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
    Reviewed-by: Alex Hung <alex.hung@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Add NULL check for function pointer in dcn20_set_output_transfer_func [+ + +]
Author: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Date:   Wed Jul 31 13:09:28 2024 +0530

    drm/amd/display: Add NULL check for function pointer in dcn20_set_output_transfer_func
    
    [ Upstream commit 62ed6f0f198da04e884062264df308277628004f ]
    
    This commit adds a null check for the set_output_gamma function pointer
    in the dcn20_set_output_transfer_func function. Previously,
    set_output_gamma was being checked for null at line 1030, but then it
    was being dereferenced without any null check at line 1048. This could
    potentially lead to a null pointer dereference error if set_output_gamma
    is null.
    
    To fix this, we now ensure that set_output_gamma is not null before
    dereferencing it. We do this by adding a null check for set_output_gamma
    before the call to set_output_gamma at line 1048.
    
    Cc: Tom Chung <chiahsuan.chung@amd.com>
    Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
    Cc: Roman Li <roman.li@amd.com>
    Cc: Alex Hung <alex.hung@amd.com>
    Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
    Cc: Harry Wentland <harry.wentland@amd.com>
    Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
    Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
    Reviewed-by: Tom Chung <chiahsuan.chung@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Add NULL check for function pointer in dcn32_set_output_transfer_func [+ + +]
Author: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Date:   Wed Jul 31 13:15:00 2024 +0530

    drm/amd/display: Add NULL check for function pointer in dcn32_set_output_transfer_func
    
    [ Upstream commit 28574b08c70e56d34d6f6379326a860b96749051 ]
    
    This commit adds a null check for the set_output_gamma function pointer
    in the dcn32_set_output_transfer_func function. Previously,
    set_output_gamma was being checked for null, but then it was being
    dereferenced without any null check. This could lead to a null pointer
    dereference if set_output_gamma is null.
    
    To fix this, we now ensure that set_output_gamma is not null before
    dereferencing it. We do this by adding a null check for set_output_gamma
    before the call to set_output_gamma.
    
    Cc: Tom Chung <chiahsuan.chung@amd.com>
    Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
    Cc: Roman Li <roman.li@amd.com>
    Cc: Alex Hung <alex.hung@amd.com>
    Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
    Cc: Harry Wentland <harry.wentland@amd.com>
    Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
    Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
    Reviewed-by: Tom Chung <chiahsuan.chung@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Add NULL check for function pointer in dcn401_set_output_transfer_func [+ + +]
Author: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Date:   Wed Jul 31 13:22:06 2024 +0530

    drm/amd/display: Add NULL check for function pointer in dcn401_set_output_transfer_func
    
    [ Upstream commit dd340acd42c24a3f28dd22fae6bf38662334264c ]
    
    This commit adds a null check for the set_output_gamma function pointer
    in the dcn401_set_output_transfer_func function. Previously,
    set_output_gamma was being checked for null, but then it was being
    dereferenced without any null check. This could lead to a null pointer
    dereference if set_output_gamma is null.
    
    To fix this, we now ensure that set_output_gamma is not null before
    dereferencing it. We do this by adding a null check for set_output_gamma
    before the call to set_output_gamma.
    
    Cc: Tom Chung <chiahsuan.chung@amd.com>
    Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
    Cc: Roman Li <roman.li@amd.com>
    Cc: Alex Hung <alex.hung@amd.com>
    Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
    Cc: Harry Wentland <harry.wentland@amd.com>
    Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
    Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
    Reviewed-by: Tom Chung <chiahsuan.chung@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Add null check for head_pipe in dcn201_acquire_free_pipe_for_layer [+ + +]
Author: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Date:   Sun Jul 21 19:18:58 2024 +0530

    drm/amd/display: Add null check for head_pipe in dcn201_acquire_free_pipe_for_layer
    
    [ Upstream commit f22f4754aaa47d8c59f166ba3042182859e5dff7 ]
    
    This commit addresses a potential null pointer dereference issue in the
    `dcn201_acquire_free_pipe_for_layer` function. The issue could occur
    when `head_pipe` is null.
    
    The fix adds a check to ensure `head_pipe` is not null before asserting
    it. If `head_pipe` is null, the function returns NULL to prevent a
    potential null pointer dereference.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/resource/dcn201/dcn201_resource.c:1016 dcn201_acquire_free_pipe_for_layer() error: we previously assumed 'head_pipe' could be null (see line 1010)
    
    Cc: Tom Chung <chiahsuan.chung@amd.com>
    Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
    Cc: Roman Li <roman.li@amd.com>
    Cc: Alex Hung <alex.hung@amd.com>
    Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
    Cc: Harry Wentland <harry.wentland@amd.com>
    Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
    Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
    Reviewed-by: Tom Chung <chiahsuan.chung@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Add null check for head_pipe in dcn32_acquire_idle_pipe_for_head_pipe_in_layer [+ + +]
Author: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Date:   Sun Jul 21 19:30:16 2024 +0530

    drm/amd/display: Add null check for head_pipe in dcn32_acquire_idle_pipe_for_head_pipe_in_layer
    
    [ Upstream commit ac2140449184a26eac99585b7f69814bd3ba8f2d ]
    
    This commit addresses a potential null pointer dereference issue in the
    `dcn32_acquire_idle_pipe_for_head_pipe_in_layer` function. The issue
    could occur when `head_pipe` is null.
    
    The fix adds a check to ensure `head_pipe` is not null before asserting
    it. If `head_pipe` is null, the function returns NULL to prevent a
    potential null pointer dereference.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/resource/dcn32/dcn32_resource.c:2690 dcn32_acquire_idle_pipe_for_head_pipe_in_layer() error: we previously assumed 'head_pipe' could be null (see line 2681)
    
    Cc: Tom Chung <chiahsuan.chung@amd.com>
    Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
    Cc: Roman Li <roman.li@amd.com>
    Cc: Alex Hung <alex.hung@amd.com>
    Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
    Cc: Harry Wentland <harry.wentland@amd.com>
    Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
    Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
    Reviewed-by: Tom Chung <chiahsuan.chung@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Add null check for pipe_ctx->plane_state in dcn20_program_pipe [+ + +]
Author: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Date:   Thu Jul 25 08:14:56 2024 +0530

    drm/amd/display: Add null check for pipe_ctx->plane_state in dcn20_program_pipe
    
    [ Upstream commit 8e4ed3cf1642df0c4456443d865cff61a9598aa8 ]
    
    This commit addresses a null pointer dereference issue in the
    `dcn20_program_pipe` function. The issue could occur when
    `pipe_ctx->plane_state` is null.
    
    The fix adds a check to ensure `pipe_ctx->plane_state` is not null
    before accessing. This prevents a null pointer dereference.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn20/dcn20_hwseq.c:1925 dcn20_program_pipe() error: we previously assumed 'pipe_ctx->plane_state' could be null (see line 1877)
    
    Cc: Tom Chung <chiahsuan.chung@amd.com>
    Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
    Cc: Roman Li <roman.li@amd.com>
    Cc: Alex Hung <alex.hung@amd.com>
    Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
    Cc: Harry Wentland <harry.wentland@amd.com>
    Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
    Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
    Reviewed-by: Tom Chung <chiahsuan.chung@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Add null check for top_pipe_to_program in commit_planes_for_stream [+ + +]
Author: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Date:   Thu Jul 25 07:23:48 2024 +0530

    drm/amd/display: Add null check for top_pipe_to_program in commit_planes_for_stream
    
    [ Upstream commit 66d71a72539e173a9b00ca0b1852cbaa5f5bf1ad ]
    
    This commit addresses a null pointer dereference issue in the
    `commit_planes_for_stream` function at line 4140. The issue could occur
    when `top_pipe_to_program` is null.
    
    The fix adds a check to ensure `top_pipe_to_program` is not null before
    accessing its stream_res. This prevents a null pointer dereference.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc.c:4140 commit_planes_for_stream() error: we previously assumed 'top_pipe_to_program' could be null (see line 3906)
    
    Cc: Tom Chung <chiahsuan.chung@amd.com>
    Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
    Cc: Roman Li <roman.li@amd.com>
    Cc: Alex Hung <alex.hung@amd.com>
    Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
    Cc: Harry Wentland <harry.wentland@amd.com>
    Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
    Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
    Reviewed-by: Tom Chung <chiahsuan.chung@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Allow backlight to go below `AMDGPU_DM_DEFAULT_MIN_BACKLIGHT` [+ + +]
Author: Mario Limonciello <mario.limonciello@amd.com>
Date:   Sun Sep 15 14:28:37 2024 -0500

    drm/amd/display: Allow backlight to go below `AMDGPU_DM_DEFAULT_MIN_BACKLIGHT`
    
    [ Upstream commit 87d749a6aab73d8069d0345afaa98297816cb220 ]
    
    The issue with panel power savings compatibility below
    `AMDGPU_DM_DEFAULT_MIN_BACKLIGHT` happens at
    `AMDGPU_DM_DEFAULT_MIN_BACKLIGHT` as well.
    
    That issue will be fixed separately, so don't prevent the backlight
    brightness from going that low.
    
    Cc: Harry Wentland <harry.wentland@amd.com>
    Cc: Thomas Weißschuh <linux@weissschuh.net>
    Link: https://lore.kernel.org/amd-gfx/be04226a-a9e3-4a45-a83b-6d263c6557d8@t-8ch.de/T/#m400dee4e2fc61fe9470334d20a7c8c89c9aef44f
    Reviewed-by: Harry Wentland <harry.wentland@amd.com>
    Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Avoid overflow assignment in link_dp_cts [+ + +]
Author: Alex Hung <alex.hung@amd.com>
Date:   Wed Jul 17 09:17:56 2024 -0600

    drm/amd/display: Avoid overflow assignment in link_dp_cts
    
    [ Upstream commit a15268787b79fd183dd526cc16bec9af4f4e49a1 ]
    
    sampling_rate is an uint8_t but is assigned an unsigned int, and thus it
    can overflow. As a result, sampling_rate is changed to uint32_t.
    
    Similarly, LINK_QUAL_PATTERN_SET has a size of 2 bits, and it should
    only be assigned to a value less or equal than 4.
    
    This fixes 2 INTEGER_OVERFLOW issues reported by Coverity.
    
    Signed-off-by: Alex Hung <alex.hung@amd.com>
    Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: avoid set dispclk to 0 [+ + +]
Author: Charlene Liu <Charlene.Liu@amd.com>
Date:   Wed Sep 11 19:45:09 2024 -0400

    drm/amd/display: avoid set dispclk to 0
    
    commit c36df0f5f5e5acec5d78f23c4725cc500df28843 upstream.
    
    [why]
    set dispclk to 0 cause stability issue.
    
    Cc: Mario Limonciello <mario.limonciello@amd.com>
    Cc: Alex Deucher <alexander.deucher@amd.com>
    Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
    Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
    Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    (cherry picked from commit 1c6b16ebf5eb2bc5740be9e37b3a69f1dfe1dded)
    Cc: stable@vger.kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/amd/display: Check null pointer before try to access it [+ + +]
Author: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Date:   Tue Jul 30 20:02:45 2024 -0600

    drm/amd/display: Check null pointer before try to access it
    
    [ Upstream commit 1b686053c06ffb9f4524b288110cf2a831ff7a25 ]
    
    [why & how]
    Change the order of the pipe_ctx->plane_state check to ensure that
    plane_state is not null before accessing it.
    
    Reviewed-by: Alex Hung <alex.hung@amd.com>
    Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
    Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Check null pointers before multiple uses [+ + +]
Author: Alex Hung <alex.hung@amd.com>
Date:   Tue Jun 25 10:37:35 2024 -0600

    drm/amd/display: Check null pointers before multiple uses
    
    [ Upstream commit fdd5ecbbff751c3b9061d8ebb08e5c96119915b4 ]
    
    [WHAT & HOW]
    Poniters, such as stream_enc and dc->bw_vbios, are null checked previously
    in the same function, so Coverity warns "implies that stream_enc and
    dc->bw_vbios might be null". They are used multiple times in the
    subsequent code and need to be checked.
    
    This fixes 10 FORWARD_NULL issues reported by Coverity.
    
    Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
    Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
    Signed-off-by: Alex Hung <alex.hung@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Check null pointers before used [+ + +]
Author: Alex Hung <alex.hung@amd.com>
Date:   Tue Jun 25 10:35:52 2024 -0600

    drm/amd/display: Check null pointers before used
    
    [ Upstream commit be1fb44389ca3038ad2430dac4234669bc177ee3 ]
    
    [WHAT & HOW]
    Poniters, such as dc->clk_mgr, are null checked previously in the same
    function, so Coverity warns "implies that "dc->clk_mgr" might be null".
    As a result, these pointers need to be checked when used again.
    
    This fixes 10 FORWARD_NULL issues reported by Coverity.
    
    Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
    Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
    Signed-off-by: Alex Hung <alex.hung@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Check null pointers before using dc->clk_mgr [+ + +]
Author: Alex Hung <alex.hung@amd.com>
Date:   Mon Jul 29 15:29:09 2024 -0600

    drm/amd/display: Check null pointers before using dc->clk_mgr
    
    [ Upstream commit 95d9e0803e51d5a24276b7643b244c7477daf463 ]
    
    [WHY & HOW]
    dc->clk_mgr is null checked previously in the same function, indicating
    it might be null.
    
    Passing "dc" to "dc->hwss.apply_idle_power_optimizations", which
    dereferences null "dc->clk_mgr". (The function pointer resolves to
    "dcn35_apply_idle_power_optimizations".)
    
    This fixes 1 FORWARD_NULL issue reported by Coverity.
    
    Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
    Signed-off-by: Alex Hung <alex.hung@amd.com>
    Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Check null pointers before using them [+ + +]
Author: Alex Hung <alex.hung@amd.com>
Date:   Thu Jun 27 17:38:16 2024 -0600

    drm/amd/display: Check null pointers before using them
    
    [ Upstream commit 1ff12bcd7deaeed25efb5120433c6a45dd5504a8 ]
    
    [WHAT & HOW]
    These pointers are null checked previously in the same function,
    indicating they might be null as reported by Coverity. As a result,
    they need to be checked when used again.
    
    This fixes 3 FORWARD_NULL issue reported by Coverity.
    
    Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
    Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
    Signed-off-by: Alex Hung <alex.hung@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Check null-initialized variables [+ + +]
Author: Alex Hung <alex.hung@amd.com>
Date:   Thu Jun 27 17:34:18 2024 -0600

    drm/amd/display: Check null-initialized variables
    
    [ Upstream commit 367cd9ceba1933b63bc1d87d967baf6d9fd241d2 ]
    
    [WHAT & HOW]
    drr_timing and subvp_pipe are initialized to null and they are not
    always assigned new values. It is necessary to check for null before
    dereferencing.
    
    This fixes 2 FORWARD_NULL issues reported by Coverity.
    
    Reviewed-by: Nevenko Stupar <nevenko.stupar@amd.com>
    Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
    Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
    Signed-off-by: Alex Hung <alex.hung@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Check phantom_stream before it is used [+ + +]
Author: Alex Hung <alex.hung@amd.com>
Date:   Thu Jun 20 20:23:41 2024 -0600

    drm/amd/display: Check phantom_stream before it is used
    
    [ Upstream commit 3718a619a8c0a53152e76bb6769b6c414e1e83f4 ]
    
    dcn32_enable_phantom_stream can return null, so returned value
    must be checked before used.
    
    This fixes 1 NULL_RETURNS issue reported by Coverity.
    
    Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
    Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
    Signed-off-by: Alex Hung <alex.hung@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Check stream before comparing them [+ + +]
Author: Alex Hung <alex.hung@amd.com>
Date:   Thu Jun 27 20:05:14 2024 -0600

    drm/amd/display: Check stream before comparing them
    
    [ Upstream commit 35ff747c86767937ee1e0ca987545b7eed7a0810 ]
    
    [WHAT & HOW]
    amdgpu_dm can pass a null stream to dc_is_stream_unchanged. It is
    necessary to check for null before dereferencing them.
    
    This fixes 1 FORWARD_NULL issue reported by Coverity.
    
    Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
    Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
    Signed-off-by: Alex Hung <alex.hung@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Check stream_status before it is used [+ + +]
Author: Alex Hung <alex.hung@amd.com>
Date:   Mon Jul 15 10:37:28 2024 -0600

    drm/amd/display: Check stream_status before it is used
    
    [ Upstream commit 58a8ee96f84d2c21abb85ad8c22d2bbdf59bd7a9 ]
    
    [WHAT & HOW]
    dc_state_get_stream_status can return null, and therefore null must be
    checked before stream_status is used.
    
    This fixes 1 NULL_RETURNS issue reported by Coverity.
    
    Signed-off-by: Alex Hung <alex.hung@amd.com>
    Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Deallocate DML memory if allocation fails [+ + +]
Author: Chris Park <chris.park@amd.com>
Date:   Fri Jun 28 15:09:06 2024 -0400

    drm/amd/display: Deallocate DML memory if allocation fails
    
    [ Upstream commit 892abca6877a96c9123bb1c010cafccdf8ca1b75 ]
    
    [Why]
    When DC state create DML memory allocation fails, memory is not
    deallocated subsequently, resulting in uninitialized structure
    that is not NULL.
    
    [How]
    Deallocate memory if DML memory allocation fails.
    
    Reviewed-by: Joshua Aberback <joshua.aberback@amd.com>
    Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
    Signed-off-by: Chris Park <chris.park@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Disable replay if VRR capability is false [+ + +]
Author: Tom Chung <chiahsuan.chung@amd.com>
Date:   Wed Jun 26 16:14:24 2024 +0800

    drm/amd/display: Disable replay if VRR capability is false
    
    [ Upstream commit b68417613d4134b9e39fff95e72ca726268b47db ]
    
    [Why]
    The VRR need to be supported for panel replay feature.
    If VRR capability is false, panel replay capability also
    need to be disabled.
    
    [How]
    After update the vrr capability, the panel replay capability
    also need to be check if need.
    
    Reviewed-by: Wayne Lin <wayne.lin@amd.com>
    Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
    Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Enable idle workqueue for more IPS modes [+ + +]
Author: Leo Li <sunpeng.li@amd.com>
Date:   Wed Sep 11 17:27:08 2024 -0400

    drm/amd/display: Enable idle workqueue for more IPS modes
    
    commit ef785ca7f7c80891580cafd36c8dd86375684310 upstream.
    
    [Why]
    
    There are more IPS modes other than DMUB_IPS_ENABLE that enables IPS. We
    need to enable the hotplug detect idle workqueue for those modes as
    well.
    
    [How]
    
    Modify the if condition to initialize the workqueue in all IPS modes
    except for DMUB_IPS_DISABLE_ALL.
    
    Fixes: 65444581a4ae ("drm/amd/display: Determine IPS mode by ASIC and PMFW versions")
    Signed-off-by: Leo Li <sunpeng.li@amd.com>
    Reviewed-by: Roman Li <roman.li@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    (cherry picked from commit 181db30bcfed097ecc680539b1eabe935c11f57f)
    Cc: stable@vger.kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/amd/display: fix a UBSAN warning in DML2.1 [+ + +]
Author: Aurabindo Pillai <aurabindo.pillai@amd.com>
Date:   Fri Jul 19 14:10:58 2024 -0400

    drm/amd/display: fix a UBSAN warning in DML2.1
    
    [ Upstream commit eaf3adb8faab611ba57594fa915893fc93a7788c ]
    
    When programming phantom pipe, since cursor_width is explicity set to 0,
    this causes calculation logic to trigger overflow for an unsigned int
    triggering the kernel's UBSAN check as below:
    
    [   40.962845] UBSAN: shift-out-of-bounds in /tmp/amd.EfpumTkO/amd/amdgpu/../display/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4_calcs.c:3312:34
    [   40.962849] shift exponent 4294967170 is too large for 32-bit type 'unsigned int'
    [   40.962852] CPU: 1 PID: 1670 Comm: gnome-shell Tainted: G        W  OE      6.5.0-41-generic #41~22.04.2-Ubuntu
    [   40.962854] Hardware name: Gigabyte Technology Co., Ltd. X670E AORUS PRO X/X670E AORUS PRO X, BIOS F21 01/10/2024
    [   40.962856] Call Trace:
    [   40.962857]  <TASK>
    [   40.962860]  dump_stack_lvl+0x48/0x70
    [   40.962870]  dump_stack+0x10/0x20
    [   40.962872]  __ubsan_handle_shift_out_of_bounds+0x1ac/0x360
    [   40.962878]  calculate_cursor_req_attributes.cold+0x1b/0x28 [amdgpu]
    [   40.963099]  dml_core_mode_support+0x6b91/0x16bc0 [amdgpu]
    [   40.963327]  ? srso_alias_return_thunk+0x5/0x7f
    [   40.963331]  ? CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport+0x18b8/0x2790 [amdgpu]
    [   40.963534]  ? srso_alias_return_thunk+0x5/0x7f
    [   40.963536]  ? dml_core_mode_support+0xb3db/0x16bc0 [amdgpu]
    [   40.963730]  dml2_core_calcs_mode_support_ex+0x2c/0x90 [amdgpu]
    [   40.963906]  ? srso_alias_return_thunk+0x5/0x7f
    [   40.963909]  ? dml2_core_calcs_mode_support_ex+0x2c/0x90 [amdgpu]
    [   40.964078]  core_dcn4_mode_support+0x72/0xbf0 [amdgpu]
    [   40.964247]  dml2_top_optimization_perform_optimization_phase+0x1d3/0x2a0 [amdgpu]
    [   40.964420]  dml2_build_mode_programming+0x23d/0x750 [amdgpu]
    [   40.964587]  dml21_validate+0x274/0x770 [amdgpu]
    [   40.964761]  ? srso_alias_return_thunk+0x5/0x7f
    [   40.964763]  ? resource_append_dpp_pipes_for_plane_composition+0x27c/0x3b0 [amdgpu]
    [   40.964942]  dml2_validate+0x504/0x750 [amdgpu]
    [   40.965117]  ? dml21_copy+0x95/0xb0 [amdgpu]
    [   40.965291]  ? srso_alias_return_thunk+0x5/0x7f
    [   40.965295]  dcn401_validate_bandwidth+0x4e/0x70 [amdgpu]
    [   40.965491]  update_planes_and_stream_state+0x38d/0x5c0 [amdgpu]
    [   40.965672]  update_planes_and_stream_v3+0x52/0x1e0 [amdgpu]
    [   40.965845]  ? srso_alias_return_thunk+0x5/0x7f
    [   40.965849]  dc_update_planes_and_stream+0x71/0xb0 [amdgpu]
    
    Fix this by adding a guard for checking cursor width before triggering
    the size calculation.
    
    Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
    Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
    Signed-off-by: Wayne Lin <wayne.lin@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: fix double free issue during amdgpu module unload [+ + +]
Author: Tim Huang <tim.huang@amd.com>
Date:   Thu Aug 15 18:45:22 2024 -0400

    drm/amd/display: fix double free issue during amdgpu module unload
    
    [ Upstream commit 20b5a8f9f4670a8503aa9fa95ca632e77c6bf55d ]
    
    Flexible endpoints use DIGs from available inflexible endpoints,
    so only the encoders of inflexible links need to be freed.
    Otherwise, a double free issue may occur when unloading the
    amdgpu module.
    
    [  279.190523] RIP: 0010:__slab_free+0x152/0x2f0
    [  279.190577] Call Trace:
    [  279.190580]  <TASK>
    [  279.190582]  ? show_regs+0x69/0x80
    [  279.190590]  ? die+0x3b/0x90
    [  279.190595]  ? do_trap+0xc8/0xe0
    [  279.190601]  ? do_error_trap+0x73/0xa0
    [  279.190605]  ? __slab_free+0x152/0x2f0
    [  279.190609]  ? exc_invalid_op+0x56/0x70
    [  279.190616]  ? __slab_free+0x152/0x2f0
    [  279.190642]  ? asm_exc_invalid_op+0x1f/0x30
    [  279.190648]  ? dcn10_link_encoder_destroy+0x19/0x30 [amdgpu]
    [  279.191096]  ? __slab_free+0x152/0x2f0
    [  279.191102]  ? dcn10_link_encoder_destroy+0x19/0x30 [amdgpu]
    [  279.191469]  kfree+0x260/0x2b0
    [  279.191474]  dcn10_link_encoder_destroy+0x19/0x30 [amdgpu]
    [  279.191821]  link_destroy+0xd7/0x130 [amdgpu]
    [  279.192248]  dc_destruct+0x90/0x270 [amdgpu]
    [  279.192666]  dc_destroy+0x19/0x40 [amdgpu]
    [  279.193020]  amdgpu_dm_fini+0x16e/0x200 [amdgpu]
    [  279.193432]  dm_hw_fini+0x26/0x40 [amdgpu]
    [  279.193795]  amdgpu_device_fini_hw+0x24c/0x400 [amdgpu]
    [  279.194108]  amdgpu_driver_unload_kms+0x4f/0x70 [amdgpu]
    [  279.194436]  amdgpu_pci_remove+0x40/0x80 [amdgpu]
    [  279.194632]  pci_device_remove+0x3a/0xa0
    [  279.194638]  device_remove+0x40/0x70
    [  279.194642]  device_release_driver_internal+0x1ad/0x210
    [  279.194647]  driver_detach+0x4e/0xa0
    [  279.194650]  bus_remove_driver+0x6f/0xf0
    [  279.194653]  driver_unregister+0x33/0x60
    [  279.194657]  pci_unregister_driver+0x44/0x90
    [  279.194662]  amdgpu_exit+0x19/0x1f0 [amdgpu]
    [  279.194939]  __do_sys_delete_module.isra.0+0x198/0x2f0
    [  279.194946]  __x64_sys_delete_module+0x16/0x20
    [  279.194950]  do_syscall_64+0x58/0x120
    [  279.194954]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
    [  279.194980]  </TASK>
    
    Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
    Signed-off-by: Tim Huang <tim.huang@amd.com>
    Reviewed-by: Roman Li <roman.li@amd.com>
    Signed-off-by: Roman Li <roman.li@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Fix index out of bounds in DCN30 color transformation [+ + +]
Author: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Date:   Sat Jul 20 18:05:20 2024 +0530

    drm/amd/display: Fix index out of bounds in DCN30 color transformation
    
    [ Upstream commit d81873f9e715b72d4f8d391c8eb243946f784dfc ]
    
    This commit addresses a potential index out of bounds issue in the
    `cm3_helper_translate_curve_to_hw_format` function in the DCN30 color
    management module. The issue could occur when the index 'i' exceeds the
    number of transfer function points (TRANSFER_FUNC_POINTS).
    
    The fix adds a check to ensure 'i' is within bounds before accessing the
    transfer function points. If 'i' is out of bounds, the function returns
    false to indicate an error.
    
    drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_cm_common.c:180 cm3_helper_translate_curve_to_hw_format() error: buffer overflow 'output_tf->tf_pts.red' 1025 <= s32max
    drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_cm_common.c:181 cm3_helper_translate_curve_to_hw_format() error: buffer overflow 'output_tf->tf_pts.green' 1025 <= s32max
    drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_cm_common.c:182 cm3_helper_translate_curve_to_hw_format() error: buffer overflow 'output_tf->tf_pts.blue' 1025 <= s32max
    
    Cc: Tom Chung <chiahsuan.chung@amd.com>
    Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
    Cc: Roman Li <roman.li@amd.com>
    Cc: Alex Hung <alex.hung@amd.com>
    Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
    Cc: Harry Wentland <harry.wentland@amd.com>
    Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
    Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
    Reviewed-by: Tom Chung <chiahsuan.chung@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Fix index out of bounds in DCN30 degamma hardware format translation [+ + +]
Author: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Date:   Sat Jul 20 18:44:02 2024 +0530

    drm/amd/display: Fix index out of bounds in DCN30 degamma hardware format translation
    
    [ Upstream commit bc50b614d59990747dd5aeced9ec22f9258991ff ]
    
    This commit addresses a potential index out of bounds issue in the
    `cm3_helper_translate_curve_to_degamma_hw_format` function in the DCN30
    color  management module. The issue could occur when the index 'i'
    exceeds the  number of transfer function points (TRANSFER_FUNC_POINTS).
    
    The fix adds a check to ensure 'i' is within bounds before accessing the
    transfer function points. If 'i' is out of bounds, the function returns
    false to indicate an error.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_cm_common.c:338 cm3_helper_translate_curve_to_degamma_hw_format() error: buffer overflow 'output_tf->tf_pts.red' 1025 <= s32max
    drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_cm_common.c:339 cm3_helper_translate_curve_to_degamma_hw_format() error: buffer overflow 'output_tf->tf_pts.green' 1025 <= s32max
    drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_cm_common.c:340 cm3_helper_translate_curve_to_degamma_hw_format() error: buffer overflow 'output_tf->tf_pts.blue' 1025 <= s32max
    
    Cc: Tom Chung <chiahsuan.chung@amd.com>
    Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
    Cc: Roman Li <roman.li@amd.com>
    Cc: Alex Hung <alex.hung@amd.com>
    Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
    Cc: Harry Wentland <harry.wentland@amd.com>
    Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
    Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
    Reviewed-by: Tom Chung <chiahsuan.chung@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Fix index out of bounds in degamma hardware format translation [+ + +]
Author: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Date:   Sat Jul 20 17:48:27 2024 +0530

    drm/amd/display: Fix index out of bounds in degamma hardware format translation
    
    [ Upstream commit b7e99058eb2e86aabd7a10761e76cae33d22b49f ]
    
    Fixes index out of bounds issue in
    `cm_helper_translate_curve_to_degamma_hw_format` function. The issue
    could occur when the index 'i' exceeds the number of transfer function
    points (TRANSFER_FUNC_POINTS).
    
    The fix adds a check to ensure 'i' is within bounds before accessing the
    transfer function points. If 'i' is out of bounds the function returns
    false to indicate an error.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/dcn10/dcn10_cm_common.c:594 cm_helper_translate_curve_to_degamma_hw_format() error: buffer overflow 'output_tf->tf_pts.red' 1025 <= s32max
    drivers/gpu/drm/amd/amdgpu/../display/dc/dcn10/dcn10_cm_common.c:595 cm_helper_translate_curve_to_degamma_hw_format() error: buffer overflow 'output_tf->tf_pts.green' 1025 <= s32max
    drivers/gpu/drm/amd/amdgpu/../display/dc/dcn10/dcn10_cm_common.c:596 cm_helper_translate_curve_to_degamma_hw_format() error: buffer overflow 'output_tf->tf_pts.blue' 1025 <= s32max
    
    Cc: Tom Chung <chiahsuan.chung@amd.com>
    Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
    Cc: Roman Li <roman.li@amd.com>
    Cc: Alex Hung <alex.hung@amd.com>
    Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
    Cc: Harry Wentland <harry.wentland@amd.com>
    Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
    Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
    Reviewed-by: Tom Chung <chiahsuan.chung@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Fix possible overflow in integer multiplication [+ + +]
Author: Alex Hung <alex.hung@amd.com>
Date:   Fri Jun 7 22:09:53 2024 -0600

    drm/amd/display: Fix possible overflow in integer multiplication
    
    [ Upstream commit 3f96f545f877ac59d0c967f52d760b4b2b3b9a47 ]
    
    [WHAT & HOW]
    Integer multiplies integer may overflow in context that expects an
    expression of unsigned long long (64 bits). This can be fixed by casting
    integer to unsigned long long to force 64 bits results.
    
    This fixes 2 OVERFLOW_BEFORE_WIDEN issues reported by Coverity.
    
    Signed-off-by: Alex Hung <alex.hung@amd.com>
    Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Fix system hang while resume with TBT monitor [+ + +]
Author: Tom Chung <chiahsuan.chung@amd.com>
Date:   Fri Sep 13 15:44:40 2024 +0800

    drm/amd/display: Fix system hang while resume with TBT monitor
    
    commit 52d4e3fb3d340447dcdac0e14ff21a764f326907 upstream.
    
    [Why]
    Connected with a Thunderbolt monitor and do the suspend and the system
    may hang while resume.
    
    The TBT monitor HPD will be triggered during the resume procedure
    and call the drm_client_modeset_probe() while
    struct drm_connector connector->dev->master is NULL.
    
    It will mess up the pipe topology after resume.
    
    [How]
    Skip the TBT monitor HPD during the resume procedure because we
    currently will probe the connectors after resume by default.
    
    Reviewed-by: Wayne Lin <wayne.lin@amd.com>
    Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
    Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    (cherry picked from commit 453f86a26945207a16b8f66aaed5962dc2b95b85)
    Cc: stable@vger.kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/amd/display: Fix VRR cannot enable [+ + +]
Author: Tom Chung <chiahsuan.chung@amd.com>
Date:   Wed Jul 3 16:47:57 2024 +0800

    drm/amd/display: Fix VRR cannot enable
    
    [ Upstream commit f91a9af09dea850d83d4b217b8acbafd97b5c61f ]
    
    [Why]
    Sometimes the VRR cannot enable after login to the desktop.
    
    User space may call the DRM_IOCTL_MODE_GETCONNECTOR right after
    the DRM_IOCTL_MODE_RMFB.
    
    After calling DRM_IOCTL_MODE_RMFB to remove all the frame buffer
    and it will cause the driver to disable the crtc and disable the
    link while calling the link_set_dpms_off().
    
    It will cause the dpcd read failed in amdgpu_dm_update_freesync_caps()
    while try to get the DP_MSA_TIMING_PAR_IGNORED capability and think
    the sink side does not support VRR.
    
    [How]
    Use the dpcd_caps.allow_invalid_MSA_timing_param flag instead of
    reading from dpcd directly.
    
    dpcd_caps.allow_invalid_MSA_timing_param flag is updated during HPD.
    It is safe to replace the original method.
    
    Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
    Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
    Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Force enable 3DLUT DMA check for dcn401 in DML [+ + +]
Author: Dillon Varone <dillon.varone@amd.com>
Date:   Tue Jul 23 15:54:23 2024 -0400

    drm/amd/display: Force enable 3DLUT DMA check for dcn401 in DML
    
    [ Upstream commit b8dc6ca028d9a39196a3a066b9ef2d4a5eca475d ]
    
    [WHY]
    Currently TR0 (trip 0) is not properly budgeting for urgent latency in
    DML2.1. This results in overly aggressive prefetch schedules that are
    vulnerable to request return jitter, resulting in severe underflow at
    the start of the frame.
    
    [HOW]
    Forcing 3DLUT DMA check to enable causes urgent latency to be budgeted
    properly into the prefetch schedule, avoiding the vulnerability.
    
    Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
    Signed-off-by: Dillon Varone <dillon.varone@amd.com>
    Signed-off-by: Wayne Lin <wayne.lin@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: guard write a 0 post_divider value to HW [+ + +]
Author: Ahmed, Muhammad <Ahmed.Ahmed@amd.com>
Date:   Tue Aug 13 17:11:55 2024 -0400

    drm/amd/display: guard write a 0 post_divider value to HW
    
    [ Upstream commit 5d666496c24129edeb2bcb500498b87cc64e7f07 ]
    
    [why]
    post_divider_value should not be 0.
    
    Reviewed-by: Charlene Liu <charlene.liu@amd.com>
    Signed-off-by: Ahmed, Muhammad <Ahmed.Ahmed@amd.com>
    Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Handle null 'stream_status' in 'planes_changed_for_existing_stream' [+ + +]
Author: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Date:   Fri Jul 26 19:31:55 2024 +0530

    drm/amd/display: Handle null 'stream_status' in 'planes_changed_for_existing_stream'
    
    [ Upstream commit 8141f21b941710ecebe49220b69822cab3abd23d ]
    
    This commit adds a null check for 'stream_status' in the function
    'planes_changed_for_existing_stream'. Previously, the code assumed
    'stream_status' could be null, but did not handle the case where it was
    actually null. This could lead to a null pointer dereference.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_resource.c:3784 planes_changed_for_existing_stream() error: we previously assumed 'stream_status' could be null (see line 3774)
    
    Cc: Tom Chung <chiahsuan.chung@amd.com>
    Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
    Cc: Roman Li <roman.li@amd.com>
    Cc: Alex Hung <alex.hung@amd.com>
    Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
    Cc: Harry Wentland <harry.wentland@amd.com>
    Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
    Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
    Reviewed-by: Tom Chung <chiahsuan.chung@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: handle nulled pipe context in DCE110's set_drr() [+ + +]
Author: Tobias Jakobi <tjakobi@math.uni-bielefeld.de>
Date:   Mon Sep 16 14:54:05 2024 +0200

    drm/amd/display: handle nulled pipe context in DCE110's set_drr()
    
    [ Upstream commit e7d4e1438533abe448813bdc45691f9c230aa307 ]
    
    As set_drr() is called from IRQ context, it can happen that the
    pipe context has been nulled by dc_state_destruct().
    
    Apply the same protection here that is already present for
    dcn35_set_drr() and dcn10_set_drr(). I.e. fetch the tg pointer
    first (to avoid a race with dc_state_destruct()), and then
    check the local copy before using it.
    
    Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3142
    Fixes: 06ad7e164256 ("drm/amd/display: Destroy DC context while keeping DML and DML2")
    Acked-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Implement bounds check for stream encoder creation in DCN401 [+ + +]
Author: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Date:   Fri Jul 19 21:39:57 2024 +0530

    drm/amd/display: Implement bounds check for stream encoder creation in DCN401
    
    [ Upstream commit bdf606810210e8e07a0cdf1af3c467291363b295 ]
    
    'stream_enc_regs' array is an array of dcn10_stream_enc_registers
    structures. The array is initialized with four elements, corresponding
    to the four calls to stream_enc_regs() in the array initializer. This
    means that valid indices for this array are 0, 1, 2, and 3.
    
    The error message 'stream_enc_regs' 4 <= 5 below, is indicating that
    there is an attempt to access this array with an index of 5, which is
    out of bounds. This could lead to undefined behavior
    
    Here, eng_id is used as an index to access the stream_enc_regs array. If
    eng_id is 5, this would result in an out-of-bounds access on the
    stream_enc_regs array.
    
    Thus fixing Buffer overflow error in dcn401_stream_encoder_create
    
    Found by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/resource/dcn401/dcn401_resource.c:1209 dcn401_stream_encoder_create() error: buffer overflow 'stream_enc_regs' 4 <= 5
    
    Cc: Tom Chung <chiahsuan.chung@amd.com>
    Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
    Cc: Roman Li <roman.li@amd.com>
    Cc: Alex Hung <alex.hung@amd.com>
    Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
    Cc: Harry Wentland <harry.wentland@amd.com>
    Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
    Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
    Reviewed-by: Tom Chung <chiahsuan.chung@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Increase array size of dummy_boolean [+ + +]
Author: Alex Hung <alex.hung@amd.com>
Date:   Wed Jul 3 10:50:35 2024 -0600

    drm/amd/display: Increase array size of dummy_boolean
    
    [ Upstream commit 6d64d39486197083497a01b39e23f2f8474b35d3 ]
    
    [WHY]
    dml2_core_shared_mode_support and dml_core_mode_support access the third
    element of dummy_boolean, i.e. hw_debug5 = &s->dummy_boolean[2], when
    dummy_boolean has size of 2. Any assignment to hw_debug5 causes an
    OVERRUN.
    
    [HOW]
    Increase dummy_boolean's array size to 3.
    
    This fixes 2 OVERRUN issues reported by Coverity.
    
    Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
    Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
    Signed-off-by: Alex Hung <alex.hung@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Initialize denominators' default to 1 [+ + +]
Author: Alex Hung <alex.hung@amd.com>
Date:   Tue Jun 18 14:05:08 2024 -0600

    drm/amd/display: Initialize denominators' default to 1
    
    [ Upstream commit b995c0a6de6c74656a0c39cd57a0626351b13e3c ]
    
    [WHAT & HOW]
    Variables used as denominators and maybe not assigned to other values,
    should not be 0. Change their default to 1 so they are never 0.
    
    This fixes 10 DIVIDE_BY_ZERO issues reported by Coverity.
    
    Reviewed-by: Harry Wentland <harry.wentland@amd.com>
    Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
    Signed-off-by: Alex Hung <alex.hung@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Initialize get_bytes_per_element's default to 1 [+ + +]
Author: Alex Hung <alex.hung@amd.com>
Date:   Mon Jul 15 09:57:01 2024 -0600

    drm/amd/display: Initialize get_bytes_per_element's default to 1
    
    [ Upstream commit 4067f4fa0423a89fb19a30b57231b384d77d2610 ]
    
    Variables, used as denominators and maybe not assigned to other values,
    should not be 0. bytes_per_element_y & bytes_per_element_c are
    initialized by get_bytes_per_element() which should never return 0.
    
    This fixes 10 DIVIDE_BY_ZERO issues reported by Coverity.
    
    Signed-off-by: Alex Hung <alex.hung@amd.com>
    Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Pass non-null to dcn20_validate_apply_pipe_split_flags [+ + +]
Author: Alex Hung <alex.hung@amd.com>
Date:   Thu Jun 27 11:51:27 2024 -0600

    drm/amd/display: Pass non-null to dcn20_validate_apply_pipe_split_flags
    
    [ Upstream commit 5559598742fb4538e4c51c48ef70563c49c2af23 ]
    
    [WHAT & HOW]
    "dcn20_validate_apply_pipe_split_flags" dereferences merge, and thus it
    cannot be a null pointer. Let's pass a valid pointer to avoid null
    dereference.
    
    This fixes 2 FORWARD_NULL issues reported by Coverity.
    
    Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
    Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
    Signed-off-by: Alex Hung <alex.hung@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Re-enable panel replay feature [+ + +]
Author: Tom Chung <chiahsuan.chung@amd.com>
Date:   Wed Jun 26 17:02:23 2024 +0800

    drm/amd/display: Re-enable panel replay feature
    
    [ Upstream commit be64336307a6c3ee71fe1337c1b9f0495aa83c50 ]
    
    [Why & How]
    Fixed the replay issues and now re-enable the panel replay feature.
    
    Reported-by: Arthur Borsboom <arthurborsboom@gmail.com>
    Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3344
    Reviewed-by: Sun peng Li <sunpeng.li@amd.com>
    Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
    Signed-off-by: Wayne Lin <wayne.lin@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Restore Optimized pbn Value if Failed to Disable DSC [+ + +]
Author: Fangzhi Zuo <Jerry.Zuo@amd.com>
Date:   Wed Sep 4 15:29:24 2024 -0400

    drm/amd/display: Restore Optimized pbn Value if Failed to Disable DSC
    
    commit d51160ab00969ee6758ed2dcbc0f81dd476a181c upstream.
    
    Existing last step of dsc policy is to restore pbn value under minimum compression
    when try to greedily disable dsc for a stream failed to fit in MST bw.
    Optimized dsc params result from optimization step is not necessarily the minimum compression,
    therefore it is not correct to restore the pbn under minimum compression rate.
    
    Restore the pbn under minimum compression instead of the value from optimized pbn could result
    in the dsc params not correct at the modeset where atomic_check failed due to not
    enough bw. One or more monitors connected could not light up in such case.
    
    Restore the optimized pbn value, instead of using the pbn value under minimum
    compression.
    
    Cc: Mario Limonciello <mario.limonciello@amd.com>
    Cc: Alex Deucher <alexander.deucher@amd.com>
    Reviewed-by: Wayne Lin <wayne.lin@amd.com>
    Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com>
    Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    (cherry picked from commit 352c3165d2b75030169e012461a16bcf97f392fc)
    Cc: stable@vger.kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/amd/display: Revert Avoid overflow assignment [+ + +]
Author: Gabe Teeger <Gabe.Teeger@amd.com>
Date:   Thu Jul 25 18:42:21 2024 -0400

    drm/amd/display: Revert Avoid overflow assignment
    
    commit e80f8f491df873ea2e07c941c747831234814612 upstream.
    
    This reverts commit a15268787b79 ("drm/amd/display: Avoid overflow assignment in link_dp_cts")
    Due to regression causing DPMS hang.
    
    Reviewed-by: Alex Hung <alex.hung@amd.com>
    Signed-off-by: Gabe Teeger <Gabe.Teeger@amd.com>
    Signed-off-by: Wayne Lin <wayne.lin@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/amd/display: Underflow Seen on DCN401 eGPU [+ + +]
Author: Daniel Sa <Daniel.Sa@amd.com>
Date:   Fri Jul 19 13:39:09 2024 -0400

    drm/amd/display: Underflow Seen on DCN401 eGPU
    
    [ Upstream commit ca0fb243c3bb53dbbd71d16c76f319bf923ee3d4 ]
    
    [WHY]
    In dcn401 we read clock values before FW is loaded. These incorrect
    values cause the driver to believe that we are running higher clocks
    than what we actually have. This then causes corruption/underflow for
    the eGPU.
    
    [HOW]
    When new values are read from HW, update internal structures to
    propagate the new/correct value. Fixes issue
    
    Signed-off-by: Daniel Sa <Daniel.Sa@amd.com>
    Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Unlock Pipes Based On DET Allocation [+ + +]
Author: Austin Zheng <Austin.Zheng@amd.com>
Date:   Tue Jul 30 11:55:23 2024 -0400

    drm/amd/display: Unlock Pipes Based On DET Allocation
    
    [ Upstream commit 4af0d8ebf74ccbb60d33fdd410891283dd6cb109 ]
    
    [Why]
    DML21 does not allocate DET evenly between pipes.
    May result in underflow when unlocking the pipes as DET could
    be overallocated.
    
    [How]
    1. Unlock pipes that have a decreased amount of DET allocation
    2. Wait for the double buffer to be updated.
    3. Unlock the remaining pipes.
    
    Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
    Signed-off-by: Austin Zheng <Austin.Zheng@amd.com>
    Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: update DML2 policy EnhancedPrefetchScheduleAccelerationFinal DCN35 [+ + +]
Author: Yihan Zhu <Yihan.Zhu@amd.com>
Date:   Sat Sep 7 13:25:19 2024 -0400

    drm/amd/display: update DML2 policy EnhancedPrefetchScheduleAccelerationFinal DCN35
    
    commit 0d5e5e8a0aa49ea2163abf128da3b509a6c58286 upstream.
    
    [WHY & HOW]
    Mismatch in DCN35 DML2 cause bw validation failed to acquire unexpected DPP pipe to cause
    grey screen and system hang. Remove EnhancedPrefetchScheduleAccelerationFinal value override
    to match HW spec.
    
    Cc: Mario Limonciello <mario.limonciello@amd.com>
    Cc: Alex Deucher <alexander.deucher@amd.com>
    Reviewed-by: Charlene Liu <charlene.liu@amd.com>
    Signed-off-by: Yihan Zhu <Yihan.Zhu@amd.com>
    Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    (cherry picked from commit 9dad21f910fcea2bdcff4af46159101d7f9cd8ba)
    Cc: stable@vger.kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/amd/display: Use gpuvm_min_page_size_kbytes for DML2 surfaces [+ + +]
Author: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Date:   Thu Jul 18 11:53:31 2024 -0400

    drm/amd/display: Use gpuvm_min_page_size_kbytes for DML2 surfaces
    
    [ Upstream commit 31663521ede2edb622ee1b397ae3ac666d6351c5 ]
    
    [Why]
    It's currently hard coded to 256 when it should be using the SOC
    provided values. This can result in corruption with linear surfaces
    where we prefetch more PTE than the buffer can hold.
    
    [How]
    Update the min page size correctly for the plane.
    
    Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
    Reviewed-by: Jun Lei <jun.lei@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
drm/amd/pm: ensure the fw_info is not null before using it [+ + +]
Author: Tim Huang <tim.huang@amd.com>
Date:   Wed Aug 7 17:15:12 2024 +0800

    drm/amd/pm: ensure the fw_info is not null before using it
    
    [ Upstream commit 186fb12e7a7b038c2710ceb2fb74068f1b5d55a4 ]
    
    This resolves the dereference null return value warning
    reported by Coverity.
    
    Signed-off-by: Tim Huang <tim.huang@amd.com>
    Reviewed-by: Jesse Zhang <jesse.zhang@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
drm/amdgpu/gfx10: use rlc safe mode for soft recovery [+ + +]
Author: Alex Deucher <alexander.deucher@amd.com>
Date:   Wed Jul 24 18:20:34 2024 -0400

    drm/amdgpu/gfx10: use rlc safe mode for soft recovery
    
    [ Upstream commit ead60e9c4e29c8574cae1be4fe3af1d9a978fb0f ]
    
    Protect the MMIO access with safe mode.
    
    Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
drm/amdgpu/gfx11: enter safe mode before touching CP_INT_CNTL [+ + +]
Author: Alex Deucher <alexander.deucher@amd.com>
Date:   Fri Jul 12 15:36:19 2024 -0400

    drm/amdgpu/gfx11: enter safe mode before touching CP_INT_CNTL
    
    [ Upstream commit b5be054c585110b2c5c1b180136800e8c41c7bb4 ]
    
    Need to enter safe mode before touching GC MMIO.
    
    Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amdgpu/gfx11: use rlc safe mode for soft recovery [+ + +]
Author: Alex Deucher <alexander.deucher@amd.com>
Date:   Wed Jul 24 18:20:23 2024 -0400

    drm/amdgpu/gfx11: use rlc safe mode for soft recovery
    
    [ Upstream commit 3f2d35c325534c1b7ac5072173f0dc7ca969dec2 ]
    
    Protect the MMIO access with safe mode.
    
    Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
drm/amdgpu/gfx12: properly handle error ints on all pipes [+ + +]
Author: Alex Deucher <alexander.deucher@amd.com>
Date:   Mon Jul 1 17:40:55 2024 -0400

    drm/amdgpu/gfx12: properly handle error ints on all pipes
    
    [ Upstream commit 39879321769cc2d9a690725959ef76af92a38ac1 ]
    
    Need to handle the interrupt enables for all pipes.
    
    v2: fix indexing (Jessie)
    
    Acked-by: Christian König <christian.koenig@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amdgpu/gfx12: use rlc safe mode for soft recovery [+ + +]
Author: Alex Deucher <alexander.deucher@amd.com>
Date:   Wed Jul 24 18:20:13 2024 -0400

    drm/amdgpu/gfx12: use rlc safe mode for soft recovery
    
    [ Upstream commit 21818f39beda2e843199e5d8d9e3f9e43c8080a3 ]
    
    Protect the MMIO access with safe mode.
    
    Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
drm/amdgpu/gfx9: properly handle error ints on all pipes [+ + +]
Author: Alex Deucher <alexander.deucher@amd.com>
Date:   Tue Jul 2 10:24:59 2024 -0400

    drm/amdgpu/gfx9: properly handle error ints on all pipes
    
    [ Upstream commit 48695573d2feaf42812c1ad54e01caff0d1c2d71 ]
    
    Need to handle the interrupt enables for all pipes.
    
    Acked-by: Christian König <christian.koenig@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amdgpu/gfx9: use rlc safe mode for soft recovery [+ + +]
Author: Alex Deucher <alexander.deucher@amd.com>
Date:   Wed Jul 24 18:20:57 2024 -0400

    drm/amdgpu/gfx9: use rlc safe mode for soft recovery
    
    [ Upstream commit 3ec2ad7c34c412bd9264cd1ff235d0812be90e82 ]
    
    Protect the MMIO access with safe mode.
    
    Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
drm/amdgpu: add list empty check to avoid null pointer issue [+ + +]
Author: Yang Wang <kevinyang.wang@amd.com>
Date:   Wed Aug 21 14:42:41 2024 +0800

    drm/amdgpu: add list empty check to avoid null pointer issue
    
    [ Upstream commit 4416377ae1fdc41a90b665943152ccd7ff61d3c5 ]
    
    Add list empty check to avoid null pointer issues in some corner cases.
    - list_for_each_entry_safe()
    
    Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
    Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amdgpu: add raven1 gfxoff quirk [+ + +]
Author: Peng Liu <liupeng01@kylinos.cn>
Date:   Fri Aug 30 15:25:54 2024 +0800

    drm/amdgpu: add raven1 gfxoff quirk
    
    [ Upstream commit 0126c0ae11e8b52ecfde9d1b174ee2f32d6c3a5d ]
    
    Fix screen corruption with openkylin.
    
    Link: https://bbs.openkylin.top/t/topic/171497
    Signed-off-by: Peng Liu <liupeng01@kylinos.cn>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amdgpu: Block MMR_READ IOCTL in reset [+ + +]
Author: Victor Skvortsov <victor.skvortsov@amd.com>
Date:   Thu Aug 8 13:40:23 2024 -0400

    drm/amdgpu: Block MMR_READ IOCTL in reset
    
    [ Upstream commit 9e823f307074c0f82b5f6044943b0086e3079bed ]
    
    Register access from userspace should be blocked until
    reset is complete.
    
    Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amdgpu: disallow multiple BO_HANDLES chunks in one submit [+ + +]
Author: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Date:   Tue Jul 2 11:54:30 2024 +0200

    drm/amdgpu: disallow multiple BO_HANDLES chunks in one submit
    
    [ Upstream commit fec5f8e8c6bcf83ed7a392801d7b44c5ecfc1e82 ]
    
    Before this commit, only submits with both a BO_HANDLES chunk and a
    'bo_list_handle' would be rejected (by amdgpu_cs_parser_bos).
    
    But if UMD sent multiple BO_HANDLES, what would happen is:
    * only the last one would be really used
    * all the others would leak memory as amdgpu_cs_p1_bo_handles would
      overwrite the previous p->bo_list value
    
    This commit rejects submissions with multiple BO_HANDLES chunks to
    match the implementation of the parser.
    
    Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
    Reviewed-by: Christian König <christian.koenig@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amdgpu: enable gfxoff quirk on HP 705G4 [+ + +]
Author: Peng Liu <liupeng01@kylinos.cn>
Date:   Fri Aug 30 15:27:08 2024 +0800

    drm/amdgpu: enable gfxoff quirk on HP 705G4
    
    [ Upstream commit 2c7795e245d993bcba2f716a8c93a5891ef910c9 ]
    
    Enabling gfxoff quirk results in perfectly usable
    graphical user interface on HP 705G4 DM with R5 2400G.
    
    Without the quirk, X server is completely unusable as
    every few seconds there is gpu reset due to ring gfx timeout.
    
    Signed-off-by: Peng Liu <liupeng01@kylinos.cn>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amdgpu: Fix get each xcp macro [+ + +]
Author: Asad Kamal <asad.kamal@amd.com>
Date:   Mon Jul 22 19:45:11 2024 +0800

    drm/amdgpu: Fix get each xcp macro
    
    [ Upstream commit ef126c06a98bde1a41303970eb0fc0ac33c3cc02 ]
    
    Fix get each xcp macro to loop over each partition correctly
    
    Fixes: 4bdca2057933 ("drm/amdgpu: Add utility functions for xcp")
    Signed-off-by: Asad Kamal <asad.kamal@amd.com>
    Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amdgpu: fix ptr check warning in gfx10 ip_dump [+ + +]
Author: Sunil Khatri <sunil.khatri@amd.com>
Date:   Wed Aug 7 17:25:24 2024 +0530

    drm/amdgpu: fix ptr check warning in gfx10 ip_dump
    
    [ Upstream commit 98df5a7732e3b78bf8824d2938a8865a45cfc113 ]
    
    Change condition, if (ptr == NULL) to if (!ptr)
    for a better format and fix the warning.
    
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amdgpu: fix ptr check warning in gfx11 ip_dump [+ + +]
Author: Sunil Khatri <sunil.khatri@amd.com>
Date:   Wed Aug 7 17:27:10 2024 +0530

    drm/amdgpu: fix ptr check warning in gfx11 ip_dump
    
    [ Upstream commit bd15f805cdc503ac229a14f5fe21db12e6e7f84a ]
    
    Change condition, if (ptr == NULL) to if (!ptr)
    for a better format and fix the warning.
    
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amdgpu: fix ptr check warning in gfx9 ip_dump [+ + +]
Author: Sunil Khatri <sunil.khatri@amd.com>
Date:   Wed Aug 7 17:21:53 2024 +0530

    drm/amdgpu: fix ptr check warning in gfx9 ip_dump
    
    [ Upstream commit 07f4f9c00ec545dfa6251a44a09d2c48a76e7ee5 ]
    
    Change if (ptr == NULL) to if (!ptr) for a better
    format and fix the warning.
    
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amdgpu: fix unchecked return value warning for amdgpu_atombios [+ + +]
Author: Tim Huang <tim.huang@amd.com>
Date:   Thu Aug 1 13:47:55 2024 +0800

    drm/amdgpu: fix unchecked return value warning for amdgpu_atombios
    
    [ Upstream commit 92549780e32718d64a6d08bbbb3c6fffecb541c7 ]
    
    This resolves the unchecded return value warning reported by Coverity.
    
    Signed-off-by: Tim Huang <tim.huang@amd.com>
    Reviewed-by: Jesse Zhang <jesse.zhang@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amdgpu: fix unchecked return value warning for amdgpu_gfx [+ + +]
Author: Tim Huang <tim.huang@amd.com>
Date:   Thu Aug 1 10:38:37 2024 +0800

    drm/amdgpu: fix unchecked return value warning for amdgpu_gfx
    
    [ Upstream commit c0277b9d7c2ee9ee5dbc948548984f0fbb861301 ]
    
    This resolves the unchecded return value warning reported by Coverity.
    
    Signed-off-by: Tim Huang <tim.huang@amd.com>
    Reviewed-by: Jesse Zhang <jesse.zhang@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
drm/amdkfd: amdkfd_free_gtt_mem clear the correct pointer [+ + +]
Author: Philip Yang <Philip.Yang@amd.com>
Date:   Sun Jul 14 11:11:05 2024 -0400

    drm/amdkfd: amdkfd_free_gtt_mem clear the correct pointer
    
    [ Upstream commit c86ad39140bbcb9dc75a10046c2221f657e8083b ]
    
    Pass pointer reference to amdgpu_bo_unref to clear the correct pointer,
    otherwise amdgpu_bo_unref clear the local variable, the original pointer
    not set to NULL, this could cause use-after-free bug.
    
    Signed-off-by: Philip Yang <Philip.Yang@amd.com>
    Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
    Acked-by: Christian König <christian.koenig@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amdkfd: Check int source id for utcl2 poison event [+ + +]
Author: Hawking Zhang <Hawking.Zhang@amd.com>
Date:   Tue Aug 20 13:56:32 2024 +0800

    drm/amdkfd: Check int source id for utcl2 poison event
    
    [ Upstream commit db6341a9168d2a24ded526277eeab29724d76e9d ]
    
    Traditional utcl2 fault_status polling does not
    work in SRIOV environment. The polling of fault
    status register from guest side will be dropped
    by hardware.
    
    Driver should switch to check utcl2 interrupt
    source id to identify utcl2 poison event. It is
    set to 1 when poisoned data interrupts are
    signaled.
    
    v2: drop the unused local variable (Tao)
    
    Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
    Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amdkfd: Fix resource leak in criu restore queue [+ + +]
Author: Jesse Zhang <jesse.zhang@amd.com>
Date:   Fri Sep 6 11:29:55 2024 +0800

    drm/amdkfd: Fix resource leak in criu restore queue
    
    [ Upstream commit aa47fe8d3595365a935921a90d00bc33ee374728 ]
    
    To avoid memory leaks, release q_extra_data when exiting the restore queue.
    v2: Correct the proto (Alex)
    
    Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
    Reviewed-by: Tim Huang <tim.huang@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
drm/connector: hdmi: Fix writing Dynamic Range Mastering infoframes [+ + +]
Author: Derek Foreman <derek.foreman@collabora.com>
Date:   Tue Aug 27 11:39:04 2024 -0500

    drm/connector: hdmi: Fix writing Dynamic Range Mastering infoframes
    
    [ Upstream commit f0fa69b5011a45394554fb8061d74fee4d7cd72c ]
    
    The largest infoframe we create is the DRM (Dynamic Range Mastering)
    infoframe which is 26 bytes + a 4 byte header, for a total of 30
    bytes.
    
    With HDMI_MAX_INFOFRAME_SIZE set to 29 bytes, as it is now, we
    allocate too little space to pack a DRM infoframe in
    write_device_infoframe(), leading to an ENOSPC return from
    hdmi_infoframe_pack(), and never calling the connector's
    write_infoframe() vfunc.
    
    Instead of having HDMI_MAX_INFOFRAME_SIZE defined in two places,
    replace HDMI_MAX_INFOFRAME_SIZE with HDMI_INFOFRAME_SIZE(MAX) and make
    MAX 27 bytes - which is defined by the HDMI specification to be the
    largest infoframe payload.
    
    Fixes: f378b77227bc ("drm/connector: hdmi: Add Infoframes generation")
    Fixes: c602e4959a0c ("drm/connector: hdmi: Create Infoframe DebugFS entries")
    
    Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
    Acked-by: Maxime Ripard <mripard@kernel.org>
    Reviewed-by: Jani Nikula <jani.nikula@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240827163918.48160-1-derek.foreman@collabora.com
    Signed-off-by: Maxime Ripard <mripard@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
drm/i915/display: BMG supports UHBR13.5 [+ + +]
Author: Arun R Murthy <arun.r.murthy@intel.com>
Date:   Tue Aug 27 13:42:05 2024 +0530

    drm/i915/display: BMG supports UHBR13.5
    
    [ Upstream commit fcd33d434d31a210bc9f209b5bfd92f3b91a2dda ]
    
    UHBR20 is not supported by battlemage and the maximum link rate
    supported is UHBR13.5
    
    v2: Replace IS_DGFX with IS_BATTLEMAGE (Jani)
    
    HSD: 16023263677
    Signed-off-by: Arun R Murthy <arun.r.murthy@intel.com>
    Reviewed-by: Mika Kahola <mika.kahola@intel.com>
    Fixes: 98b1c87a5e51 ("drm/i915/xe2hpd: Set maximum DP rate to UHBR13.5")
    Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240827081205.136569-1-arun.r.murthy@intel.com
    (cherry picked from commit 9c2338ac4543e0fab3a1e0f9f025591e0f0d9f8f)
    Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
drm/i915/dp: Fix AUX IO power enabling for eDP PSR [+ + +]
Author: Imre Deak <imre.deak@intel.com>
Date:   Tue Sep 10 14:18:47 2024 +0300

    drm/i915/dp: Fix AUX IO power enabling for eDP PSR
    
    [ Upstream commit ec2231b8dd2dc515912ff7816c420153b4a95e92 ]
    
    Panel Self Refresh on eDP requires the AUX IO power to be enabled
    whenever the output (main link) is enabled. This is required by the
    AUX_PHY_WAKE/ML_PHY_LOCK signaling initiated by the HW automatically to
    re-enable the main link after it got disabled in power saving states
    (see eDP v1.4b, sections 5.1, 6.1.3.3.1.1).
    
    The Panel Replay mode on non-eDP outputs on the other hand is only
    supported by keeping the main link active, thus not requiring the above
    AUX_PHY_WAKE/ML_PHY_LOCK signaling (eDP v1.4b, section 6.1.3.3.1.2).
    Thus enabling the AUX IO power for this case is not required either.
    
    Based on the above enable the AUX IO power only for eDP/PSR outputs.
    
    Bspec: 49274, 53370
    
    v2:
    - Add a TODO comment to adjust the requirement for AUX IO based on
      whether the ALPM/main-link off mode gets enabled. (Rodrigo)
    
    Cc: Animesh Manna <animesh.manna@intel.com>
    Fixes: b8cf5b5d266e ("drm/i915/panelreplay: Initializaton and compute config for panel replay")
    Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
    Signed-off-by: Imre Deak <imre.deak@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240910111847.2995725-1-imre.deak@intel.com
    (cherry picked from commit f7c2ed9d4ce80a2570c492825de239dc8b500f2e)
    Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/i915/dp: Fix colorimetry detection [+ + +]
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date:   Wed Sep 18 22:04:39 2024 +0300

    drm/i915/dp: Fix colorimetry detection
    
    [ Upstream commit e860513f56d8428fcb2bd0282ac8ab691a53fc6c ]
    
    intel_dp_init_connector() is no place for detecting stuff via
    DPCD (except perhaps for eDP). Move the colorimetry stuff into
    a more appropriate place.
    
    Cc: Jouni Högander <jouni.hogander@intel.com>
    Fixes: 00076671a648 ("drm/i915/display: Move colorimetry_support from intel_psr to intel_dp")
    Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240918190441.29071-1-ville.syrjala@linux.intel.com
    Reviewed-by: Jouni Högander <jouni.hogander@intel.com>
    (cherry picked from commit 35dba4834bded843d5416e8caadfe82bd0ce1904)
    Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
drm/i915/gem: fix bitwise and logical AND mixup [+ + +]
Author: Jani Nikula <jani.nikula@intel.com>
Date:   Wed Sep 18 20:35:43 2024 +0300

    drm/i915/gem: fix bitwise and logical AND mixup
    
    commit 394b52462020b6cceff1f7f47fdebd03589574f3 upstream.
    
    CONFIG_DRM_I915_USERFAULT_AUTOSUSPEND is an int, defaulting to 250. When
    the wakeref is non-zero, it's either -1 or a dynamically allocated
    pointer, depending on CONFIG_DRM_I915_DEBUG_RUNTIME_PM. It's likely that
    the code works by coincidence with the bitwise AND, but with
    CONFIG_DRM_I915_DEBUG_RUNTIME_PM=y, there's the off chance that the
    condition evaluates to false, and intel_wakeref_auto() doesn't get
    called. Switch to the intended logical AND.
    
    v2: Use != to avoid clang -Wconstant-logical-operand (Nathan)
    
    Fixes: ad74457a6b5a ("drm/i915/dgfx: Release mmap on rpm suspend")
    Cc: Matthew Auld <matthew.auld@intel.com>
    Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
    Cc: Anshuman Gupta <anshuman.gupta@intel.com>
    Cc: Andi Shyti <andi.shyti@linux.intel.com>
    Cc: Nathan Chancellor <nathan@kernel.org>
    Cc: stable@vger.kernel.org # v6.1+
    Reviewed-by: Matthew Auld <matthew.auld@intel.com>
    Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> # v1
    Link: https://patchwork.freedesktop.org/patch/msgid/643cc0a4d12f47fd8403d42581e83b1e9c4543c7.1726680898.git.jani.nikula@intel.com
    Signed-off-by: Jani Nikula <jani.nikula@intel.com>
    (cherry picked from commit 4c1bfe259ed1d2ade826f95d437e1c41b274df04)
    Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
drm/i915/psr: Do not wait for PSR being idle on on Panel Replay [+ + +]
Author: Jouni Högander <jouni.hogander@intel.com>
Date:   Fri Sep 6 10:00:33 2024 +0300

    drm/i915/psr: Do not wait for PSR being idle on on Panel Replay
    
    [ Upstream commit 9498f2e24ee0133d486667c9fa4c27ecdaadc272 ]
    
    We do not have ALPM on DP Panel Replay. Due to this SRD_STATUS[SRD State]
    doesn't change from SRDENT_ON after Panel Replay is enabled until it gets
    disabled.
    
    On eDP Panel Replay DEEP_SLEEP is not reached.
    _psr2_ready_for_pipe_update_locked is waiting DEEP_SLEEP bit getting reset.
    
    Take these into account in Panel Replay code by not waiting PSR getting
    idle after enabling VBI.
    
    Fixes: 29fb595d4875 ("drm/i915/psr: Panel replay uses SRD_STATUS to track it's status")
    Cc: Animesh Manna <animesh.manna@intel.com>
    Signed-off-by: Jouni Högander <jouni.hogander@intel.com>
    Reviewed-by: Animesh Manna <animesh.manna@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240906070033.289015-5-jouni.hogander@intel.com
    (cherry picked from commit a2d98feb4b0013ef4f9db0d8f642a8ac1f5ecbb9)
    Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
drm/mediatek: ovl_adaptor: Add missing of_node_put() [+ + +]
Author: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Date:   Mon Jun 24 18:43:47 2024 +0200

    drm/mediatek: ovl_adaptor: Add missing of_node_put()
    
    commit 5beb6fba25db235b52eab34bde8112f07bb31d75 upstream.
    
    Error paths that exit for_each_child_of_node() need to call
    of_node_put() to decerement the child refcount and avoid memory leaks.
    
    Add the missing of_node_put().
    
    Cc: stable@vger.kernel.org
    Fixes: 453c3364632a ("drm/mediatek: Add ovl_adaptor support for MT8195")
    Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com>
    Reviewed-by: CK Hu <ck.hu@mediatek.com>
    Link: https://patchwork.kernel.org/project/dri-devel/patch/20240624-mtk_disp_ovl_adaptor_scoped-v1-2-9fa1e074d881@gmail.com/
    Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
drm/msm/adreno: Assign msm_gpu->pdev earlier to avoid nullptrs [+ + +]
Author: Konrad Dybcio <konradybcio@kernel.org>
Date:   Tue Jul 9 13:15:40 2024 +0200

    drm/msm/adreno: Assign msm_gpu->pdev earlier to avoid nullptrs
    
    [ Upstream commit 16007768551d5bfe53426645401435ca8d2ef54f ]
    
    There are some cases, such as the one uncovered by Commit 46d4efcccc68
    ("drm/msm/a6xx: Avoid a nullptr dereference when speedbin setting fails")
    where
    
    msm_gpu_cleanup() : platform_set_drvdata(gpu->pdev, NULL);
    
    is called on gpu->pdev == NULL, as the GPU device has not been fully
    initialized yet.
    
    Turns out that there's more than just the aforementioned path that
    causes this to happen (e.g. the case when there's speedbin data in the
    catalog, but opp-supported-hw is missing in DT).
    
    Assigning msm_gpu->pdev earlier seems like the least painful solution
    to this, therefore do so.
    
    Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
    Patchwork: https://patchwork.freedesktop.org/patch/602742/
    Signed-off-by: Rob Clark <robdclark@chromium.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
drm/panthor: Don't add write fences to the shared BOs [+ + +]
Author: Boris Brezillon <boris.brezillon@collabora.com>
Date:   Thu Sep 5 09:01:54 2024 +0200

    drm/panthor: Don't add write fences to the shared BOs
    
    commit f9e7ac6e2e9986c2ee63224992cb5c8276e46b2a upstream.
    
    The only user (the mesa gallium driver) is already assuming explicit
    synchronization and doing the export/import dance on shared BOs. The
    only reason we were registering ourselves as writers on external BOs
    is because Xe, which was the reference back when we developed Panthor,
    was doing so. Turns out Xe was wrong, and we really want bookkeep on
    all registered fences, so userspace can explicitly upgrade those to
    read/write when needed.
    
    Fixes: 4bdca1150792 ("drm/panthor: Add the driver frontend block")
    Cc: Matthew Brost <matthew.brost@intel.com>
    Cc: Simona Vetter <simona.vetter@ffwll.ch>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
    Reviewed-by: Steven Price <steven.price@arm.com>
    Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240905070155.3254011-1-boris.brezillon@collabora.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/panthor: Don't declare a queue blocked if deferred operations are pending [+ + +]
Author: Boris Brezillon <boris.brezillon@collabora.com>
Date:   Thu Sep 5 09:19:14 2024 +0200

    drm/panthor: Don't declare a queue blocked if deferred operations are pending
    
    commit 7a1f30afe97294281a2ba05977688385744f9844 upstream.
    
    If deferred operations are pending, we want to wait for those to
    land before declaring the queue blocked on a SYNC_WAIT. We need
    this to deal with the case where the sync object is signalled through
    a deferred SYNC_{ADD,SET} from the same queue. If we don't do that
    and the group gets scheduled out before the deferred SYNC_{SET,ADD}
    is executed, we'll end up with a timeout, because no external
    SYNC_{SET,ADD} will make the scheduler reconsider the group for
    execution.
    
    Fixes: de8548813824 ("drm/panthor: Add the scheduler logical block")
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
    Reviewed-by: Steven Price <steven.price@arm.com>
    Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240905071914.3278599-1-boris.brezillon@collabora.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/panthor: Fix access to uninitialized variable in tick_ctx_cleanup() [+ + +]
Author: Boris Brezillon <boris.brezillon@collabora.com>
Date:   Mon Sep 30 18:37:42 2024 +0200

    drm/panthor: Fix access to uninitialized variable in tick_ctx_cleanup()
    
    commit 282864cc5d3f144af0cdea1868ee2dc2c5110f0d upstream.
    
    The group variable can't be used to retrieve ptdev in our second loop,
    because it points to the previously iterated list_head, not a valid
    group. Get the ptdev object from the scheduler instead.
    
    Cc: <stable@vger.kernel.org>
    Fixes: d72f049087d4 ("drm/panthor: Allow driver compilation")
    Reported-by: kernel test robot <lkp@intel.com>
    Reported-by: Julia Lawall <julia.lawall@inria.fr>
    Closes: https://lore.kernel.org/r/202409302306.UDikqa03-lkp@intel.com/
    Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
    Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240930163742.87036-1-boris.brezillon@collabora.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/panthor: Fix race when converting group handle to group object [+ + +]
Author: Steven Price <steven.price@arm.com>
Date:   Mon Sep 23 11:34:06 2024 +0100

    drm/panthor: Fix race when converting group handle to group object
    
    [ Upstream commit cac075706f298948898b1f63e81709df42afa75d ]
    
    XArray provides it's own internal lock which protects the internal array
    when entries are being simultaneously added and removed. However there
    is still a race between retrieving the pointer from the XArray and
    incrementing the reference count.
    
    To avoid this race simply hold the internal XArray lock when
    incrementing the reference count, this ensures there cannot be a racing
    call to xa_erase().
    
    Fixes: de8548813824 ("drm/panthor: Add the scheduler logical block")
    Signed-off-by: Steven Price <steven.price@arm.com>
    Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
    Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240923103406.2509906-1-steven.price@arm.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/panthor: Lock the VM resv before calling drm_gpuvm_bo_obtain_prealloc() [+ + +]
Author: Boris Brezillon <boris.brezillon@collabora.com>
Date:   Fri Sep 13 13:27:22 2024 +0200

    drm/panthor: Lock the VM resv before calling drm_gpuvm_bo_obtain_prealloc()
    
    [ Upstream commit fa998a9eac8809da4f219aad49836fcad2a9bf5c ]
    
    drm_gpuvm_bo_obtain_prealloc() will call drm_gpuvm_bo_put() on our
    pre-allocated BO if the <BO,VM> association exists. Given we
    only have one ref on preallocated_vm_bo, drm_gpuvm_bo_destroy() will
    be called immediately, and we have to hold the VM resv lock when
    calling this function.
    
    Fixes: 647810ec2476 ("drm/panthor: Add the MMU/VM logical block")
    Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
    Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
    Reviewed-by: Steven Price <steven.price@arm.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240913112722.492144-1-boris.brezillon@collabora.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
drm/printer: Allow NULL data in devcoredump printer [+ + +]
Author: Matthew Brost <matthew.brost@intel.com>
Date:   Thu Aug 1 08:41:17 2024 -0700

    drm/printer: Allow NULL data in devcoredump printer
    
    [ Upstream commit 53369581dc0c68a5700ed51e1660f44c4b2bb524 ]
    
    We want to determine the size of the devcoredump before writing it out.
    To that end, we will run the devcoredump printer with NULL data to get
    the size, alloc data based on the generated offset, then run the
    devcorecump again with a valid data pointer to print.  This necessitates
    not writing data to the data pointer on the initial pass, when it is
    NULL.
    
    v5:
     - Better commit message (Jonathan)
     - Add kerenl doc with examples (Jani)
    
    Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
    Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
    Signed-off-by: Matthew Brost <matthew.brost@intel.com>
    Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240801154118.2547543-3-matthew.brost@intel.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
drm/radeon/r100: Handle unknown family in r100_cp_init_microcode() [+ + +]
Author: Geert Uytterhoeven <geert+renesas@glider.be>
Date:   Tue Jul 30 17:58:12 2024 +0200

    drm/radeon/r100: Handle unknown family in r100_cp_init_microcode()
    
    [ Upstream commit c6dbab46324b1742b50dc2fb5c1fee2c28129439 ]
    
    With -Werror:
    
        In function ‘r100_cp_init_microcode’,
            inlined from ‘r100_cp_init’ at drivers/gpu/drm/radeon/r100.c:1136:7:
        include/linux/printk.h:465:44: error: ‘%s’ directive argument is null [-Werror=format-overflow=]
          465 | #define printk(fmt, ...) printk_index_wrap(_printk, fmt, ##__VA_ARGS__)
              |                                            ^
        include/linux/printk.h:437:17: note: in definition of macro ‘printk_index_wrap’
          437 |                 _p_func(_fmt, ##__VA_ARGS__);                           \
              |                 ^~~~~~~
        include/linux/printk.h:508:9: note: in expansion of macro ‘printk’
          508 |         printk(KERN_ERR pr_fmt(fmt), ##__VA_ARGS__)
              |         ^~~~~~
        drivers/gpu/drm/radeon/r100.c:1062:17: note: in expansion of macro ‘pr_err’
         1062 |                 pr_err("radeon_cp: Failed to load firmware \"%s\"\n", fw_name);
              |                 ^~~~~~
    
    Fix this by converting the if/else if/... construct into a proper
    switch() statement with a default to handle the error case.
    
    As a bonus, the generated code is ca. 100 bytes smaller (with gcc 11.4.0
    targeting arm32).
    
    Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
drm/rockchip: vop: clear DMA stop bit on RK3066 [+ + +]
Author: Val Packett <val@packett.cool>
Date:   Mon Jun 24 17:40:48 2024 -0300

    drm/rockchip: vop: clear DMA stop bit on RK3066
    
    commit 6b44aa559d6c7f4ea591ef9d2352a7250138d62a upstream.
    
    The RK3066 VOP sets a dma_stop bit when it's done scanning out a frame
    and needs the driver to acknowledge that by clearing the bit.
    
    Unless we clear it "between" frames, the RGB output only shows noise
    instead of the picture. atomic_flush is the place for it that least
    affects other code (doing it on vblank would require converting all
    other usages of the reg_lock to spin_(un)lock_irq, which would affect
    performance for everyone).
    
    This seems to be a redundant synchronization mechanism that was removed
    in later iterations of the VOP hardware block.
    
    Fixes: f4a6de855eae ("drm: rockchip: vop: add rk3066 vop definitions")
    Cc: stable@vger.kernel.org
    Signed-off-by: Val Packett <val@packett.cool>
    Signed-off-by: Heiko Stuebner <heiko@sntech.de>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240624204054.5524-2-val@packett.cool
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/rockchip: vop: enable VOP_FEATURE_INTERNAL_RGB on RK3066 [+ + +]
Author: Val Packett <val@packett.cool>
Date:   Mon Jun 24 17:40:49 2024 -0300

    drm/rockchip: vop: enable VOP_FEATURE_INTERNAL_RGB on RK3066
    
    commit 6ed51ba95e27221ce87979bd2ad5926033b9e1b9 upstream.
    
    The RK3066 does have RGB display output, so it should be marked as such.
    
    Fixes: f4a6de855eae ("drm: rockchip: vop: add rk3066 vop definitions")
    Cc: stable@vger.kernel.org
    Signed-off-by: Val Packett <val@packett.cool>
    Signed-off-by: Heiko Stuebner <heiko@sntech.de>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240624204054.5524-3-val@packett.cool
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
drm/sched: Add locking to drm_sched_entity_modify_sched [+ + +]
Author: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Date:   Fri Sep 13 17:05:52 2024 +0100

    drm/sched: Add locking to drm_sched_entity_modify_sched
    
    commit 4286cc2c953983d44d248c9de1c81d3a9643345c upstream.
    
    Without the locking amdgpu currently can race between
    amdgpu_ctx_set_entity_priority() (via drm_sched_entity_modify_sched()) and
    drm_sched_job_arm(), leading to the latter accesing potentially
    inconsitent entity->sched_list and entity->num_sched_list pair.
    
    v2:
     * Improve commit message. (Philipp)
    
    Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
    Fixes: b37aced31eb0 ("drm/scheduler: implement a function to modify sched list")
    Cc: Christian König <christian.koenig@amd.com>
    Cc: Alex Deucher <alexander.deucher@amd.com>
    Cc: Luben Tuikov <ltuikov89@gmail.com>
    Cc: Matthew Brost <matthew.brost@intel.com>
    Cc: David Airlie <airlied@gmail.com>
    Cc: Daniel Vetter <daniel@ffwll.ch>
    Cc: dri-devel@lists.freedesktop.org
    Cc: Philipp Stanner <pstanner@redhat.com>
    Cc: <stable@vger.kernel.org> # v5.7+
    Reviewed-by: Christian König <christian.koenig@amd.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240913160559.49054-2-tursulin@igalia.com
    Signed-off-by: Christian König <christian.koenig@amd.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/sched: Always increment correct scheduler score [+ + +]
Author: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Date:   Tue Sep 24 11:19:09 2024 +0100

    drm/sched: Always increment correct scheduler score
    
    commit 087913e0ba2b3b9d7ccbafb2acf5dab9e35ae1d5 upstream.
    
    Entities run queue can change during drm_sched_entity_push_job() so make
    sure to update the score consistently.
    
    Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
    Fixes: d41a39dda140 ("drm/scheduler: improve job distribution with multiple queues")
    Cc: Nirmoy Das <nirmoy.das@amd.com>
    Cc: Christian König <christian.koenig@amd.com>
    Cc: Luben Tuikov <ltuikov89@gmail.com>
    Cc: Matthew Brost <matthew.brost@intel.com>
    Cc: David Airlie <airlied@gmail.com>
    Cc: Daniel Vetter <daniel@ffwll.ch>
    Cc: dri-devel@lists.freedesktop.org
    Cc: <stable@vger.kernel.org> # v5.9+
    Reviewed-by: Christian König <christian.koenig@amd.com>
    Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240924101914.2713-4-tursulin@igalia.com
    Signed-off-by: Christian König <christian.koenig@amd.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/sched: Always wake up correct scheduler in drm_sched_entity_push_job [+ + +]
Author: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Date:   Tue Sep 24 11:19:08 2024 +0100

    drm/sched: Always wake up correct scheduler in drm_sched_entity_push_job
    
    commit cbc8764e29c2318229261a679b2aafd0f9072885 upstream.
    
    Since drm_sched_entity_modify_sched() can modify the entities run queue,
    lets make sure to only dereference the pointer once so both adding and
    waking up are guaranteed to be consistent.
    
    Alternative of moving the spin_unlock to after the wake up would for now
    be more problematic since the same lock is taken inside
    drm_sched_rq_update_fifo().
    
    v2:
     * Improve commit message. (Philipp)
     * Cache the scheduler pointer directly. (Christian)
    
    Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
    Fixes: b37aced31eb0 ("drm/scheduler: implement a function to modify sched list")
    Cc: Christian König <christian.koenig@amd.com>
    Cc: Alex Deucher <alexander.deucher@amd.com>
    Cc: Luben Tuikov <ltuikov89@gmail.com>
    Cc: Matthew Brost <matthew.brost@intel.com>
    Cc: David Airlie <airlied@gmail.com>
    Cc: Daniel Vetter <daniel@ffwll.ch>
    Cc: Philipp Stanner <pstanner@redhat.com>
    Cc: dri-devel@lists.freedesktop.org
    Cc: <stable@vger.kernel.org> # v5.7+
    Reviewed-by: Christian König <christian.koenig@amd.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240924101914.2713-3-tursulin@igalia.com
    Signed-off-by: Christian König <christian.koenig@amd.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/sched: Fix dynamic job-flow control race [+ + +]
Author: Rob Clark <robdclark@chromium.org>
Date:   Fri Sep 13 13:23:01 2024 -0700

    drm/sched: Fix dynamic job-flow control race
    
    commit 440d52b370b03b366fd26ace36bab20552116145 upstream.
    
    Fixes a race condition reported here: https://github.com/AsahiLinux/linux/issues/309#issuecomment-2238968609
    
    The whole premise of lockless access to a single-producer-single-
    consumer queue is that there is just a single producer and single
    consumer.  That means we can't call drm_sched_can_queue() (which is
    about queueing more work to the hw, not to the spsc queue) from
    anywhere other than the consumer (wq).
    
    This call in the producer is just an optimization to avoid scheduling
    the consuming worker if it cannot yet queue more work to the hw.  It
    is safe to drop this optimization to avoid the race condition.
    
    Suggested-by: Asahi Lina <lina@asahilina.net>
    Fixes: a78422e9dff3 ("drm/sched: implement dynamic job-flow control")
    Closes: https://github.com/AsahiLinux/linux/issues/309
    Cc: stable@vger.kernel.org
    Signed-off-by: Rob Clark <robdclark@chromium.org>
    Reviewed-by: Danilo Krummrich <dakr@kernel.org>
    Tested-by: Janne Grunau <j@jannau.net>
    Signed-off-by: Danilo Krummrich <dakr@kernel.org>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240913202301.16772-1-robdclark@gmail.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/sched: revert "Always increment correct scheduler score" [+ + +]
Author: Christian König <christian.koenig@amd.com>
Date:   Mon Sep 30 15:07:49 2024 +0200

    drm/sched: revert "Always increment correct scheduler score"
    
    commit abf201f6ce14c4ceeccde5471bdf59614b83a3d8 upstream.
    
    This reverts commit 087913e0ba2b3b9d7ccbafb2acf5dab9e35ae1d5.
    
    It turned out that the original code was correct since the rq can only
    change when there is no armed job for an entity.
    
    This change here broke the logic since we only incremented the counter
    for the first job, so revert it.
    
    Signed-off-by: Christian König <christian.koenig@amd.com>
    Acked-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240930131451.536150-1-christian.koenig@amd.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
drm/stm: Avoid use-after-free issues with crtc and plane [+ + +]
Author: Katya Orlova <e.orlova@ispras.ru>
Date:   Fri Feb 16 15:50:40 2024 +0300

    drm/stm: Avoid use-after-free issues with crtc and plane
    
    [ Upstream commit 19dd9780b7ac673be95bf6fd6892a184c9db611f ]
    
    ltdc_load() calls functions drm_crtc_init_with_planes(),
    drm_universal_plane_init() and drm_encoder_init(). These functions
    should not be called with parameters allocated with devm_kzalloc()
    to avoid use-after-free issues [1].
    
    Use allocations managed by the DRM framework.
    
    Found by Linux Verification Center (linuxtesting.org).
    
    [1]
    https://lore.kernel.org/lkml/u366i76e3qhh3ra5oxrtngjtm2u5lterkekcz6y2jkndhuxzli@diujon4h7qwb/
    
    Signed-off-by: Katya Orlova <e.orlova@ispras.ru>
    Acked-by: Raphaël Gallais-Pou <raphael.gallais-pou@foss.st.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240216125040.8968-1-e.orlova@ispras.ru
    Signed-off-by: Raphael Gallais-Pou <raphael.gallais-pou@foss.st.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/stm: ltdc: reset plane transparency after plane disable [+ + +]
Author: Yannick Fertre <yannick.fertre@foss.st.com>
Date:   Fri Jul 12 15:13:44 2024 +0200

    drm/stm: ltdc: reset plane transparency after plane disable
    
    [ Upstream commit 02fa62d41c8abff945bae5bfc3ddcf4721496aca ]
    
    The plane's opacity should be reseted while the plane
    is disabled. It prevents from seeing a possible global
    or layer background color set earlier.
    
    Signed-off-by: Yannick Fertre <yannick.fertre@foss.st.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240712131344.98113-1-yannick.fertre@foss.st.com
    Signed-off-by: Raphael Gallais-Pou <raphael.gallais-pou@foss.st.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
drm/v3d: Prevent out of bounds access in performance query extensions [+ + +]
Author: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Date:   Thu Jul 11 14:53:30 2024 +0100

    drm/v3d: Prevent out of bounds access in performance query extensions
    
    commit f32b5128d2c440368b5bf3a7a356823e235caabb upstream.
    
    Check that the number of perfmons userspace is passing in the copy and
    reset extensions is not greater than the internal kernel storage where
    the ids will be copied into.
    
    Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
    Fixes: bae7cb5d6800 ("drm/v3d: Create a CPU job extension for the reset performance query job")
    Cc: Maíra Canal <mcanal@igalia.com>
    Cc: Iago Toral Quiroga <itoral@igalia.com>
    Cc: stable@vger.kernel.org # v6.8+
    Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
    Reviewed-by: Maíra Canal <mcanal@igalia.com>
    Signed-off-by: Maíra Canal <mcanal@igalia.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240711135340.84617-2-tursulin@igalia.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
drm/xe/fbdev: Limit the usage of stolen for LNL+ [+ + +]
Author: Uma Shankar <uma.shankar@intel.com>
Date:   Wed Jul 17 13:52:52 2024 +0530

    drm/xe/fbdev: Limit the usage of stolen for LNL+
    
    [ Upstream commit 775d0adc01a55fe0458139330415d86bb3533efe ]
    
    As per recommendation in the workarounds:
    WA_22019338487
    
    There is an issue with accessing Stolen memory pages due a
    hardware limitation. Limit the usage of stolen memory for
    fbdev for LNL+. Don't use BIOS FB from stolen on LNL+ and
    assign the same from system memory.
    
    v2: Corrected the WA Number, limited WA to LNL and
        Adopted XE_WA framework as suggested by Lucas and Matt.
    
    v3: Introduced the waxxx_display to implement display side
        of WA changes on Lunarlake. Used xe_root_mmio_gt and
        avoid the for loop (Suggested by Lucas)
    
    v4: Fixed some nits (Luca)
    
    Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
    Signed-off-by: Uma Shankar <uma.shankar@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240717082252.3875909-1-uma.shankar@intel.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
drm/xe/guc_submit: add missing locking in wedged_fini [+ + +]
Author: Matthew Auld <matthew.auld@intel.com>
Date:   Tue Sep 24 16:09:48 2024 +0100

    drm/xe/guc_submit: add missing locking in wedged_fini
    
    [ Upstream commit 790533e44bfc7af929842fccd9674c9f424d4627 ]
    
    Any non-wedged queue can have a zero refcount here and can be running
    concurrently with an async queue destroy, therefore dereferencing the
    queue ptr to check wedge status after the lookup can trigger UAF if
    queue is not wedged.  Fix this by keeping the submission_state lock held
    around the check to postpone the free and make the check safe, before
    dropping again around the put() to avoid the deadlock.
    
    Fixes: 8ed9aaae39f3 ("drm/xe: Force wedged state and block GT reset upon any GPU hang")
    Signed-off-by: Matthew Auld <matthew.auld@intel.com>
    Cc: Matthew Brost <matthew.brost@intel.com>
    Reviewed-by: Matthew Brost <matthew.brost@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240924150947.118433-2-matthew.auld@intel.com
    (cherry picked from commit d28af0b6b9580b9f90c265a7da0315b0ad20bbfd)
    Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
drm/xe/hdcp: Check GSC structure validity [+ + +]
Author: Suraj Kandpal <suraj.kandpal@intel.com>
Date:   Mon Jul 22 12:14:51 2024 +0530

    drm/xe/hdcp: Check GSC structure validity
    
    [ Upstream commit b4224f6bae3801d589f815672ec62800a1501b0d ]
    
    Sometimes xe_gsc is not initialized when checked at HDCP capability
    check. Add gsc structure check to avoid null pointer error.
    
    Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com>
    Reviewed-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240722064451.3610512-4-suraj.kandpal@intel.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close [+ + +]
Author: José Roberto de Souza <jose.souza@intel.com>
Date:   Tue Sep 24 14:37:13 2024 -0700

    drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close
    
    commit 8135f1c09dd2eecee7cb637f7ec9a29e57300eb8 upstream.
    
    Mesa testing on Xe2+ revealed that when OA metrics are collected for an
    exec_queue, after the OA stream is closed, future batch buffers submitted
    on that exec_queue do not complete. Not resetting OAC_CONTEXT_ENABLE on OA
    stream close resolves these hangs and should not have any adverse effects.
    
    v2: Make the change that we don't reset the bit clearer (Ashutosh)
        Also make the same fix for OAC as OAR (Ashutosh)
    
    Bspec: 60314
    Fixes: 2f4a730fcd2d ("drm/xe/oa: Add OAR support")
    Fixes: 14e077f8006d ("drm/xe/oa: Add OAC support")
    Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2821
    Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
    Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
    Cc: stable@vger.kernel.org
    Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240924213713.3497992-1-ashutosh.dixit@intel.com
    (cherry picked from commit 0c8650b09a365f4a31fca1d1d1e9d99c56071128)
    Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
drm/xe/vm: move xa_alloc to prevent UAF [+ + +]
Author: Matthew Auld <matthew.auld@intel.com>
Date:   Wed Sep 25 08:14:27 2024 +0100

    drm/xe/vm: move xa_alloc to prevent UAF
    
    [ Upstream commit 74231870cf4976f69e83aa24f48edb16619f652f ]
    
    Evil user can guess the next id of the vm before the ioctl completes and
    then call vm destroy ioctl to trigger UAF since create ioctl is still
    referencing the same vm. Move the xa_alloc all the way to the end to
    prevent this.
    
    v2:
     - Rebase
    
    Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
    Signed-off-by: Matthew Auld <matthew.auld@intel.com>
    Cc: Matthew Brost <matthew.brost@intel.com>
    Cc: <stable@vger.kernel.org> # v6.8+
    Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
    Reviewed-by: Matthew Brost <matthew.brost@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240925071426.144015-3-matthew.auld@intel.com
    (cherry picked from commit dcfd3971327f3ee92765154baebbaece833d3ca9)
    Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
drm/xe/vram: fix ccs offset calculation [+ + +]
Author: Matthew Auld <matthew.auld@intel.com>
Date:   Mon Sep 16 09:49:12 2024 +0100

    drm/xe/vram: fix ccs offset calculation
    
    commit ee06c09ded3c2f722be4e240ed06287e23596bda upstream.
    
    Spec says SW is expected to round up to the nearest 128K, if not already
    aligned for the CC unit view of CCS. We are seeing the assert sometimes
    pop on BMG to tell us that there is a hole between GSM and CCS, as well
    as popping other asserts with having a vram size with strange alignment,
    which is likely caused by misaligned offset here.
    
    v2 (Shuicheng):
     - Do the round_up() on final SW address.
    
    BSpec: 68023
    Fixes: b5c2ca0372dc ("drm/xe/xe2hpg: Determine flat ccs offset for vram")
    Signed-off-by: Matthew Auld <matthew.auld@intel.com>
    Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
    Cc: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
    Cc: Lucas De Marchi <lucas.demarchi@intel.com>
    Cc: Shuicheng Lin <shuicheng.lin@intel.com>
    Cc: Matt Roper <matthew.d.roper@intel.com>
    Cc: stable@vger.kernel.org # v6.10+
    Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
    Tested-by: Shuicheng Lin <shuicheng.lin@intel.com>
    Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240916084911.13119-2-matthew.auld@intel.com
    Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
    (cherry picked from commit 37173392741c425191b959acb3adf70c9a4610c0)
    Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
drm/xe: Add timeout to preempt fences [+ + +]
Author: Matthew Brost <matthew.brost@intel.com>
Date:   Tue Jun 25 17:41:37 2024 -0700

    drm/xe: Add timeout to preempt fences
    
    [ Upstream commit 627c961d672d3304564455ba471f5e4405170eec ]
    
    To adhere to dma fencing rules that fences must signal within a
    reasonable amount of time, add a 5 second timeout to preempt fences. If
    this timeout occurs, kill the associated VM as this fatal to the VM.
    
    v2:
     - Add comment for smp_wmb (Checkpatch)
     - Fix kernel doc typo (Inspection)
     - Add comment for killed check (Niranjana)
    v3:
     - Drop smp_wmb (Matthew Auld)
     - Don't take vm->lock in preempt fence worker (Matthew Auld)
     - Drop RB given changes to patch
    v4:
     - Add WRITE/READ_ONCE (Niranjana)
     - Don't export xe_vm_kill (Niranjana)
    
    Cc: Matthew Auld <matthew.auld@intel.com>
    Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
    Signed-off-by: Matthew Brost <matthew.brost@intel.com>
    Tested-by: Stuart Summers <stuart.summers@intel.com>
    Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240626004137.4060806-1-matthew.brost@intel.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/xe: Clean up VM / exec queue file lock usage. [+ + +]
Author: Matthew Brost <matthew.brost@intel.com>
Date:   Fri Sep 20 18:17:12 2024 -0700

    drm/xe: Clean up VM / exec queue file lock usage.
    
    [ Upstream commit 9e3c85ddea7a473ed57b6cdfef2dfd468356fc91 ]
    
    Both the VM / exec queue file lock protect the lookup and reference to
    the object, nothing more. These locks are not intended anything else
    underneath them. XA have their own locking too, so no need to take the
    VM / exec queue file lock aside from when doing a lookup and reference
    get.
    
    Add some kernel doc to make this clear and cleanup a few typos too.
    
    Signed-off-by: Matthew Brost <matthew.brost@intel.com>
    Reviewed-by: Matthew Auld <matthew.auld@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240921011712.2681510-1-matthew.brost@intel.com
    (cherry picked from commit fe4f5d4b661666a45b48fe7f95443f8fefc09c8c)
    Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
    Stable-dep-of: 74231870cf49 ("drm/xe/vm: move xa_alloc to prevent UAF")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/xe: Drop warn on xe_guc_pc_gucrc_disable in guc pc fini [+ + +]
Author: Matthew Brost <matthew.brost@intel.com>
Date:   Tue Aug 20 10:29:55 2024 -0700

    drm/xe: Drop warn on xe_guc_pc_gucrc_disable in guc pc fini
    
    [ Upstream commit a323782567812ee925e9b7926445532c7afe331b ]
    
    Not a big deal if CT is down as driver is unloading, no need to warn.
    
    Signed-off-by: Matthew Brost <matthew.brost@intel.com>
    Reviewed-by: Jagmeet Randhawa <jagmeet.randhawa@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240820172958.1095143-4-matthew.brost@intel.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/xe: Fix memory leak on xe_alloc_pf_queue failure [+ + +]
Author: Nirmoy Das <nirmoy.das@intel.com>
Date:   Mon Aug 26 18:20:35 2024 +0200

    drm/xe: Fix memory leak on xe_alloc_pf_queue failure
    
    [ Upstream commit c5f728de696caa35481fd84202dfbc9fecc18e0b ]
    
    Simplify memory unwinding on error also fixing current memory
    leak that can happen on error.
    
    v2: use devm_kcalloc(Matt A)
    
    Fixes: 3338e4f90c14 ("drm/xe: Use topology to determine page fault queue size")
    Cc: Matthew Auld <matthew.auld@intel.com>
    Cc: Matthew Brost <matthew.brost@intel.com>
    Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
    Cc: Stuart Summers <stuart.summers@intel.com>
    Reviewed-by: Matthew Auld <matthew.auld@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240826162035.20462-1-nirmoy.das@intel.com
    Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/xe: fix UAF around queue destruction [+ + +]
Author: Matthew Auld <matthew.auld@intel.com>
Date:   Mon Sep 23 15:56:48 2024 +0100

    drm/xe: fix UAF around queue destruction
    
    commit 2d2be279f1ca9e7288282d4214f16eea8a727cdb upstream.
    
    We currently do stuff like queuing the final destruction step on a
    random system wq, which will outlive the driver instance. With bad
    timing we can teardown the driver with one or more work workqueue still
    being alive leading to various UAF splats. Add a fini step to ensure
    user queues are properly torn down. At this point GuC should already be
    nuked so queue itself should no longer be referenced from hw pov.
    
    v2 (Matt B)
     - Looks much safer to use a waitqueue and then just wait for the
       xa_array to become empty before triggering the drain.
    
    Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2317
    Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
    Signed-off-by: Matthew Auld <matthew.auld@intel.com>
    Cc: Matthew Brost <matthew.brost@intel.com>
    Cc: <stable@vger.kernel.org> # v6.8+
    Reviewed-by: Matthew Brost <matthew.brost@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240923145647.77707-2-matthew.auld@intel.com
    (cherry picked from commit 861108666cc0e999cffeab6aff17b662e68774e3)
    Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/xe: fixup xe_alloc_pf_queue [+ + +]
Author: Matthew Auld <matthew.auld@intel.com>
Date:   Wed Aug 21 18:19:18 2024 +0100

    drm/xe: fixup xe_alloc_pf_queue
    
    [ Upstream commit 321d6b4b9cbe3dd0bc99937d5e5b4d730b5b5798 ]
    
    kzalloc expects number of bytes, therefore we should convert the number
    of dw into bytes, otherwise we are likely just accessing beyond the
    array causing all kinds of carnage. Also fixup the error handling while
    we are here.
    
    v2:
     - Prefer kcalloc (dim)
    
    Fixes: 3338e4f90c14 ("drm/xe: Use topology to determine page fault queue size")
    Signed-off-by: Matthew Auld <matthew.auld@intel.com>
    Cc: Stuart Summers <stuart.summers@intel.com>
    Cc: Matthew Brost <matthew.brost@intel.com>
    Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
    Signed-off-by: Matthew Brost <matthew.brost@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240821171917.417386-2-matthew.auld@intel.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/xe: Generate oob before compiling anything [+ + +]
Author: Lucas De Marchi <lucas.demarchi@intel.com>
Date:   Mon Jul 8 14:29:06 2024 -0700

    drm/xe: Generate oob before compiling anything
    
    commit ea74bf9ccba9ae80fc0766c07c4abaef927e9e63 upstream.
    
    Instead of keep adding more dependencies as WAs are needed in different
    places of the driver, just add a rule with all the objects so the code
    generation happens before anything else.
    
    While at it, group lines related to wa_oob in the Makefile.
    
    v2: Prefix $(obj) when declaring dependency
    
    Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240708213041.1734028-1-lucas.demarchi@intel.com
    Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/xe: Name and document Wa_14019789679 [+ + +]
Author: Matt Roper <matthew.d.roper@intel.com>
Date:   Mon Aug 12 11:10:43 2024 -0700

    drm/xe: Name and document Wa_14019789679
    
    [ Upstream commit 1d734a3e5d6bb266f52eaf2b1400c5d3f1875a54 ]
    
    Early in the development of Xe we identified an issue with SVG state
    handling on DG2 and MTL (and later on Xe2 as well).  In
    commit 72ac304769dd ("drm/xe: Emit SVG state on RCS during driver load
    on DG2 and MTL") and commit fb24b858a20d ("drm/xe/xe2: Update SVG state
    handling") we implemented our own workaround to prevent SVG state from
    leaking from context A to context B in cases where context B never
    issues a specific state setting.
    
    The hardware teams have now created official workaround Wa_14019789679
    to cover this issue.  The workaround description only requires emitting
    3DSTATE_MESH_CONTROL, since they believe that's the only SVG instruction
    that would potentially remain unset by a context B, but still cause
    notable issues if unwanted values were inherited from context A.
    However since we already have a more extensive implementation that emits
    the entire SVG state and prevents _any_ SVG state from unintentionally
    leaking, we'll stick with our existing implementation just to be safe.
    
    Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
    Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240812181042.2013508-2-matthew.d.roper@intel.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/xe: Prevent null pointer access in xe_migrate_copy [+ + +]
Author: Zhanjun Dong <zhanjun.dong@intel.com>
Date:   Fri Sep 27 09:13:08 2024 -0700

    drm/xe: Prevent null pointer access in xe_migrate_copy
    
    [ Upstream commit 7257d9c9a3c6cfe26c428e9b7ae21d61f2f55a79 ]
    
    xe_migrate_copy designed to copy content of TTM resources. When source
    resource is null, it will trigger a NULL pointer dereference in
    xe_migrate_copy. To avoid this situation, update lacks source flag to
    true for this case, the flag will trigger xe_migrate_clear rather than
    xe_migrate_copy.
    
    Issue trace:
    <7> [317.089847] xe 0000:00:02.0: [drm:xe_migrate_copy [xe]] Pass 14,
     sizes: 4194304 & 4194304
    <7> [317.089945] xe 0000:00:02.0: [drm:xe_migrate_copy [xe]] Pass 15,
     sizes: 4194304 & 4194304
    <1> [317.128055] BUG: kernel NULL pointer dereference, address:
     0000000000000010
    <1> [317.128064] #PF: supervisor read access in kernel mode
    <1> [317.128066] #PF: error_code(0x0000) - not-present page
    <6> [317.128069] PGD 0 P4D 0
    <4> [317.128071] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
    <4> [317.128074] CPU: 1 UID: 0 PID: 1440 Comm: kunit_try_catch Tainted:
     G     U           N 6.11.0-rc7-xe #1
    <4> [317.128078] Tainted: [U]=USER, [N]=TEST
    <4> [317.128080] Hardware name: Intel Corporation Lunar Lake Client
     Platform/LNL-M LP5 RVP1, BIOS LNLMFWI1.R00.3221.D80.2407291239 07/29/2024
    <4> [317.128082] RIP: 0010:xe_migrate_copy+0x66/0x13e0 [xe]
    <4> [317.128158] Code: 00 00 48 89 8d e0 fe ff ff 48 8b 40 10 4c 89 85 c8
     fe ff ff 44 88 8d bd fe ff ff 65 48 8b 3c 25 28 00 00 00 48 89 7d d0 31
     ff <8b> 79 10 48 89 85 a0 fe ff ff 48 8b 00 48 89 b5 d8 fe ff ff 83 ff
    <4> [317.128162] RSP: 0018:ffffc9000167f9f0 EFLAGS: 00010246
    <4> [317.128164] RAX: ffff8881120d8028 RBX: ffff88814d070428 RCX:
     0000000000000000
    <4> [317.128166] RDX: ffff88813cb99c00 RSI: 0000000004000000 RDI:
     0000000000000000
    <4> [317.128168] RBP: ffffc9000167fbb8 R08: ffff88814e7b1f08 R09:
     0000000000000001
    <4> [317.128170] R10: 0000000000000001 R11: 0000000000000001 R12:
     ffff88814e7b1f08
    <4> [317.128172] R13: ffff88814e7b1f08 R14: ffff88813cb99c00 R15:
     0000000000000001
    <4> [317.128174] FS:  0000000000000000(0000) GS:ffff88846f280000(0000)
     knlGS:0000000000000000
    <4> [317.128176] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    <4> [317.128178] CR2: 0000000000000010 CR3: 000000011f676004 CR4:
     0000000000770ef0
    <4> [317.128180] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
     0000000000000000
    <4> [317.128182] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7:
     0000000000000400
    <4> [317.128184] PKRU: 55555554
    <4> [317.128185] Call Trace:
    <4> [317.128187]  <TASK>
    <4> [317.128189]  ? show_regs+0x67/0x70
    <4> [317.128194]  ? __die_body+0x20/0x70
    <4> [317.128196]  ? __die+0x2b/0x40
    <4> [317.128198]  ? page_fault_oops+0x15f/0x4e0
    <4> [317.128203]  ? do_user_addr_fault+0x3fb/0x970
    <4> [317.128205]  ? lock_acquire+0xc7/0x2e0
    <4> [317.128209]  ? exc_page_fault+0x87/0x2b0
    <4> [317.128212]  ? asm_exc_page_fault+0x27/0x30
    <4> [317.128216]  ? xe_migrate_copy+0x66/0x13e0 [xe]
    <4> [317.128263]  ? __lock_acquire+0xb9d/0x26f0
    <4> [317.128265]  ? __lock_acquire+0xb9d/0x26f0
    <4> [317.128267]  ? sg_free_append_table+0x20/0x80
    <4> [317.128271]  ? lock_acquire+0xc7/0x2e0
    <4> [317.128273]  ? mark_held_locks+0x4d/0x80
    <4> [317.128275]  ? trace_hardirqs_on+0x1e/0xd0
    <4> [317.128278]  ? _raw_spin_unlock_irqrestore+0x31/0x60
    <4> [317.128281]  ? __pm_runtime_resume+0x60/0xa0
    <4> [317.128284]  xe_bo_move+0x682/0xc50 [xe]
    <4> [317.128315]  ? lock_is_held_type+0xaa/0x120
    <4> [317.128318]  ttm_bo_handle_move_mem+0xe5/0x1a0 [ttm]
    <4> [317.128324]  ttm_bo_validate+0xd1/0x1a0 [ttm]
    <4> [317.128328]  shrink_test_run_device+0x721/0xc10 [xe]
    <4> [317.128360]  ? find_held_lock+0x31/0x90
    <4> [317.128363]  ? lock_release+0xd1/0x2a0
    <4> [317.128365]  ? __pfx_kunit_generic_run_threadfn_adapter+0x10/0x10
     [kunit]
    <4> [317.128370]  xe_bo_shrink_kunit+0x11/0x20 [xe]
    <4> [317.128397]  kunit_try_run_case+0x6e/0x150 [kunit]
    <4> [317.128400]  ? trace_hardirqs_on+0x1e/0xd0
    <4> [317.128402]  ? _raw_spin_unlock_irqrestore+0x31/0x60
    <4> [317.128404]  kunit_generic_run_threadfn_adapter+0x1e/0x40 [kunit]
    <4> [317.128407]  kthread+0xf5/0x130
    <4> [317.128410]  ? __pfx_kthread+0x10/0x10
    <4> [317.128412]  ret_from_fork+0x39/0x60
    <4> [317.128415]  ? __pfx_kthread+0x10/0x10
    <4> [317.128416]  ret_from_fork_asm+0x1a/0x30
    <4> [317.128420]  </TASK>
    
    Fixes: 266c85885263 ("drm/xe/xe2: Handle flat ccs move for igfx.")
    Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com>
    Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
    Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240927161308.862323-2-zhanjun.dong@intel.com
    (cherry picked from commit 59a1c9c7e1d02b43b415ea92627ce095b7c79e47)
    Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/xe: Restore pci state upon resume [+ + +]
Author: Rodrigo Vivi <rodrigo.vivi@intel.com>
Date:   Thu Sep 12 17:45:07 2024 -0400

    drm/xe: Restore pci state upon resume
    
    [ Upstream commit cffa8e83df9fe525afad1e1099097413f9174f57 ]
    
    The pci state was saved, but not restored. Restore
    right after the power state transition request like
    every other driver.
    
    v2: Use right fixes tag, since this was there initialy, but
        accidentally removed.
    
    Fixes: f6761c68c0ac ("drm/xe/display: Improve s2idle handling.")
    Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
    Cc: Lucas De Marchi <lucas.demarchi@intel.com>
    Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
    Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240912214507.456897-1-rodrigo.vivi@intel.com
    Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
    (cherry picked from commit ec2d1539e159f53eae708e194c449cfefa004994)
    Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/xe: Resume TDR after GT reset [+ + +]
Author: Matthew Brost <matthew.brost@intel.com>
Date:   Wed Jul 24 16:59:19 2024 -0700

    drm/xe: Resume TDR after GT reset
    
    [ Upstream commit 1b30f87e088b499eb74298db256da5c98e8276e2 ]
    
    Not starting the TDR after GT reset on exec queue which have been
    restarted can lead to jobs being able to be run forever. Fix this by
    restarting the TDR.
    
    Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
    Signed-off-by: Matthew Brost <matthew.brost@intel.com>
    Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240724235919.1917216-1-matthew.brost@intel.com
    (cherry picked from commit 8ec5a4e5ce97d6ee9f5eb5b4ce4cfc831976fdec)
    Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/xe: Use topology to determine page fault queue size [+ + +]
Author: Stuart Summers <stuart.summers@intel.com>
Date:   Sat Aug 17 02:47:31 2024 +0000

    drm/xe: Use topology to determine page fault queue size
    
    [ Upstream commit 3338e4f90c143cf32f77d64f464cb7f2c2d24700 ]
    
    Currently the page fault queue size is hard coded. However
    the hardware supports faulting for each EU and each CS.
    For some applications running on hardware with a large
    number of EUs and CSs, this can result in an overflow of
    the page fault queue.
    
    Add a small calculation to determine the page fault queue
    size based on the number of EUs and CSs in the platform as
    detmined by fuses.
    
    Signed-off-by: Stuart Summers <stuart.summers@intel.com>
    Reviewed-by: Matthew Brost <matthew.brost@intel.com>
    Signed-off-by: Matthew Brost <matthew.brost@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/24d582a3b48c97793b8b6a402f34b4b469471636.1723862633.git.stuart.summers@intel.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
drm: Consistently use struct drm_mode_rect for FB_DAMAGE_CLIPS [+ + +]
Author: Thomas Zimmermann <tzimmermann@suse.de>
Date:   Mon Sep 23 09:58:14 2024 +0200

    drm: Consistently use struct drm_mode_rect for FB_DAMAGE_CLIPS
    
    commit 8b0d2f61545545ab5eef923ed6e59fc3be2385e0 upstream.
    
    FB_DAMAGE_CLIPS is a plane property for damage handling. Its UAPI
    should only use UAPI types. Hence replace struct drm_rect with
    struct drm_mode_rect in drm_atomic_plane_set_property(). Both types
    are identical in practice, so there's no change in behavior.
    
    Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
    Closes: https://lore.kernel.org/dri-devel/Zu1Ke1TuThbtz15E@intel.com/
    Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
    Fixes: d3b21767821e ("drm: Add a new plane property to send damage during plane update")
    Cc: Lukasz Spintzyk <lukasz.spintzyk@displaylink.com>
    Cc: Deepak Rawat <drawat@vmware.com>
    Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
    Cc: Thomas Hellstrom <thellstrom@vmware.com>
    Cc: David Airlie <airlied@gmail.com>
    Cc: Simona Vetter <simona@ffwll.ch>
    Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
    Cc: Maxime Ripard <mripard@kernel.org>
    Cc: Thomas Zimmermann <tzimmermann@suse.de>
    Cc: dri-devel@lists.freedesktop.org
    Cc: <stable@vger.kernel.org> # v5.0+
    Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240923075841.16231-1-tzimmermann@suse.de
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm: omapdrm: Add missing check for alloc_ordered_workqueue [+ + +]
Author: Ma Ke <make24@iscas.ac.cn>
Date:   Thu Aug 8 14:13:36 2024 +0800

    drm: omapdrm: Add missing check for alloc_ordered_workqueue
    
    commit e794b7b9b92977365c693760a259f8eef940c536 upstream.
    
    As it may return NULL pointer and cause NULL pointer dereference. Add check
    for the return value of alloc_ordered_workqueue.
    
    Cc: stable@vger.kernel.org
    Fixes: 2f95bc6d324a ("drm: omapdrm: Perform initialization/cleanup at probe/remove time")
    Signed-off-by: Ma Ke <make24@iscas.ac.cn>
    Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240808061336.2796729-1-make24@iscas.ac.cn
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
dt-bindings: clock: exynos7885: Fix duplicated binding [+ + +]
Author: David Virag <virag.david003@gmail.com>
Date:   Tue Aug 6 14:11:44 2024 +0200

    dt-bindings: clock: exynos7885: Fix duplicated binding
    
    commit abf3a3ea9acb5c886c8729191a670744ecd42024 upstream.
    
    The numbering in Exynos7885's FSYS CMU bindings has 4 duplicated by
    accident, with the rest of the bindings continuing with 5.
    
    Fix this by moving CLK_MOUT_FSYS_USB30DRD_USER to the end as 11.
    
    Since CLK_MOUT_FSYS_USB30DRD_USER is not used in any device tree as of
    now, and there are no other clocks affected (maybe apart from
    CLK_MOUT_FSYS_MMC_SDIO_USER which the number was shared with, also not
    used in a device tree), this is the least impactful way to solve this
    problem.
    
    Fixes: cd268e309c29 ("dt-bindings: clock: Add bindings for Exynos7885 CMU_FSYS")
    Cc: stable@vger.kernel.org
    Signed-off-by: David Virag <virag.david003@gmail.com>
    Link: https://lore.kernel.org/r/20240806121157.479212-2-virag.david003@gmail.com
    Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dt-bindings: clock: qcom: Add GPLL9 support on gcc-sc8180x [+ + +]
Author: Satya Priya Kakitapalli <quic_skakitap@quicinc.com>
Date:   Mon Aug 12 10:43:02 2024 +0530

    dt-bindings: clock: qcom: Add GPLL9 support on gcc-sc8180x
    
    commit 648b4bde0aca2980ebc0b90cdfbb80d222370c3d upstream.
    
    Add the missing GPLL9 which is required for the gcc sdcc2 clock.
    
    Fixes: 0fadcdfdcf57 ("dt-bindings: clock: Add SC8180x GCC binding")
    Cc: stable@vger.kernel.org
    Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
    Signed-off-by: Satya Priya Kakitapalli <quic_skakitap@quicinc.com>
    Link: https://lore.kernel.org/r/20240812-gcc-sc8180x-fixes-v2-2-8b3eaa5fb856@quicinc.com
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dt-bindings: net: xlnx,axi-ethernet: Add missing reg minItems [+ + +]
Author: Ravikanth Tuniki <ravikanth.tuniki@amd.com>
Date:   Tue Oct 1 00:43:35 2024 +0530

    dt-bindings: net: xlnx,axi-ethernet: Add missing reg minItems
    
    [ Upstream commit c6929644c1e0d6108e57061d427eb966e1746351 ]
    
    Add missing reg minItems as based on current binding document
    only ethernet MAC IO space is a supported configuration.
    
    There is a bug in schema, current examples contain 64-bit
    addressing as well as 32-bit addressing. The schema validation
    does pass incidentally considering one 64-bit reg address as
    two 32-bit reg address entries. If we change axi_ethernet_eth1
    example node reg addressing to 32-bit schema validation reports:
    
    Documentation/devicetree/bindings/net/xlnx,axi-ethernet.example.dtb:
    ethernet@40000000: reg: [[1073741824, 262144]] is too short
    
    To fix it add missing reg minItems constraints and to make things clearer
    stick to 32-bit addressing in examples.
    
    Fixes: cbb1ca6d5f9a ("dt-bindings: net: xlnx,axi-ethernet: convert bindings document to yaml")
    Signed-off-by: Ravikanth Tuniki <ravikanth.tuniki@amd.com>
    Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com>
    Acked-by: Conor Dooley <conor.dooley@microchip.com>
    Link: https://patch.msgid.link/1727723615-2109795-1-git-send-email-radhey.shyam.pandey@amd.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
e1000e: avoid failing the system during pm_suspend [+ + +]
Author: Vitaly Lifshits <vitaly.lifshits@intel.com>
Date:   Tue Aug 6 16:23:48 2024 +0300

    e1000e: avoid failing the system during pm_suspend
    
    [ Upstream commit 0a6ad4d9e1690c7faa3a53f762c877e477093657 ]
    
    Occasionally when the system goes into pm_suspend, the suspend might fail
    due to a PHY access error on the network adapter. Previously, this would
    have caused the whole system to fail to go to a low power state.
    An example of this was reported in the following Bugzilla:
    https://bugzilla.kernel.org/show_bug.cgi?id=205015
    
    [ 1663.694828] e1000e 0000:00:19.0 eth0: Failed to disable ULP
    [ 1664.731040] asix 2-3:1.0 eth1: link up, 100Mbps, full-duplex, lpa 0xC1E1
    [ 1665.093513] e1000e 0000:00:19.0 eth0: Hardware Error
    [ 1665.596760] e1000e 0000:00:19.0: pci_pm_resume+0x0/0x80 returned 0 after 2975399 usecs
    
    and then the system never recovers from it, and all the following suspend failed due to this
    [22909.393854] PM: pci_pm_suspend(): e1000e_pm_suspend+0x0/0x760 [e1000e] returns -2
    [22909.393858] PM: dpm_run_callback(): pci_pm_suspend+0x0/0x160 returns -2
    [22909.393861] PM: Device 0000:00:1f.6 failed to suspend async: error -2
    
    This can be avoided by changing the return values of __e1000_shutdown and
    e1000e_pm_suspend functions so that they always return 0 (success). This
    is consistent with what other drivers do.
    
    If the e1000e driver encounters a hardware error during suspend, potential
    side effects include slightly higher power draw or non-working wake on
    LAN. This is preferred to a system-level suspend failure, and a warning
    message is written to the system log, so that the user can be aware that
    the LAN controller experienced a problem during suspend.
    
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=205015
    Suggested-by: Dima Ruinskiy <dima.ruinskiy@intel.com>
    Signed-off-by: Vitaly Lifshits <vitaly.lifshits@intel.com>
    Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
EINJ, CXL: Fix CXL device SBDF calculation [+ + +]
Author: Ben Cheatham <Benjamin.Cheatham@amd.com>
Date:   Fri Sep 27 11:34:28 2024 -0500

    EINJ, CXL: Fix CXL device SBDF calculation
    
    [ Upstream commit ee1e3c46ed19c096be22472c728fa7f68b1352c4 ]
    
    The SBDF of the target CXL 2.0 compliant root port is required to inject a CXL
    protocol error as per ACPI 6.5. The SBDF given has to be in the
    following format:
    
    31     24 23    16 15    11 10      8  7        0
    +-------------------------------------------------+
    | segment |   bus  | device | function | reserved |
    +-------------------------------------------------+
    
    The SBDF calculated in cxl_dport_get_sbdf() doesn't account for
    the reserved bits currently, causing the wrong SBDF to be used.
    Fix said calculation to properly shift the SBDF.
    
    Without this fix, error injection into CXL 2.0 root ports through the
    CXL debugfs interface (<debugfs>/cxl) is broken. Injection
    through the legacy interface (<debugfs>/apei/einj/) will still work
    because the SBDF is manually provided by the user.
    
    Fixes: 12fb28ea6b1cf ("EINJ: Add CXL error type support")
    Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com>
    Reviewed-by: Dan Williams <dan.j.williams@intel.com>
    Tested-by: Srinivasulu Thanneeru <sthanneeru.opensrc@micron.com>
    Reviewed-by: Srinivasulu Thanneeru <sthanneeru.opensrc@micron.com>
    Link: https://patch.msgid.link/20240927163428.366557-1-Benjamin.Cheatham@amd.com
    Signed-off-by: Ira Weiny <ira.weiny@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
exec: don't WARN for racy path_noexec check [+ + +]
Author: Mateusz Guzik <mjguzik@gmail.com>
Date:   Mon Aug 5 15:17:21 2024 +0200

    exec: don't WARN for racy path_noexec check
    
    [ Upstream commit 0d196e7589cefe207d5d41f37a0a28a1fdeeb7c6 ]
    
    Both i_mode and noexec checks wrapped in WARN_ON stem from an artifact
    of the previous implementation. They used to legitimately check for the
    condition, but that got moved up in two commits:
    633fb6ac3980 ("exec: move S_ISREG() check earlier")
    0fd338b2d2cd ("exec: move path_noexec() check earlier")
    
    Instead of being removed said checks are WARN_ON'ed instead, which
    has some debug value.
    
    However, the spurious path_noexec check is racy, resulting in
    unwarranted warnings should someone race with setting the noexec flag.
    
    One can note there is more to perm-checking whether execve is allowed
    and none of the conditions are guaranteed to still hold after they were
    tested for.
    
    Additionally this does not validate whether the code path did any perm
    checking to begin with -- it will pass if the inode happens to be
    regular.
    
    Keep the redundant path_noexec() check even though it's mindless
    nonsense checking for guarantee that isn't given so drop the WARN.
    
    Reword the commentary and do small tidy ups while here.
    
    Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
    Link: https://lore.kernel.org/r/20240805131721.765484-1-mjguzik@gmail.com
    [brauner: keep redundant path_noexec() check]
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
exfat: fix memory leak in exfat_load_bitmap() [+ + +]
Author: Yuezhang Mo <Yuezhang.Mo@sony.com>
Date:   Tue Sep 3 15:01:09 2024 +0800

    exfat: fix memory leak in exfat_load_bitmap()
    
    commit d2b537b3e533f28e0d97293fe9293161fe8cd137 upstream.
    
    If the first directory entry in the root directory is not a bitmap
    directory entry, 'bh' will not be released and reassigned, which
    will cause a memory leak.
    
    Fixes: 1e49a94cf707 ("exfat: add bitmap operations")
    Cc: stable@vger.kernel.org
    Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
    Reviewed-by: Aoyama Wataru <wataru.aoyama@sony.com>
    Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
ext4: aovid use-after-free in ext4_ext_insert_extent() [+ + +]
Author: Baokun Li <libaokun1@huawei.com>
Date:   Thu Aug 22 10:35:26 2024 +0800

    ext4: aovid use-after-free in ext4_ext_insert_extent()
    
    commit a164f3a432aae62ca23d03e6d926b122ee5b860d upstream.
    
    As Ojaswin mentioned in Link, in ext4_ext_insert_extent(), if the path is
    reallocated in ext4_ext_create_new_leaf(), we'll use the stale path and
    cause UAF. Below is a sample trace with dummy values:
    
    ext4_ext_insert_extent
      path = *ppath = 2000
      ext4_ext_create_new_leaf(ppath)
        ext4_find_extent(ppath)
          path = *ppath = 2000
          if (depth > path[0].p_maxdepth)
                kfree(path = 2000);
                *ppath = path = NULL;
          path = kcalloc() = 3000
          *ppath = 3000;
          return path;
      /* here path is still 2000, UAF! */
      eh = path[depth].p_hdr
    
    ==================================================================
    BUG: KASAN: slab-use-after-free in ext4_ext_insert_extent+0x26d4/0x3330
    Read of size 8 at addr ffff8881027bf7d0 by task kworker/u36:1/179
    CPU: 3 UID: 0 PID: 179 Comm: kworker/u6:1 Not tainted 6.11.0-rc2-dirty #866
    Call Trace:
     <TASK>
     ext4_ext_insert_extent+0x26d4/0x3330
     ext4_ext_map_blocks+0xe22/0x2d40
     ext4_map_blocks+0x71e/0x1700
     ext4_do_writepages+0x1290/0x2800
    [...]
    
    Allocated by task 179:
     ext4_find_extent+0x81c/0x1f70
     ext4_ext_map_blocks+0x146/0x2d40
     ext4_map_blocks+0x71e/0x1700
     ext4_do_writepages+0x1290/0x2800
     ext4_writepages+0x26d/0x4e0
     do_writepages+0x175/0x700
    [...]
    
    Freed by task 179:
     kfree+0xcb/0x240
     ext4_find_extent+0x7c0/0x1f70
     ext4_ext_insert_extent+0xa26/0x3330
     ext4_ext_map_blocks+0xe22/0x2d40
     ext4_map_blocks+0x71e/0x1700
     ext4_do_writepages+0x1290/0x2800
     ext4_writepages+0x26d/0x4e0
     do_writepages+0x175/0x700
    [...]
    ==================================================================
    
    So use *ppath to update the path to avoid the above problem.
    
    Reported-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
    Closes: https://lore.kernel.org/r/ZqyL6rmtwl6N4MWR@li-bb2b2a4c-3307-11b2-a85c-8fa5c3a69313.ibm.com
    Fixes: 10809df84a4d ("ext4: teach ext4_ext_find_extent() to realloc path if necessary")
    Cc: stable@kernel.org
    Signed-off-by: Baokun Li <libaokun1@huawei.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Link: https://patch.msgid.link/20240822023545.1994557-7-libaokun@huaweicloud.com
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ext4: avoid use-after-free in ext4_ext_show_leaf() [+ + +]
Author: Baokun Li <libaokun1@huawei.com>
Date:   Thu Aug 22 10:35:24 2024 +0800

    ext4: avoid use-after-free in ext4_ext_show_leaf()
    
    [ Upstream commit 4e2524ba2ca5f54bdbb9e5153bea00421ef653f5 ]
    
    In ext4_find_extent(), path may be freed by error or be reallocated, so
    using a previously saved *ppath may have been freed and thus may trigger
    use-after-free, as follows:
    
    ext4_split_extent
      path = *ppath;
      ext4_split_extent_at(ppath)
      path = ext4_find_extent(ppath)
      ext4_split_extent_at(ppath)
        // ext4_find_extent fails to free path
        // but zeroout succeeds
      ext4_ext_show_leaf(inode, path)
        eh = path[depth].p_hdr
        // path use-after-free !!!
    
    Similar to ext4_split_extent_at(), we use *ppath directly as an input to
    ext4_ext_show_leaf(). Fix a spelling error by the way.
    
    Same problem in ext4_ext_handle_unwritten_extents(). Since 'path' is only
    used in ext4_ext_show_leaf(), remove 'path' and use *ppath directly.
    
    This issue is triggered only when EXT_DEBUG is defined and therefore does
    not affect functionality.
    
    Signed-off-by: Baokun Li <libaokun1@huawei.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
    Tested-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
    Link: https://patch.msgid.link/20240822023545.1994557-5-libaokun@huaweicloud.com
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ext4: correct encrypted dentry name hash when not casefolded [+ + +]
Author: yao.ly <yao.ly@linux.alibaba.com>
Date:   Mon Jul 1 14:43:39 2024 +0800

    ext4: correct encrypted dentry name hash when not casefolded
    
    commit 70dd7b573afeba9b8f8a33f2ae1e4a9a2ec8c1ec upstream.
    
    EXT4_DIRENT_HASH and EXT4_DIRENT_MINOR_HASH will access struct
    ext4_dir_entry_hash followed ext4_dir_entry. But there is no ext4_dir_entry_hash
    followed when inode is encrypted and not casefolded
    
    Signed-off-by: yao.ly <yao.ly@linux.alibaba.com>
    Link: https://patch.msgid.link/1719816219-128287-1-git-send-email-yao.ly@linux.alibaba.com
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Cc: stable@kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ext4: dax: fix overflowing extents beyond inode size when partially writing [+ + +]
Author: Zhihao Cheng <chengzhihao1@huawei.com>
Date:   Fri Aug 9 20:15:32 2024 +0800

    ext4: dax: fix overflowing extents beyond inode size when partially writing
    
    commit dda898d7ffe85931f9cca6d702a51f33717c501e upstream.
    
    The dax_iomap_rw() does two things in each iteration: map written blocks
    and copy user data to blocks. If the process is killed by user(See signal
    handling in dax_iomap_iter()), the copied data will be returned and added
    on inode size, which means that the length of written extents may exceed
    the inode size, then fsck will fail. An example is given as:
    
    dd if=/dev/urandom of=file bs=4M count=1
     dax_iomap_rw
      iomap_iter // round 1
       ext4_iomap_begin
        ext4_iomap_alloc // allocate 0~2M extents(written flag)
      dax_iomap_iter // copy 2M data
      iomap_iter // round 2
       iomap_iter_advance
        iter->pos += iter->processed // iter->pos = 2M
       ext4_iomap_begin
        ext4_iomap_alloc // allocate 2~4M extents(written flag)
      dax_iomap_iter
       fatal_signal_pending
      done = iter->pos - iocb->ki_pos // done = 2M
     ext4_handle_inode_extension
      ext4_update_inode_size // inode size = 2M
    
    fsck reports: Inode 13, i_size is 2097152, should be 4194304.  Fix?
    
    Fix the problem by truncating extents if the written length is smaller
    than expected.
    
    Fixes: 776722e85d3b ("ext4: DAX iomap write support")
    CC: stable@vger.kernel.org
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=219136
    Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
    Link: https://patch.msgid.link/20240809121532.2105494-1-chengzhihao@huaweicloud.com
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ext4: drop ppath from ext4_ext_replay_update_ex() to avoid double-free [+ + +]
Author: Baokun Li <libaokun1@huawei.com>
Date:   Thu Aug 22 10:35:27 2024 +0800

    ext4: drop ppath from ext4_ext_replay_update_ex() to avoid double-free
    
    commit 5c0f4cc84d3a601c99bc5e6e6eb1cbda542cce95 upstream.
    
    When calling ext4_force_split_extent_at() in ext4_ext_replay_update_ex(),
    the 'ppath' is updated but it is the 'path' that is freed, thus potentially
    triggering a double-free in the following process:
    
    ext4_ext_replay_update_ex
      ppath = path
      ext4_force_split_extent_at(&ppath)
        ext4_split_extent_at
          ext4_ext_insert_extent
            ext4_ext_create_new_leaf
              ext4_ext_grow_indepth
                ext4_find_extent
                  if (depth > path[0].p_maxdepth)
                    kfree(path)                 ---> path First freed
                    *orig_path = path = NULL    ---> null ppath
      kfree(path)                               ---> path double-free !!!
    
    So drop the unnecessary ppath and use path directly to avoid this problem.
    And use ext4_find_extent() directly to update path, avoiding unnecessary
    memory allocation and freeing. Also, propagate the error returned by
    ext4_find_extent() instead of using strange error codes.
    
    Fixes: 8016e29f4362 ("ext4: fast commit recovery path")
    Cc: stable@kernel.org
    Signed-off-by: Baokun Li <libaokun1@huawei.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
    Tested-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
    Link: https://patch.msgid.link/20240822023545.1994557-8-libaokun@huaweicloud.com
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ext4: ext4_search_dir should return a proper error [+ + +]
Author: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
Date:   Wed Aug 21 12:23:21 2024 -0300

    ext4: ext4_search_dir should return a proper error
    
    [ Upstream commit cd69f8f9de280e331c9e6ff689ced0a688a9ce8f ]
    
    ext4_search_dir currently returns -1 in case of a failure, while it returns
    0 when the name is not found. In such failure cases, it should return an
    error code instead.
    
    This becomes even more important when ext4_find_inline_entry returns an
    error code as well in the next commit.
    
    -EFSCORRUPTED seems appropriate as such error code as these failures would
    be caused by unexpected record lengths and is in line with other instances
    of ext4_check_dir_entry failures.
    
    In the case of ext4_dx_find_entry, the current use of ERR_BAD_DX_DIR was
    left as is to reduce the risk of regressions.
    
    Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
    Link: https://patch.msgid.link/20240821152324.3621860-2-cascardo@igalia.com
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ext4: filesystems without casefold feature cannot be mounted with siphash [+ + +]
Author: Lizhi Xu <lizhi.xu@windriver.com>
Date:   Wed Jun 5 09:23:35 2024 +0800

    ext4: filesystems without casefold feature cannot be mounted with siphash
    
    [ Upstream commit 985b67cd86392310d9e9326de941c22fc9340eec ]
    
    When mounting the ext4 filesystem, if the default hash version is set to
    DX_HASH_SIPHASH but the casefold feature is not set, exit the mounting.
    
    Reported-by: syzbot+340581ba9dceb7e06fb3@syzkaller.appspotmail.com
    Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com>
    Link: https://patch.msgid.link/20240605012335.44086-1-lizhi.xu@windriver.com
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ext4: fix access to uninitialised lock in fc replay path [+ + +]
Author: Luis Henriques (SUSE) <luis.henriques@linux.dev>
Date:   Thu Jul 18 10:43:56 2024 +0100

    ext4: fix access to uninitialised lock in fc replay path
    
    commit 23dfdb56581ad92a9967bcd720c8c23356af74c1 upstream.
    
    The following kernel trace can be triggered with fstest generic/629 when
    executed against a filesystem with fast-commit feature enabled:
    
    INFO: trying to register non-static key.
    The code is fine but needs lockdep annotation, or maybe
    you didn't initialize this object before use?
    turning off the locking correctness validator.
    CPU: 0 PID: 866 Comm: mount Not tainted 6.10.0+ #11
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-prebuilt.qemu.org 04/01/2014
    Call Trace:
     <TASK>
     dump_stack_lvl+0x66/0x90
     register_lock_class+0x759/0x7d0
     __lock_acquire+0x85/0x2630
     ? __find_get_block+0xb4/0x380
     lock_acquire+0xd1/0x2d0
     ? __ext4_journal_get_write_access+0xd5/0x160
     _raw_spin_lock+0x33/0x40
     ? __ext4_journal_get_write_access+0xd5/0x160
     __ext4_journal_get_write_access+0xd5/0x160
     ext4_reserve_inode_write+0x61/0xb0
     __ext4_mark_inode_dirty+0x79/0x270
     ? ext4_ext_replay_set_iblocks+0x2f8/0x450
     ext4_ext_replay_set_iblocks+0x330/0x450
     ext4_fc_replay+0x14c8/0x1540
     ? jread+0x88/0x2e0
     ? rcu_is_watching+0x11/0x40
     do_one_pass+0x447/0xd00
     jbd2_journal_recover+0x139/0x1b0
     jbd2_journal_load+0x96/0x390
     ext4_load_and_init_journal+0x253/0xd40
     ext4_fill_super+0x2cc6/0x3180
    ...
    
    In the replay path there's an attempt to lock sbi->s_bdev_wb_lock in
    function ext4_check_bdev_write_error().  Unfortunately, at this point this
    spinlock has not been initialized yet.  Moving it's initialization to an
    earlier point in __ext4_fill_super() fixes this splat.
    
    Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
    Link: https://patch.msgid.link/20240718094356.7863-1-luis.henriques@linux.dev
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Cc: stable@kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ext4: fix double brelse() the buffer of the extents path [+ + +]
Author: Baokun Li <libaokun1@huawei.com>
Date:   Thu Aug 22 10:35:28 2024 +0800

    ext4: fix double brelse() the buffer of the extents path
    
    commit dcaa6c31134c0f515600111c38ed7750003e1b9c upstream.
    
    In ext4_ext_try_to_merge_up(), set path[1].p_bh to NULL after it has been
    released, otherwise it may be released twice. An example of what triggers
    this is as follows:
    
      split2    map    split1
    |--------|-------|--------|
    
    ext4_ext_map_blocks
     ext4_ext_handle_unwritten_extents
      ext4_split_convert_extents
       // path->p_depth == 0
       ext4_split_extent
         // 1. do split1
         ext4_split_extent_at
           |ext4_ext_insert_extent
           |  ext4_ext_create_new_leaf
           |    ext4_ext_grow_indepth
           |      le16_add_cpu(&neh->eh_depth, 1)
           |    ext4_find_extent
           |      // return -ENOMEM
           |// get error and try zeroout
           |path = ext4_find_extent
           |  path->p_depth = 1
           |ext4_ext_try_to_merge
           |  ext4_ext_try_to_merge_up
           |    path->p_depth = 0
           |    brelse(path[1].p_bh)  ---> not set to NULL here
           |// zeroout success
         // 2. update path
         ext4_find_extent
         // 3. do split2
         ext4_split_extent_at
           ext4_ext_insert_extent
             ext4_ext_create_new_leaf
               ext4_ext_grow_indepth
                 le16_add_cpu(&neh->eh_depth, 1)
               ext4_find_extent
                 path[0].p_bh = NULL;
                 path->p_depth = 1
                 read_extent_tree_block  ---> return err
                 // path[1].p_bh is still the old value
                 ext4_free_ext_path
                   ext4_ext_drop_refs
                     // path->p_depth == 1
                     brelse(path[1].p_bh)  ---> brelse a buffer twice
    
    Finally got the following WARRNING when removing the buffer from lru:
    
    ============================================
    VFS: brelse: Trying to free free buffer
    WARNING: CPU: 2 PID: 72 at fs/buffer.c:1241 __brelse+0x58/0x90
    CPU: 2 PID: 72 Comm: kworker/u19:1 Not tainted 6.9.0-dirty #716
    RIP: 0010:__brelse+0x58/0x90
    Call Trace:
     <TASK>
     __find_get_block+0x6e7/0x810
     bdev_getblk+0x2b/0x480
     __ext4_get_inode_loc+0x48a/0x1240
     ext4_get_inode_loc+0xb2/0x150
     ext4_reserve_inode_write+0xb7/0x230
     __ext4_mark_inode_dirty+0x144/0x6a0
     ext4_ext_insert_extent+0x9c8/0x3230
     ext4_ext_map_blocks+0xf45/0x2dc0
     ext4_map_blocks+0x724/0x1700
     ext4_do_writepages+0x12d6/0x2a70
    [...]
    ============================================
    
    Fixes: ecb94f5fdf4b ("ext4: collapse a single extent tree block into the inode if possible")
    Cc: stable@kernel.org
    Signed-off-by: Baokun Li <libaokun1@huawei.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
    Tested-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
    Link: https://patch.msgid.link/20240822023545.1994557-9-libaokun@huaweicloud.com
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ext4: fix error message when rejecting the default hash [+ + +]
Author: Gabriel Krisman Bertazi <krisman@suse.de>
Date:   Tue Aug 27 16:16:36 2024 -0400

    ext4: fix error message when rejecting the default hash
    
    [ Upstream commit a2187431c395cdfbf144e3536f25468c64fc7cfa ]
    
    Commit 985b67cd8639 ("ext4: filesystems without casefold feature cannot
    be mounted with siphash") properly rejects volumes where
    s_def_hash_version is set to DX_HASH_SIPHASH, but the check and the
    error message should not look into casefold setup - a filesystem should
    never have DX_HASH_SIPHASH as the default hash.  Fix it and, since we
    are there, move the check to ext4_hash_info_init.
    
    Fixes:985b67cd8639 ("ext4: filesystems without casefold feature cannot
    be mounted with siphash")
    
    Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
    Link: https://patch.msgid.link/87jzg1en6j.fsf_-_@mailhost.krisman.be
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ext4: fix fast commit inode enqueueing during a full journal commit [+ + +]
Author: Luis Henriques (SUSE) <luis.henriques@linux.dev>
Date:   Wed Jul 17 18:22:20 2024 +0100

    ext4: fix fast commit inode enqueueing during a full journal commit
    
    commit 6db3c1575a750fd417a70e0178bdf6efa0dd5037 upstream.
    
    When a full journal commit is on-going, any fast commit has to be enqueued
    into a different queue: FC_Q_STAGING instead of FC_Q_MAIN.  This enqueueing
    is done only once, i.e. if an inode is already queued in a previous fast
    commit entry it won't be enqueued again.  However, if a full commit starts
    _after_ the inode is enqueued into FC_Q_MAIN, the next fast commit needs to
    be done into FC_Q_STAGING.  And this is not being done in function
    ext4_fc_track_template().
    
    This patch fixes the issue by re-enqueuing an inode into the STAGING queue
    during the fast commit clean-up callback when doing a full commit.  However,
    to prevent a race with a fast-commit, the clean-up callback has to be called
    with the journal locked.
    
    This bug was found using fstest generic/047.  This test creates several 32k
    bytes files, sync'ing each of them after it's creation, and then shutting
    down the filesystem.  Some data may be loss in this operation; for example a
    file may have it's size truncated to zero.
    
    Suggested-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Link: https://patch.msgid.link/20240717172220.14201-1-luis.henriques@linux.dev
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Cc: stable@kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ext4: fix i_data_sem unlock order in ext4_ind_migrate() [+ + +]
Author: Artem Sadovnikov <ancowi69@gmail.com>
Date:   Thu Aug 29 15:22:09 2024 +0000

    ext4: fix i_data_sem unlock order in ext4_ind_migrate()
    
    [ Upstream commit cc749e61c011c255d81b192a822db650c68b313f ]
    
    Fuzzing reports a possible deadlock in jbd2_log_wait_commit.
    
    This issue is triggered when an EXT4_IOC_MIGRATE ioctl is set to require
    synchronous updates because the file descriptor is opened with O_SYNC.
    This can lead to the jbd2_journal_stop() function calling
    jbd2_might_wait_for_commit(), potentially causing a deadlock if the
    EXT4_IOC_MIGRATE call races with a write(2) system call.
    
    This problem only arises when CONFIG_PROVE_LOCKING is enabled. In this
    case, the jbd2_might_wait_for_commit macro locks jbd2_handle in the
    jbd2_journal_stop function while i_data_sem is locked. This triggers
    lockdep because the jbd2_journal_start function might also lock the same
    jbd2_handle simultaneously.
    
    Found by Linux Verification Center (linuxtesting.org) with syzkaller.
    
    Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
    Co-developed-by: Mikhail Ukhin <mish.uxin2012@yandex.ru>
    Signed-off-by: Mikhail Ukhin <mish.uxin2012@yandex.ru>
    Signed-off-by: Artem Sadovnikov <ancowi69@gmail.com>
    Rule: add
    Link: https://lore.kernel.org/stable/20240404095000.5872-1-mish.uxin2012%40yandex.ru
    Link: https://patch.msgid.link/20240829152210.2754-1-ancowi69@gmail.com
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ext4: fix incorrect tid assumption in __jbd2_log_wait_for_space() [+ + +]
Author: Luis Henriques (SUSE) <luis.henriques@linux.dev>
Date:   Wed Jul 24 17:11:16 2024 +0100

    ext4: fix incorrect tid assumption in __jbd2_log_wait_for_space()
    
    commit 972090651ee15e51abfb2160e986fa050cfc7a40 upstream.
    
    Function __jbd2_log_wait_for_space() assumes that '0' is not a valid value
    for transaction IDs, which is incorrect.  Don't assume that and invoke
    jbd2_log_wait_commit() if the journal had a committing transaction instead.
    
    Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Link: https://patch.msgid.link/20240724161119.13448-3-luis.henriques@linux.dev
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Cc: stable@kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ext4: fix incorrect tid assumption in ext4_fc_mark_ineligible() [+ + +]
Author: Luis Henriques (SUSE) <luis.henriques@linux.dev>
Date:   Wed Jul 24 17:11:18 2024 +0100

    ext4: fix incorrect tid assumption in ext4_fc_mark_ineligible()
    
    commit ebc4b2c1ac92fc0f8bf3f5a9c285a871d5084a6b upstream.
    
    Function jbd2_journal_shrink_checkpoint_list() assumes that '0' is not a
    valid value for transaction IDs, which is incorrect.
    
    Furthermore, the sbi->s_fc_ineligible_tid handling also makes the same
    assumption by being initialised to '0'.  Fortunately, the sb flag
    EXT4_MF_FC_INELIGIBLE can be used to check whether sbi->s_fc_ineligible_tid
    has been previously set instead of comparing it with '0'.
    
    Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Link: https://patch.msgid.link/20240724161119.13448-5-luis.henriques@linux.dev
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Cc: stable@kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ext4: fix incorrect tid assumption in ext4_wait_for_tail_page_commit() [+ + +]
Author: Luis Henriques (SUSE) <luis.henriques@linux.dev>
Date:   Wed Jul 24 17:11:15 2024 +0100

    ext4: fix incorrect tid assumption in ext4_wait_for_tail_page_commit()
    
    commit dd589b0f1445e1ea1085b98edca6e4d5dedb98d0 upstream.
    
    Function ext4_wait_for_tail_page_commit() assumes that '0' is not a valid
    value for transaction IDs, which is incorrect.  Don't assume that and invoke
    jbd2_log_wait_commit() if the journal had a committing transaction instead.
    
    Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Link: https://patch.msgid.link/20240724161119.13448-2-luis.henriques@linux.dev
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Cc: stable@kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ext4: fix incorrect tid assumption in jbd2_journal_shrink_checkpoint_list() [+ + +]
Author: Luis Henriques (SUSE) <luis.henriques@linux.dev>
Date:   Wed Jul 24 17:11:17 2024 +0100

    ext4: fix incorrect tid assumption in jbd2_journal_shrink_checkpoint_list()
    
    commit 7a6443e1dad70281f99f0bd394d7fd342481a632 upstream.
    
    Function jbd2_journal_shrink_checkpoint_list() assumes that '0' is not a
    valid value for transaction IDs, which is incorrect.  Don't assume that and
    use two extra boolean variables to control the loop iterations and keep
    track of the first and last tid.
    
    Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Link: https://patch.msgid.link/20240724161119.13448-4-luis.henriques@linux.dev
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Cc: stable@kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ext4: fix off by one issue in alloc_flex_gd() [+ + +]
Author: Baokun Li <libaokun1@huawei.com>
Date:   Fri Sep 27 21:33:29 2024 +0800

    ext4: fix off by one issue in alloc_flex_gd()
    
    commit 6121258c2b33ceac3d21f6a221452692c465df88 upstream.
    
    Wesley reported an issue:
    
    ==================================================================
    EXT4-fs (dm-5): resizing filesystem from 7168 to 786432 blocks
    ------------[ cut here ]------------
    kernel BUG at fs/ext4/resize.c:324!
    CPU: 9 UID: 0 PID: 3576 Comm: resize2fs Not tainted 6.11.0+ #27
    RIP: 0010:ext4_resize_fs+0x1212/0x12d0
    Call Trace:
     __ext4_ioctl+0x4e0/0x1800
     ext4_ioctl+0x12/0x20
     __x64_sys_ioctl+0x99/0xd0
     x64_sys_call+0x1206/0x20d0
     do_syscall_64+0x72/0x110
     entry_SYSCALL_64_after_hwframe+0x76/0x7e
    ==================================================================
    
    While reviewing the patch, Honza found that when adjusting resize_bg in
    alloc_flex_gd(), it was possible for flex_gd->resize_bg to be bigger than
    flexbg_size.
    
    The reproduction of the problem requires the following:
    
     o_group = flexbg_size * 2 * n;
     o_size = (o_group + 1) * group_size;
     n_group: [o_group + flexbg_size, o_group + flexbg_size * 2)
     o_size = (n_group + 1) * group_size;
    
    Take n=0,flexbg_size=16 as an example:
    
                  last:15
    |o---------------|--------------n-|
    o_group:0    resize to      n_group:30
    
    The corresponding reproducer is:
    
    img=test.img
    rm -f $img
    truncate -s 600M $img
    mkfs.ext4 -F $img -b 1024 -G 16 8M
    dev=`losetup -f --show $img`
    mkdir -p /tmp/test
    mount $dev /tmp/test
    resize2fs $dev 248M
    
    Delete the problematic plus 1 to fix the issue, and add a WARN_ON_ONCE()
    to prevent the issue from happening again.
    
    [ Note: another reproucer which this commit fixes is:
    
      img=test.img
      rm -f $img
      truncate -s 25MiB $img
      mkfs.ext4 -b 4096 -E nodiscard,lazy_itable_init=0,lazy_journal_init=0 $img
      truncate -s 3GiB $img
      dev=`losetup -f --show $img`
      mkdir -p /tmp/test
      mount $dev /tmp/test
      resize2fs $dev 3G
      umount $dev
      losetup -d $dev
    
      -- TYT ]
    
    Reported-by: Wesley Hershberger <wesley.hershberger@canonical.com>
    Closes: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2081231
    Reported-by: Stéphane Graber <stgraber@stgraber.org>
    Closes: https://lore.kernel.org/all/20240925143325.518508-1-aleksandr.mikhalitsyn@canonical.com/
    Tested-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
    Tested-by: Eric Sandeen <sandeen@redhat.com>
    Fixes: 665d3e0af4d3 ("ext4: reduce unnecessary memory allocation in alloc_flex_gd()")
    Cc: stable@vger.kernel.org
    Signed-off-by: Baokun Li <libaokun1@huawei.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Link: https://patch.msgid.link/20240927133329.1015041-1-libaokun@huaweicloud.com
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ext4: fix slab-use-after-free in ext4_split_extent_at() [+ + +]
Author: Baokun Li <libaokun1@huawei.com>
Date:   Thu Aug 22 10:35:23 2024 +0800

    ext4: fix slab-use-after-free in ext4_split_extent_at()
    
    commit c26ab35702f8cd0cdc78f96aa5856bfb77be798f upstream.
    
    We hit the following use-after-free:
    
    ==================================================================
    BUG: KASAN: slab-use-after-free in ext4_split_extent_at+0xba8/0xcc0
    Read of size 2 at addr ffff88810548ed08 by task kworker/u20:0/40
    CPU: 0 PID: 40 Comm: kworker/u20:0 Not tainted 6.9.0-dirty #724
    Call Trace:
     <TASK>
     kasan_report+0x93/0xc0
     ext4_split_extent_at+0xba8/0xcc0
     ext4_split_extent.isra.0+0x18f/0x500
     ext4_split_convert_extents+0x275/0x750
     ext4_ext_handle_unwritten_extents+0x73e/0x1580
     ext4_ext_map_blocks+0xe20/0x2dc0
     ext4_map_blocks+0x724/0x1700
     ext4_do_writepages+0x12d6/0x2a70
    [...]
    
    Allocated by task 40:
     __kmalloc_noprof+0x1ac/0x480
     ext4_find_extent+0xf3b/0x1e70
     ext4_ext_map_blocks+0x188/0x2dc0
     ext4_map_blocks+0x724/0x1700
     ext4_do_writepages+0x12d6/0x2a70
    [...]
    
    Freed by task 40:
     kfree+0xf1/0x2b0
     ext4_find_extent+0xa71/0x1e70
     ext4_ext_insert_extent+0xa22/0x3260
     ext4_split_extent_at+0x3ef/0xcc0
     ext4_split_extent.isra.0+0x18f/0x500
     ext4_split_convert_extents+0x275/0x750
     ext4_ext_handle_unwritten_extents+0x73e/0x1580
     ext4_ext_map_blocks+0xe20/0x2dc0
     ext4_map_blocks+0x724/0x1700
     ext4_do_writepages+0x12d6/0x2a70
    [...]
    ==================================================================
    
    The flow of issue triggering is as follows:
    
    ext4_split_extent_at
      path = *ppath
      ext4_ext_insert_extent(ppath)
        ext4_ext_create_new_leaf(ppath)
          ext4_find_extent(orig_path)
            path = *orig_path
            read_extent_tree_block
              // return -ENOMEM or -EIO
            ext4_free_ext_path(path)
              kfree(path)
            *orig_path = NULL
      a. If err is -ENOMEM:
      ext4_ext_dirty(path + path->p_depth)
      // path use-after-free !!!
      b. If err is -EIO and we have EXT_DEBUG defined:
      ext4_ext_show_leaf(path)
        eh = path[depth].p_hdr
        // path also use-after-free !!!
    
    So when trying to zeroout or fix the extent length, call ext4_find_extent()
    to update the path.
    
    In addition we use *ppath directly as an ext4_ext_show_leaf() input to
    avoid possible use-after-free when EXT_DEBUG is defined, and to avoid
    unnecessary path updates.
    
    Fixes: dfe5080939ea ("ext4: drop EXT4_EX_NOFREE_ON_ERR from rest of extents handling code")
    Cc: stable@kernel.org
    Signed-off-by: Baokun Li <libaokun1@huawei.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
    Tested-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
    Link: https://patch.msgid.link/20240822023545.1994557-4-libaokun@huaweicloud.com
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ext4: fix timer use-after-free on failed mount [+ + +]
Author: Xiaxi Shen <shenxiaxi26@gmail.com>
Date:   Sun Jul 14 21:33:36 2024 -0700

    ext4: fix timer use-after-free on failed mount
    
    commit 0ce160c5bdb67081a62293028dc85758a8efb22a upstream.
    
    Syzbot has found an ODEBUG bug in ext4_fill_super
    
    The del_timer_sync function cancels the s_err_report timer,
    which reminds about filesystem errors daily. We should
    guarantee the timer is no longer active before kfree(sbi).
    
    When filesystem mounting fails, the flow goes to failed_mount3,
    where an error occurs when ext4_stop_mmpd is called, causing
    a read I/O failure. This triggers the ext4_handle_error function
    that ultimately re-arms the timer,
    leaving the s_err_report timer active before kfree(sbi) is called.
    
    Fix the issue by canceling the s_err_report timer after calling ext4_stop_mmpd.
    
    Signed-off-by: Xiaxi Shen <shenxiaxi26@gmail.com>
    Reported-and-tested-by: syzbot+59e0101c430934bc9a36@syzkaller.appspotmail.com
    Closes: https://syzkaller.appspot.com/bug?extid=59e0101c430934bc9a36
    Link: https://patch.msgid.link/20240715043336.98097-1-shenxiaxi26@gmail.com
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Cc: stable@kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ext4: mark fc as ineligible using an handle in ext4_xattr_set() [+ + +]
Author: Luis Henriques (SUSE) <luis.henriques@linux.dev>
Date:   Mon Sep 23 11:49:09 2024 +0100

    ext4: mark fc as ineligible using an handle in ext4_xattr_set()
    
    commit 04e6ce8f06d161399e5afde3df5dcfa9455b4952 upstream.
    
    Calling ext4_fc_mark_ineligible() with a NULL handle is racy and may result
    in a fast-commit being done before the filesystem is effectively marked as
    ineligible.  This patch moves the call to this function so that an handle
    can be used.  If a transaction fails to start, then there's not point in
    trying to mark the filesystem as ineligible, and an error will eventually be
    returned to user-space.
    
    Suggested-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Link: https://patch.msgid.link/20240923104909.18342-3-luis.henriques@linux.dev
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Cc: stable@kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ext4: no need to continue when the number of entries is 1 [+ + +]
Author: Edward Adam Davis <eadavis@qq.com>
Date:   Mon Jul 1 22:25:03 2024 +0800

    ext4: no need to continue when the number of entries is 1
    
    commit 1a00a393d6a7fb1e745a41edd09019bd6a0ad64c upstream.
    
    Fixes: ac27a0ec112a ("[PATCH] ext4: initial copy of files from ext3")
    Reported-by: syzbot+ae688d469e36fb5138d0@syzkaller.appspotmail.com
    Closes: https://syzkaller.appspot.com/bug?extid=ae688d469e36fb5138d0
    Signed-off-by: Edward Adam Davis <eadavis@qq.com>
    Reported-and-tested-by: syzbot+ae688d469e36fb5138d0@syzkaller.appspotmail.com
    Link: https://patch.msgid.link/tencent_BE7AEE6C7C2D216CB8949CE8E6EE7ECC2C0A@qq.com
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Cc: stable@kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ext4: propagate errors from ext4_find_extent() in ext4_insert_range() [+ + +]
Author: Baokun Li <libaokun1@huawei.com>
Date:   Thu Aug 22 10:35:30 2024 +0800

    ext4: propagate errors from ext4_find_extent() in ext4_insert_range()
    
    commit 369c944ed1d7c3fb7b35f24e4735761153afe7b3 upstream.
    
    Even though ext4_find_extent() returns an error, ext4_insert_range() still
    returns 0. This may confuse the user as to why fallocate returns success,
    but the contents of the file are not as expected. So propagate the error
    returned by ext4_find_extent() to avoid inconsistencies.
    
    Fixes: 331573febb6a ("ext4: Add support FALLOC_FL_INSERT_RANGE for fallocate")
    Cc: stable@kernel.org
    Signed-off-by: Baokun Li <libaokun1@huawei.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
    Tested-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
    Link: https://patch.msgid.link/20240822023545.1994557-11-libaokun@huaweicloud.com
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ext4: update orig_path in ext4_find_extent() [+ + +]
Author: Baokun Li <libaokun1@huawei.com>
Date:   Thu Aug 22 10:35:25 2024 +0800

    ext4: update orig_path in ext4_find_extent()
    
    commit 5b4b2dcace35f618fe361a87bae6f0d13af31bc1 upstream.
    
    In ext4_find_extent(), if the path is not big enough, we free it and set
    *orig_path to NULL. But after reallocating and successfully initializing
    the path, we don't update *orig_path, in which case the caller gets a
    valid path but a NULL ppath, and this may cause a NULL pointer dereference
    or a path memory leak. For example:
    
    ext4_split_extent
      path = *ppath = 2000
      ext4_find_extent
        if (depth > path[0].p_maxdepth)
          kfree(path = 2000);
          *orig_path = path = NULL;
          path = kcalloc() = 3000
      ext4_split_extent_at(*ppath = NULL)
        path = *ppath;
        ex = path[depth].p_ext;
        // NULL pointer dereference!
    
    ==================================================================
    BUG: kernel NULL pointer dereference, address: 0000000000000010
    CPU: 6 UID: 0 PID: 576 Comm: fsstress Not tainted 6.11.0-rc2-dirty #847
    RIP: 0010:ext4_split_extent_at+0x6d/0x560
    Call Trace:
     <TASK>
     ext4_split_extent.isra.0+0xcb/0x1b0
     ext4_ext_convert_to_initialized+0x168/0x6c0
     ext4_ext_handle_unwritten_extents+0x325/0x4d0
     ext4_ext_map_blocks+0x520/0xdb0
     ext4_map_blocks+0x2b0/0x690
     ext4_iomap_begin+0x20e/0x2c0
    [...]
    ==================================================================
    
    Therefore, *orig_path is updated when the extent lookup succeeds, so that
    the caller can safely use path or *ppath.
    
    Fixes: 10809df84a4d ("ext4: teach ext4_ext_find_extent() to realloc path if necessary")
    Cc: stable@kernel.org
    Signed-off-by: Baokun Li <libaokun1@huawei.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Link: https://patch.msgid.link/20240822023545.1994557-6-libaokun@huaweicloud.com
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ext4: use handle to mark fc as ineligible in __track_dentry_update() [+ + +]
Author: Luis Henriques (SUSE) <luis.henriques@linux.dev>
Date:   Mon Sep 23 11:49:08 2024 +0100

    ext4: use handle to mark fc as ineligible in __track_dentry_update()
    
    commit faab35a0370fd6e0821c7a8dd213492946fc776f upstream.
    
    Calling ext4_fc_mark_ineligible() with a NULL handle is racy and may result
    in a fast-commit being done before the filesystem is effectively marked as
    ineligible.  This patch fixes the calls to this function in
    __track_dentry_update() by adding an extra parameter to the callback used in
    ext4_fc_track_template().
    
    Suggested-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Link: https://patch.msgid.link/20240923104909.18342-2-luis.henriques@linux.dev
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Cc: stable@kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
f2fs: add write priority option based on zone UFS [+ + +]
Author: Liao Yuanhong <liaoyuanhong@vivo.com>
Date:   Mon Jul 15 20:34:51 2024 +0800

    f2fs: add write priority option based on zone UFS
    
    [ Upstream commit 8444ce524947daf441546b5b3a0c418706dade35 ]
    
    Currently, we are using a mix of traditional UFS and zone UFS to support
    some functionalities that cannot be achieved on zone UFS alone. However,
    there are some issues with this approach. There exists a significant
    performance difference between traditional UFS and zone UFS. Under normal
    usage, we prioritize writes to zone UFS. However, in critical conditions
    (such as when the entire UFS is almost full), we cannot determine whether
    data will be written to traditional UFS or zone UFS. This can lead to
    significant performance fluctuations, which is not conducive to
    development and testing. To address this, we have added an option
    zlu_io_enable under sys with the following three modes:
    1) zlu_io_enable == 0:Normal mode, prioritize writing to zone UFS;
    2) zlu_io_enable == 1:Zone UFS only mode, only allow writing to zone UFS;
    3) zlu_io_enable == 2:Traditional UFS priority mode, prioritize writing to
    traditional UFS.
    
    Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com>
    Signed-off-by: Wu Bo <bo.wu@vivo.com>
    Reviewed-by: Chao Yu <chao@kernel.org>
    Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
    Stable-dep-of: 65a6ce4726c2 ("f2fs: fix to don't panic system for no free segment fault injection")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

f2fs: do FG_GC when GC boosting is required for zoned devices [+ + +]
Author: Daeho Jeong <daehojeong@google.com>
Date:   Mon Sep 9 15:19:44 2024 -0700

    f2fs: do FG_GC when GC boosting is required for zoned devices
    
    [ Upstream commit 9748c2ddea4a3f46a498bff4cf2bf9a5629e3f8b ]
    
    Under low free section count, we need to use FG_GC instead of BG_GC to
    recover free sections.
    
    Signed-off-by: Daeho Jeong <daehojeong@google.com>
    Reviewed-by: Chao Yu <chao@kernel.org>
    Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
    Stable-dep-of: 5cc69a27abfa ("f2fs: forcibly migrate to secure space for zoned device file pinning")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

f2fs: fix to don't panic system for no free segment fault injection [+ + +]
Author: Chao Yu <chao@kernel.org>
Date:   Tue Sep 10 09:16:19 2024 +0800

    f2fs: fix to don't panic system for no free segment fault injection
    
    [ Upstream commit 65a6ce4726c27b45600303f06496fef46d00b57f ]
    
    f2fs: fix to don't panic system for no free segment fault injection
    
    syzbot reports a f2fs bug as below:
    
    F2FS-fs (loop0): inject no free segment in get_new_segment of __allocate_new_segment+0x1ce/0x940 fs/f2fs/segment.c:3167
    F2FS-fs (loop0): Stopped filesystem due to reason: 7
    ------------[ cut here ]------------
    kernel BUG at fs/f2fs/segment.c:2748!
    CPU: 0 UID: 0 PID: 5109 Comm: syz-executor304 Not tainted 6.11.0-rc6-syzkaller-00363-g89f5e14d05b4 #0
    RIP: 0010:get_new_segment fs/f2fs/segment.c:2748 [inline]
    RIP: 0010:new_curseg+0x1f61/0x1f70 fs/f2fs/segment.c:2836
    Call Trace:
     __allocate_new_segment+0x1ce/0x940 fs/f2fs/segment.c:3167
     f2fs_allocate_new_section fs/f2fs/segment.c:3181 [inline]
     f2fs_allocate_pinning_section+0xfa/0x4e0 fs/f2fs/segment.c:3195
     f2fs_expand_inode_data+0x5d6/0xbb0 fs/f2fs/file.c:1799
     f2fs_fallocate+0x448/0x960 fs/f2fs/file.c:1903
     vfs_fallocate+0x553/0x6c0 fs/open.c:334
     do_vfs_ioctl+0x2592/0x2e50 fs/ioctl.c:886
     __do_sys_ioctl fs/ioctl.c:905 [inline]
     __se_sys_ioctl+0x81/0x170 fs/ioctl.c:893
     do_syscall_x64 arch/x86/entry/common.c:52 [inline]
     do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
    RIP: 0010:get_new_segment fs/f2fs/segment.c:2748 [inline]
    RIP: 0010:new_curseg+0x1f61/0x1f70 fs/f2fs/segment.c:2836
    
    The root cause is when we inject no free segment fault into f2fs,
    we should not panic system, fix it.
    
    Fixes: 8b10d3653735 ("f2fs: introduce FAULT_NO_SEGMENT")
    Reported-by: syzbot+341e5f32ebafbb46b81c@syzkaller.appspotmail.com
    Closes: https://lore.kernel.org/linux-f2fs-devel/000000000000f0ee5b0621ab694b@google.com
    Signed-off-by: Chao Yu <chao@kernel.org>
    Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

f2fs: forcibly migrate to secure space for zoned device file pinning [+ + +]
Author: Daeho Jeong <daehojeong@google.com>
Date:   Thu Sep 12 09:59:58 2024 -0700

    f2fs: forcibly migrate to secure space for zoned device file pinning
    
    [ Upstream commit 5cc69a27abfa91abbb39fc584f82d6c867b60f47 ]
    
    We need to migrate data blocks even though it is full to secure space
    for zoned device file pinning.
    
    Fixes: 9703d69d9d15 ("f2fs: support file pinning for zoned devices")
    Signed-off-by: Daeho Jeong <daehojeong@google.com>
    Reviewed-by: Chao Yu <chao@kernel.org>
    Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

f2fs: increase BG GC migration window granularity when boosted for zoned devices [+ + +]
Author: Daeho Jeong <daehojeong@google.com>
Date:   Mon Sep 9 15:19:43 2024 -0700

    f2fs: increase BG GC migration window granularity when boosted for zoned devices
    
    [ Upstream commit 2223fe652f759649ae1d520e47e5f06727c0acbd ]
    
    Need bigger BG GC migration window granularity when free section is
    running low.
    
    Signed-off-by: Daeho Jeong <daehojeong@google.com>
    Reviewed-by: Chao Yu <chao@kernel.org>
    Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
    Stable-dep-of: 5cc69a27abfa ("f2fs: forcibly migrate to secure space for zoned device file pinning")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

f2fs: introduce migration_window_granularity [+ + +]
Author: Daeho Jeong <daehojeong@google.com>
Date:   Mon Sep 9 15:19:41 2024 -0700

    f2fs: introduce migration_window_granularity
    
    [ Upstream commit 8c890c4c60342719526520133fb1b6f69f196ab8 ]
    
    We can control the scanning window granularity for GC migration. For
    more frequent scanning and GC on zoned devices, we need a fine grained
    control knob for it.
    
    Signed-off-by: Daeho Jeong <daehojeong@google.com>
    Reviewed-by: Chao Yu <chao@kernel.org>
    Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
    Stable-dep-of: 5cc69a27abfa ("f2fs: forcibly migrate to secure space for zoned device file pinning")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

f2fs: make BG GC more aggressive for zoned devices [+ + +]
Author: Daeho Jeong <daehojeong@google.com>
Date:   Mon Sep 9 15:19:40 2024 -0700

    f2fs: make BG GC more aggressive for zoned devices
    
    [ Upstream commit 5062b5bed4323275f2f89bc185c6a28d62cfcfd5 ]
    
    Since we don't have any GC on device side for zoned devices, need more
    aggressive BG GC. So, tune the parameters for that.
    
    Signed-off-by: Daeho Jeong <daehojeong@google.com>
    Reviewed-by: Chao Yu <chao@kernel.org>
    Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
    Stable-dep-of: 5cc69a27abfa ("f2fs: forcibly migrate to secure space for zoned device file pinning")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
fbdev: efifb: Register sysfs groups through driver core [+ + +]
Author: Thomas Weißschuh <linux@weissschuh.net>
Date:   Tue Aug 27 17:25:13 2024 +0200

    fbdev: efifb: Register sysfs groups through driver core
    
    [ Upstream commit 95cdd538e0e5677efbdf8aade04ec098ab98f457 ]
    
    The driver core can register and cleanup sysfs groups already.
    Make use of that functionality to simplify the error handling and
    cleanup.
    
    Also avoid a UAF race during unregistering where the sysctl attributes
    were usable after the info struct was freed.
    
    Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
    Signed-off-by: Helge Deller <deller@gmx.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fbdev: pxafb: Fix possible use after free in pxafb_task() [+ + +]
Author: Kaixin Wang <kxwang23@m.fudan.edu.cn>
Date:   Wed Sep 11 22:29:52 2024 +0800

    fbdev: pxafb: Fix possible use after free in pxafb_task()
    
    [ Upstream commit 4a6921095eb04a900e0000da83d9475eb958e61e ]
    
    In the pxafb_probe function, it calls the pxafb_init_fbinfo function,
    after which &fbi->task is associated with pxafb_task. Moreover,
    within this pxafb_init_fbinfo function, the pxafb_blank function
    within the &pxafb_ops struct is capable of scheduling work.
    
    If we remove the module which will call pxafb_remove to make cleanup,
    it will call unregister_framebuffer function which can call
    do_unregister_framebuffer to free fbi->fb through
    put_fb_info(fb_info), while the work mentioned above will be used.
    The sequence of operations that may lead to a UAF bug is as follows:
    
    CPU0                                                CPU1
    
                                       | pxafb_task
    pxafb_remove                       |
    unregister_framebuffer(info)       |
    do_unregister_framebuffer(fb_info) |
    put_fb_info(fb_info)               |
    // free fbi->fb                    | set_ctrlr_state(fbi, state)
                                       | __pxafb_lcd_power(fbi, 0)
                                       | fbi->lcd_power(on, &fbi->fb.var)
                                       | //use fbi->fb
    
    Fix it by ensuring that the work is canceled before proceeding
    with the cleanup in pxafb_remove.
    
    Note that only root user can remove the driver at runtime.
    
    Signed-off-by: Kaixin Wang <kxwang23@m.fudan.edu.cn>
    Signed-off-by: Helge Deller <deller@gmx.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
firmware/sysfb: Disable sysfb for firmware buffers with unknown parent [+ + +]
Author: Thomas Zimmermann <tzimmermann@suse.de>
Date:   Tue Sep 24 10:41:03 2024 +0200

    firmware/sysfb: Disable sysfb for firmware buffers with unknown parent
    
    commit ad604f0a4c040dcb8faf44dc72db25e457c28076 upstream.
    
    The sysfb framebuffer handling only operates on graphics devices
    that provide the system's firmware framebuffer. If that device is
    not known, assume that any graphics device has been initialized by
    firmware.
    
    Fixes a problem on i915 where sysfb does not release the firmware
    framebuffer after the native graphics driver loaded.
    
    Reported-by: Borah, Chaitanya Kumar <chaitanya.kumar.borah@intel.com>
    Closes: https://lore.kernel.org/dri-devel/SJ1PR11MB6129EFB8CE63D1EF6D932F94B96F2@SJ1PR11MB6129.namprd11.prod.outlook.com/
    Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12160
    Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
    Fixes: b49420d6a1ae ("video/aperture: optionally match the device in sysfb_disable()")
    Cc: Javier Martinez Canillas <javierm@redhat.com>
    Cc: Thomas Zimmermann <tzimmermann@suse.de>
    Cc: Helge Deller <deller@gmx.de>
    Cc: Sam Ravnborg <sam@ravnborg.org>
    Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
    Cc: Alex Deucher <alexander.deucher@amd.com>
    Cc: dri-devel@lists.freedesktop.org
    Cc: Linux regression tracking (Thorsten Leemhuis) <regressions@leemhuis.info>
    Cc: <stable@vger.kernel.org> # v6.11+
    Acked-by: Alex Deucher <alexander.deucher@amd.com>
    Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240924084227.262271-1-tzimmermann@suse.de
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
firmware: tegra: bpmp: Drop unused mbox_client_to_bpmp() [+ + +]
Author: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Date:   Fri Aug 16 15:57:21 2024 +0200

    firmware: tegra: bpmp: Drop unused mbox_client_to_bpmp()
    
    commit 9c3a62c20f7fb00294a4237e287254456ba8a48b upstream.
    
    mbox_client_to_bpmp() is not used, W=1 builds:
    
      drivers/firmware/tegra/bpmp.c:28:1: error: unused function 'mbox_client_to_bpmp' [-Werror,-Wunused-function]
    
    Fixes: cdfa358b248e ("firmware: tegra: Refactor BPMP driver")
    Cc: stable@vger.kernel.org
    Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
    Signed-off-by: Thierry Reding <treding@nvidia.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
fs/inode: Prevent dump_mapping() accessing invalid dentry.d_name.name [+ + +]
Author: Li Zhijian <lizhijian@fujitsu.com>
Date:   Mon Aug 26 13:55:03 2024 +0800

    fs/inode: Prevent dump_mapping() accessing invalid dentry.d_name.name
    
    [ Upstream commit 7f7b850689ac06a62befe26e1fd1806799e7f152 ]
    
    It's observed that a crash occurs during hot-remove a memory device,
    in which user is accessing the hugetlb. See calltrace as following:
    
    ------------[ cut here ]------------
    WARNING: CPU: 1 PID: 14045 at arch/x86/mm/fault.c:1278 do_user_addr_fault+0x2a0/0x790
    Modules linked in: kmem device_dax cxl_mem cxl_pmem cxl_port cxl_pci dax_hmem dax_pmem nd_pmem cxl_acpi nd_btt cxl_core crc32c_intel nvme virtiofs fuse nvme_core nfit libnvdimm dm_multipath scsi_dh_rdac scsi_dh_emc s
    mirror dm_region_hash dm_log dm_mod
    CPU: 1 PID: 14045 Comm: daxctl Not tainted 6.10.0-rc2-lizhijian+ #492
    Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
    RIP: 0010:do_user_addr_fault+0x2a0/0x790
    Code: 48 8b 00 a8 04 0f 84 b5 fe ff ff e9 1c ff ff ff 4c 89 e9 4c 89 e2 be 01 00 00 00 bf 02 00 00 00 e8 b5 ef 24 00 e9 42 fe ff ff <0f> 0b 48 83 c4 08 4c 89 ea 48 89 ee 4c 89 e7 5b 5d 41 5c 41 5d 41
    RSP: 0000:ffffc90000a575f0 EFLAGS: 00010046
    RAX: ffff88800c303600 RBX: 0000000000000000 RCX: 0000000000000000
    RDX: 0000000000001000 RSI: ffffffff82504162 RDI: ffffffff824b2c36
    RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000000 R12: ffffc90000a57658
    R13: 0000000000001000 R14: ffff88800bc2e040 R15: 0000000000000000
    FS:  00007f51cb57d880(0000) GS:ffff88807fd00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000001000 CR3: 00000000072e2004 CR4: 00000000001706f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
     <TASK>
     ? __warn+0x8d/0x190
     ? do_user_addr_fault+0x2a0/0x790
     ? report_bug+0x1c3/0x1d0
     ? handle_bug+0x3c/0x70
     ? exc_invalid_op+0x14/0x70
     ? asm_exc_invalid_op+0x16/0x20
     ? do_user_addr_fault+0x2a0/0x790
     ? exc_page_fault+0x31/0x200
     exc_page_fault+0x68/0x200
    <...snip...>
    BUG: unable to handle page fault for address: 0000000000001000
     #PF: supervisor read access in kernel mode
     #PF: error_code(0x0000) - not-present page
     PGD 800000000ad92067 P4D 800000000ad92067 PUD 7677067 PMD 0
     Oops: Oops: 0000 [#1] PREEMPT SMP PTI
     ---[ end trace 0000000000000000 ]---
     BUG: unable to handle page fault for address: 0000000000001000
     #PF: supervisor read access in kernel mode
     #PF: error_code(0x0000) - not-present page
     PGD 800000000ad92067 P4D 800000000ad92067 PUD 7677067 PMD 0
     Oops: Oops: 0000 [#1] PREEMPT SMP PTI
     CPU: 1 PID: 14045 Comm: daxctl Kdump: loaded Tainted: G        W          6.10.0-rc2-lizhijian+ #492
     Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
     RIP: 0010:dentry_name+0x1f4/0x440
    <...snip...>
    ? dentry_name+0x2fa/0x440
    vsnprintf+0x1f3/0x4f0
    vprintk_store+0x23a/0x540
    vprintk_emit+0x6d/0x330
    _printk+0x58/0x80
    dump_mapping+0x10b/0x1a0
    ? __pfx_free_object_rcu+0x10/0x10
    __dump_page+0x26b/0x3e0
    ? vprintk_emit+0xe0/0x330
    ? _printk+0x58/0x80
    ? dump_page+0x17/0x50
    dump_page+0x17/0x50
    do_migrate_range+0x2f7/0x7f0
    ? do_migrate_range+0x42/0x7f0
    ? offline_pages+0x2f4/0x8c0
    offline_pages+0x60a/0x8c0
    memory_subsys_offline+0x9f/0x1c0
    ? lockdep_hardirqs_on+0x77/0x100
    ? _raw_spin_unlock_irqrestore+0x38/0x60
    device_offline+0xe3/0x110
    state_store+0x6e/0xc0
    kernfs_fop_write_iter+0x143/0x200
    vfs_write+0x39f/0x560
    ksys_write+0x65/0xf0
    do_syscall_64+0x62/0x130
    
    Previously, some sanity check have been done in dump_mapping() before
    the print facility parsing '%pd' though, it's still possible to run into
    an invalid dentry.d_name.name.
    
    Since dump_mapping() only needs to dump the filename only, retrieve it
    by itself in a safer way to prevent an unnecessary crash.
    
    Note that either retrieving the filename with '%pd' or
    strncpy_from_kernel_nofault(), the filename could be unreliable.
    
    Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
    Link: https://lore.kernel.org/r/20240826055503.1522320-1-lizhijian@fujitsu.com
    Reviewed-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
gfs2: fix double destroy_workqueue error [+ + +]
Author: Julian Sun <sunjunchao2870@gmail.com>
Date:   Tue Aug 20 11:31:48 2024 +0800

    gfs2: fix double destroy_workqueue error
    
    commit 6cb9df81a2c462b89d2f9611009ab43ae8717841 upstream.
    
    When gfs2_fill_super() fails, destroy_workqueue() is called within
    gfs2_gl_hash_clear(), and the subsequent code path calls
    destroy_workqueue() on the same work queue again.
    
    This issue can be fixed by setting the work queue pointer to NULL after
    the first destroy_workqueue() call and checking for a NULL pointer
    before attempting to destroy the work queue again.
    
    Reported-by: syzbot+d34c2a269ed512c531b0@syzkaller.appspotmail.com
    Closes: https://syzkaller.appspot.com/bug?extid=d34c2a269ed512c531b0
    Fixes: 30e388d57367 ("gfs2: Switch to a per-filesystem glock workqueue")
    Cc: stable@vger.kernel.org
    Signed-off-by: Julian Sun <sunjunchao2870@gmail.com>
    Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
gpio: davinci: fix lazy disable [+ + +]
Author: Emanuele Ghidoli <emanuele.ghidoli@toradex.com>
Date:   Wed Aug 28 15:32:07 2024 +0200

    gpio: davinci: fix lazy disable
    
    commit 3360d41f4ac490282fddc3ccc0b58679aa5c065d upstream.
    
    On a few platforms such as TI's AM69 device, disable_irq() fails to keep
    track of the interrupts that happen between disable_irq() and
    enable_irq() and those interrupts are missed. Use the ->irq_unmask() and
    ->irq_mask() methods instead of ->irq_enable() and ->irq_disable() to
    correctly keep track of edges when disable_irq is called.
    
    This solves the issue of disable_irq() not working as expected on such
    platforms.
    
    Fixes: 23265442b02b ("ARM: davinci: irq_data conversion.")
    Signed-off-by: Emanuele Ghidoli <emanuele.ghidoli@toradex.com>
    Signed-off-by: Parth Pancholi <parth.pancholi@toradex.com>
    Acked-by: Keerthy <j-keerthy@ti.com>
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20240828133207.493961-1-parth105105@gmail.com
    Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
gpiolib: Fix potential NULL pointer dereference in gpiod_get_label() [+ + +]
Author: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Date:   Thu Oct 3 14:13:51 2024 +0100

    gpiolib: Fix potential NULL pointer dereference in gpiod_get_label()
    
    [ Upstream commit 7b99b5ab885993bff010ebcd93be5e511c56e28a ]
    
    In `gpiod_get_label()`, it is possible that `srcu_dereference_check()` may
    return a NULL pointer, leading to a scenario where `label->str` is accessed
    without verifying if `label` itself is NULL.
    
    This patch adds a proper NULL check for `label` before accessing
    `label->str`. The check for `label->str != NULL` is removed because
    `label->str` can never be NULL if `label` is not NULL.
    
    This fixes the issue where the label name was being printed as `(efault)`
    when dumping the sysfs GPIO file when `label == NULL`.
    
    Fixes: 5a646e03e956 ("gpiolib: Return label, if set, for IRQ only line")
    Fixes: a86d27693066 ("gpiolib: fix the speed of descriptor label setting with SRCU")
    Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
    Link: https://lore.kernel.org/r/20241003131351.472015-1-prabhakar.mahadev-lad.rj@bp.renesas.com
    Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
gso: fix udp gso fraglist segmentation after pull from frag_list [+ + +]
Author: Willem de Bruijn <willemb@google.com>
Date:   Tue Oct 1 13:17:46 2024 -0400

    gso: fix udp gso fraglist segmentation after pull from frag_list
    
    commit a1e40ac5b5e9077fe1f7ae0eb88034db0f9ae1ab upstream.
    
    Detect gso fraglist skbs with corrupted geometry (see below) and
    pass these to skb_segment instead of skb_segment_list, as the first
    can segment them correctly.
    
    Valid SKB_GSO_FRAGLIST skbs
    - consist of two or more segments
    - the head_skb holds the protocol headers plus first gso_size
    - one or more frag_list skbs hold exactly one segment
    - all but the last must be gso_size
    
    Optional datapath hooks such as NAT and BPF (bpf_skb_pull_data) can
    modify these skbs, breaking these invariants.
    
    In extreme cases they pull all data into skb linear. For UDP, this
    causes a NULL ptr deref in __udpv4_gso_segment_list_csum at
    udp_hdr(seg->next)->dest.
    
    Detect invalid geometry due to pull, by checking head_skb size.
    Don't just drop, as this may blackhole a destination. Convert to be
    able to pass to regular skb_segment.
    
    Link: https://lore.kernel.org/netdev/20240428142913.18666-1-shiming.cheng@mediatek.com/
    Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
    Signed-off-by: Willem de Bruijn <willemb@google.com>
    Cc: stable@vger.kernel.org
    Link: https://patch.msgid.link/20241001171752.107580-1-willemdebruijn.kernel@gmail.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
HID: bpf: fix cfi stubs for hid_bpf_ops [+ + +]
Author: Benjamin Tissoires <bentiss@kernel.org>
Date:   Fri Sep 27 16:17:41 2024 +0200

    HID: bpf: fix cfi stubs for hid_bpf_ops
    
    commit acd5f76fd5292c91628e04da83e8b78c986cfa2b upstream.
    
    With the introduction of commit e42ac1418055 ("bpf: Check unsupported ops
    from the bpf_struct_ops's cfi_stubs"), a HID-BPF struct_ops containing
    a .hid_hw_request() or a .hid_hw_output_report() was failing to load
    as the cfi stubs were not defined.
    
    Fix that by defining those simple static functions and restore HID-BPF
    functionality.
    
    This was detected with the HID selftests suddenly failing on Linus' tree.
    
    Cc: stable@vger.kernel.org # v6.11+
    Fixes: 9286675a2aed ("HID: bpf: add HID-BPF hooks for hid_hw_output_report")
    Fixes: 8bd0488b5ea5 ("HID: bpf: add HID-BPF hooks for hid_hw_raw_requests")
    Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
    Signed-off-by: Jiri Kosina <jkosina@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

HID: i2c-hid: ensure various commands do not interfere with each other [+ + +]
Author: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Date:   Mon Sep 9 13:37:40 2024 -0700

    HID: i2c-hid: ensure various commands do not interfere with each other
    
    [ Upstream commit b4ed18a3d56eabd18cfd9841ff05111e3cfbe8f9 ]
    
    i2c-hid uses 2 shared buffers: command and "raw" input buffer for
    sending requests to peripherals and read data from peripherals when
    executing variety of commands. Such commands include reading of HID
    registers, requesting particular power mode, getting and setting
    reports and so on. Because all such requests use the same 2 buffers
    they should not execute simultaneously.
    
    Fix this by introducing "cmd_lock" mutex and acquire it whenever
    we needs to access ihid->cmdbuf or idid->rawbuf.
    
    Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
    Reviewed-by: Hans de Goede <hdegoede@redhat.com>
    Signed-off-by: Jiri Kosina <jkosina@suse.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

HID: Ignore battery for all ELAN I2C-HID devices [+ + +]
Author: Hans de Goede <hdegoede@redhat.com>
Date:   Mon Aug 5 16:51:47 2024 +0200

    HID: Ignore battery for all ELAN I2C-HID devices
    
    [ Upstream commit bcc31692a1d1e21f0d06c5f727c03ee299d2264e ]
    
    Before this change there were 16 vid:pid based quirks to ignore the battery
    reported by Elan I2C-HID touchscreens on various Asus and HP laptops.
    
    And a report has been received that the 04F3:2A00 I2C touchscreen on
    the HP ProBook x360 11 G5 EE/86CF also reports a non present battery.
    
    Since I2C-HID devices are always builtin to laptops they are not battery
    owered so it should be safe to just ignore the battery on all Elan I2C-HID
    devices, rather then adding a 17th quirk for the 04F3:2A00 touchscreen.
    
    As reported in the changelog of commit a3a5a37efba1 ("HID: Ignore battery
    for ELAN touchscreens 2F2C and 4116"), which added 2 new Elan touchscreen
    quirks about a month ago, the HID reported battery seems to be related
    to a stylus being used. But even when a stylus is in use it does not
    properly report the charge of the stylus battery, instead the reported
    battery charge jumps from 0% to 1%. So it is best to just ignore the
    HID battery.
    
    Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2302776
    Cc: Louis Dalibard <ontake@ontake.dev>
    Signed-off-by: Hans de Goede <hdegoede@redhat.com>
    Signed-off-by: Jiri Kosina <jkosina@suse.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

HID: multitouch: Add support for Thinkpad X12 Gen 2 Kbd Portfolio [+ + +]
Author: Vishnu Sankar <vishnuocv@gmail.com>
Date:   Sun Aug 18 16:27:29 2024 +0900

    HID: multitouch: Add support for Thinkpad X12 Gen 2 Kbd Portfolio
    
    [ Upstream commit 65b72ea91a257a5f0cb5a26b01194d3dd4b85298 ]
    
    This applies similar quirks used by previous generation device, so that
    Trackpoint and buttons on the touchpad works.  New USB KBD PID 0x61AE for
    Thinkpad X12 Tab is added.
    
    Signed-off-by: Vishnu Sankar <vishnuocv@gmail.com>
    Reviewed-by: Mark Pearson <mpearson-lenovo@squebb.ca>
    Signed-off-by: Jiri Kosina <jkosina@suse.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
hwmon: (nct6775) add G15CF to ASUS WMI monitoring list [+ + +]
Author: Denis Pauk <pauk.denis@gmail.com>
Date:   Mon Aug 12 18:26:38 2024 +0300

    hwmon: (nct6775) add G15CF to ASUS WMI monitoring list
    
    [ Upstream commit 1f432e4cf1dd3ecfec5ed80051b4611632a0fd51 ]
    
    Boards G15CF has got a nct6775 chip, but by default there's no use of it
    because of resource conflict with WMI method.
    
    Add the board to the WMI monitoring list.
    
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=204807
    Signed-off-by: Denis Pauk <pauk.denis@gmail.com>
    Tested-by: Attila <attila@fulop.one>
    Message-ID: <20240812152652.1303-1-pauk.denis@gmail.com>
    Signed-off-by: Guenter Roeck <linux@roeck-us.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
i2c: core: Lock address during client device instantiation [+ + +]
Author: Heiner Kallweit <hkallweit1@gmail.com>
Date:   Thu Aug 15 21:44:50 2024 +0200

    i2c: core: Lock address during client device instantiation
    
    commit 8d3cefaf659265aa82b0373a563fdb9d16a2b947 upstream.
    
    Krzysztof reported an issue [0] which is caused by parallel attempts to
    instantiate the same I2C client device. This can happen if driver
    supports auto-detection, but certain devices are also instantiated
    explicitly.
    The original change isn't actually wrong, it just revealed that I2C core
    isn't prepared yet to handle this scenario.
    Calls to i2c_new_client_device() can be nested, therefore we can't use a
    simple mutex here. Parallel instantiation of devices at different addresses
    is ok, so we just have to prevent parallel instantiation at the same address.
    We can use a bitmap with one bit per 7-bit I2C client address, and atomic
    bit operations to set/check/clear bits.
    Now a parallel attempt to instantiate a device at the same address will
    result in -EBUSY being returned, avoiding the "sysfs: cannot create duplicate
    filename" splash.
    
    Note: This patch version includes small cosmetic changes to the Tested-by
          version, only functional change is that address locking is supported
          for slave addresses too.
    
    [0] https://lore.kernel.org/linux-i2c/9479fe4e-eb0c-407e-84c0-bd60c15baf74@ans.pl/T/#m12706546e8e2414d8f1a0dc61c53393f731685cc
    
    Fixes: caba40ec3531 ("eeprom: at24: Probe for DDR3 thermal sensor in the SPD case")
    Cc: stable@vger.kernel.org
    Tested-by: Krzysztof Piotr Oledzki <ole@ans.pl>
    Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
    Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

i2c: designware: fix controller is holding SCL low while ENABLE bit is disabled [+ + +]
Author: Kimriver Liu <kimriver.liu@siengine.com>
Date:   Fri Sep 13 11:31:46 2024 +0800

    i2c: designware: fix controller is holding SCL low while ENABLE bit is disabled
    
    commit 5d69d5a00f80488ddcb4dee7d1374a0709398178 upstream.
    
    It was observed that issuing the ABORT bit (IC_ENABLE[1]) will not
    work when IC_ENABLE is already disabled.
    
    Check if the ENABLE bit (IC_ENABLE[0]) is disabled when the controller
    is holding SCL low. If the ENABLE bit is disabled, the software needs
    to enable it before trying to issue the ABORT bit. otherwise,
    the controller ignores any write to ABORT bit.
    
    These kernel logs show up whenever an I2C transaction is
    attempted after this failure.
    i2c_designware e95e0000.i2c: timeout waiting for bus ready
    i2c_designware e95e0000.i2c: timeout in disabling adapter
    
    The patch fixes the issue where the controller cannot be disabled
    while SCL is held low if the ENABLE bit is already disabled.
    
    Fixes: 2409205acd3c ("i2c: designware: fix __i2c_dw_disable() in case master is holding SCL low")
    Signed-off-by: Kimriver Liu <kimriver.liu@siengine.com>
    Cc: <stable@vger.kernel.org> # v6.6+
    Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
    Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
    Reviewed-by: Andy Shevchenko <andy@kernel.org>
    Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

i2c: qcom-geni: Use IRQF_NO_AUTOEN flag in request_irq() [+ + +]
Author: Jinjie Ruan <ruanjinjie@huawei.com>
Date:   Thu Sep 12 11:34:59 2024 +0800

    i2c: qcom-geni: Use IRQF_NO_AUTOEN flag in request_irq()
    
    commit e2c85d85a05f16af2223fcc0195ff50a7938b372 upstream.
    
    disable_irq() after request_irq() still has a time gap in which
    interrupts can come. request_irq() with IRQF_NO_AUTOEN flag will
    disable IRQ auto-enable when request IRQ.
    
    Fixes: 37692de5d523 ("i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller")
    Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
    Cc: <stable@vger.kernel.org> # v4.19+
    Acked-by: Mukesh Kumar Savaliya <quic_msavaliy@quicinc.com>
    Reviewed-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org>
    Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

i2c: stm32f7: Do not prepare/unprepare clock during runtime suspend/resume [+ + +]
Author: Marek Vasut <marex@denx.de>
Date:   Mon Sep 30 21:27:41 2024 +0200

    i2c: stm32f7: Do not prepare/unprepare clock during runtime suspend/resume
    
    commit 048bbbdbf85e5e00258dfb12f5e368f908801d7b upstream.
    
    In case there is any sort of clock controller attached to this I2C bus
    controller, for example Versaclock or even an AIC32x4 I2C codec, then
    an I2C transfer triggered from the clock controller clk_ops .prepare
    callback may trigger a deadlock on drivers/clk/clk.c prepare_lock mutex.
    
    This is because the clock controller first grabs the prepare_lock mutex
    and then performs the prepare operation, including its I2C access. The
    I2C access resumes this I2C bus controller via .runtime_resume callback,
    which calls clk_prepare_enable(), which attempts to grab the prepare_lock
    mutex again and deadlocks.
    
    Since the clock are already prepared since probe() and unprepared in
    remove(), use simple clk_enable()/clk_disable() calls to enable and
    disable the clock on runtime suspend and resume, to avoid hitting the
    prepare_lock mutex.
    
    Acked-by: Alain Volmat <alain.volmat@foss.st.com>
    Signed-off-by: Marek Vasut <marex@denx.de>
    Fixes: 4e7bca6fc07b ("i2c: i2c-stm32f7: add PM Runtime support")
    Cc: <stable@vger.kernel.org> # v5.0+
    Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

i2c: synquacer: Deal with optional PCLK correctly [+ + +]
Author: Ard Biesheuvel <ardb@kernel.org>
Date:   Thu Sep 12 12:46:31 2024 +0200

    i2c: synquacer: Deal with optional PCLK correctly
    
    commit f2990f8630531a99cad4dc5c44cb2a11ded42492 upstream.
    
    ACPI boot does not provide clocks and regulators, but instead, provides
    the PCLK rate directly, and enables the clock in firmware. So deal
    gracefully with this.
    
    Fixes: 55750148e559 ("i2c: synquacer: Fix an error handling path in synquacer_i2c_probe()")
    Cc: stable@vger.kernel.org # v6.10+
    Cc: Andi Shyti <andi.shyti@kernel.org>
    Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
    Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
    Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

i2c: xiic: Fix pm_runtime_set_suspended() with runtime pm enabled [+ + +]
Author: Jinjie Ruan <ruanjinjie@huawei.com>
Date:   Mon Sep 23 11:42:50 2024 +0800

    i2c: xiic: Fix pm_runtime_set_suspended() with runtime pm enabled
    
    commit 0c8d604dea437b69a861479b413d629bc9b3da70 upstream.
    
    It is not valid to call pm_runtime_set_suspended() for devices
    with runtime PM enabled because it returns -EAGAIN if it is enabled
    already and working. So, call pm_runtime_disable() before to fix it.
    
    Fixes: 36ecbcab84d0 ("i2c: xiic: Implement power management")
    Cc: <stable@vger.kernel.org> # v4.6+
    Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
    Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

i2c: xiic: Wait for TX empty to avoid missed TX NAKs [+ + +]
Author: Robert Hancock <robert.hancock@calian.com>
Date:   Tue Nov 21 18:11:16 2023 +0000

    i2c: xiic: Wait for TX empty to avoid missed TX NAKs
    
    commit 521da1e9225450bd323db5fa5bca942b1dc485b7 upstream.
    
    Frequently an I2C write will be followed by a read, such as a register
    address write followed by a read of the register value. In this driver,
    when the TX FIFO half empty interrupt was raised and it was determined
    that there was enough space in the TX FIFO to send the following read
    command, it would do so without waiting for the TX FIFO to actually
    empty.
    
    Unfortunately it appears that in some cases this can result in a NAK
    that was raised by the target device on the write, such as due to an
    unsupported register address, being ignored and the subsequent read
    being done anyway. This can potentially put the I2C bus into an
    invalid state and/or result in invalid read data being processed.
    
    To avoid this, once a message has been fully written to the TX FIFO,
    wait for the TX FIFO empty interrupt before moving on to the next
    message, to ensure NAKs are handled properly.
    
    Fixes: e1d5b6598cdc ("i2c: Add support for Xilinx XPS IIC Bus Interface")
    Signed-off-by: Robert Hancock <robert.hancock@calian.com>
    Cc: <stable@vger.kernel.org> # v2.6.34+
    Reviewed-by: Manikanta Guntupalli <manikanta.guntupalli@amd.com>
    Acked-by: Michal Simek <michal.simek@amd.com>
    Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
i3c: master: svc: Fix use after free vulnerability in svc_i3c_master Driver Due to Race Condition [+ + +]
Author: Kaixin Wang <kxwang23@m.fudan.edu.cn>
Date:   Sun Sep 15 00:39:33 2024 +0800

    i3c: master: svc: Fix use after free vulnerability in svc_i3c_master Driver Due to Race Condition
    
    commit 61850725779709369c7e907ae8c7c75dc7cec4f3 upstream.
    
    In the svc_i3c_master_probe function, &master->hj_work is bound with
    svc_i3c_master_hj_work, &master->ibi_work is bound with
    svc_i3c_master_ibi_work. And svc_i3c_master_ibi_work  can start the
    hj_work, svc_i3c_master_irq_handler can start the ibi_work.
    
    If we remove the module which will call svc_i3c_master_remove to
    make cleanup, it will free master->base through i3c_master_unregister
    while the work mentioned above will be used. The sequence of operations
    that may lead to a UAF bug is as follows:
    
    CPU0                                         CPU1
    
                                        | svc_i3c_master_hj_work
    svc_i3c_master_remove               |
    i3c_master_unregister(&master->base)|
    device_unregister(&master->dev)     |
    device_release                      |
    //free master->base                 |
                                        | i3c_master_do_daa(&master->base)
                                        | //use master->base
    
    Fix it by ensuring that the work is canceled before proceeding with the
    cleanup in svc_i3c_master_remove.
    
    Fixes: 0f74f8b6675c ("i3c: Make i3c_master_unregister() return void")
    Cc: stable@vger.kernel.org
    Signed-off-by: Kaixin Wang <kxwang23@m.fudan.edu.cn>
    Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
    Reviewed-by: Frank Li <Frank.Li@nxp.com>
    Link: https://lore.kernel.org/stable/20240914154030.180-1-kxwang23%40m.fudan.edu.cn
    Link: https://lore.kernel.org/r/20240914163932.253-1-kxwang23@m.fudan.edu.cn
    Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
ice: Adjust over allocation of memory in ice_sched_add_root_node() and ice_sched_add_node() [+ + +]
Author: Aleksandr Mishin <amishin@t-argos.ru>
Date:   Wed Jul 10 15:39:49 2024 +0300

    ice: Adjust over allocation of memory in ice_sched_add_root_node() and ice_sched_add_node()
    
    [ Upstream commit 62fdaf9e8056e9a9e6fe63aa9c816ec2122d60c6 ]
    
    In ice_sched_add_root_node() and ice_sched_add_node() there are calls to
    devm_kcalloc() in order to allocate memory for array of pointers to
    'ice_sched_node' structure. But incorrect types are used as sizeof()
    arguments in these calls (structures instead of pointers) which leads to
    over allocation of memory.
    
    Adjust over allocation of memory by correcting types in devm_kcalloc()
    sizeof() arguments.
    
    Found by Linux Verification Center (linuxtesting.org) with SVACE.
    
    Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
    Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ieee802154: Fix build error [+ + +]
Author: Jinjie Ruan <ruanjinjie@huawei.com>
Date:   Mon Sep 9 21:17:40 2024 +0800

    ieee802154: Fix build error
    
    [ Upstream commit addf89774e48c992316449ffab4f29c2309ebefb ]
    
    If REGMAP_SPI is m and IEEE802154_MCR20A is y,
    
            mcr20a.c:(.text+0x3ed6c5b): undefined reference to `__devm_regmap_init_spi'
            ld: mcr20a.c:(.text+0x3ed6cb5): undefined reference to `__devm_regmap_init_spi'
    
    Select REGMAP_SPI for IEEE802154_MCR20A to fix it.
    
    Fixes: 8c6ad9cc5157 ("ieee802154: Add NXP MCR20A IEEE 802.15.4 transceiver driver")
    Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
    Link: https://lore.kernel.org/20240909131740.1296608-1-ruanjinjie@huawei.com
    Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
iio: magnetometer: ak8975: Fix reading for ak099xx sensors [+ + +]
Author: Barnabás Czémán <barnabas.czeman@mainlining.org>
Date:   Mon Aug 19 00:29:40 2024 +0200

    iio: magnetometer: ak8975: Fix reading for ak099xx sensors
    
    commit 129464e86c7445a858b790ac2d28d35f58256bbe upstream.
    
    Move ST2 reading with overflow handling after measurement data
    reading.
    ST2 register read have to be read after read measurment data,
    because it means end of the reading and realease the lock on the data.
    Remove ST2 read skip on interrupt based waiting because ST2 required to
    be read out at and of the axis read.
    
    Fixes: 57e73a423b1e ("iio: ak8975: add ak09911 and ak09912 support")
    Signed-off-by: Barnabás Czémán <barnabas.czeman@mainlining.org>
    Link: https://patch.msgid.link/20240819-ak09918-v4-2-f0734d14cfb9@mainlining.org
    Cc: <Stable@vger.kernel.org>
    Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

iio: pressure: bmp280: Fix regmap for BMP280 device [+ + +]
Author: Vasileios Amoiridis <vassilisamir@gmail.com>
Date:   Thu Jul 11 23:15:49 2024 +0200

    iio: pressure: bmp280: Fix regmap for BMP280 device
    
    commit b9065b0250e1705935445ede0a18c1850afe7b75 upstream.
    
    Up to now, the BMP280 device is using the regmap of the BME280 which
    has registers that exist only in the BME280 device.
    
    Fixes: 14e8015f8569 ("iio: pressure: bmp280: split driver in logical parts")
    Signed-off-by: Vasileios Amoiridis <vassilisamir@gmail.com>
    Link: https://patch.msgid.link/20240711211558.106327-2-vassilisamir@gmail.com
    Cc: <Stable@vger.kernel.org>
    Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

iio: pressure: bmp280: Fix waiting time for BMP3xx configuration [+ + +]
Author: Vasileios Amoiridis <vassilisamir@gmail.com>
Date:   Thu Jul 11 23:15:50 2024 +0200

    iio: pressure: bmp280: Fix waiting time for BMP3xx configuration
    
    commit 262a6634bcc4f0c1c53d13aa89882909f281a6aa upstream.
    
    According to the datasheet, both pressure and temperature can go up to
    oversampling x32. With this option, the maximum measurement time is not
    80ms (this is for press x32 and temp x2), but it is 130ms nominal
    (calculated from table 3.9.2) and since most of the maximum values
    are around +15%, it is configured to 150ms.
    
    Fixes: 8d329309184d ("iio: pressure: bmp280: Add support for BMP380 sensor family")
    Signed-off-by: Vasileios Amoiridis <vassilisamir@gmail.com>
    Link: https://patch.msgid.link/20240711211558.106327-3-vassilisamir@gmail.com
    Cc: <Stable@vger.kernel.org>
    Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
Input: adp5589-keys - fix adp5589_gpio_get_value() [+ + +]
Author: Nuno Sa <nuno.sa@analog.com>
Date:   Tue Oct 1 07:47:23 2024 -0700

    Input: adp5589-keys - fix adp5589_gpio_get_value()
    
    commit c684771630e64bc39bddffeb65dd8a6612a6b249 upstream.
    
    The adp5589 seems to have the same behavior as similar devices as
    explained in commit 910a9f5636f5 ("Input: adp5588-keys - get value from
    data out when dir is out").
    
    Basically, when the gpio is set as output we need to get the value from
    ADP5589_GPO_DATA_OUT_A register instead of ADP5589_GPI_STATUS_A.
    
    Fixes: 9d2e173644bb ("Input: ADP5589 - new driver for I2C Keypad Decoder and I/O Expander")
    Signed-off-by: Nuno Sa <nuno.sa@analog.com>
    Link: https://lore.kernel.org/r/20241001-b4-dev-adp5589-fw-conversion-v1-2-fca0149dfc47@analog.com
    Cc: stable@vger.kernel.org
    Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Input: adp5589-keys - fix NULL pointer dereference [+ + +]
Author: Nuno Sa <nuno.sa@analog.com>
Date:   Tue Oct 1 07:46:44 2024 -0700

    Input: adp5589-keys - fix NULL pointer dereference
    
    commit fb5cc65f973661241e4a2b7390b429aa7b330c69 upstream.
    
    We register a devm action to call adp5589_clear_config() and then pass
    the i2c client as argument so that we can call i2c_get_clientdata() in
    order to get our device object. However, i2c_set_clientdata() is only
    being set at the end of the probe function which means that we'll get a
    NULL pointer dereference in case the probe function fails early.
    
    Fixes: 30df385e35a4 ("Input: adp5589-keys - use devm_add_action_or_reset() for register clear")
    Signed-off-by: Nuno Sa <nuno.sa@analog.com>
    Link: https://lore.kernel.org/r/20241001-b4-dev-adp5589-fw-conversion-v1-1-fca0149dfc47@analog.com
    Cc: stable@vger.kernel.org
    Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
intel_idle: Disable promotion to C1E on Jasper Lake and Elkhart Lake [+ + +]
Author: Kai-Heng Feng <kai.heng.feng@canonical.com>
Date:   Tue Aug 20 12:11:28 2024 +0800

    intel_idle: Disable promotion to C1E on Jasper Lake and Elkhart Lake
    
    [ Upstream commit 5bb33212b5c664396e5de4cd5a2999abb84a3978 ]
    
    PCIe ethernet throughut is sub-optimal on Jasper Lake and Elkhart Lake.
    
    The CPU can take long time to exit to C0 to handle IRQ and perform DMA
    when C1E has been entered.
    
    For this reason, adjust intel_idle to disable promotion to C1E and still
    use C-states from ACPI _CST on those two platforms.
    
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=219023
    Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
    Link: https://patch.msgid.link/20240820041128.102452-1-kai.heng.feng@canonical.com
    [ rjw: Subject and changelog edits ]
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
io_uring/net: harden multishot termination case for recv [+ + +]
Author: Jens Axboe <axboe@kernel.dk>
Date:   Thu Sep 26 07:08:10 2024 -0600

    io_uring/net: harden multishot termination case for recv
    
    commit c314094cb4cfa6fc5a17f4881ead2dfebfa717a7 upstream.
    
    If the recv returns zero, or an error, then it doesn't matter if more
    data has already been received for this buffer. A condition like that
    should terminate the multishot receive. Rather than pass in the
    collected return value, pass in whether to terminate or keep the recv
    going separately.
    
    Note that this isn't a bug right now, as the only way to get there is
    via setting MSG_WAITALL with multishot receive. And if an application
    does that, then -EINVAL is returned anyway. But it seems like an easy
    bug to introduce, so let's make it a bit more explicit.
    
    Link: https://github.com/axboe/liburing/issues/1246
    Cc: stable@vger.kernel.org
    Fixes: b3fdea6ecb55 ("io_uring: multishot recv")
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
io_uring: fix memory leak when cache init fail [+ + +]
Author: Guixin Liu <kanie@linux.alibaba.com>
Date:   Mon Sep 23 18:05:12 2024 +0800

    io_uring: fix memory leak when cache init fail
    
    [ Upstream commit 3a87e264290d71ec86a210ab3e8d23b715ad266d ]
    
    Exit the percpu ref when cache init fails to free the data memory with
    in struct percpu_ref.
    
    Fixes: 206aefde4f88 ("io_uring: reduce/pack size of io_ring_ctx")
    Signed-off-by: Guixin Liu <kanie@linux.alibaba.com>
    Reviewed-by: Gabriel Krisman Bertazi <krisman@suse.de>
    Link: https://lore.kernel.org/r/20240923100512.64638-1-kanie@linux.alibaba.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
iomap: constrain the file range passed to iomap_file_unshare [+ + +]
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Wed Oct 2 08:02:13 2024 -0700

    iomap: constrain the file range passed to iomap_file_unshare
    
    [ Upstream commit a311a08a4237241fb5b9d219d3e33346de6e83e0 ]
    
    File contents can only be shared (i.e. reflinked) below EOF, so it makes
    no sense to try to unshare ranges beyond EOF.  Constrain the file range
    parameters here so that we don't have to do that in the callers.
    
    Fixes: 5f4e5752a8a3 ("fs: add iomap_file_dirty")
    Signed-off-by: Darrick J. Wong <djwong@kernel.org>
    Link: https://lore.kernel.org/r/20241002150213.GC21853@frogsfrogsfrogs
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Brian Foster <bfoster@redhat.com>
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

iomap: handle a post-direct I/O invalidate race in iomap_write_delalloc_release [+ + +]
Author: Christoph Hellwig <hch@lst.de>
Date:   Tue Sep 10 07:39:03 2024 +0300

    iomap: handle a post-direct I/O invalidate race in iomap_write_delalloc_release
    
    [ Upstream commit 7a9d43eace888a0ee6095035997bb138425844d3 ]
    
    When direct I/O completions invalidates the page cache it holds neither the
    i_rwsem nor the invalidate_lock so it can be racing with
    iomap_write_delalloc_release.  If the search for the end of the region that
    contains data returns the start offset we hit such a race and just need to
    look for the end of the newly created hole instead.
    
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Link: https://lore.kernel.org/r/20240910043949.3481298-2-hch@lst.de
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
iommu/arm-smmu-v3: Do not use devm for the cd table allocations [+ + +]
Author: Jason Gunthorpe <jgg@ziepe.ca>
Date:   Fri Sep 6 12:47:52 2024 -0300

    iommu/arm-smmu-v3: Do not use devm for the cd table allocations
    
    [ Upstream commit 47b2de35cab2b683f69d03515c2658c2d8515323 ]
    
    The master->cd_table is entirely contained within the struct
    arm_smmu_master which is guaranteed to be freed by the core code under
    arm_smmu_release_device().
    
    There is no reason to use devm here, arm_smmu_free_cd_tables() is reliably
    called to free the CD related memory. Remove it and save some memory.
    
    Tested-by: Nicolin Chen <nicolinc@nvidia.com>
    Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
    Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
    Link: https://lore.kernel.org/r/5-v4-6416877274e1+1af-smmuv3_tidy_jgg@nvidia.com
    Signed-off-by: Will Deacon <will@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

iommu/arm-smmu-v3: Match Stall behaviour for S2 [+ + +]
Author: Mostafa Saleh <smostafa@google.com>
Date:   Fri Aug 30 11:03:47 2024 +0000

    iommu/arm-smmu-v3: Match Stall behaviour for S2
    
    [ Upstream commit ce7cb08e22e09f43649b025c849a3ae3b80833c4 ]
    
    According to the spec (ARM IHI 0070 F.b), in
    "5.5 Fault configuration (A, R, S bits)":
        A STE with stage 2 translation enabled and STE.S2S == 0 is
        considered ILLEGAL if SMMU_IDR0.STALL_MODEL == 0b10.
    
    Also described in the pseudocode “SteIllegal()”
        if STE.Config == '11x' then
            [..]
            if eff_idr0_stall_model == '10' && STE.S2S == '0' then
                // stall_model forcing stall, but S2S == 0
                return TRUE;
    
    Which means, S2S must be set when stall model is
    "ARM_SMMU_FEAT_STALL_FORCE", but currently the driver ignores that.
    
    Although, the driver can do the minimum and only set S2S for
    “ARM_SMMU_FEAT_STALL_FORCE”, it is more consistent to match S1
    behaviour, which also sets it for “ARM_SMMU_FEAT_STALL” if the
    master has requested stalls.
    
    Also, since S2 stalls are enabled now, report them to the IOMMU layer
    and for VFIO devices it will fail anyway as VFIO doesn’t register an
    iopf handler.
    
    Signed-off-by: Mostafa Saleh <smostafa@google.com>
    Link: https://lore.kernel.org/r/20240830110349.797399-2-smostafa@google.com
    Signed-off-by: Will Deacon <will@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
iommu/vt-d: Always reserve a domain ID for identity setup [+ + +]
Author: Lu Baolu <baolu.lu@linux.intel.com>
Date:   Mon Sep 2 10:27:13 2024 +0800

    iommu/vt-d: Always reserve a domain ID for identity setup
    
    [ Upstream commit 2c13012e09190174614fd6901857a1b8c199e17d ]
    
    We will use a global static identity domain. Reserve a static domain ID
    for it.
    
    Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
    Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
    Reviewed-by: Kevin Tian <kevin.tian@intel.com>
    Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
    Link: https://lore.kernel.org/r/20240809055431.36513-4-baolu.lu@linux.intel.com
    Signed-off-by: Joerg Roedel <jroedel@suse.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

iommu/vt-d: Fix potential lockup if qi_submit_sync called with 0 count [+ + +]
Author: Sanjay K Kumar <sanjay.k.kumar@intel.com>
Date:   Mon Sep 2 10:27:18 2024 +0800

    iommu/vt-d: Fix potential lockup if qi_submit_sync called with 0 count
    
    [ Upstream commit 3cf74230c139f208b7fb313ae0054386eee31a81 ]
    
    If qi_submit_sync() is invoked with 0 invalidation descriptors (for
    instance, for DMA draining purposes), we can run into a bug where a
    submitting thread fails to detect the completion of invalidation_wait.
    Subsequently, this led to a soft lockup. Currently, there is no impact
    by this bug on the existing users because no callers are submitting
    invalidations with 0 descriptors. This fix will enable future users
    (such as DMA drain) calling qi_submit_sync() with 0 count.
    
    Suppose thread T1 invokes qi_submit_sync() with non-zero descriptors, while
    concurrently, thread T2 calls qi_submit_sync() with zero descriptors. Both
    threads then enter a while loop, waiting for their respective descriptors
    to complete. T1 detects its completion (i.e., T1's invalidation_wait status
    changes to QI_DONE by HW) and proceeds to call reclaim_free_desc() to
    reclaim all descriptors, potentially including adjacent ones of other
    threads that are also marked as QI_DONE.
    
    During this time, while T2 is waiting to acquire the qi->q_lock, the IOMMU
    hardware may complete the invalidation for T2, setting its status to
    QI_DONE. However, if T1's execution of reclaim_free_desc() frees T2's
    invalidation_wait descriptor and changes its status to QI_FREE, T2 will
    not observe the QI_DONE status for its invalidation_wait and will
    indefinitely remain stuck.
    
    This soft lockup does not occur when only non-zero descriptors are
    submitted.In such cases, invalidation descriptors are interspersed among
    wait descriptors with the status QI_IN_USE, acting as barriers. These
    barriers prevent the reclaim code from mistakenly freeing descriptors
    belonging to other submitters.
    
    Considered the following example timeline:
            T1                      T2
    ========================================
            ID1
            WD1
            while(WD1!=QI_DONE)
            unlock
                                    lock
            WD1=QI_DONE*            WD2
                                    while(WD2!=QI_DONE)
                                    unlock
            lock
            WD1==QI_DONE?
            ID1=QI_DONE             WD2=DONE*
            reclaim()
            ID1=FREE
            WD1=FREE
            WD2=FREE
            unlock
                                    soft lockup! T2 never sees QI_DONE in WD2
    
    Where:
    ID = invalidation descriptor
    WD = wait descriptor
    * Written by hardware
    
    The root of the problem is that the descriptor status QI_DONE flag is used
    for two conflicting purposes:
    1. signal a descriptor is ready for reclaim (to be freed)
    2. signal by the hardware that a wait descriptor is complete
    
    The solution (in this patch) is state separation by using QI_FREE flag
    for #1.
    
    Once a thread's invalidation descriptors are complete, their status would
    be set to QI_FREE. The reclaim_free_desc() function would then only
    free descriptors marked as QI_FREE instead of those marked as
    QI_DONE. This change ensures that T2 (from the previous example) will
    correctly observe the completion of its invalidation_wait (marked as
    QI_DONE).
    
    Signed-off-by: Sanjay K Kumar <sanjay.k.kumar@intel.com>
    Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
    Reviewed-by: Kevin Tian <kevin.tian@intel.com>
    Link: https://lore.kernel.org/r/20240728210059.1964602-1-jacob.jun.pan@linux.intel.com
    Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
    Signed-off-by: Joerg Roedel <jroedel@suse.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

iommu/vt-d: Unconditionally flush device TLB for pasid table updates [+ + +]
Author: Lu Baolu <baolu.lu@linux.intel.com>
Date:   Mon Sep 2 10:27:20 2024 +0800

    iommu/vt-d: Unconditionally flush device TLB for pasid table updates
    
    [ Upstream commit 1f5e307ca16c0c19186cbd56ac460a687e6daba0 ]
    
    The caching mode of an IOMMU is irrelevant to the behavior of the device
    TLB. Previously, commit <304b3bde24b5> ("iommu/vt-d: Remove caching mode
    check before device TLB flush") removed this redundant check in the
    domain unmap path.
    
    Checking the caching mode before flushing the device TLB after a pasid
    table entry is updated is unnecessary and can lead to inconsistent
    behavior.
    
    Extends this consistency by removing the caching mode check in the pasid
    table update path.
    
    Suggested-by: Yi Liu <yi.l.liu@intel.com>
    Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
    Link: https://lore.kernel.org/r/20240820030208.20020-1-baolu.lu@linux.intel.com
    Signed-off-by: Joerg Roedel <jroedel@suse.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ipv4: Check !in_dev earlier for ioctl(SIOCSIFADDR). [+ + +]
Author: Kuniyuki Iwashima <kuniyu@amazon.com>
Date:   Fri Aug 9 16:54:02 2024 -0700

    ipv4: Check !in_dev earlier for ioctl(SIOCSIFADDR).
    
    [ Upstream commit e3af3d3c5b26c33a7950e34e137584f6056c4319 ]
    
    dev->ip_ptr could be NULL if we set an invalid MTU.
    
    Even then, if we issue ioctl(SIOCSIFADDR) for a new IPv4 address,
    devinet_ioctl() allocates struct in_ifaddr and fails later in
    inet_set_ifa() because in_dev is NULL.
    
    Let's move the check earlier.
    
    Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
    Link: https://patch.msgid.link/20240809235406.50187-2-kuniyu@amazon.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ipv4: ip_gre: Fix drops of small packets in ipgre_xmit [+ + +]
Author: Anton Danilov <littlesmilingcloud@gmail.com>
Date:   Wed Sep 25 02:51:59 2024 +0300

    ipv4: ip_gre: Fix drops of small packets in ipgre_xmit
    
    [ Upstream commit c4a14f6d9d17ad1e41a36182dd3b8a5fd91efbd7 ]
    
    Regression Description:
    
    Depending on the options specified for the GRE tunnel device, small
    packets may be dropped. This occurs because the pskb_network_may_pull
    function fails due to the packet's insufficient length.
    
    For example, if only the okey option is specified for the tunnel device,
    original (before encapsulation) packets smaller than 28 bytes (including
    the IPv4 header) will be dropped. This happens because the required
    length is calculated relative to the network header, not the skb->head.
    
    Here is how the required length is computed and checked:
    
    * The pull_len variable is set to 28 bytes, consisting of:
      * IPv4 header: 20 bytes
      * GRE header with Key field: 8 bytes
    
    * The pskb_network_may_pull function adds the network offset, shifting
    the checkable space further to the beginning of the network header and
    extending it to the beginning of the packet. As a result, the end of
    the checkable space occurs beyond the actual end of the packet.
    
    Instead of ensuring that 28 bytes are present in skb->head, the function
    is requesting these 28 bytes starting from the network header. For small
    packets, this requested length exceeds the actual packet size, causing
    the check to fail and the packets to be dropped.
    
    This issue affects both locally originated and forwarded packets in
    DMVPN-like setups.
    
    How to reproduce (for local originated packets):
    
      ip link add dev gre1 type gre ikey 1.9.8.4 okey 1.9.8.4 \
              local <your-ip> remote 0.0.0.0
    
      ip link set mtu 1400 dev gre1
      ip link set up dev gre1
      ip address add 192.168.13.1/24 dev gre1
      ip neighbor add 192.168.13.2 lladdr <remote-ip> dev gre1
      ping -s 1374 -c 10 192.168.13.2
      tcpdump -vni gre1
      tcpdump -vni <your-ext-iface> 'ip proto 47'
      ip -s -s -d link show dev gre1
    
    Solution:
    
    Use the pskb_may_pull function instead the pskb_network_may_pull.
    
    Fixes: 80d875cfc9d3 ("ipv4: ip_gre: Avoid skb_pull() failure in ipgre_xmit()")
    Signed-off-by: Anton Danilov <littlesmilingcloud@gmail.com>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Link: https://patch.msgid.link/20240924235158.106062-1-littlesmilingcloud@gmail.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ipv4: Mask upper DSCP bits and ECN bits in NETLINK_FIB_LOOKUP family [+ + +]
Author: Ido Schimmel <idosch@nvidia.com>
Date:   Wed Aug 14 15:52:22 2024 +0300

    ipv4: Mask upper DSCP bits and ECN bits in NETLINK_FIB_LOOKUP family
    
    [ Upstream commit 8fed54758cd248cd311a2b5c1e180abef1866237 ]
    
    The NETLINK_FIB_LOOKUP netlink family can be used to perform a FIB
    lookup according to user provided parameters and communicate the result
    back to user space.
    
    However, unlike other users of the FIB lookup API, the upper DSCP bits
    and the ECN bits of the DS field are not masked, which can result in the
    wrong result being returned.
    
    Solve this by masking the upper DSCP bits and the ECN bits using
    IPTOS_RT_MASK.
    
    The structure that communicates the request and the response is not
    exported to user space, so it is unlikely that this netlink family is
    actually in use [1].
    
    [1] https://lore.kernel.org/netdev/ZpqpB8vJU%2FQ6LSqa@debian/
    
    Signed-off-by: Ido Schimmel <idosch@nvidia.com>
    Reviewed-by: Guillaume Nault <gnault@redhat.com>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
jbd2: correctly compare tids with tid_geq function in jbd2_fc_begin_commit [+ + +]
Author: Kemeng Shi <shikemeng@huaweicloud.com>
Date:   Thu Aug 1 09:38:08 2024 +0800

    jbd2: correctly compare tids with tid_geq function in jbd2_fc_begin_commit
    
    commit f0e3c14802515f60a47e6ef347ea59c2733402aa upstream.
    
    Use tid_geq to compare tids to work over sequence number wraps.
    
    Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
    Cc: stable@kernel.org
    Link: https://patch.msgid.link/20240801013815.2393869-2-shikemeng@huaweicloud.com
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

jbd2: stop waiting for space when jbd2_cleanup_journal_tail() returns error [+ + +]
Author: Baokun Li <libaokun1@huawei.com>
Date:   Thu Jul 18 19:53:36 2024 +0800

    jbd2: stop waiting for space when jbd2_cleanup_journal_tail() returns error
    
    commit f5cacdc6f2bb2a9bf214469dd7112b43dd2dd68a upstream.
    
    In __jbd2_log_wait_for_space(), we might call jbd2_cleanup_journal_tail()
    to recover some journal space. But if an error occurs while executing
    jbd2_cleanup_journal_tail() (e.g., an EIO), we don't stop waiting for free
    space right away, we try other branches, and if j_committing_transaction
    is NULL (i.e., the tid is 0), we will get the following complain:
    
    ============================================
    JBD2: I/O error when updating journal superblock for sdd-8.
    __jbd2_log_wait_for_space: needed 256 blocks and only had 217 space available
    __jbd2_log_wait_for_space: no way to get more journal space in sdd-8
    ------------[ cut here ]------------
    WARNING: CPU: 2 PID: 139804 at fs/jbd2/checkpoint.c:109 __jbd2_log_wait_for_space+0x251/0x2e0
    Modules linked in:
    CPU: 2 PID: 139804 Comm: kworker/u8:3 Not tainted 6.6.0+ #1
    RIP: 0010:__jbd2_log_wait_for_space+0x251/0x2e0
    Call Trace:
     <TASK>
     add_transaction_credits+0x5d1/0x5e0
     start_this_handle+0x1ef/0x6a0
     jbd2__journal_start+0x18b/0x340
     ext4_dirty_inode+0x5d/0xb0
     __mark_inode_dirty+0xe4/0x5d0
     generic_update_time+0x60/0x70
    [...]
    ============================================
    
    So only if jbd2_cleanup_journal_tail() returns 1, i.e., there is nothing to
    clean up at the moment, continue to try to reclaim free space in other ways.
    
    Note that this fix relies on commit 6f6a6fda2945 ("jbd2: fix ocfs2 corrupt
    when updating journal superblock fails") to make jbd2_cleanup_journal_tail
    return the correct error code.
    
    Fixes: 8c3f25d8950c ("jbd2: don't give up looking for space so easily in __jbd2_log_wait_for_space")
    Cc: stable@kernel.org
    Signed-off-by: Baokun Li <libaokun1@huawei.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Link: https://patch.msgid.link/20240718115336.2554501-1-libaokun@huaweicloud.com
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
jfs: check if leafidx greater than num leaves per dmap tree [+ + +]
Author: Edward Adam Davis <eadavis@qq.com>
Date:   Sat Aug 24 09:25:23 2024 +0800

    jfs: check if leafidx greater than num leaves per dmap tree
    
    [ Upstream commit d64ff0d2306713ff084d4b09f84ed1a8c75ecc32 ]
    
    syzbot report a out of bounds in dbSplit, it because dmt_leafidx greater
    than num leaves per dmap tree, add a checking for dmt_leafidx in dbFindLeaf.
    
    Shaggy:
    Modified sanity check to apply to control pages as well as leaf pages.
    
    Reported-and-tested-by: syzbot+dca05492eff41f604890@syzkaller.appspotmail.com
    Closes: https://syzkaller.appspot.com/bug?extid=dca05492eff41f604890
    Signed-off-by: Edward Adam Davis <eadavis@qq.com>
    Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

jfs: Fix uaf in dbFreeBits [+ + +]
Author: Edward Adam Davis <eadavis@qq.com>
Date:   Sat Aug 24 10:50:48 2024 +0800

    jfs: Fix uaf in dbFreeBits
    
    [ Upstream commit d6c1b3599b2feb5c7291f5ac3a36e5fa7cedb234 ]
    
    [syzbot reported]
    ==================================================================
    BUG: KASAN: slab-use-after-free in __mutex_lock_common kernel/locking/mutex.c:587 [inline]
    BUG: KASAN: slab-use-after-free in __mutex_lock+0xfe/0xd70 kernel/locking/mutex.c:752
    Read of size 8 at addr ffff8880229254b0 by task syz-executor357/5216
    
    CPU: 0 UID: 0 PID: 5216 Comm: syz-executor357 Not tainted 6.11.0-rc3-syzkaller-00156-gd7a5aa4b3c00 #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/27/2024
    Call Trace:
     <TASK>
     __dump_stack lib/dump_stack.c:93 [inline]
     dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119
     print_address_description mm/kasan/report.c:377 [inline]
     print_report+0x169/0x550 mm/kasan/report.c:488
     kasan_report+0x143/0x180 mm/kasan/report.c:601
     __mutex_lock_common kernel/locking/mutex.c:587 [inline]
     __mutex_lock+0xfe/0xd70 kernel/locking/mutex.c:752
     dbFreeBits+0x7ea/0xd90 fs/jfs/jfs_dmap.c:2390
     dbFreeDmap fs/jfs/jfs_dmap.c:2089 [inline]
     dbFree+0x35b/0x680 fs/jfs/jfs_dmap.c:409
     dbDiscardAG+0x8a9/0xa20 fs/jfs/jfs_dmap.c:1650
     jfs_ioc_trim+0x433/0x670 fs/jfs/jfs_discard.c:100
     jfs_ioctl+0x2d0/0x3e0 fs/jfs/ioctl.c:131
     vfs_ioctl fs/ioctl.c:51 [inline]
     __do_sys_ioctl fs/ioctl.c:907 [inline]
     __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:893
     do_syscall_x64 arch/x86/entry/common.c:52 [inline]
     do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
    
    Freed by task 5218:
     kasan_save_stack mm/kasan/common.c:47 [inline]
     kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
     kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:579
     poison_slab_object+0xe0/0x150 mm/kasan/common.c:240
     __kasan_slab_free+0x37/0x60 mm/kasan/common.c:256
     kasan_slab_free include/linux/kasan.h:184 [inline]
     slab_free_hook mm/slub.c:2252 [inline]
     slab_free mm/slub.c:4473 [inline]
     kfree+0x149/0x360 mm/slub.c:4594
     dbUnmount+0x11d/0x190 fs/jfs/jfs_dmap.c:278
     jfs_mount_rw+0x4ac/0x6a0 fs/jfs/jfs_mount.c:247
     jfs_remount+0x3d1/0x6b0 fs/jfs/super.c:454
     reconfigure_super+0x445/0x880 fs/super.c:1083
     vfs_cmd_reconfigure fs/fsopen.c:263 [inline]
     vfs_fsconfig_locked fs/fsopen.c:292 [inline]
     __do_sys_fsconfig fs/fsopen.c:473 [inline]
     __se_sys_fsconfig+0xb6e/0xf80 fs/fsopen.c:345
     do_syscall_x64 arch/x86/entry/common.c:52 [inline]
     do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
    
    [Analysis]
    There are two paths (dbUnmount and jfs_ioc_trim) that generate race
    condition when accessing bmap, which leads to the occurrence of uaf.
    
    Use the lock s_umount to synchronize them, in order to avoid uaf caused
    by race condition.
    
    Reported-and-tested-by: syzbot+3c010e21296f33a5dc16@syzkaller.appspotmail.com
    Signed-off-by: Edward Adam Davis <eadavis@qq.com>
    Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

jfs: Fix uninit-value access of new_ea in ea_buffer [+ + +]
Author: Zhao Mengmeng <zhaomengmeng@kylinos.cn>
Date:   Wed Sep 4 09:07:58 2024 +0800

    jfs: Fix uninit-value access of new_ea in ea_buffer
    
    [ Upstream commit 2b59ffad47db1c46af25ccad157bb3b25147c35c ]
    
    syzbot reports that lzo1x_1_do_compress is using uninit-value:
    
    =====================================================
    BUG: KMSAN: uninit-value in lzo1x_1_do_compress+0x19f9/0x2510 lib/lzo/lzo1x_compress.c:178
    
    ...
    
    Uninit was stored to memory at:
     ea_put fs/jfs/xattr.c:639 [inline]
    
    ...
    
    Local variable ea_buf created at:
     __jfs_setxattr+0x5d/0x1ae0 fs/jfs/xattr.c:662
     __jfs_xattr_set+0xe6/0x1f0 fs/jfs/xattr.c:934
    
    =====================================================
    
    The reason is ea_buf->new_ea is not initialized properly.
    
    Fix this by using memset to empty its content at the beginning
    in ea_get().
    
    Reported-by: syzbot+02341e0daa42a15ce130@syzkaller.appspotmail.com
    Closes: https://syzkaller.appspot.com/bug?extid=02341e0daa42a15ce130
    Signed-off-by: Zhao Mengmeng <zhaomengmeng@kylinos.cn>
    Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

jfs: UBSAN: shift-out-of-bounds in dbFindBits [+ + +]
Author: Remington Brasga <rbrasga@uci.edu>
Date:   Wed Jul 10 00:12:44 2024 +0000

    jfs: UBSAN: shift-out-of-bounds in dbFindBits
    
    [ Upstream commit b0b2fc815e514221f01384f39fbfbff65d897e1c ]
    
    Fix issue with UBSAN throwing shift-out-of-bounds warning.
    
    Reported-by: syzbot+e38d703eeb410b17b473@syzkaller.appspotmail.com
    Signed-off-by: Remington Brasga <rbrasga@uci.edu>
    Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
jump_label: Fix static_key_slow_dec() yet again [+ + +]
Author: Peter Zijlstra <peterz@infradead.org>
Date:   Mon Sep 9 12:50:09 2024 +0200

    jump_label: Fix static_key_slow_dec() yet again
    
    [ Upstream commit 1d7f856c2ca449f04a22d876e36b464b7a9d28b6 ]
    
    While commit 83ab38ef0a0b ("jump_label: Fix concurrency issues in
    static_key_slow_dec()") fixed one problem, it created yet another,
    notably the following is now possible:
    
      slow_dec
        if (try_dec) // dec_not_one-ish, false
        // enabled == 1
                                    slow_inc
                                      if (inc_not_disabled) // inc_not_zero-ish
                                      // enabled == 2
                                        return
    
        guard((mutex)(&jump_label_mutex);
        if (atomic_cmpxchg(1,0)==1) // false, we're 2
    
                                    slow_dec
                                      if (try-dec) // dec_not_one, true
                                      // enabled == 1
                                        return
        else
          try_dec() // dec_not_one, false
          WARN
    
    Use dec_and_test instead of cmpxchg(), like it was prior to
    83ab38ef0a0b. Add a few WARNs for the paranoid.
    
    Fixes: 83ab38ef0a0b ("jump_label: Fix concurrency issues in static_key_slow_dec()")
    Reported-by: "Darrick J. Wong" <djwong@kernel.org>
    Tested-by: Klara Modin <klarasmodin@gmail.com>
    Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
kconfig: fix infinite loop in sym_calc_choice() [+ + +]
Author: Masahiro Yamada <masahiroy@kernel.org>
Date:   Wed Sep 25 20:25:31 2024 +0900

    kconfig: fix infinite loop in sym_calc_choice()
    
    [ Upstream commit 4d46b5b623e0adee1153b1d80689211e5094ae44 ]
    
    Since commit f79dc03fe68c ("kconfig: refactor choice value calculation"),
    Kconfig for ARCH=powerpc may result in an infinite loop. This occurs
    because there are two entries for POWERPC64_CPU in a choice block.
    
    If the same symbol appears twice in a choice block, the ->choice_link
    node is added twice to ->choice_members, resulting a corrupted linked
    list.
    
    A simple test case is:
    
        choice
                prompt "choice"
    
        config A
                bool "A"
    
        config B
                bool "B 1"
    
        config B
                bool "B 2"
    
        endchoice
    
    Running 'make defconfig' results in an infinite loop.
    
    One solution is to replace the current two entries:
    
        config POWERPC64_CPU
                bool "Generic (POWER5 and PowerPC 970 and above)"
                depends on PPC_BOOK3S_64 && !CPU_LITTLE_ENDIAN
                select PPC_64S_HASH_MMU
    
        config POWERPC64_CPU
                bool "Generic (POWER8 and above)"
                depends on PPC_BOOK3S_64 && CPU_LITTLE_ENDIAN
                select ARCH_HAS_FAST_MULTIPLIER
                select PPC_64S_HASH_MMU
                select PPC_HAS_LBARX_LHARX
    
    with the following single entry:
    
        config POWERPC64_CPU
                bool "Generic 64 bit powerpc"
                depends on PPC_BOOK3S_64
                select ARCH_HAS_FAST_MULTIPLIER if CPU_LITTLE_ENDIAN
                select PPC_64S_HASH_MMU
                select PPC_HAS_LBARX_LHARX if CPU_LITTLE_ENDIAN
    
    In my opinion, the latter looks cleaner, but PowerPC maintainers may
    prefer to display different prompts depending on CPU_LITTLE_ENDIAN.
    
    For now, this commit fixes the issue in Kconfig, restoring the original
    behavior. I will reconsider whether such a use case is worth supporting.
    
    Fixes: f79dc03fe68c ("kconfig: refactor choice value calculation")
    Reported-by: Marco Bonelli <marco@mebeim.net>
    Closes: https://lore.kernel.org/all/1763151587.3581913.1727224126288@privateemail.com/
    Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

kconfig: qconf: fix buffer overflow in debug links [+ + +]
Author: Masahiro Yamada <masahiroy@kernel.org>
Date:   Tue Oct 1 18:02:22 2024 +0900

    kconfig: qconf: fix buffer overflow in debug links
    
    [ Upstream commit 984ed20ece1c6c20789ece040cbff3eb1a388fa9 ]
    
    If you enable "Option -> Show Debug Info" and click a link, the program
    terminates with the following error:
    
        *** buffer overflow detected ***: terminated
    
    The buffer overflow is caused by the following line:
    
        strcat(data, "$");
    
    The buffer needs one more byte to accommodate the additional character.
    
    Fixes: c4f7398bee9c ("kconfig: qconf: make debug links work again")
    Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

kconfig: qconf: move conf_read() before drawing tree pain [+ + +]
Author: Masahiro Yamada <masahiroy@kernel.org>
Date:   Tue Oct 1 02:02:23 2024 +0900

    kconfig: qconf: move conf_read() before drawing tree pain
    
    [ Upstream commit da724c33b685463720b1c625ac440e894dc57ec0 ]
    
    The constructor of ConfigMainWindow() calls show*View(), which needs
    to calculate symbol values. conf_read() must be called before that.
    
    Fixes: 060e05c3b422 ("kconfig: qconf: remove initial call to conf_changed()")
    Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
kselftest/devices/probe: Fix SyntaxWarning in regex strings for Python3 [+ + +]
Author: Alessandro Zanni <alessandro.zanni87@gmail.com>
Date:   Tue Aug 6 14:14:50 2024 +0200

    kselftest/devices/probe: Fix SyntaxWarning in regex strings for Python3
    
    [ Upstream commit a19008256d05e726f29f43c6a307e45482c082c3 ]
    
    Insert raw strings to prevent Python3 from interpreting string literals
    as Unicode strings and "\d" as invalid escaped sequence.
    
    Fix the warnings:
    
    tools/testing/selftests/devices/probe/test_discoverable_devices.py:48:
    SyntaxWarning: invalid escape sequence '\d' usb_controller_sysfs_dir =
    "usb[\d]+"
    
    tools/testing/selftests/devices/probe/test_discoverable_devices.py: 94:
    SyntaxWarning: invalid escape sequence '\d' re_usb_version =
    re.compile("PRODUCT=.*/(\d)/.*")
    
    Fixes: dacf1d7a78bf ("kselftest: Add test to verify probe of devices from discoverable buses")
    
    Reviewed-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
    Signed-off-by: Alessandro Zanni <alessandro.zanni87@gmail.com>
    Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
kselftests: mm: fix wrong __NR_userfaultfd value [+ + +]
Author: Muhammad Usama Anjum <usama.anjum@collabora.com>
Date:   Mon Sep 23 10:38:36 2024 +0500

    kselftests: mm: fix wrong __NR_userfaultfd value
    
    commit f30beffd977e98c33550bbeb6f278d157ff54844 upstream.
    
    grep -rnIF "#define __NR_userfaultfd"
    tools/include/uapi/asm-generic/unistd.h:681:#define __NR_userfaultfd 282
    arch/x86/include/generated/uapi/asm/unistd_32.h:374:#define
    __NR_userfaultfd 374
    arch/x86/include/generated/uapi/asm/unistd_64.h:327:#define
    __NR_userfaultfd 323
    arch/x86/include/generated/uapi/asm/unistd_x32.h:282:#define
    __NR_userfaultfd (__X32_SYSCALL_BIT + 323)
    arch/arm/include/generated/uapi/asm/unistd-eabi.h:347:#define
    __NR_userfaultfd (__NR_SYSCALL_BASE + 388)
    arch/arm/include/generated/uapi/asm/unistd-oabi.h:359:#define
    __NR_userfaultfd (__NR_SYSCALL_BASE + 388)
    include/uapi/asm-generic/unistd.h:681:#define __NR_userfaultfd 282
    
    The number is dependent on the architecture. The above data shows that:
    x86     374
    x86_64  323
    
    The value of __NR_userfaultfd was changed to 282 when asm-generic/unistd.h
    was included.  It makes the test to fail every time as the correct number
    of this syscall on x86_64 is 323.  Fix the header to asm/unistd.h.
    
    Link: https://lkml.kernel.org/r/20240923053836.3270393-1-usama.anjum@collabora.com
    Fixes: a5c6bc590094 ("selftests/mm: remove local __NR_* definitions")
    Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
    Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
    Reviewed-by: David Hildenbrand <david@redhat.com>
    Cc: John Hubbard <jhubbard@nvidia.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
ksmbd: add refcnt to ksmbd_conn struct [+ + +]
Author: Namjae Jeon <linkinjeon@kernel.org>
Date:   Tue Sep 3 20:28:08 2024 +0900

    ksmbd: add refcnt to ksmbd_conn struct
    
    [ Upstream commit ee426bfb9d09b29987369b897fe9b6485ac2be27 ]
    
    When sending an oplock break request, opinfo->conn is used,
    But freed ->conn can be used on multichannel.
    This patch add a reference count to the ksmbd_conn struct
    so that it can be freed when it is no longer used.
    
    Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ksmbd: fix warning: comparison of distinct pointer types lacks a cast [+ + +]
Author: Namjae Jeon <linkinjeon@kernel.org>
Date:   Thu Sep 19 09:22:57 2024 +0900

    ksmbd: fix warning: comparison of distinct pointer types lacks a cast
    
    [ Upstream commit 289ebd9afeb94862d96c89217068943f1937df5b ]
    
    smb2pdu.c: In function ‘smb2_open’:
    ./include/linux/minmax.h:20:28: warning: comparison of distinct
    pointer types lacks a cast
       20 |  (!!(sizeof((typeof(x) *)1 == (typeof(y) *)1)))
          |                            ^~
    ./include/linux/minmax.h:26:4: note: in expansion of macro ‘__typecheck’
       26 |   (__typecheck(x, y) && __no_side_effects(x, y))
          |    ^~~~~~~~~~~
    ./include/linux/minmax.h:36:24: note: in expansion of macro ‘__safe_cmp’
       36 |  __builtin_choose_expr(__safe_cmp(x, y), \
          |                        ^~~~~~~~~~
    ./include/linux/minmax.h:45:19: note: in expansion of macro ‘__careful_cmp’
       45 | #define min(x, y) __careful_cmp(x, y, <)
          |                   ^~~~~~~~~~~~~
    /home/linkinjeon/git/smbd_work/ksmbd/smb2pdu.c:3713:27: note: in
    expansion of macro ‘min’
     3713 |     fp->durable_timeout = min(dh_info.timeout,
    
    Fixes: c8efcc786146 ("ksmbd: add support for durable handles v1/v2")
    Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
KVM: arm64: Fix kvm_has_feat*() handling of negative features [+ + +]
Author: Marc Zyngier <maz@kernel.org>
Date:   Wed Oct 2 21:42:39 2024 +0100

    KVM: arm64: Fix kvm_has_feat*() handling of negative features
    
    commit a1d402abf8e3ff1d821e88993fc5331784fac0da upstream.
    
    Oliver reports that the kvm_has_feat() helper is not behaviing as
    expected for negative feature. On investigation, the main issue
    seems to be caused by the following construct:
    
     #define get_idreg_field(kvm, id, fld)                          \
            (id##_##fld##_SIGNED ?                                  \
             get_idreg_field_signed(kvm, id, fld) :                 \
             get_idreg_field_unsigned(kvm, id, fld))
    
    where one side of the expression evaluates as something signed,
    and the other as something unsigned. In retrospect, this is totally
    braindead, as the compiler converts this into an unsigned expression.
    When compared to something that is 0, the test is simply elided.
    
    Epic fail. Similar issue exists in the expand_field_sign() macro.
    
    The correct way to handle this is to chose between signed and unsigned
    comparisons, so that both sides of the ternary expression are of the
    same type (bool).
    
    In order to keep the code readable (sort of), we introduce new
    comparison primitives taking an operator as a parameter, and
    rewrite the kvm_has_feat*() helpers in terms of these primitives.
    
    Fixes: c62d7a23b947 ("KVM: arm64: Add feature checking helpers")
    Reported-by: Oliver Upton <oliver.upton@linux.dev>
    Tested-by: Oliver Upton <oliver.upton@linux.dev>
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20241002204239.2051637-1-maz@kernel.org
    Signed-off-by: Marc Zyngier <maz@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
l2tp: free sessions using rcu [+ + +]
Author: James Chapman <jchapman@katalix.com>
Date:   Mon Jul 29 16:38:08 2024 +0100

    l2tp: free sessions using rcu
    
    [ Upstream commit d17e89999574aca143dd4ede43e4382d32d98724 ]
    
    l2tp sessions may be accessed under an rcu read lock. Have them freed
    via rcu and remove the now unneeded synchronize_rcu when a session is
    removed.
    
    Signed-off-by: James Chapman <jchapman@katalix.com>
    Signed-off-by: Tom Parkin <tparkin@katalix.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

l2tp: prevent possible tunnel refcount underflow [+ + +]
Author: James Chapman <jchapman@katalix.com>
Date:   Mon Jul 29 16:38:10 2024 +0100

    l2tp: prevent possible tunnel refcount underflow
    
    [ Upstream commit 24256415d18695b46da06c93135f5b51c548b950 ]
    
    When a session is created, it sets a backpointer to its tunnel. When
    the session refcount drops to 0, l2tp_session_free drops the tunnel
    refcount if session->tunnel is non-NULL. However, session->tunnel is
    set in l2tp_session_create, before the tunnel refcount is incremented
    by l2tp_session_register, which leaves a small window where
    session->tunnel is non-NULL when the tunnel refcount hasn't been
    bumped.
    
    Moving the assignment to l2tp_session_register is trivial but
    l2tp_session_create calls l2tp_session_set_header_len which uses
    session->tunnel to get the tunnel's encap. Add an encap arg to
    l2tp_session_set_header_len to avoid using session->tunnel.
    
    If l2tpv3 sessions have colliding IDs, it is possible for
    l2tp_v3_session_get to race with l2tp_session_register and fetch a
    session which doesn't yet have session->tunnel set. Add a check for
    this case.
    
    Signed-off-by: James Chapman <jchapman@katalix.com>
    Signed-off-by: Tom Parkin <tparkin@katalix.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

l2tp: use rcu list add/del when updating lists [+ + +]
Author: James Chapman <jchapman@katalix.com>
Date:   Mon Jul 29 16:38:11 2024 +0100

    l2tp: use rcu list add/del when updating lists
    
    [ Upstream commit 89b768ec2dfefaeba5212de14fc71368e12d06ba ]
    
    l2tp_v3_session_htable and tunnel->session_list are read by lockless
    getters using RCU. Use rcu list variants when adding or removing list
    items.
    
    Signed-off-by: James Chapman <jchapman@katalix.com>
    Signed-off-by: Tom Parkin <tparkin@katalix.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
leds: pca9532: Remove irrelevant blink configuration error message [+ + +]
Author: Bastien Curutchet <bastien.curutchet@bootlin.com>
Date:   Mon Aug 26 15:32:37 2024 +0200

    leds: pca9532: Remove irrelevant blink configuration error message
    
    commit 2aad93b6de0d874038d3d7958be05011284cd6b9 upstream.
    
    The update_hw_blink() function prints an error message when hardware is
    not able to handle a blink configuration on its own. IMHO, this isn't a
    'real' error since the software fallback is used afterwards.
    
    Remove the error messages to avoid flooding the logs with unnecessary
    messages.
    
    Cc: stable@vger.kernel.org
    Fixes: 48ca7f302cfc ("leds: pca9532: Use PWM1 for hardware blinking")
    Signed-off-by: Bastien Curutchet <bastien.curutchet@bootlin.com>
    Link: https://lore.kernel.org/r/20240826133237.134604-1-bastien.curutchet@bootlin.com
    Signed-off-by: Lee Jones <lee@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
lib/buildid: harden build ID parsing logic [+ + +]
Author: Andrii Nakryiko <andrii@kernel.org>
Date:   Thu Aug 29 10:42:23 2024 -0700

    lib/buildid: harden build ID parsing logic
    
    commit 905415ff3ffb1d7e5afa62bacabd79776bd24606 upstream.
    
    Harden build ID parsing logic, adding explicit READ_ONCE() where it's
    important to have a consistent value read and validated just once.
    
    Also, as pointed out by Andi Kleen, we need to make sure that entire ELF
    note is within a page bounds, so move the overflow check up and add an
    extra note_size boundaries validation.
    
    Fixes tag below points to the code that moved this code into
    lib/buildid.c, and then subsequently was used in perf subsystem, making
    this code exposed to perf_event_open() users in v5.12+.
    
    Cc: stable@vger.kernel.org
    Reviewed-by: Eduard Zingerman <eddyz87@gmail.com>
    Reviewed-by: Jann Horn <jannh@google.com>
    Suggested-by: Andi Kleen <ak@linux.intel.com>
    Fixes: bd7525dacd7e ("bpf: Move stack_map_get_build_id into lib")
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/r/20240829174232.3133883-2-andrii@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
Linux: Linux 6.11.3 [+ + +]
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date:   Thu Oct 10 12:04:18 2024 +0200

    Linux 6.11.3
    
    Link: https://lore.kernel.org/r/20241008115702.214071228@linuxfoundation.org
    Tested-by: Pavel Machek (CIP) <pavel@denx.de>
    Tested-by: Peter Schneider <pschneider1968@googlemail.com>
    Tested-by: Ronald Warsow <rwarsow@gmx.de>
    Tested-by: Markus Reichelt <lkt+2023@mareichelt.com>
    Tested-by: Mark Brown <broonie@kernel.org>
    Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
    Tested-by: Justin M. Forbes <jforbes@fedoraproject.org>
    Tested-by: Shuah Khan <skhan@linuxfoundation.org>
    Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
    Tested-by: Christian Heusel <christian@heusel.eu>
    Tested-by: Kexy Biscuit <kexybiscuit@aosc.io>
    Tested-by: Ron Economos <re@w6rz.net>
    Tested-by: kernelci.org bot <bot@kernelci.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
mac802154: Fix potential RCU dereference issue in mac802154_scan_worker [+ + +]
Author: Jiawei Ye <jiawei.ye@foxmail.com>
Date:   Tue Sep 24 06:58:05 2024 +0000

    mac802154: Fix potential RCU dereference issue in mac802154_scan_worker
    
    commit bff1709b3980bd7f80be6786f64cc9a9ee9e56da upstream.
    
    In the `mac802154_scan_worker` function, the `scan_req->type` field was
    accessed after the RCU read-side critical section was unlocked. According
    to RCU usage rules, this is illegal and can lead to unpredictable
    behavior, such as accessing memory that has been updated or causing
    use-after-free issues.
    
    This possible bug was identified using a static analysis tool developed
    by myself, specifically designed to detect RCU-related issues.
    
    To address this, the `scan_req->type` value is now stored in a local
    variable `scan_req_type` while still within the RCU read-side critical
    section. The `scan_req_type` is then used after the RCU lock is released,
    ensuring that the type value is safely accessed without violating RCU
    rules.
    
    Fixes: e2c3e6f53a7a ("mac802154: Handle active scanning")
    Cc: stable@vger.kernel.org
    Signed-off-by: Jiawei Ye <jiawei.ye@foxmail.com>
    Acked-by: Miquel Raynal <miquel.raynal@bootlin.com>
    Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
    Link: https://lore.kernel.org/tencent_3B2F4F2B4DA30FAE2F51A9634A16B3AD4908@qq.com
    Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
mailbox: ARM_MHU_V3 should depend on ARM64 [+ + +]
Author: Geert Uytterhoeven <geert+renesas@glider.be>
Date:   Thu Aug 29 15:58:53 2024 +0200

    mailbox: ARM_MHU_V3 should depend on ARM64
    
    [ Upstream commit 0e4ed48292c55eeb0afab22f8930b556f17eaad2 ]
    
    The ARM MHUv3 controller is only present on ARM64 SoCs.  Hence add a
    dependency on ARM64, to prevent asking the user about this driver when
    configuring a kernel for a different architecture than ARM64.
    
    Fixes: ca1a8680b134b5e6 ("mailbox: arm_mhuv3: Add driver")
    Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
    Acked-by: Sudeep Holla <sudeep.holla@arm.com>
    Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

mailbox: bcm2835: Fix timeout during suspend mode [+ + +]
Author: Stefan Wahren <wahrenst@gmx.net>
Date:   Wed Aug 21 23:40:44 2024 +0200

    mailbox: bcm2835: Fix timeout during suspend mode
    
    [ Upstream commit dc09f007caed3b2f6a3b6bd7e13777557ae22bfd ]
    
    During noirq suspend phase the Raspberry Pi power driver suffer of
    firmware property timeouts. The reason is that the IRQ of the underlying
    BCM2835 mailbox is disabled and rpi_firmware_property_list() will always
    run into a timeout [1].
    
    Since the VideoCore side isn't consider as a wakeup source, set the
    IRQF_NO_SUSPEND flag for the mailbox IRQ in order to keep it enabled
    during suspend-resume cycle.
    
    [1]
    PM: late suspend of devices complete after 1.754 msecs
    WARNING: CPU: 0 PID: 438 at drivers/firmware/raspberrypi.c:128
     rpi_firmware_property_list+0x204/0x22c
    Firmware transaction 0x00028001 timeout
    Modules linked in:
    CPU: 0 PID: 438 Comm: bash Tainted: G         C         6.9.3-dirty #17
    Hardware name: BCM2835
    Call trace:
    unwind_backtrace from show_stack+0x18/0x1c
    show_stack from dump_stack_lvl+0x34/0x44
    dump_stack_lvl from __warn+0x88/0xec
    __warn from warn_slowpath_fmt+0x7c/0xb0
    warn_slowpath_fmt from rpi_firmware_property_list+0x204/0x22c
    rpi_firmware_property_list from rpi_firmware_property+0x68/0x8c
    rpi_firmware_property from rpi_firmware_set_power+0x54/0xc0
    rpi_firmware_set_power from _genpd_power_off+0xe4/0x148
    _genpd_power_off from genpd_sync_power_off+0x7c/0x11c
    genpd_sync_power_off from genpd_finish_suspend+0xcc/0xe0
    genpd_finish_suspend from dpm_run_callback+0x78/0xd0
    dpm_run_callback from device_suspend_noirq+0xc0/0x238
    device_suspend_noirq from dpm_suspend_noirq+0xb0/0x168
    dpm_suspend_noirq from suspend_devices_and_enter+0x1b8/0x5ac
    suspend_devices_and_enter from pm_suspend+0x254/0x2e4
    pm_suspend from state_store+0xa8/0xd4
    state_store from kernfs_fop_write_iter+0x154/0x1a0
    kernfs_fop_write_iter from vfs_write+0x12c/0x184
    vfs_write from ksys_write+0x78/0xc0
    ksys_write from ret_fast_syscall+0x0/0x54
    Exception stack(0xcc93dfa8 to 0xcc93dff0)
    [...]
    PM: noirq suspend of devices complete after 3095.584 msecs
    
    Link: https://github.com/raspberrypi/firmware/issues/1894
    Fixes: 0bae6af6d704 ("mailbox: Enable BCM2835 mailbox support")
    Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
    Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
    Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

mailbox: rockchip: fix a typo in module autoloading [+ + +]
Author: Liao Chen <liaochen4@huawei.com>
Date:   Wed Aug 14 02:51:47 2024 +0000

    mailbox: rockchip: fix a typo in module autoloading
    
    [ Upstream commit e92d87c9c5d769e4cb1dd7c90faa38dddd7e52e3 ]
    
    MODULE_DEVICE_TABLE(of, rockchip_mbox_of_match) could let the module
    properly autoloaded based on the alias from of_device_id table. It
    should be 'rockchip_mbox_of_match' instead of 'rockchp_mbox_of_match',
    just fix it.
    
    Fixes: f70ed3b5dc8b ("mailbox: rockchip: Add Rockchip mailbox driver")
    Signed-off-by: Liao Chen <liaochen4@huawei.com>
    Reviewed-by: Heiko Stuebner <heiko@sntech.de>
    Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
media: i2c: ar0521: Use cansleep version of gpiod_set_value() [+ + +]
Author: Alexander Shiyan <eagle.alexander923@gmail.com>
Date:   Thu Aug 29 08:48:49 2024 +0300

    media: i2c: ar0521: Use cansleep version of gpiod_set_value()
    
    commit bee1aed819a8cda47927436685d216906ed17f62 upstream.
    
    If we use GPIO reset from I2C port expander, we must use *_cansleep()
    variant of GPIO functions.
    This was not done in ar0521_power_on()/ar0521_power_off() functions.
    Let's fix that.
    
    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 11 at drivers/gpio/gpiolib.c:3496 gpiod_set_value+0x74/0x7c
    Modules linked in:
    CPU: 0 PID: 11 Comm: kworker/u16:0 Not tainted 6.10.0 #53
    Hardware name: Diasom DS-RK3568-SOM-EVB (DT)
    Workqueue: events_unbound deferred_probe_work_func
    pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    pc : gpiod_set_value+0x74/0x7c
    lr : ar0521_power_on+0xcc/0x290
    sp : ffffff8001d7ab70
    x29: ffffff8001d7ab70 x28: ffffff80027dcc90 x27: ffffff8003c82000
    x26: ffffff8003ca9250 x25: ffffffc080a39c60 x24: ffffff8003ca9088
    x23: ffffff8002402720 x22: ffffff8003ca9080 x21: ffffff8003ca9088
    x20: 0000000000000000 x19: ffffff8001eb2a00 x18: ffffff80efeeac80
    x17: 756d2d6332692f30 x16: 0000000000000000 x15: 0000000000000000
    x14: ffffff8001d91d40 x13: 0000000000000016 x12: ffffffc080e98930
    x11: ffffff8001eb2880 x10: 0000000000000890 x9 : ffffff8001d7a9f0
    x8 : ffffff8001d92570 x7 : ffffff80efeeac80 x6 : 000000003fc6e780
    x5 : ffffff8001d91c80 x4 : 0000000000000002 x3 : 0000000000000000
    x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000001
    Call trace:
     gpiod_set_value+0x74/0x7c
     ar0521_power_on+0xcc/0x290
    ...
    
    Signed-off-by: Alexander Shiyan <eagle.alexander923@gmail.com>
    Fixes: 852b50aeed15 ("media: On Semi AR0521 sensor driver")
    Cc: stable@vger.kernel.org
    Acked-by: Krzysztof Hałasa <khalasa@piap.pl>
    Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
    Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

media: imx335: Fix reset-gpio handling [+ + +]
Author: Umang Jain <umang.jain@ideasonboard.com>
Date:   Fri Aug 30 11:41:52 2024 +0530

    media: imx335: Fix reset-gpio handling
    
    commit 99d30e2fdea4086be4e66e2deb10de854b547ab8 upstream.
    
    Rectify the logical value of reset-gpio so that it is set to
    0 (disabled) during power-on and to 1 (enabled) during power-off.
    
    Set the reset-gpio to GPIO_OUT_HIGH at initialization time to make
    sure it starts off in reset. Also drop the "Set XCLR" comment which
    is not-so-informative.
    
    The existing usage of imx335 had reset-gpios polarity inverted
    (GPIO_ACTIVE_HIGH) in their device-tree sources. With this patch
    included, those DTS will not be able to stream imx335 anymore. The
    reset-gpio polarity will need to be rectified in the device-tree
    sources as shown in [1] example, in order to get imx335 functional
    again (as it remains in reset prior to this fix).
    
    Cc: stable@vger.kernel.org
    Fixes: 45d19b5fb9ae ("media: i2c: Add imx335 camera sensor driver")
    Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
    Link: https://lore.kernel.org/linux-media/20240729110437.199428-1-umang.jain@ideasonboard.com/
    Signed-off-by: Umang Jain <umang.jain@ideasonboard.com>
    Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
    Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

media: ov5675: Fix power on/off delay timings [+ + +]
Author: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Date:   Sat Jul 13 23:33:29 2024 +0100

    media: ov5675: Fix power on/off delay timings
    
    commit 719ec29fceda2f19c833d2784b1574638320400f upstream.
    
    The ov5675 specification says that the gap between XSHUTDN deassert and the
    first I2C transaction should be a minimum of 8192 XVCLK cycles.
    
    Right now we use a usleep_rage() that gives a sleep time of between about
    430 and 860 microseconds.
    
    On the Lenovo X13s we have observed that in about 1/20 cases the current
    timing is too tight and we start transacting before the ov5675's reset
    cycle completes, leading to I2C bus transaction failures.
    
    The reset racing is sometimes triggered at initial chip probe but, more
    usually on a subsequent power-off/power-on cycle e.g.
    
    [   71.451662] ov5675 24-0010: failed to write reg 0x0103. error = -5
    [   71.451686] ov5675 24-0010: failed to set plls
    
    The current quiescence period we have is too tight. Instead of expressing
    the post reset delay in terms of the current XVCLK this patch converts the
    power-on and power-off delays to the maximum theoretical delay @ 6 MHz with
    an additional buffer.
    
    1.365 milliseconds on the power-on path is 1.5 milliseconds with grace.
    85.3 microseconds on the power-off path is 90 microseconds with grace.
    
    Fixes: 49d9ad719e89 ("media: ov5675: add device-tree support and support runtime PM")
    Cc: stable@vger.kernel.org
    Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
    Tested-by: Johan Hovold <johan+linaro@kernel.org>
    Reviewed-by: Quentin Schulz <quentin.schulz@cherry.de>
    Tested-by: Quentin Schulz <quentin.schulz@cherry.de> # RK3399 Puma with
    Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
    Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

media: qcom: camss: Fix ordering of pm_runtime_enable [+ + +]
Author: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Date:   Mon Jul 29 13:42:03 2024 +0100

    media: qcom: camss: Fix ordering of pm_runtime_enable
    
    commit a151766bd3688f6803e706c6433a7c8d3c6a6a94 upstream.
    
    pm_runtime_enable() should happen prior to vfe_get() since vfe_get() calls
    pm_runtime_resume_and_get().
    
    This is a basic race condition that doesn't show up for most users so is
    not widely reported. If you blacklist qcom-camss in modules.d and then
    subsequently modprobe the module post-boot it is possible to reliably show
    this error up.
    
    The kernel log for this error looks like this:
    
    qcom-camss ac5a000.camss: Failed to power up pipeline: -13
    
    Fixes: 02afa816dbbf ("media: camss: Add basic runtime PM support")
    Reported-by: Johan Hovold <johan+linaro@kernel.org>
    Closes: https://lore.kernel.org/lkml/ZoVNHOTI0PKMNt4_@hovoldconsulting.com/
    Tested-by: Johan Hovold <johan+linaro@kernel.org>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
    Reviewed-by: Konrad Dybcio <konradybcio@kernel.org>
    Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

media: qcom: camss: Remove use_count guard in stop_streaming [+ + +]
Author: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Date:   Mon Jul 29 13:42:02 2024 +0100

    media: qcom: camss: Remove use_count guard in stop_streaming
    
    commit 25f18cb1b673220b76a86ebef8e7fb79bd303b27 upstream.
    
    The use_count check was introduced so that multiple concurrent Raw Data
    Interfaces RDIs could be driven by different virtual channels VCs on the
    CSIPHY input driving the video pipeline.
    
    This is an invalid use of use_count though as use_count pertains to the
    number of times a video entity has been opened by user-space not the number
    of active streams.
    
    If use_count and stream-on count don't agree then stop_streaming() will
    break as is currently the case and has become apparent when using CAMSS
    with libcamera's released softisp 0.3.
    
    The use of use_count like this is a bit hacky and right now breaks regular
    usage of CAMSS for a single stream case. Stopping qcam results in the splat
    below, and then it cannot be started again and any attempts to do so fails
    with -EBUSY.
    
    [ 1265.509831] WARNING: CPU: 5 PID: 919 at drivers/media/common/videobuf2/videobuf2-core.c:2183 __vb2_queue_cancel+0x230/0x2c8 [videobuf2_common]
    ...
    [ 1265.510630] Call trace:
    [ 1265.510636]  __vb2_queue_cancel+0x230/0x2c8 [videobuf2_common]
    [ 1265.510648]  vb2_core_streamoff+0x24/0xcc [videobuf2_common]
    [ 1265.510660]  vb2_ioctl_streamoff+0x5c/0xa8 [videobuf2_v4l2]
    [ 1265.510673]  v4l_streamoff+0x24/0x30 [videodev]
    [ 1265.510707]  __video_do_ioctl+0x190/0x3f4 [videodev]
    [ 1265.510732]  video_usercopy+0x304/0x8c4 [videodev]
    [ 1265.510757]  video_ioctl2+0x18/0x34 [videodev]
    [ 1265.510782]  v4l2_ioctl+0x40/0x60 [videodev]
    ...
    [ 1265.510944] videobuf2_common: driver bug: stop_streaming operation is leaving buffer 0 in active state
    [ 1265.511175] videobuf2_common: driver bug: stop_streaming operation is leaving buffer 1 in active state
    [ 1265.511398] videobuf2_common: driver bug: stop_streaming operation is leaving buffer 2 in active st
    
    One CAMSS specific way to handle multiple VCs on the same RDI might be:
    
    - Reference count each pipeline enable for CSIPHY, CSID, VFE and RDIx.
    - The video buffers are already associated with msm_vfeN_rdiX so
      release video buffers when told to do so by stop_streaming.
    - Only release the power-domains for the CSIPHY, CSID and VFE when
      their internal refcounts drop.
    
    Either way refusing to release video buffers based on use_count is
    erroneous and should be reverted. The silicon enabling code for selecting
    VCs is perfectly fine. Its a "known missing feature" that concurrent VCs
    won't work with CAMSS right now.
    
    Initial testing with this code didn't show an error but, SoftISP and "real"
    usage with Google Hangouts breaks the upstream code pretty quickly, we need
    to do a partial revert and take another pass at VCs.
    
    This commit partially reverts commit 89013969e232 ("media: camss: sm8250:
    Pipeline starting and stopping for multiple virtual channels")
    
    Fixes: 89013969e232 ("media: camss: sm8250: Pipeline starting and stopping for multiple virtual channels")
    Reported-by: Johan Hovold <johan+linaro@kernel.org>
    Closes: https://lore.kernel.org/lkml/ZoVNHOTI0PKMNt4_@hovoldconsulting.com/
    Tested-by: Johan Hovold <johan+linaro@kernel.org>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
    Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

media: sun4i_csi: Implement link validate for sun4i_csi subdev [+ + +]
Author: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Date:   Wed Jun 19 02:46:16 2024 +0300

    media: sun4i_csi: Implement link validate for sun4i_csi subdev
    
    commit 2dc5d5d401f5c6cecd97800ffef82e8d17d228f0 upstream.
    
    The sun4i_csi driver doesn't implement link validation for the subdev it
    registers, leaving the link between the subdev and its source
    unvalidated. Fix it, using the v4l2_subdev_link_validate() helper.
    
    Fixes: 577bbf23b758 ("media: sunxi: Add A10 CSI driver")
    Cc: stable@vger.kernel.org
    Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
    Acked-by: Chen-Yu Tsai <wens@csie.org>
    Reviewed-by: Tomi Valkeinen <tomi.valkeinen+renesas@ideasonboard.com>
    Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

media: uapi/linux/cec.h: cec_msg_set_reply_to: zero flags [+ + +]
Author: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Date:   Wed Aug 7 09:22:10 2024 +0200

    media: uapi/linux/cec.h: cec_msg_set_reply_to: zero flags
    
    commit 599f6899051cb70c4e0aa9fd591b9ee220cb6f14 upstream.
    
    The cec_msg_set_reply_to() helper function never zeroed the
    struct cec_msg flags field, this can cause unexpected behavior
    if flags was uninitialized to begin with.
    
    Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
    Fixes: 0dbacebede1e ("[media] cec: move the CEC framework out of staging and to media")
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

media: venus: fix use after free bug in venus_remove due to race condition [+ + +]
Author: Zheng Wang <zyytlz.wz@163.com>
Date:   Tue Jun 18 14:55:59 2024 +0530

    media: venus: fix use after free bug in venus_remove due to race condition
    
    commit c5a85ed88e043474161bbfe54002c89c1cb50ee2 upstream.
    
    in venus_probe, core->work is bound with venus_sys_error_handler, which is
    used to handle error. The code use core->sys_err_done to make sync work.
    The core->work is started in venus_event_notify.
    
    If we call venus_remove, there might be an unfished work. The possible
    sequence is as follows:
    
    CPU0                  CPU1
    
                         |venus_sys_error_handler
    venus_remove         |
    hfi_destroy                      |
    venus_hfi_destroy        |
    kfree(hdev);         |
                         |hfi_reinit
                                             |venus_hfi_queues_reinit
                         |//use hdev
    
    Fix it by canceling the work in venus_remove.
    
    Cc: stable@vger.kernel.org
    Fixes: af2c3834c8ca ("[media] media: venus: adding core part and helper functions")
    Signed-off-by: Zheng Wang <zyytlz.wz@163.com>
    Signed-off-by: Dikshita Agarwal <quic_dikshita@quicinc.com>
    Signed-off-by: Stanimir Varbanov <stanimir.k.varbanov@gmail.com>
    Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

media: videobuf2: Drop minimum allocation requirement of 2 buffers [+ + +]
Author: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Date:   Mon Aug 26 02:24:49 2024 +0300

    media: videobuf2: Drop minimum allocation requirement of 2 buffers
    
    commit e5700c9037727d5a69a677d6dba25010b485d65b upstream.
    
    When introducing the ability for drivers to indicate the minimum number
    of buffers they require an application to allocate, commit 6662edcd32cc
    ("media: videobuf2: Add min_reqbufs_allocation field to vb2_queue
    structure") also introduced a global minimum of 2 buffers. It turns out
    this breaks the Renesas R-Car VSP test suite, where a test that
    allocates a single buffer fails when two buffers are used.
    
    One may consider debatable whether test suite failures without failures
    in production use cases should be considered as a regression, but
    operation with a single buffer is a valid use case. While full frame
    rate can't be maintained, memory-to-memory devices can still be used
    with a decent efficiency, and requiring applications to allocate
    multiple buffers for single-shot use cases with capture devices would
    just waste memory.
    
    For those reasons, fix the regression by dropping the global minimum of
    buffers. Individual drivers can still set their own minimum.
    
    Fixes: 6662edcd32cc ("media: videobuf2: Add min_reqbufs_allocation field to vb2_queue structure")
    Cc: stable@vger.kernel.org
    Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
    Reviewed-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
    Acked-by: Tomasz Figa <tfiga@chromium.org>
    Link: https://lore.kernel.org/r/20240825232449.25905-1-laurent.pinchart+renesas@ideasonboard.com
    Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
memory: tegra186-emc: drop unused to_tegra186_emc() [+ + +]
Author: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Date:   Mon Aug 12 14:30:55 2024 +0200

    memory: tegra186-emc: drop unused to_tegra186_emc()
    
    commit 67dd9e861add38755a7c5d29e25dd0f6cb4116ab upstream.
    
    to_tegra186_emc() is not used, W=1 builds:
    
      tegra186-emc.c:38:36: error: unused function 'to_tegra186_emc' [-Werror,-Wunused-function]
    
    Fixes: 9a38cb27668e ("memory: tegra: Add interconnect support for DRAM scaling in Tegra234")
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20240812123055.124123-1-krzysztof.kozlowski@linaro.org
    Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
mm, slub: avoid zeroing kmalloc redzone [+ + +]
Author: Peng Fan <peng.fan@nxp.com>
Date:   Thu Aug 29 11:29:11 2024 +0800

    mm, slub: avoid zeroing kmalloc redzone
    
    commit 59090e479ac78ae18facd4c58eb332562a23020e upstream.
    
    Since commit 946fa0dbf2d8 ("mm/slub: extend redzone check to extra
    allocated kmalloc space than requested"), setting orig_size treats
    the wasted space (object_size - orig_size) as a redzone. However with
    init_on_free=1 we clear the full object->size, including the redzone.
    
    Additionally we clear the object metadata, including the stored orig_size,
    making it zero, which makes check_object() treat the whole object as a
    redzone.
    
    These issues lead to the following BUG report with "slub_debug=FUZ
    init_on_free=1":
    
    [    0.000000] =============================================================================
    [    0.000000] BUG kmalloc-8 (Not tainted): kmalloc Redzone overwritten
    [    0.000000] -----------------------------------------------------------------------------
    [    0.000000]
    [    0.000000] 0xffff000010032858-0xffff00001003285f @offset=2136. First byte 0x0 instead of 0xcc
    [    0.000000] FIX kmalloc-8: Restoring kmalloc Redzone 0xffff000010032858-0xffff00001003285f=0xcc
    [    0.000000] Slab 0xfffffdffc0400c80 objects=36 used=23 fp=0xffff000010032a18 flags=0x3fffe0000000200(workingset|node=0|zone=0|lastcpupid=0x1ffff)
    [    0.000000] Object 0xffff000010032858 @offset=2136 fp=0xffff0000100328c8
    [    0.000000]
    [    0.000000] Redzone  ffff000010032850: cc cc cc cc cc cc cc cc                          ........
    [    0.000000] Object   ffff000010032858: cc cc cc cc cc cc cc cc                          ........
    [    0.000000] Redzone  ffff000010032860: cc cc cc cc cc cc cc cc                          ........
    [    0.000000] Padding  ffff0000100328b4: 00 00 00 00 00 00 00 00 00 00 00 00              ............
    [    0.000000] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.11.0-rc3-next-20240814-00004-g61844c55c3f4 #144
    [    0.000000] Hardware name: NXP i.MX95 19X19 board (DT)
    [    0.000000] Call trace:
    [    0.000000]  dump_backtrace+0x90/0xe8
    [    0.000000]  show_stack+0x18/0x24
    [    0.000000]  dump_stack_lvl+0x74/0x8c
    [    0.000000]  dump_stack+0x18/0x24
    [    0.000000]  print_trailer+0x150/0x218
    [    0.000000]  check_object+0xe4/0x454
    [    0.000000]  free_to_partial_list+0x2f8/0x5ec
    
    To address the issue, use orig_size to clear the used area. And restore
    the value of orig_size after clear the remaining area.
    
    When CONFIG_SLUB_DEBUG not defined, (get_orig_size()' directly returns
    s->object_size. So when using memset to init the area, the size can simply
    be orig_size, as orig_size returns object_size when CONFIG_SLUB_DEBUG not
    enabled. And orig_size can never be bigger than object_size.
    
    Fixes: 946fa0dbf2d8 ("mm/slub: extend redzone check to extra allocated kmalloc space than requested")
    Cc: <stable@vger.kernel.org>
    Reviewed-by: Feng Tang <feng.tang@intel.com>
    Acked-by: David Rientjes <rientjes@google.com>
    Signed-off-by: Peng Fan <peng.fan@nxp.com>
    Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
mm/filemap: fix filemap_get_folios_contig THP panic [+ + +]
Author: Steve Sistare <steven.sistare@oracle.com>
Date:   Tue Sep 3 07:25:17 2024 -0700

    mm/filemap: fix filemap_get_folios_contig THP panic
    
    commit c225c4f6056b46a8a5bf2ed35abf17a2d6887691 upstream.
    
    Patch series "memfd-pin huge page fixes".
    
    Fix multiple bugs that occur when using memfd_pin_folios with hugetlb
    pages and THP.  The hugetlb bugs only bite when the page is not yet
    faulted in when memfd_pin_folios is called.  The THP bug bites when the
    starting offset passed to memfd_pin_folios is not huge page aligned.  See
    the commit messages for details.
    
    
    This patch (of 5):
    
    memfd_pin_folios on memory backed by THP panics if the requested start
    offset is not huge page aligned:
    
    BUG: kernel NULL pointer dereference, address: 0000000000000036
    RIP: 0010:filemap_get_folios_contig+0xdf/0x290
    RSP: 0018:ffffc9002092fbe8 EFLAGS: 00010202
    RAX: 0000000000000002 RBX: 0000000000000002 RCX: 0000000000000002
    
    The fault occurs here, because xas_load returns a folio with value 2:
    
        filemap_get_folios_contig()
            for (folio = xas_load(&xas); folio && xas.xa_index <= end;
                            folio = xas_next(&xas)) {
                    ...
                    if (!folio_try_get(folio))   <-- BOOM
    
    "2" is an xarray sibling entry.  We get it because memfd_pin_folios does
    not round the indices passed to filemap_get_folios_contig to huge page
    boundaries for THP, so we load from the middle of a huge page range see a
    sibling.  (It does round for hugetlbfs, at the is_file_hugepages test).
    
    To fix, if the folio is a sibling, then return the next index as the
    starting point for the next call to filemap_get_folios_contig.
    
    Link: https://lkml.kernel.org/r/1725373521-451395-1-git-send-email-steven.sistare@oracle.com
    Link: https://lkml.kernel.org/r/1725373521-451395-2-git-send-email-steven.sistare@oracle.com
    Fixes: 89c1905d9c14 ("mm/gup: introduce memfd_pin_folios() for pinning memfd folios")
    Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Jason Gunthorpe <jgg@nvidia.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Muchun Song <muchun.song@linux.dev>
    Cc: Peter Xu <peterx@redhat.com>
    Cc: Vivek Kasireddy <vivek.kasireddy@intel.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
mm/gup: fix memfd_pin_folios alloc race panic [+ + +]
Author: Steve Sistare <steven.sistare@oracle.com>
Date:   Tue Sep 3 07:25:21 2024 -0700

    mm/gup: fix memfd_pin_folios alloc race panic
    
    commit ce645b9fdc78ec5d28067286e92871ddae6817d5 upstream.
    
    If memfd_pin_folios tries to create a hugetlb page, but someone else
    already did, then folio gets the value -EEXIST here:
    
            folio = memfd_alloc_folio(memfd, start_idx);
            if (IS_ERR(folio)) {
                    ret = PTR_ERR(folio);
                    if (ret != -EEXIST)
                            goto err;
    
    then on the next trip through the "while start_idx" loop we panic here:
    
            if (folio) {
                    folio_put(folio);
    
    To fix, set the folio to NULL on error.
    
    Link: https://lkml.kernel.org/r/1725373521-451395-6-git-send-email-steven.sistare@oracle.com
    Fixes: 89c1905d9c14 ("mm/gup: introduce memfd_pin_folios() for pinning memfd folios")
    Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
    Acked-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Jason Gunthorpe <jgg@nvidia.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Muchun Song <muchun.song@linux.dev>
    Cc: Peter Xu <peterx@redhat.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm/gup: fix memfd_pin_folios hugetlb page allocation [+ + +]
Author: Steve Sistare <steven.sistare@oracle.com>
Date:   Tue Sep 3 07:25:20 2024 -0700

    mm/gup: fix memfd_pin_folios hugetlb page allocation
    
    commit 9289f020da47ef04b28865589eeee3d56d4bafea upstream.
    
    When memfd_pin_folios -> memfd_alloc_folio creates a hugetlb page, the
    index is wrong.  The subsequent call to filemap_get_folios_contig thus
    cannot find it, and fails, and memfd_pin_folios loops forever.  To fix,
    adjust the index for the huge_page_order.
    
    memfd_alloc_folio also forgets to unlock the folio, so the next touch of
    the page calls hugetlb_fault which blocks forever trying to take the lock.
    Unlock it.
    
    Link: https://lkml.kernel.org/r/1725373521-451395-5-git-send-email-steven.sistare@oracle.com
    Fixes: 89c1905d9c14 ("mm/gup: introduce memfd_pin_folios() for pinning memfd folios")
    Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
    Acked-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Jason Gunthorpe <jgg@nvidia.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Muchun Song <muchun.song@linux.dev>
    Cc: Peter Xu <peterx@redhat.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
mm/hugetlb: fix memfd_pin_folios free_huge_pages leak [+ + +]
Author: Steve Sistare <steven.sistare@oracle.com>
Date:   Tue Sep 3 07:25:18 2024 -0700

    mm/hugetlb: fix memfd_pin_folios free_huge_pages leak
    
    commit c56b6f3d801d7ec8965993342bdd9e2972b6cb8e upstream.
    
    memfd_pin_folios followed by unpin_folios fails to restore free_huge_pages
    if the pages were not already faulted in, because the folio refcount for
    pages created by memfd_alloc_folio never goes to 0.  memfd_pin_folios
    needs another folio_put to undo the folio_try_get below:
    
    memfd_alloc_folio()
      alloc_hugetlb_folio_nodemask()
        dequeue_hugetlb_folio_nodemask()
          dequeue_hugetlb_folio_node_exact()
            folio_ref_unfreeze(folio, 1);    ; adds 1 refcount
      folio_try_get()                        ; adds 1 refcount
      hugetlb_add_to_page_cache()            ; adds 512 refcount (on x86)
    
    With the fix, after memfd_pin_folios + unpin_folios, the refcount for the
    (unfaulted) page is 512, which is correct, as the refcount for a faulted
    unpinned page is 513.
    
    Link: https://lkml.kernel.org/r/1725373521-451395-3-git-send-email-steven.sistare@oracle.com
    Fixes: 89c1905d9c14 ("mm/gup: introduce memfd_pin_folios() for pinning memfd folios")
    Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
    Acked-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Jason Gunthorpe <jgg@nvidia.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Muchun Song <muchun.song@linux.dev>
    Cc: Peter Xu <peterx@redhat.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm/hugetlb: fix memfd_pin_folios resv_huge_pages leak [+ + +]
Author: Steve Sistare <steven.sistare@oracle.com>
Date:   Tue Sep 3 07:25:19 2024 -0700

    mm/hugetlb: fix memfd_pin_folios resv_huge_pages leak
    
    commit 26a8ea80929c518bdec5e53a5776f95919b7c88e upstream.
    
    memfd_pin_folios followed by unpin_folios leaves resv_huge_pages elevated
    if the pages were not already faulted in.  During a normal page fault,
    resv_huge_pages is consumed here:
    
    hugetlb_fault()
      alloc_hugetlb_folio()
        dequeue_hugetlb_folio_vma()
          dequeue_hugetlb_folio_nodemask()
            dequeue_hugetlb_folio_node_exact()
              free_huge_pages--
          resv_huge_pages--
    
    During memfd_pin_folios, the page is created by calling
    alloc_hugetlb_folio_nodemask instead of alloc_hugetlb_folio, and
    resv_huge_pages is not modified:
    
    memfd_alloc_folio()
      alloc_hugetlb_folio_nodemask()
        dequeue_hugetlb_folio_nodemask()
          dequeue_hugetlb_folio_node_exact()
            free_huge_pages--
    
    alloc_hugetlb_folio_nodemask has other callers that must not modify
    resv_huge_pages.  Therefore, to fix, define an alternate version of
    alloc_hugetlb_folio_nodemask for this call site that adjusts
    resv_huge_pages.
    
    Link: https://lkml.kernel.org/r/1725373521-451395-4-git-send-email-steven.sistare@oracle.com
    Fixes: 89c1905d9c14 ("mm/gup: introduce memfd_pin_folios() for pinning memfd folios")
    Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
    Acked-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Jason Gunthorpe <jgg@nvidia.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Muchun Song <muchun.song@linux.dev>
    Cc: Peter Xu <peterx@redhat.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm/hugetlb: simplify refs in memfd_alloc_folio [+ + +]
Author: Steve Sistare <steven.sistare@oracle.com>
Date:   Wed Sep 4 12:41:08 2024 -0700

    mm/hugetlb: simplify refs in memfd_alloc_folio
    
    commit dc677b5f3765cfd0944c8873d1ea57f1a3439676 upstream.
    
    The folio_try_get in memfd_alloc_folio is not necessary.  Delete it, and
    delete the matching folio_put in memfd_pin_folios.  This also avoids
    leaking a ref if the memfd_alloc_folio call to hugetlb_add_to_page_cache
    fails.  That error path is also broken in a second way -- when its
    folio_put causes the ref to become 0, it will implicitly call
    free_huge_folio, but then the path *explicitly* calls free_huge_folio.
    Delete the latter.
    
    This is a continuation of the fix
      "mm/hugetlb: fix memfd_pin_folios free_huge_pages leak"
    
    [steven.sistare@oracle.com: remove explicit call to free_huge_folio(), per Matthew]
      Link: https://lkml.kernel.org/r/Zti-7nPVMcGgpcbi@casper.infradead.org
      Link: https://lkml.kernel.org/r/1725481920-82506-1-git-send-email-steven.sistare@oracle.com
    Link: https://lkml.kernel.org/r/1725478868-61732-1-git-send-email-steven.sistare@oracle.com
    Fixes: 89c1905d9c14 ("mm/gup: introduce memfd_pin_folios() for pinning memfd folios")
    Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
    Suggested-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Jason Gunthorpe <jgg@nvidia.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Muchun Song <muchun.song@linux.dev>
    Cc: Peter Xu <peterx@redhat.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
mm: krealloc: consider spare memory for __GFP_ZERO [+ + +]
Author: Danilo Krummrich <dakr@kernel.org>
Date:   Tue Aug 13 00:34:34 2024 +0200

    mm: krealloc: consider spare memory for __GFP_ZERO
    
    commit 1a83a716ec233990e1fd5b6fbb1200ade63bf450 upstream.
    
    As long as krealloc() is called with __GFP_ZERO consistently, starting
    with the initial memory allocation, __GFP_ZERO should be fully honored.
    
    However, if for an existing allocation krealloc() is called with a
    decreased size, it is not ensured that the spare portion the allocation is
    zeroed.  Thus, if krealloc() is subsequently called with a larger size
    again, __GFP_ZERO can't be fully honored, since we don't know the previous
    size, but only the bucket size.
    
    Example:
    
            buf = kzalloc(64, GFP_KERNEL);
            memset(buf, 0xff, 64);
    
            buf = krealloc(buf, 48, GFP_KERNEL | __GFP_ZERO);
    
            /* After this call the last 16 bytes are still 0xff. */
            buf = krealloc(buf, 64, GFP_KERNEL | __GFP_ZERO);
    
    Fix this, by explicitly setting spare memory to zero, when shrinking an
    allocation with __GFP_ZERO flag set or init_on_alloc enabled.
    
    Link: https://lkml.kernel.org/r/20240812223707.32049-1-dakr@kernel.org
    Signed-off-by: Danilo Krummrich <dakr@kernel.org>
    Acked-by: Vlastimil Babka <vbabka@suse.cz>
    Acked-by: David Rientjes <rientjes@google.com>
    Cc: Christoph Lameter <cl@linux.com>
    Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
    Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
    Cc: Pekka Enberg <penberg@kernel.org>
    Cc: Roman Gushchin <roman.gushchin@linux.dev>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm: z3fold: deprecate CONFIG_Z3FOLD [+ + +]
Author: Yosry Ahmed <yosryahmed@google.com>
Date:   Mon Oct 7 19:21:16 2024 +0000

    mm: z3fold: deprecate CONFIG_Z3FOLD
    
    [ Upstream commit 7a2369b74abf76cd3e54c45b30f6addb497f831b ]
    
    The z3fold compressed pages allocator is rarely used, most users use
    zsmalloc.  The only disadvantage of zsmalloc in comparison is the
    dependency on MMU, and zbud is a more common option for !MMU as it was the
    default zswap allocator for a long time.
    
    Historically, zsmalloc had worse latency than zbud and z3fold but offered
    better memory savings.  This is no longer the case as shown by a simple
    recent analysis [1].  That analysis showed that z3fold does not have any
    advantage over zsmalloc or zbud considering both performance and memory
    usage.  In a kernel build test on tmpfs in a limited cgroup, z3fold took
    3% more time and used 1.8% more memory.  The latency of zswap_load() was
    7% higher, and that of zswap_store() was 10% higher.  Zsmalloc is better
    in all metrics.
    
    Moreover, z3fold apparently has latent bugs, which was made noticeable by
    a recent soft lockup bug report with z3fold [2].  Switching to zsmalloc
    not only fixed the problem, but also reduced the swap usage from 6~8G to
    1~2G.  Other users have also reported being bitten by mistakenly enabling
    z3fold.
    
    Other than hurting users, z3fold is repeatedly causing wasted engineering
    effort.  Apart from investigating the above bug, it came up in multiple
    development discussions (e.g.  [3]) as something we need to handle, when
    there aren't any legit users (at least not intentionally).
    
    The natural course of action is to deprecate z3fold, and remove in a few
    cycles if no objections are raised from active users.  Next on the list
    should be zbud, as it offers marginal latency gains at the cost of huge
    memory waste when compared to zsmalloc.  That one will need to wait until
    zsmalloc does not depend on MMU.
    
    Rename the user-visible config option from CONFIG_Z3FOLD to
    CONFIG_Z3FOLD_DEPRECATED so that users with CONFIG_Z3FOLD=y get a new
    prompt with explanation during make oldconfig.  Also, remove
    CONFIG_Z3FOLD=y from defconfigs.
    
    [1]https://lore.kernel.org/lkml/CAJD7tkbRF6od-2x_L8-A1QL3=2Ww13sCj4S3i4bNndqF+3+_Vg@mail.gmail.com/
    [2]https://lore.kernel.org/lkml/EF0ABD3E-A239-4111-A8AB-5C442E759CF3@gmail.com/
    [3]https://lore.kernel.org/lkml/CAJD7tkbnmeVugfunffSovJf9FAgy9rhBVt_tx=nxUveLUfqVsA@mail.gmail.com/
    
    [arnd@arndb.de: deprecate ZSWAP_ZPOOL_DEFAULT_Z3FOLD as well]
      Link: https://lkml.kernel.org/r/20240909202625.1054880-1-arnd@kernel.org
    Link: https://lkml.kernel.org/r/20240904233343.933462-1-yosryahmed@google.com
    Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
    Signed-off-by: Arnd Bergmann <arnd@arndb.de>
    Acked-by: Chris Down <chris@chrisdown.name>
    Acked-by: Nhat Pham <nphamcs@gmail.com>
    Acked-by: Johannes Weiner <hannes@cmpxchg.org>
    Acked-by: Vitaly Wool <vitaly.wool@konsulko.com>
    Acked-by: Christoph Hellwig <hch@lst.de>
    Cc: Aneesh Kumar K.V <aneesh.kumar@kernel.org>
    Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
    Cc: Huacai Chen <chenhuacai@kernel.org>
    Cc: Miaohe Lin <linmiaohe@huawei.com>
    Cc: Michael Ellerman <mpe@ellerman.id.au>
    Cc: Naveen N. Rao <naveen.n.rao@linux.ibm.com>
    Cc: Nicholas Piggin <npiggin@gmail.com>
    Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
    Cc: WANG Xuerui <kernel@xen0n.name>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    (cherry picked from commit 7a2369b74abf76cd3e54c45b30f6addb497f831b)
    Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
net/mlx5: Added cond_resched() to crdump collection [+ + +]
Author: Mohamed Khalfella <mkhalfella@purestorage.com>
Date:   Wed Sep 4 22:02:48 2024 -0600

    net/mlx5: Added cond_resched() to crdump collection
    
    [ Upstream commit ec793155894140df7421d25903de2e6bc12c695b ]
    
    Collecting crdump involves reading vsc registers from pci config space
    of mlx device, which can take long time to complete. This might result
    in starving other threads waiting to run on the cpu.
    
    Numbers I got from testing ConnectX-5 Ex MCX516A-CDAT in the lab:
    
    - mlx5_vsc_gw_read_block_fast() was called with length = 1310716.
    - mlx5_vsc_gw_read_fast() reads 4 bytes at a time. It was not used to
      read the entire 1310716 bytes. It was called 53813 times because
      there are jumps in read_addr.
    - On average mlx5_vsc_gw_read_fast() took 35284.4ns.
    - In total mlx5_vsc_wait_on_flag() called vsc_read() 54707 times.
      The average time for each call was 17548.3ns. In some instances
      vsc_read() was called more than one time when the flag was not set.
      As expected the thread released the cpu after 16 iterations in
      mlx5_vsc_wait_on_flag().
    - Total time to read crdump was 35284.4ns * 53813 ~= 1.898s.
    
    It was seen in the field that crdump can take more than 5 seconds to
    complete. During that time mlx5_vsc_wait_on_flag() did not release the
    cpu because it did not complete 16 iterations. It is believed that pci
    config reads were slow. Adding cond_resched() every 128 register read
    improves the situation. In the common case the, crdump takes ~1.8989s,
    the thread yields the cpu every ~4.51ms. If crdump takes ~5s, the thread
    yields the cpu every ~18.0ms.
    
    Fixes: 8b9d8baae1de ("net/mlx5: Add Crdump support")
    Reviewed-by: Yuanyuan Zhong <yzhong@purestorage.com>
    Signed-off-by: Mohamed Khalfella <mkhalfella@purestorage.com>
    Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
    Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net/mlx5: Fix error path in multi-packet WQE transmit [+ + +]
Author: Gerd Bayer <gbayer@linux.ibm.com>
Date:   Tue Sep 10 10:53:51 2024 +0200

    net/mlx5: Fix error path in multi-packet WQE transmit
    
    [ Upstream commit 2bcae12c795f32ddfbf8c80d1b5f1d3286341c32 ]
    
    Remove the erroneous unmap in case no DMA mapping was established
    
    The multi-packet WQE transmit code attempts to obtain a DMA mapping for
    the skb. This could fail, e.g. under memory pressure, when the IOMMU
    driver just can't allocate more memory for page tables. While the code
    tries to handle this in the path below the err_unmap label it erroneously
    unmaps one entry from the sq's FIFO list of active mappings. Since the
    current map attempt failed this unmap is removing some random DMA mapping
    that might still be required. If the PCI function now presents that IOVA,
    the IOMMU may assumes a rogue DMA access and e.g. on s390 puts the PCI
    function in error state.
    
    The erroneous behavior was seen in a stress-test environment that created
    memory pressure.
    
    Fixes: 5af75c747e2a ("net/mlx5e: Enhanced TX MPWQE for SKBs")
    Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
    Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>
    Acked-by: Maxim Mikityanskiy <maxtram95@gmail.com>
    Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
net/mlx5e: Fix crash caused by calling __xfrm_state_delete() twice [+ + +]
Author: Jianbo Liu <jianbol@nvidia.com>
Date:   Mon Sep 2 09:40:58 2024 +0300

    net/mlx5e: Fix crash caused by calling __xfrm_state_delete() twice
    
    [ Upstream commit 7b124695db40d5c9c5295a94ae928a8d67a01c3d ]
    
    The km.state is not checked in driver's delayed work. When
    xfrm_state_check_expire() is called, the state can be reset to
    XFRM_STATE_EXPIRED, even if it is XFRM_STATE_DEAD already. This
    happens when xfrm state is deleted, but not freed yet. As
    __xfrm_state_delete() is called again in xfrm timer, the following
    crash occurs.
    
    To fix this issue, skip xfrm_state_check_expire() if km.state is not
    XFRM_STATE_VALID.
    
     Oops: general protection fault, probably for non-canonical address 0xdead000000000108: 0000 [#1] SMP
     CPU: 5 UID: 0 PID: 7448 Comm: kworker/u102:2 Not tainted 6.11.0-rc2+ #1
     Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
     Workqueue: mlx5e_ipsec: eth%d mlx5e_ipsec_handle_sw_limits [mlx5_core]
     RIP: 0010:__xfrm_state_delete+0x3d/0x1b0
     Code: 0f 84 8b 01 00 00 48 89 fd c6 87 c8 00 00 00 05 48 8d bb 40 10 00 00 e8 11 04 1a 00 48 8b 95 b8 00 00 00 48 8b 85 c0 00 00 00 <48> 89 42 08 48 89 10 48 8b 55 10 48 b8 00 01 00 00 00 00 ad de 48
     RSP: 0018:ffff88885f945ec8 EFLAGS: 00010246
     RAX: dead000000000122 RBX: ffffffff82afa940 RCX: 0000000000000036
     RDX: dead000000000100 RSI: 0000000000000000 RDI: ffffffff82afb980
     RBP: ffff888109a20340 R08: ffff88885f945ea0 R09: 0000000000000000
     R10: 0000000000000000 R11: ffff88885f945ff8 R12: 0000000000000246
     R13: ffff888109a20340 R14: ffff88885f95f420 R15: ffff88885f95f400
     FS:  0000000000000000(0000) GS:ffff88885f940000(0000) knlGS:0000000000000000
     CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
     CR2: 00007f2163102430 CR3: 00000001128d6001 CR4: 0000000000370eb0
     DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
     DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
     Call Trace:
      <IRQ>
      ? die_addr+0x33/0x90
      ? exc_general_protection+0x1a2/0x390
      ? asm_exc_general_protection+0x22/0x30
      ? __xfrm_state_delete+0x3d/0x1b0
      ? __xfrm_state_delete+0x2f/0x1b0
      xfrm_timer_handler+0x174/0x350
      ? __xfrm_state_delete+0x1b0/0x1b0
      __hrtimer_run_queues+0x121/0x270
      hrtimer_run_softirq+0x88/0xd0
      handle_softirqs+0xcc/0x270
      do_softirq+0x3c/0x50
      </IRQ>
      <TASK>
      __local_bh_enable_ip+0x47/0x50
      mlx5e_ipsec_handle_sw_limits+0x7d/0x90 [mlx5_core]
      process_one_work+0x137/0x2d0
      worker_thread+0x28d/0x3a0
      ? rescuer_thread+0x480/0x480
      kthread+0xb8/0xe0
      ? kthread_park+0x80/0x80
      ret_from_fork+0x2d/0x50
      ? kthread_park+0x80/0x80
      ret_from_fork_asm+0x11/0x20
      </TASK>
    
    Fixes: b2f7b01d36a9 ("net/mlx5e: Simulate missing IPsec TX limits hardware functionality")
    Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
    Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
    Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net/mlx5e: Fix NULL deref in mlx5e_tir_builder_alloc() [+ + +]
Author: Elena Salomatkina <esalomatkina@ispras.ru>
Date:   Tue Sep 24 19:00:18 2024 +0300

    net/mlx5e: Fix NULL deref in mlx5e_tir_builder_alloc()
    
    [ Upstream commit f25389e779500cf4a59ef9804534237841bce536 ]
    
    In mlx5e_tir_builder_alloc() kvzalloc() may return NULL
    which is dereferenced on the next line in a reference
    to the modify field.
    
    Found by Linux Verification Center (linuxtesting.org) with SVACE.
    
    Fixes: a6696735d694 ("net/mlx5e: Convert TIR to a dedicated object")
    Signed-off-by: Elena Salomatkina <esalomatkina@ispras.ru>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
    Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
    Reviewed-by: Gal Pressman <gal@nvidia.com>
    Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net/mlx5e: SHAMPO, Fix overflow of hd_per_wq [+ + +]
Author: Dragos Tatulea <dtatulea@nvidia.com>
Date:   Tue Aug 13 13:34:54 2024 +0300

    net/mlx5e: SHAMPO, Fix overflow of hd_per_wq
    
    [ Upstream commit 023d2a43ed0d9ab73d4a35757121e4c8e01298e5 ]
    
    When having larger RQ sizes and small MTUs sizes, the hd_per_wq variable
    can overflow. Like in the following case:
    
    $> ethtool --set-ring eth1 rx 8192
    $> ip link set dev eth1 mtu 144
    $> ethtool --features eth1 rx-gro-hw on
    
    ... yields in dmesg:
    
    mlx5_core 0000:08:00.1: mlx5_cmd_out_err:808:(pid 194797): CREATE_MKEY(0x200) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x3bf6f), err(-22)
    
    because hd_per_wq is 64K which overflows to 0 and makes the command
    fail.
    
    This patch increases the variable size to 32 bit.
    
    Fixes: 99be56171fa9 ("net/mlx5e: SHAMPO, Re-enable HW-GRO")
    Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
    Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
    Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
net/ncsi: Disable the ncsi work before freeing the associated structure [+ + +]
Author: Eddie James <eajames@linux.ibm.com>
Date:   Wed Sep 25 10:55:23 2024 -0500

    net/ncsi: Disable the ncsi work before freeing the associated structure
    
    [ Upstream commit a0ffa68c70b367358b2672cdab6fa5bc4c40de2c ]
    
    The work function can run after the ncsi device is freed, resulting
    in use-after-free bugs or kernel panic.
    
    Fixes: 2d283bdd079c ("net/ncsi: Resource management")
    Signed-off-by: Eddie James <eajames@linux.ibm.com>
    Link: https://patch.msgid.link/20240925155523.1017097-1-eajames@linux.ibm.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
net/xen-netback: prevent UAF in xenvif_flush_hash() [+ + +]
Author: Jeongjun Park <aha310510@gmail.com>
Date:   Fri Aug 23 03:11:09 2024 +0900

    net/xen-netback: prevent UAF in xenvif_flush_hash()
    
    [ Upstream commit 0fa5e94a1811d68fbffa0725efe6d4ca62c03d12 ]
    
    During the list_for_each_entry_rcu iteration call of xenvif_flush_hash,
    kfree_rcu does not exist inside the rcu read critical section, so if
    kfree_rcu is called when the rcu grace period ends during the iteration,
    UAF occurs when accessing head->next after the entry becomes free.
    
    Therefore, to solve this, you need to change it to list_for_each_entry_safe.
    
    Signed-off-by: Jeongjun Park <aha310510@gmail.com>
    Link: https://patch.msgid.link/20240822181109.2577354-1-aha310510@gmail.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
net: add more sanity checks to qdisc_pkt_len_init() [+ + +]
Author: Eric Dumazet <edumazet@google.com>
Date:   Tue Sep 24 15:02:57 2024 +0000

    net: add more sanity checks to qdisc_pkt_len_init()
    
    [ Upstream commit ab9a9a9e9647392a19e7a885b08000e89c86b535 ]
    
    One path takes care of SKB_GSO_DODGY, assuming
    skb->len is bigger than hdr_len.
    
    virtio_net_hdr_to_skb() does not fully dissect TCP headers,
    it only make sure it is at least 20 bytes.
    
    It is possible for an user to provide a malicious 'GSO' packet,
    total length of 80 bytes.
    
    - 20 bytes of IPv4 header
    - 60 bytes TCP header
    - a small gso_size like 8
    
    virtio_net_hdr_to_skb() would declare this packet as a normal
    GSO packet, because it would see 40 bytes of payload,
    bigger than gso_size.
    
    We need to make detect this case to not underflow
    qdisc_skb_cb(skb)->pkt_len.
    
    Fixes: 1def9238d4aa ("net_sched: more precise pkt_len computation")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reviewed-by: Willem de Bruijn <willemb@google.com>
    Reviewed-by: David Ahern <dsahern@kernel.org>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: Add netif_get_gro_max_size helper for GRO [+ + +]
Author: Daniel Borkmann <daniel@iogearbox.net>
Date:   Mon Sep 23 23:22:41 2024 +0200

    net: Add netif_get_gro_max_size helper for GRO
    
    [ Upstream commit e8d4d34df715133c319fabcf63fdec684be75ff8 ]
    
    Add a small netif_get_gro_max_size() helper which returns the maximum IPv4
    or IPv6 GRO size of the netdevice.
    
    We later add a netif_get_gso_max_size() equivalent as well for GSO, so that
    these helpers can be used consistently instead of open-coded checks.
    
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Cc: Eric Dumazet <edumazet@google.com>
    Cc: Paolo Abeni <pabeni@redhat.com>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Link: https://patch.msgid.link/20240923212242.15669-1-daniel@iogearbox.net
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Stable-dep-of: e609c959a939 ("net: Fix gso_features_check to check for both dev->gso_{ipv4_,}max_size")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: atlantic: Avoid warning about potential string truncation [+ + +]
Author: Simon Horman <horms@kernel.org>
Date:   Wed Aug 21 16:58:57 2024 +0100

    net: atlantic: Avoid warning about potential string truncation
    
    [ Upstream commit 5874e0c9f25661c2faefe4809907166defae3d7f ]
    
    W=1 builds with GCC 14.2.0 warn that:
    
    .../aq_ethtool.c:278:59: warning: ‘%d’ directive output may be truncated writing between 1 and 11 bytes into a region of size 6 [-Wformat-truncation=]
      278 |                                 snprintf(tc_string, 8, "TC%d ", tc);
          |                                                           ^~
    .../aq_ethtool.c:278:56: note: directive argument in the range [-2147483641, 254]
      278 |                                 snprintf(tc_string, 8, "TC%d ", tc);
          |                                                        ^~~~~~~
    .../aq_ethtool.c:278:33: note: ‘snprintf’ output between 5 and 15 bytes into a destination of size 8
      278 |                                 snprintf(tc_string, 8, "TC%d ", tc);
          |                                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    tc is always in the range 0 - cfg->tcs. And as cfg->tcs is a u8,
    the range is 0 - 255. Further, on inspecting the code, it seems
    that cfg->tcs will never be more than AQ_CFG_TCS_MAX (8), so
    the range is actually 0 - 8.
    
    So, it seems that the condition that GCC flags will not occur.
    But, nonetheless, it would be nice if it didn't emit the warning.
    
    It seems that this can be achieved by changing the format specifier
    from %d to %u, in which case I believe GCC recognises an upper bound
    on the range of tc of 0 - 255. After some experimentation I think
    this is due to the combination of the use of %u and the type of
    cfg->tcs (u8).
    
    Empirically, updating the type of the tc variable to unsigned int
    has the same effect.
    
    As both of these changes seem to make sense in relation to what the code
    is actually doing - iterating over unsigned values - do both.
    
    Compile tested only.
    
    Signed-off-by: Simon Horman <horms@kernel.org>
    Link: https://patch.msgid.link/20240821-atlantic-str-v1-1-fa2cfe38ca00@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: avoid potential underflow in qdisc_pkt_len_init() with UFO [+ + +]
Author: Eric Dumazet <edumazet@google.com>
Date:   Tue Sep 24 15:02:56 2024 +0000

    net: avoid potential underflow in qdisc_pkt_len_init() with UFO
    
    [ Upstream commit c20029db28399ecc50e556964eaba75c43b1e2f1 ]
    
    After commit 7c6d2ecbda83 ("net: be more gentle about silly gso
    requests coming from user") virtio_net_hdr_to_skb() had sanity check
    to detect malicious attempts from user space to cook a bad GSO packet.
    
    Then commit cf9acc90c80ec ("net: virtio_net_hdr_to_skb: count
    transport header in UFO") while fixing one issue, allowed user space
    to cook a GSO packet with the following characteristic :
    
    IPv4 SKB_GSO_UDP, gso_size=3, skb->len = 28.
    
    When this packet arrives in qdisc_pkt_len_init(), we end up
    with hdr_len = 28 (IPv4 header + UDP header), matching skb->len
    
    Then the following sets gso_segs to 0 :
    
    gso_segs = DIV_ROUND_UP(skb->len - hdr_len,
                            shinfo->gso_size);
    
    Then later we set qdisc_skb_cb(skb)->pkt_len to back to zero :/
    
    qdisc_skb_cb(skb)->pkt_len += (gso_segs - 1) * hdr_len;
    
    This leads to the following crash in fq_codel [1]
    
    qdisc_pkt_len_init() is best effort, we only want an estimation
    of the bytes sent on the wire, not crashing the kernel.
    
    This patch is fixing this particular issue, a following one
    adds more sanity checks for another potential bug.
    
    [1]
    [   70.724101] BUG: kernel NULL pointer dereference, address: 0000000000000000
    [   70.724561] #PF: supervisor read access in kernel mode
    [   70.724561] #PF: error_code(0x0000) - not-present page
    [   70.724561] PGD 10ac61067 P4D 10ac61067 PUD 107ee2067 PMD 0
    [   70.724561] Oops: Oops: 0000 [#1] SMP NOPTI
    [   70.724561] CPU: 11 UID: 0 PID: 2163 Comm: b358537762 Not tainted 6.11.0-virtme #991
    [   70.724561] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
    [   70.724561] RIP: 0010:fq_codel_enqueue (net/sched/sch_fq_codel.c:120 net/sched/sch_fq_codel.c:168 net/sched/sch_fq_codel.c:230) sch_fq_codel
    [ 70.724561] Code: 24 08 49 c1 e1 06 44 89 7c 24 18 45 31 ed 45 31 c0 31 ff 89 44 24 14 4c 03 8b 90 01 00 00 eb 04 39 ca 73 37 4d 8b 39 83 c7 01 <49> 8b 17 49 89 11 41 8b 57 28 45 8b 5f 34 49 c7 07 00 00 00 00 49
    All code
    ========
       0:   24 08                   and    $0x8,%al
       2:   49 c1 e1 06             shl    $0x6,%r9
       6:   44 89 7c 24 18          mov    %r15d,0x18(%rsp)
       b:   45 31 ed                xor    %r13d,%r13d
       e:   45 31 c0                xor    %r8d,%r8d
      11:   31 ff                   xor    %edi,%edi
      13:   89 44 24 14             mov    %eax,0x14(%rsp)
      17:   4c 03 8b 90 01 00 00    add    0x190(%rbx),%r9
      1e:   eb 04                   jmp    0x24
      20:   39 ca                   cmp    %ecx,%edx
      22:   73 37                   jae    0x5b
      24:   4d 8b 39                mov    (%r9),%r15
      27:   83 c7 01                add    $0x1,%edi
      2a:*  49 8b 17                mov    (%r15),%rdx              <-- trapping instruction
      2d:   49 89 11                mov    %rdx,(%r9)
      30:   41 8b 57 28             mov    0x28(%r15),%edx
      34:   45 8b 5f 34             mov    0x34(%r15),%r11d
      38:   49 c7 07 00 00 00 00    movq   $0x0,(%r15)
      3f:   49                      rex.WB
    
    Code starting with the faulting instruction
    ===========================================
       0:   49 8b 17                mov    (%r15),%rdx
       3:   49 89 11                mov    %rdx,(%r9)
       6:   41 8b 57 28             mov    0x28(%r15),%edx
       a:   45 8b 5f 34             mov    0x34(%r15),%r11d
       e:   49 c7 07 00 00 00 00    movq   $0x0,(%r15)
      15:   49                      rex.WB
    [   70.724561] RSP: 0018:ffff95ae85e6fb90 EFLAGS: 00000202
    [   70.724561] RAX: 0000000002000000 RBX: ffff95ae841de000 RCX: 0000000000000000
    [   70.724561] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000001
    [   70.724561] RBP: ffff95ae85e6fbf8 R08: 0000000000000000 R09: ffff95b710a30000
    [   70.724561] R10: 0000000000000000 R11: bdf289445ce31881 R12: ffff95ae85e6fc58
    [   70.724561] R13: 0000000000000000 R14: 0000000000000040 R15: 0000000000000000
    [   70.724561] FS:  000000002c5c1380(0000) GS:ffff95bd7fcc0000(0000) knlGS:0000000000000000
    [   70.724561] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [   70.724561] CR2: 0000000000000000 CR3: 000000010c568000 CR4: 00000000000006f0
    [   70.724561] Call Trace:
    [   70.724561]  <TASK>
    [   70.724561] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434)
    [   70.724561] ? page_fault_oops (arch/x86/mm/fault.c:715)
    [   70.724561] ? exc_page_fault (./arch/x86/include/asm/irqflags.h:26 ./arch/x86/include/asm/irqflags.h:87 ./arch/x86/include/asm/irqflags.h:147 arch/x86/mm/fault.c:1489 arch/x86/mm/fault.c:1539)
    [   70.724561] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:623)
    [   70.724561] ? fq_codel_enqueue (net/sched/sch_fq_codel.c:120 net/sched/sch_fq_codel.c:168 net/sched/sch_fq_codel.c:230) sch_fq_codel
    [   70.724561] dev_qdisc_enqueue (net/core/dev.c:3784)
    [   70.724561] __dev_queue_xmit (net/core/dev.c:3880 (discriminator 2) net/core/dev.c:4390 (discriminator 2))
    [   70.724561] ? irqentry_enter (kernel/entry/common.c:237)
    [   70.724561] ? sysvec_apic_timer_interrupt (./arch/x86/include/asm/hardirq.h:74 (discriminator 2) arch/x86/kernel/apic/apic.c:1043 (discriminator 2) arch/x86/kernel/apic/apic.c:1043 (discriminator 2))
    [   70.724561] ? trace_hardirqs_on (kernel/trace/trace_preemptirq.c:58 (discriminator 4))
    [   70.724561] ? asm_sysvec_apic_timer_interrupt (./arch/x86/include/asm/idtentry.h:702)
    [   70.724561] ? virtio_net_hdr_to_skb.constprop.0 (./include/linux/virtio_net.h:129 (discriminator 1))
    [   70.724561] packet_sendmsg (net/packet/af_packet.c:3145 (discriminator 1) net/packet/af_packet.c:3177 (discriminator 1))
    [   70.724561] ? _raw_spin_lock_bh (./arch/x86/include/asm/atomic.h:107 (discriminator 4) ./include/linux/atomic/atomic-arch-fallback.h:2170 (discriminator 4) ./include/linux/atomic/atomic-instrumented.h:1302 (discriminator 4) ./include/asm-generic/qspinlock.h:111 (discriminator 4) ./include/linux/spinlock.h:187 (discriminator 4) ./include/linux/spinlock_api_smp.h:127 (discriminator 4) kernel/locking/spinlock.c:178 (discriminator 4))
    [   70.724561] ? netdev_name_node_lookup_rcu (net/core/dev.c:325 (discriminator 1))
    [   70.724561] __sys_sendto (net/socket.c:730 (discriminator 1) net/socket.c:745 (discriminator 1) net/socket.c:2210 (discriminator 1))
    [   70.724561] ? __sys_setsockopt (./include/linux/file.h:34 net/socket.c:2355)
    [   70.724561] __x64_sys_sendto (net/socket.c:2222 (discriminator 1) net/socket.c:2218 (discriminator 1) net/socket.c:2218 (discriminator 1))
    [   70.724561] do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1))
    [   70.724561] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
    [   70.724561] RIP: 0033:0x41ae09
    
    Fixes: cf9acc90c80ec ("net: virtio_net_hdr_to_skb: count transport header in UFO")
    Reported-by: syzbot <syzkaller@googlegroups.com>
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Cc: Jonathan Davies <jonathan.davies@nutanix.com>
    Reviewed-by: Willem de Bruijn <willemb@google.com>
    Reviewed-by: Jonathan Davies <jonathan.davies@nutanix.com>
    Reviewed-by: David Ahern <dsahern@kernel.org>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: dsa: improve shutdown sequence [+ + +]
Author: Vladimir Oltean <vladimir.oltean@nxp.com>
Date:   Fri Sep 13 23:35:49 2024 +0300

    net: dsa: improve shutdown sequence
    
    [ Upstream commit 6c24a03a61a245fe34d47582898331fa034b6ccd ]
    
    Alexander Sverdlin presents 2 problems during shutdown with the
    lan9303 driver. One is specific to lan9303 and the other just happens
    to reproduce there.
    
    The first problem is that lan9303 is unique among DSA drivers in that it
    calls dev_get_drvdata() at "arbitrary runtime" (not probe, not shutdown,
    not remove):
    
    phy_state_machine()
    -> ...
       -> dsa_user_phy_read()
          -> ds->ops->phy_read()
             -> lan9303_phy_read()
                -> chip->ops->phy_read()
                   -> lan9303_mdio_phy_read()
                      -> dev_get_drvdata()
    
    But we never stop the phy_state_machine(), so it may continue to run
    after dsa_switch_shutdown(). Our common pattern in all DSA drivers is
    to set drvdata to NULL to suppress the remove() method that may come
    afterwards. But in this case it will result in an NPD.
    
    The second problem is that the way in which we set
    dp->conduit->dsa_ptr = NULL; is concurrent with receive packet
    processing. dsa_switch_rcv() checks once whether dev->dsa_ptr is NULL,
    but afterwards, rather than continuing to use that non-NULL value,
    dev->dsa_ptr is dereferenced again and again without NULL checks:
    dsa_conduit_find_user() and many other places. In between dereferences,
    there is no locking to ensure that what was valid once continues to be
    valid.
    
    Both problems have the common aspect that closing the conduit interface
    solves them.
    
    In the first case, dev_close(conduit) triggers the NETDEV_GOING_DOWN
    event in dsa_user_netdevice_event() which closes user ports as well.
    dsa_port_disable_rt() calls phylink_stop(), which synchronously stops
    the phylink state machine, and ds->ops->phy_read() will thus no longer
    call into the driver after this point.
    
    In the second case, dev_close(conduit) should do this, as per
    Documentation/networking/driver.rst:
    
    | Quiescence
    | ----------
    |
    | After the ndo_stop routine has been called, the hardware must
    | not receive or transmit any data.  All in flight packets must
    | be aborted. If necessary, poll or wait for completion of
    | any reset commands.
    
    So it should be sufficient to ensure that later, when we zeroize
    conduit->dsa_ptr, there will be no concurrent dsa_switch_rcv() call
    on this conduit.
    
    The addition of the netif_device_detach() function is to ensure that
    ioctls, rtnetlinks and ethtool requests on the user ports no longer
    propagate down to the driver - we're no longer prepared to handle them.
    
    The race condition actually did not exist when commit 0650bf52b31f
    ("net: dsa: be compatible with masters which unregister on shutdown")
    first introduced dsa_switch_shutdown(). It was created later, when we
    stopped unregistering the user interfaces from a bad spot, and we just
    replaced that sequence with a racy zeroization of conduit->dsa_ptr
    (one which doesn't ensure that the interfaces aren't up).
    
    Reported-by: Alexander Sverdlin <alexander.sverdlin@siemens.com>
    Closes: https://lore.kernel.org/netdev/2d2e3bba17203c14a5ffdabc174e3b6bbb9ad438.camel@siemens.com/
    Closes: https://lore.kernel.org/netdev/c1bf4de54e829111e0e4a70e7bd1cf523c9550ff.camel@siemens.com/
    Fixes: ee534378f005 ("net: dsa: fix panic when DSA master device unbinds on shutdown")
    Reviewed-by: Alexander Sverdlin <alexander.sverdlin@siemens.com>
    Tested-by: Alexander Sverdlin <alexander.sverdlin@siemens.com>
    Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Link: https://patch.msgid.link/20240913203549.3081071-1-vladimir.oltean@nxp.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: ethernet: lantiq_etop: fix memory disclosure [+ + +]
Author: Aleksander Jan Bajkowski <olek2@wp.pl>
Date:   Mon Sep 23 23:49:49 2024 +0200

    net: ethernet: lantiq_etop: fix memory disclosure
    
    [ Upstream commit 45c0de18ff2dc9af01236380404bbd6a46502c69 ]
    
    When applying padding, the buffer is not zeroed, which results in memory
    disclosure. The mentioned data is observed on the wire. This patch uses
    skb_put_padto() to pad Ethernet frames properly. The mentioned function
    zeroes the expanded buffer.
    
    In case the packet cannot be padded it is silently dropped. Statistics
    are also not incremented. This driver does not support statistics in the
    old 32-bit format or the new 64-bit format. These will be added in the
    future. In its current form, the patch should be easily backported to
    stable versions.
    
    Ethernet MACs on Amazon-SE and Danube cannot do padding of the packets
    in hardware, so software padding must be applied.
    
    Fixes: 504d4721ee8e ("MIPS: Lantiq: Add ethernet driver")
    Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
    Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
    Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
    Link: https://patch.msgid.link/20240923214949.231511-2-olek2@wp.pl
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: fec: Reload PTP registers after link-state change [+ + +]
Author: Csókás, Bence <csokas.bence@prolan.hu>
Date:   Tue Sep 24 11:37:06 2024 +0200

    net: fec: Reload PTP registers after link-state change
    
    [ Upstream commit d9335d0232d2da605585eea1518ac6733518f938 ]
    
    On link-state change, the controller gets reset,
    which clears all PTP registers, including PHC time,
    calibrated clock correction values etc. For correct
    IEEE 1588 operation we need to restore these after
    the reset.
    
    Fixes: 6605b730c061 ("FEC: Add time stamping code and a PTP hardware clock")
    Signed-off-by: Csókás, Bence <csokas.bence@prolan.hu>
    Reviewed-by: Wei Fang <wei.fang@nxp.com>
    Link: https://patch.msgid.link/20240924093705.2897329-2-csokas.bence@prolan.hu
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: fec: Restart PPS after link state change [+ + +]
Author: Csókás, Bence <csokas.bence@prolan.hu>
Date:   Tue Sep 24 11:37:04 2024 +0200

    net: fec: Restart PPS after link state change
    
    [ Upstream commit a1477dc87dc4996dcf65a4893d4e2c3a6b593002 ]
    
    On link state change, the controller gets reset,
    causing PPS to drop out. Re-enable PPS if it was
    enabled before the controller reset.
    
    Fixes: 6605b730c061 ("FEC: Add time stamping code and a PTP hardware clock")
    Signed-off-by: Csókás, Bence <csokas.bence@prolan.hu>
    Link: https://patch.msgid.link/20240924093705.2897329-1-csokas.bence@prolan.hu
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: Fix gso_features_check to check for both dev->gso_{ipv4_,}max_size [+ + +]
Author: Daniel Borkmann <daniel@iogearbox.net>
Date:   Mon Sep 23 23:22:42 2024 +0200

    net: Fix gso_features_check to check for both dev->gso_{ipv4_,}max_size
    
    [ Upstream commit e609c959a939660c7519895f853dfa5624c6827a ]
    
    Commit 24ab059d2ebd ("net: check dev->gso_max_size in gso_features_check()")
    added a dev->gso_max_size test to gso_features_check() in order to fall
    back to GSO when needed.
    
    This was added as it was noticed that some drivers could misbehave if TSO
    packets get too big. However, the check doesn't respect dev->gso_ipv4_max_size
    limit. For instance, a device could be configured with BIG TCP for IPv4,
    but not IPv6.
    
    Therefore, add a netif_get_gso_max_size() equivalent to netif_get_gro_max_size()
    and use the helper to respect both limits before falling back to GSO engine.
    
    Fixes: 24ab059d2ebd ("net: check dev->gso_max_size in gso_features_check()")
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Cc: Eric Dumazet <edumazet@google.com>
    Cc: Paolo Abeni <pabeni@redhat.com>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Link: https://patch.msgid.link/20240923212242.15669-2-daniel@iogearbox.net
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: gso: fix tcp fraglist segmentation after pull from frag_list [+ + +]
Author: Felix Fietkau <nbd@nbd.name>
Date:   Thu Sep 26 10:53:14 2024 +0200

    net: gso: fix tcp fraglist segmentation after pull from frag_list
    
    commit 17bd3bd82f9f79f3feba15476c2b2c95a9b11ff8 upstream.
    
    Detect tcp gso fraglist skbs with corrupted geometry (see below) and
    pass these to skb_segment instead of skb_segment_list, as the first
    can segment them correctly.
    
    Valid SKB_GSO_FRAGLIST skbs
    - consist of two or more segments
    - the head_skb holds the protocol headers plus first gso_size
    - one or more frag_list skbs hold exactly one segment
    - all but the last must be gso_size
    
    Optional datapath hooks such as NAT and BPF (bpf_skb_pull_data) can
    modify these skbs, breaking these invariants.
    
    In extreme cases they pull all data into skb linear. For TCP, this
    causes a NULL ptr deref in __tcpv4_gso_segment_list_csum at
    tcp_hdr(seg->next).
    
    Detect invalid geometry due to pull, by checking head_skb size.
    Don't just drop, as this may blackhole a destination. Convert to be
    able to pass to regular skb_segment.
    
    Approach and description based on a patch by Willem de Bruijn.
    
    Link: https://lore.kernel.org/netdev/20240428142913.18666-1-shiming.cheng@mediatek.com/
    Link: https://lore.kernel.org/netdev/20240922150450.3873767-1-willemdebruijn.kernel@gmail.com/
    Fixes: bee88cd5bd83 ("net: add support for segmenting TCP fraglist GSO packets")
    Cc: stable@vger.kernel.org
    Signed-off-by: Felix Fietkau <nbd@nbd.name>
    Reviewed-by: Willem de Bruijn <willemb@google.com>
    Link: https://patch.msgid.link/20240926085315.51524-1-nbd@nbd.name
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: hisilicon: hip04: fix OF node leak in probe() [+ + +]
Author: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Date:   Tue Aug 27 16:44:19 2024 +0200

    net: hisilicon: hip04: fix OF node leak in probe()
    
    [ Upstream commit 17555297dbd5bccc93a01516117547e26a61caf1 ]
    
    Driver is leaking OF node reference from
    of_parse_phandle_with_fixed_args() in probe().
    
    Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://patch.msgid.link/20240827144421.52852-2-krzysztof.kozlowski@linaro.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: hisilicon: hns_dsaf_mac: fix OF node leak in hns_mac_get_info() [+ + +]
Author: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Date:   Tue Aug 27 16:44:20 2024 +0200

    net: hisilicon: hns_dsaf_mac: fix OF node leak in hns_mac_get_info()
    
    [ Upstream commit 5680cf8d34e1552df987e2f4bb1bff0b2a8c8b11 ]
    
    Driver is leaking OF node reference from
    of_parse_phandle_with_fixed_args() in hns_mac_get_info().
    
    Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://patch.msgid.link/20240827144421.52852-3-krzysztof.kozlowski@linaro.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: hisilicon: hns_mdio: fix OF node leak in probe() [+ + +]
Author: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Date:   Tue Aug 27 16:44:21 2024 +0200

    net: hisilicon: hns_mdio: fix OF node leak in probe()
    
    [ Upstream commit e62beddc45f487b9969821fad3a0913d9bc18a2f ]
    
    Driver is leaking OF node reference from
    of_parse_phandle_with_fixed_args() in probe().
    
    Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://patch.msgid.link/20240827144421.52852-4-krzysztof.kozlowski@linaro.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: ieee802154: mcr20a: Use IRQF_NO_AUTOEN flag in request_irq() [+ + +]
Author: Jinjie Ruan <ruanjinjie@huawei.com>
Date:   Wed Sep 11 17:42:34 2024 +0800

    net: ieee802154: mcr20a: Use IRQF_NO_AUTOEN flag in request_irq()
    
    [ Upstream commit 09573b1cc76e7ff8f056ab29ea1cdc152ec8c653 ]
    
    disable_irq() after request_irq() still has a time gap in which
    interrupts can come. request_irq() with IRQF_NO_AUTOEN flag will
    disable IRQ auto-enable when request IRQ.
    
    Fixes: 8c6ad9cc5157 ("ieee802154: Add NXP MCR20A IEEE 802.15.4 transceiver driver")
    Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
    Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
    Link: https://lore.kernel.org/20240911094234.1922418-1-ruanjinjie@huawei.com
    Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: mvpp2: Increase size of queue_name buffer [+ + +]
Author: Simon Horman <horms@kernel.org>
Date:   Tue Aug 6 12:28:24 2024 +0100

    net: mvpp2: Increase size of queue_name buffer
    
    [ Upstream commit 91d516d4de48532d967a77967834e00c8c53dfe6 ]
    
    Increase size of queue_name buffer from 30 to 31 to accommodate
    the largest string written to it. This avoids truncation in
    the possibly unlikely case where the string is name is the
    maximum size.
    
    Flagged by gcc-14:
    
      .../mvpp2_main.c: In function 'mvpp2_probe':
      .../mvpp2_main.c:7636:32: warning: 'snprintf' output may be truncated before the last format character [-Wformat-truncation=]
       7636 |                  "stats-wq-%s%s", netdev_name(priv->port_list[0]->dev),
            |                                ^
      .../mvpp2_main.c:7635:9: note: 'snprintf' output between 10 and 31 bytes into a destination of size 30
       7635 |         snprintf(priv->queue_name, sizeof(priv->queue_name),
            |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
       7636 |                  "stats-wq-%s%s", netdev_name(priv->port_list[0]->dev),
            |                  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
       7637 |                  priv->port_count > 1 ? "+" : "");
            |                  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    Introduced by commit 118d6298f6f0 ("net: mvpp2: add ethtool GOP statistics").
    I am not flagging this as a bug as I am not aware that it is one.
    
    Compile tested only.
    
    Signed-off-by: Simon Horman <horms@kernel.org>
    Reviewed-by: Marcin Wojtas <marcin.s.wojtas@gmail.com>
    Link: https://patch.msgid.link/20240806-mvpp2-namelen-v1-1-6dc773653f2f@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: napi: Prevent overflow of napi_defer_hard_irqs [+ + +]
Author: Joe Damato <jdamato@fastly.com>
Date:   Wed Sep 4 15:34:30 2024 +0000

    net: napi: Prevent overflow of napi_defer_hard_irqs
    
    [ Upstream commit 08062af0a52107a243f7608fd972edb54ca5b7f8 ]
    
    In commit 6f8b12d661d0 ("net: napi: add hard irqs deferral feature")
    napi_defer_irqs was added to net_device and napi_defer_irqs_count was
    added to napi_struct, both as type int.
    
    This value never goes below zero, so there is not reason for it to be a
    signed int. Change the type for both from int to u32, and add an
    overflow check to sysfs to limit the value to S32_MAX.
    
    The limit of S32_MAX was chosen because the practical limit before this
    patch was S32_MAX (anything larger was an overflow) and thus there are
    no behavioral changes introduced. If the extra bit is needed in the
    future, the limit can be raised.
    
    Before this patch:
    
    $ sudo bash -c 'echo 2147483649 > /sys/class/net/eth4/napi_defer_hard_irqs'
    $ cat /sys/class/net/eth4/napi_defer_hard_irqs
    -2147483647
    
    After this patch:
    
    $ sudo bash -c 'echo 2147483649 > /sys/class/net/eth4/napi_defer_hard_irqs'
    bash: line 0: echo: write error: Numerical result out of range
    
    Similarly, /sys/class/net/XXXXX/tx_queue_len is defined as unsigned:
    
    include/linux/netdevice.h:      unsigned int            tx_queue_len;
    
    And has an overflow check:
    
    dev_change_tx_queue_len(..., unsigned long new_len):
    
      if (new_len != (unsigned int)new_len)
              return -ERANGE;
    
    Suggested-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Joe Damato <jdamato@fastly.com>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Link: https://patch.msgid.link/20240904153431.307932-1-jdamato@fastly.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: pcs: xpcs: fix the wrong register that was written back [+ + +]
Author: Jiawen Wu <jiawenwu@trustnetic.com>
Date:   Tue Sep 24 10:28:57 2024 +0800

    net: pcs: xpcs: fix the wrong register that was written back
    
    commit 93ef6ee5c20e9330477930ec6347672c9e0cf5a6 upstream.
    
    The value is read from the register TXGBE_RX_GEN_CTL3, and it should be
    written back to TXGBE_RX_GEN_CTL3 when it changes some fields.
    
    Cc: stable@vger.kernel.org
    Fixes: f629acc6f210 ("net: pcs: xpcs: support to switch mode for Wangxun NICs")
    Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
    Reported-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
    Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
    Link: https://patch.msgid.link/20240924022857.865422-1-jiawenwu@trustnetic.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: phy: Check for read errors in SIOCGMIIREG [+ + +]
Author: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Date:   Tue Sep 3 19:15:36 2024 +0200

    net: phy: Check for read errors in SIOCGMIIREG
    
    [ Upstream commit 569bf6d481b0b823c3c9c3b8be77908fd7caf66b ]
    
    When reading registers from the PHY using the SIOCGMIIREG IOCTL any
    errors returned from either mdiobus_read() or mdiobus_c45_read() are
    ignored, and parts of the returned error is passed as the register value
    back to user-space.
    
    For example, if mdiobus_c45_read() is used with a bus that do not
    implement the read_c45() callback -EOPNOTSUPP is returned. This is
    however directly stored in mii_data->val_out and returned as the
    registers content. As val_out is a u16 the error code is truncated and
    returned as a plausible register value.
    
    Fix this by first checking the return value for errors before returning
    it as the register content.
    
    Before this patch,
    
        # phytool read eth0/0:1/0
        0xffa1
    
    After this change,
    
        $ phytool read eth0/0:1/0
        error: phy_read (-95)
    
    Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
    Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
    Reviewed-by: Andrew Lunn <andrew@lunn.ch>
    Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
    Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
    Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
    Link: https://patch.msgid.link/20240903171536.628930-1-niklas.soderlund+renesas@ragnatech.se
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: phy: realtek: Check the index value in led_hw_control_get [+ + +]
Author: Hui Wang <hui.wang@canonical.com>
Date:   Fri Sep 27 19:46:10 2024 +0800

    net: phy: realtek: Check the index value in led_hw_control_get
    
    [ Upstream commit c283782fc5d60c4d8169137c6f955aa3553d3b3d ]
    
    Just like rtl8211f_led_hw_is_supported() and
    rtl8211f_led_hw_control_set(), the rtl8211f_led_hw_control_get() also
    needs to check the index value, otherwise the caller is likely to get
    an incorrect rules.
    
    Fixes: 17784801d888 ("net: phy: realtek: Add support for PHY LEDs on RTL8211F")
    Signed-off-by: Hui Wang <hui.wang@canonical.com>
    Reviewed-by: Marek Vasut <marex@denx.de>
    Link: https://patch.msgid.link/20240927114610.1278935-1-hui.wang@canonical.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: sched: consistently use rcu_replace_pointer() in taprio_change() [+ + +]
Author: Dmitry Antipov <dmantipov@yandex.ru>
Date:   Wed Sep 4 14:54:01 2024 +0300

    net: sched: consistently use rcu_replace_pointer() in taprio_change()
    
    [ Upstream commit d5c4546062fd6f5dbce575c7ea52ad66d1968678 ]
    
    According to Vinicius (and carefully looking through the whole
    https://syzkaller.appspot.com/bug?extid=b65e0af58423fc8a73aa
    once again), txtime branch of 'taprio_change()' is not going to
    race against 'advance_sched()'. But using 'rcu_replace_pointer()'
    in the former may be a good idea as well.
    
    Suggested-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
    Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
    Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: skbuff: sprinkle more __GFP_NOWARN on ingress allocs [+ + +]
Author: Jakub Kicinski <kuba@kernel.org>
Date:   Thu Aug 1 17:19:56 2024 -0700

    net: skbuff: sprinkle more __GFP_NOWARN on ingress allocs
    
    [ Upstream commit c89cca307b20917da739567a255a68a0798ee129 ]
    
    build_skb() and frag allocations done with GFP_ATOMIC will
    fail in real life, when system is under memory pressure,
    and there's nothing we can do about that. So no point
    printing warnings.
    
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: sparx5: Fix invalid timestamps [+ + +]
Author: Aakash Menon <aakash.r.menon@gmail.com>
Date:   Mon Sep 16 22:18:29 2024 -0700

    net: sparx5: Fix invalid timestamps
    
    [ Upstream commit 151ac45348afc5b56baa584c7cd4876addf461ff ]
    
    Bit 270-271 are occasionally unexpectedly set by the hardware. This issue
    was observed with 10G SFPs causing huge time errors (> 30ms) in PTP. Only
    30 bits are needed for the nanosecond part of the timestamp, clear 2 most
    significant bits before extracting timestamp from the internal frame
    header.
    
    Fixes: 70dfe25cd866 ("net: sparx5: Update extraction/injection for timestamping")
    Signed-off-by: Aakash Menon <aakash.menon@protempis.com>
    Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: stmmac: dwmac4: extend timeout for VLAN Tag register busy bit check [+ + +]
Author: Shenwei Wang <shenwei.wang@nxp.com>
Date:   Tue Sep 24 15:54:24 2024 -0500

    net: stmmac: dwmac4: extend timeout for VLAN Tag register busy bit check
    
    [ Upstream commit 4c1b56671b68ffcbe6b78308bfdda6bcce6491ae ]
    
    Increase the timeout for checking the busy bit of the VLAN Tag register
    from 10µs to 500ms. This change is necessary to accommodate scenarios
    where Energy Efficient Ethernet (EEE) is enabled.
    
    Overnight testing revealed that when EEE is active, the busy bit can
    remain set for up to approximately 300ms. The new 500ms timeout provides
    a safety margin.
    
    Fixes: ed64639bc1e0 ("net: stmmac: Add support for VLAN Rx filtering")
    Reviewed-by: Andrew Lunn <andrew@lunn.ch>
    Signed-off-by: Shenwei Wang <shenwei.wang@nxp.com>
    Link: https://patch.msgid.link/20240924205424.573913-1-shenwei.wang@nxp.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: stmmac: Fix zero-division error when disabling tc cbs [+ + +]
Author: KhaiWenTan <khai.wen.tan@linux.intel.com>
Date:   Wed Sep 18 14:14:22 2024 +0800

    net: stmmac: Fix zero-division error when disabling tc cbs
    
    commit 675faf5a14c14a2be0b870db30a70764df81e2df upstream.
    
    The commit b8c43360f6e4 ("net: stmmac: No need to calculate speed divider
    when offload is disabled") allows the "port_transmit_rate_kbps" to be
    set to a value of 0, which is then passed to the "div_s64" function when
    tc-cbs is disabled. This leads to a zero-division error.
    
    When tc-cbs is disabled, the idleslope, sendslope, and credit values the
    credit values are not required to be configured. Therefore, adding a return
    statement after setting the txQ mode to DCB when tc-cbs is disabled would
    prevent a zero-division error.
    
    Fixes: b8c43360f6e4 ("net: stmmac: No need to calculate speed divider when offload is disabled")
    Cc: <stable@vger.kernel.org>
    Co-developed-by: Choong Yong Liang <yong.liang.choong@linux.intel.com>
    Signed-off-by: Choong Yong Liang <yong.liang.choong@linux.intel.com>
    Signed-off-by: KhaiWenTan <khai.wen.tan@linux.intel.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://patch.msgid.link/20240918061422.1589662-1-khai.wen.tan@linux.intel.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: test for not too small csum_start in virtio_net_hdr_to_skb() [+ + +]
Author: Eric Dumazet <edumazet@google.com>
Date:   Thu Sep 26 16:58:36 2024 +0000

    net: test for not too small csum_start in virtio_net_hdr_to_skb()
    
    [ Upstream commit 49d14b54a527289d09a9480f214b8c586322310a ]
    
    syzbot was able to trigger this warning [1], after injecting a
    malicious packet through af_packet, setting skb->csum_start and thus
    the transport header to an incorrect value.
    
    We can at least make sure the transport header is after
    the end of the network header (with a estimated minimal size).
    
    [1]
    [   67.873027] skb len=4096 headroom=16 headlen=14 tailroom=0
    mac=(-1,-1) mac_len=0 net=(16,-6) trans=10
    shinfo(txflags=0 nr_frags=1 gso(size=0 type=0 segs=0))
    csum(0xa start=10 offset=0 ip_summed=3 complete_sw=0 valid=0 level=0)
    hash(0x0 sw=0 l4=0) proto=0x0800 pkttype=0 iif=0
    priority=0x0 mark=0x0 alloc_cpu=10 vlan_all=0x0
    encapsulation=0 inner(proto=0x0000, mac=0, net=0, trans=0)
    [   67.877172] dev name=veth0_vlan feat=0x000061164fdd09e9
    [   67.877764] sk family=17 type=3 proto=0
    [   67.878279] skb linear:   00000000: 00 00 10 00 00 00 00 00 0f 00 00 00 08 00
    [   67.879128] skb frag:     00000000: 0e 00 07 00 00 00 28 00 08 80 1c 00 04 00 00 02
    [   67.879877] skb frag:     00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.880647] skb frag:     00000020: 00 00 02 00 00 00 08 00 1b 00 00 00 00 00 00 00
    [   67.881156] skb frag:     00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.881753] skb frag:     00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.882173] skb frag:     00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.882790] skb frag:     00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.883171] skb frag:     00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.883733] skb frag:     00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.884206] skb frag:     00000090: 00 00 00 00 00 00 00 00 00 00 69 70 76 6c 61 6e
    [   67.884704] skb frag:     000000a0: 31 00 00 00 00 00 00 00 00 00 2b 00 00 00 00 00
    [   67.885139] skb frag:     000000b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.885677] skb frag:     000000c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.886042] skb frag:     000000d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.886408] skb frag:     000000e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.887020] skb frag:     000000f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.887384] skb frag:     00000100: 00 00
    [   67.887878] ------------[ cut here ]------------
    [   67.887908] offset (-6) >= skb_headlen() (14)
    [   67.888445] WARNING: CPU: 10 PID: 2088 at net/core/dev.c:3332 skb_checksum_help (net/core/dev.c:3332 (discriminator 2))
    [   67.889353] Modules linked in: macsec macvtap macvlan hsr wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 libchacha poly1305_x86_64 dummy bridge sr_mod cdrom evdev pcspkr i2c_piix4 9pnet_virtio 9p 9pnet netfs
    [   67.890111] CPU: 10 UID: 0 PID: 2088 Comm: b363492833 Not tainted 6.11.0-virtme #1011
    [   67.890183] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
    [   67.890309] RIP: 0010:skb_checksum_help (net/core/dev.c:3332 (discriminator 2))
    [   67.891043] Call Trace:
    [   67.891173]  <TASK>
    [   67.891274] ? __warn (kernel/panic.c:741)
    [   67.891320] ? skb_checksum_help (net/core/dev.c:3332 (discriminator 2))
    [   67.891333] ? report_bug (lib/bug.c:180 lib/bug.c:219)
    [   67.891348] ? handle_bug (arch/x86/kernel/traps.c:239)
    [   67.891363] ? exc_invalid_op (arch/x86/kernel/traps.c:260 (discriminator 1))
    [   67.891372] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:621)
    [   67.891388] ? skb_checksum_help (net/core/dev.c:3332 (discriminator 2))
    [   67.891399] ? skb_checksum_help (net/core/dev.c:3332 (discriminator 2))
    [   67.891416] ip_do_fragment (net/ipv4/ip_output.c:777 (discriminator 1))
    [   67.891448] ? __ip_local_out (./include/linux/skbuff.h:1146 ./include/net/l3mdev.h:196 ./include/net/l3mdev.h:213 net/ipv4/ip_output.c:113)
    [   67.891459] ? __pfx_ip_finish_output2 (net/ipv4/ip_output.c:200)
    [   67.891470] ? ip_route_output_flow (./arch/x86/include/asm/preempt.h:84 (discriminator 13) ./include/linux/rcupdate.h:96 (discriminator 13) ./include/linux/rcupdate.h:871 (discriminator 13) net/ipv4/route.c:2625 (discriminator 13) ./include/net/route.h:141 (discriminator 13) net/ipv4/route.c:2852 (discriminator 13))
    [   67.891484] ipvlan_process_v4_outbound (drivers/net/ipvlan/ipvlan_core.c:445 (discriminator 1))
    [   67.891581] ipvlan_queue_xmit (drivers/net/ipvlan/ipvlan_core.c:542 drivers/net/ipvlan/ipvlan_core.c:604 drivers/net/ipvlan/ipvlan_core.c:670)
    [   67.891596] ipvlan_start_xmit (drivers/net/ipvlan/ipvlan_main.c:227)
    [   67.891607] dev_hard_start_xmit (./include/linux/netdevice.h:4916 ./include/linux/netdevice.h:4925 net/core/dev.c:3588 net/core/dev.c:3604)
    [   67.891620] __dev_queue_xmit (net/core/dev.h:168 (discriminator 25) net/core/dev.c:4425 (discriminator 25))
    [   67.891630] ? skb_copy_bits (./include/linux/uaccess.h:233 (discriminator 1) ./include/linux/uaccess.h:260 (discriminator 1) ./include/linux/highmem-internal.h:230 (discriminator 1) net/core/skbuff.c:3018 (discriminator 1))
    [   67.891645] ? __pskb_pull_tail (net/core/skbuff.c:2848 (discriminator 4))
    [   67.891655] ? skb_partial_csum_set (net/core/skbuff.c:5657)
    [   67.891666] ? virtio_net_hdr_to_skb.constprop.0 (./include/linux/skbuff.h:2791 (discriminator 3) ./include/linux/skbuff.h:2799 (discriminator 3) ./include/linux/virtio_net.h:109 (discriminator 3))
    [   67.891684] packet_sendmsg (net/packet/af_packet.c:3145 (discriminator 1) net/packet/af_packet.c:3177 (discriminator 1))
    [   67.891700] ? _raw_spin_lock_bh (./arch/x86/include/asm/atomic.h:107 (discriminator 4) ./include/linux/atomic/atomic-arch-fallback.h:2170 (discriminator 4) ./include/linux/atomic/atomic-instrumented.h:1302 (discriminator 4) ./include/asm-generic/qspinlock.h:111 (discriminator 4) ./include/linux/spinlock.h:187 (discriminator 4) ./include/linux/spinlock_api_smp.h:127 (discriminator 4) kernel/locking/spinlock.c:178 (discriminator 4))
    [   67.891716] __sys_sendto (net/socket.c:730 (discriminator 1) net/socket.c:745 (discriminator 1) net/socket.c:2210 (discriminator 1))
    [   67.891734] ? do_sock_setsockopt (net/socket.c:2335)
    [   67.891747] ? __sys_setsockopt (./include/linux/file.h:34 net/socket.c:2355)
    [   67.891761] __x64_sys_sendto (net/socket.c:2222 (discriminator 1) net/socket.c:2218 (discriminator 1) net/socket.c:2218 (discriminator 1))
    [   67.891772] do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1))
    [   67.891785] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
    
    Fixes: 9181d6f8a2bb ("net: add more sanity check in virtio_net_hdr_to_skb()")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reviewed-by: Willem de Bruijn <willemb@google.com>
    Link: https://patch.msgid.link/20240926165836.3797406-1-edumazet@google.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: wwan: qcom_bam_dmux: Fix missing pm_runtime_disable() [+ + +]
Author: Jinjie Ruan <ruanjinjie@huawei.com>
Date:   Mon Sep 23 19:57:43 2024 +0800

    net: wwan: qcom_bam_dmux: Fix missing pm_runtime_disable()
    
    [ Upstream commit d505d3593b52b6c43507f119572409087416ba28 ]
    
    It's important to undo pm_runtime_use_autosuspend() with
    pm_runtime_dont_use_autosuspend() at driver exit time.
    
    But the pm_runtime_disable() and pm_runtime_dont_use_autosuspend()
    is missing in the error path for bam_dmux_probe(). So add it.
    
    Found by code review. Compile-tested only.
    
    Fixes: 21a0ffd9b38c ("net: wwan: Add Qualcomm BAM-DMUX WWAN network driver")
    Suggested-by: Stephan Gerhold <stephan.gerhold@linaro.org>
    Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
    Reviewed-by: Stephan Gerhold <stephan.gerhold@linaro.org>
    Reviewed-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
netdev-genl: Set extack and fix error on napi-get [+ + +]
Author: Joe Damato <jdamato@fastly.com>
Date:   Sat Aug 31 12:17:04 2024 +0000

    netdev-genl: Set extack and fix error on napi-get
    
    [ Upstream commit 4e3a024b437ec0aee82550cc66a0f4e1a7a88a67 ]
    
    In commit 27f91aaf49b3 ("netdev-genl: Add netlink framework functions
    for napi"), when an invalid NAPI ID is specified the return value
    -EINVAL is used and no extack is set.
    
    Change the return value to -ENOENT and set the extack.
    
    Before this commit:
    
    $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
                              --do napi-get --json='{"id": 451}'
    Netlink error: Invalid argument
    nl_len = 36 (20) nl_flags = 0x100 nl_type = 2
            error: -22
    
    After this commit:
    
    $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
                             --do napi-get --json='{"id": 451}'
    Netlink error: No such file or directory
    nl_len = 44 (28) nl_flags = 0x300 nl_type = 2
            error: -2
            extack: {'bad-attr': '.id'}
    
    Suggested-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Joe Damato <jdamato@fastly.com>
    Link: https://patch.msgid.link/20240831121707.17562-1-jdamato@fastly.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
netfilter: nf_tables: do not remove elements if set backend implements .abort [+ + +]
Author: Pablo Neira Ayuso <pablo@netfilter.org>
Date:   Mon Jul 15 13:32:31 2024 +0200

    netfilter: nf_tables: do not remove elements if set backend implements .abort
    
    [ Upstream commit c9526aeb4998393171d85225ff540e28c7d4ab86 ]
    
    pipapo set backend maintains two copies of the datastructure, removing
    the elements from the copy that is going to be discarded slows down
    the abort path significantly, from several minutes to few seconds after
    this patch.
    
    This patch was previously reverted by
    
      f86fb94011ae ("netfilter: nf_tables: revert do not remove elements if set backend implements .abort")
    
    but it is now possible since recent work by Florian Westphal to perform
    on-demand clone from insert/remove path:
    
      532aec7e878b ("netfilter: nft_set_pipapo: remove dirty flag")
      3f1d886cc7c3 ("netfilter: nft_set_pipapo: move cloning of match info to insert/removal path")
      a238106703ab ("netfilter: nft_set_pipapo: prepare pipapo_get helper for on-demand clone")
      c5444786d0ea ("netfilter: nft_set_pipapo: merge deactivate helper into caller")
      6c108d9bee44 ("netfilter: nft_set_pipapo: prepare walk function for on-demand clone")
      8b8a2417558c ("netfilter: nft_set_pipapo: prepare destroy function for on-demand clone")
      80efd2997fb9 ("netfilter: nft_set_pipapo: make pipapo_clone helper return NULL")
      a590f4760922 ("netfilter: nft_set_pipapo: move prove_locking helper around")
    
    after this series, the clone is fully released once aborted, no need to
    take it back to previous state. Thus, no stale reference to elements can
    occur.
    
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

netfilter: nf_tables: prevent nf_skb_duplicated corruption [+ + +]
Author: Eric Dumazet <edumazet@google.com>
Date:   Thu Sep 26 18:56:11 2024 +0000

    netfilter: nf_tables: prevent nf_skb_duplicated corruption
    
    [ Upstream commit 92ceba94de6fb4cee2bf40b485979c342f44a492 ]
    
    syzbot found that nf_dup_ipv4() or nf_dup_ipv6() could write
    per-cpu variable nf_skb_duplicated in an unsafe way [1].
    
    Disabling preemption as hinted by the splat is not enough,
    we have to disable soft interrupts as well.
    
    [1]
    BUG: using __this_cpu_write() in preemptible [00000000] code: syz.4.282/6316
     caller is nf_dup_ipv4+0x651/0x8f0 net/ipv4/netfilter/nf_dup_ipv4.c:87
    CPU: 0 UID: 0 PID: 6316 Comm: syz.4.282 Not tainted 6.11.0-rc7-syzkaller-00104-g7052622fccb1 #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
    Call Trace:
     <TASK>
      __dump_stack lib/dump_stack.c:93 [inline]
      dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119
      check_preemption_disabled+0x10e/0x120 lib/smp_processor_id.c:49
      nf_dup_ipv4+0x651/0x8f0 net/ipv4/netfilter/nf_dup_ipv4.c:87
      nft_dup_ipv4_eval+0x1db/0x300 net/ipv4/netfilter/nft_dup_ipv4.c:30
      expr_call_ops_eval net/netfilter/nf_tables_core.c:240 [inline]
      nft_do_chain+0x4ad/0x1da0 net/netfilter/nf_tables_core.c:288
      nft_do_chain_ipv4+0x202/0x320 net/netfilter/nft_chain_filter.c:23
      nf_hook_entry_hookfn include/linux/netfilter.h:154 [inline]
      nf_hook_slow+0xc3/0x220 net/netfilter/core.c:626
      nf_hook+0x2c4/0x450 include/linux/netfilter.h:269
      NF_HOOK_COND include/linux/netfilter.h:302 [inline]
      ip_output+0x185/0x230 net/ipv4/ip_output.c:433
      ip_local_out net/ipv4/ip_output.c:129 [inline]
      ip_send_skb+0x74/0x100 net/ipv4/ip_output.c:1495
      udp_send_skb+0xacf/0x1650 net/ipv4/udp.c:981
      udp_sendmsg+0x1c21/0x2a60 net/ipv4/udp.c:1269
      sock_sendmsg_nosec net/socket.c:730 [inline]
      __sock_sendmsg+0x1a6/0x270 net/socket.c:745
      ____sys_sendmsg+0x525/0x7d0 net/socket.c:2597
      ___sys_sendmsg net/socket.c:2651 [inline]
      __sys_sendmmsg+0x3b2/0x740 net/socket.c:2737
      __do_sys_sendmmsg net/socket.c:2766 [inline]
      __se_sys_sendmmsg net/socket.c:2763 [inline]
      __x64_sys_sendmmsg+0xa0/0xb0 net/socket.c:2763
      do_syscall_x64 arch/x86/entry/common.c:52 [inline]
      do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
    RIP: 0033:0x7f4ce4f7def9
    Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
    RSP: 002b:00007f4ce5d4a038 EFLAGS: 00000246 ORIG_RAX: 0000000000000133
    RAX: ffffffffffffffda RBX: 00007f4ce5135f80 RCX: 00007f4ce4f7def9
    RDX: 0000000000000001 RSI: 0000000020005d40 RDI: 0000000000000006
    RBP: 00007f4ce4ff0b76 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
    R13: 0000000000000000 R14: 00007f4ce5135f80 R15: 00007ffd4cbc6d68
     </TASK>
    
    Fixes: d877f07112f1 ("netfilter: nf_tables: add nft_dup expression")
    Reported-by: syzbot <syzkaller@googlegroups.com>
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

netfilter: uapi: NFTA_FLOWTABLE_HOOK is NLA_NESTED [+ + +]
Author: Phil Sutter <phil@nwl.cc>
Date:   Wed Sep 25 20:01:20 2024 +0200

    netfilter: uapi: NFTA_FLOWTABLE_HOOK is NLA_NESTED
    
    [ Upstream commit 76f1ed087b562a469f2153076f179854b749c09a ]
    
    Fix the comment which incorrectly defines it as NLA_U32.
    
    Fixes: 3b49e2e94e6e ("netfilter: nf_tables: add flow table netlink frontend")
    Signed-off-by: Phil Sutter <phil@nwl.cc>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
netfs: Cancel dirty folios that have no storage destination [+ + +]
Author: David Howells <dhowells@redhat.com>
Date:   Mon Jul 29 12:23:11 2024 +0100

    netfs: Cancel dirty folios that have no storage destination
    
    [ Upstream commit 8f246b7c0a1be0882374f2ff831a61f0dbe77678 ]
    
    Kafs wants to be able to cache the contents of directories (and symlinks),
    but whilst these are downloaded from the server with the FS.FetchData RPC
    op and similar, the same as for regular files, they can't be updated by
    FS.StoreData, but rather have special operations (FS.MakeDir, etc.).
    
    Now, rather than redownloading a directory's content after each change made
    to that directory, kafs modifies the local blob.  This blob can be saved
    out to the cache, and since it's using netfslib, kafs just marks the folios
    dirty and lets ->writepages() on the directory take care of it, as for an
    regular file.
    
    This is fine as long as there's a cache as although the upload stream is
    disabled, there's a cache stream to drive the procedure.  But if the cache
    goes away in the meantime, suddenly there's no way do any writes and the
    code gets confused, complains "R=%x: No submit" to dmesg and leaves the
    dirty folio hanging.
    
    Fix this by just cancelling the store of the folio if neither stream is
    active.  (If there's no cache at the time of dirtying, we should just not
    mark the folio dirty).
    
    Signed-off-by: David Howells <dhowells@redhat.com>
    cc: Jeff Layton <jlayton@kernel.org>
    cc: netfs@lists.linux.dev
    cc: linux-fsdevel@vger.kernel.org
    Link: https://lore.kernel.org/r/20240814203850.2240469-23-dhowells@redhat.com/ # v2
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

netfs: Fix missing wakeup after issuing writes [+ + +]
Author: David Howells <dhowells@redhat.com>
Date:   Wed Oct 2 15:45:50 2024 +0100

    netfs: Fix missing wakeup after issuing writes
    
    [ Upstream commit 1ca4169c391c370e0f3a92938df2862900575096 ]
    
    After dividing up a proposed write into subrequests, netfslib sets
    NETFS_RREQ_ALL_QUEUED to indicate to the collector that it can move on to
    the final cleanup once it has emptied the subrequest queues.
    
    Now, whilst the collector will normally end up running at least once after
    this bit is set just because it takes a while to process all the write
    subrequests before the collector runs out of subrequests, there exists the
    possibility that the issuing thread will be forced to sleep and the
    collector thread will clean up all the subrequests before ALL_QUEUED gets
    set.
    
    In such a case, the collector thread will not get triggered again and will
    never clear NETFS_RREQ_IN_PROGRESS thus leaving a request uncompleted and
    causing a potential futute hang.
    
    Fix this by scheduling the write collector if all the subrequest queues are
    empty (and thus no writes pending issuance).
    
    Note that we'd do this ideally before queuing the subrequest, but in the
    case of buffered writeback, at least, we can't find out that we've run out
    of folios until after we've called writeback_iter() and it has returned
    NULL - at which point we might not actually have any subrequests still
    under construction.
    
    Fixes: 288ace2f57c9 ("netfs: New writeback implementation")
    Signed-off-by: David Howells <dhowells@redhat.com>
    Link: https://lore.kernel.org/r/3317784.1727880350@warthog.procyon.org.uk
    cc: Jeff Layton <jlayton@kernel.org>
    cc: netfs@lists.linux.dev
    cc: linux-fsdevel@vger.kernel.org
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
netpoll: Ensure clean state on setup failures [+ + +]
Author: Breno Leitao <leitao@debian.org>
Date:   Thu Aug 22 04:10:47 2024 -0700

    netpoll: Ensure clean state on setup failures
    
    [ Upstream commit ae5a0456e0b4cfd7e61619e55251ffdf1bc7adfb ]
    
    Modify netpoll_setup() and __netpoll_setup() to ensure that the netpoll
    structure (np) is left in a clean state if setup fails for any reason.
    This prevents carrying over misconfigured fields in case of partial
    setup success.
    
    Key changes:
    - np->dev is now set only after successful setup, ensuring it's always
      NULL if netpoll is not configured or if netpoll_setup() fails.
    - np->local_ip is zeroed if netpoll setup doesn't complete successfully.
    - Added DEBUG_NET_WARN_ON_ONCE() checks to catch unexpected states.
    - Reordered some operations in __netpoll_setup() for better logical flow.
    
    These changes improve the reliability of netpoll configuration, since it
    assures that the structure is fully initialized or totally unset.
    
    Suggested-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Breno Leitao <leitao@debian.org>
    Link: https://patch.msgid.link/20240822111051.179850-2-leitao@debian.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
nfp: Use IRQF_NO_AUTOEN flag in request_irq() [+ + +]
Author: Jinjie Ruan <ruanjinjie@huawei.com>
Date:   Wed Sep 11 17:44:45 2024 +0800

    nfp: Use IRQF_NO_AUTOEN flag in request_irq()
    
    [ Upstream commit daaba19d357f0900b303a530ced96c78086267ea ]
    
    disable_irq() after request_irq() still has a time gap in which
    interrupts can come. request_irq() with IRQF_NO_AUTOEN flag will
    disable IRQ auto-enable when request IRQ.
    
    Reviewed-by: Louis Peens <louis.peens@corigine.com>
    Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
    Link: https://patch.msgid.link/20240911094445.1922476-4-ruanjinjie@huawei.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
NFSD: Async COPY result needs to return a write verifier [+ + +]
Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Aug 28 13:40:03 2024 -0400

    NFSD: Async COPY result needs to return a write verifier
    
    [ Upstream commit 9ed666eba4e0a2bb8ffaa3739d830b64d4f2aaad ]
    
    Currently, when NFSD handles an asynchronous COPY, it returns a
    zero write verifier, relying on the subsequent CB_OFFLOAD callback
    to pass the write verifier and a stable_how4 value to the client.
    
    However, if the CB_OFFLOAD never arrives at the client (for example,
    if a network partition occurs just as the server sends the
    CB_OFFLOAD operation), the client will never receive this verifier.
    Thus, if the client sends a follow-up COMMIT, there is no way for
    the client to assess the COMMIT result.
    
    The usual recovery for a missing CB_OFFLOAD is for the client to
    send an OFFLOAD_STATUS operation, but that operation does not carry
    a write verifier in its result. Neither does it carry a stable_how4
    value, so the client /must/ send a COMMIT in this case -- which will
    always fail because currently there's still no write verifier in the
    COPY result.
    
    Thus the server needs to return a normal write verifier in its COPY
    result even if the COPY operation is to be performed asynchronously.
    
    If the server recognizes the callback stateid in subsequent
    OFFLOAD_STATUS operations, then obviously it has not restarted, and
    the write verifier the client received in the COPY result is still
    valid and can be used to assess a COMMIT of the copied data, if one
    is needed.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Stable-dep-of: aadc3bbea163 ("NFSD: Limit the number of concurrent async COPY operations")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
nfsd: fix delegation_blocked() to block correctly for at least 30 seconds [+ + +]
Author: NeilBrown <neilb@suse.de>
Date:   Mon Sep 9 15:06:36 2024 +1000

    nfsd: fix delegation_blocked() to block correctly for at least 30 seconds
    
    commit 45bb63ed20e02ae146336412889fe5450316a84f upstream.
    
    The pair of bloom filtered used by delegation_blocked() was intended to
    block delegations on given filehandles for between 30 and 60 seconds.  A
    new filehandle would be recorded in the "new" bit set.  That would then
    be switch to the "old" bit set between 0 and 30 seconds later, and it
    would remain as the "old" bit set for 30 seconds.
    
    Unfortunately the code intended to clear the old bit set once it reached
    30 seconds old, preparing it to be the next new bit set, instead cleared
    the *new* bit set before switching it to be the old bit set.  This means
    that the "old" bit set is always empty and delegations are blocked
    between 0 and 30 seconds.
    
    This patch updates bd->new before clearing the set with that index,
    instead of afterwards.
    
    Reported-by: Olga Kornievskaia <okorniev@redhat.com>
    Cc: stable@vger.kernel.org
    Fixes: 6282cd565553 ("NFSD: Don't hand out delegations for 30 seconds after recalling them.")
    Signed-off-by: NeilBrown <neilb@suse.de>
    Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
NFSD: Fix NFSv4's PUTPUBFH operation [+ + +]
Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Sun Aug 11 13:11:07 2024 -0400

    NFSD: Fix NFSv4's PUTPUBFH operation
    
    commit 202f39039a11402dcbcd5fece8d9fa6be83f49ae upstream.
    
    According to RFC 8881, all minor versions of NFSv4 support PUTPUBFH.
    
    Replace the XDR decoder for PUTPUBFH with a "noop" since we no
    longer want the minorversion check, and PUTPUBFH has no arguments to
    decode. (Ideally nfsd4_decode_noop should really be called
    nfsd4_decode_void).
    
    PUTPUBFH should now behave just like PUTROOTFH.
    
    Reported-by: Cedric Blancher <cedric.blancher@gmail.com>
    Fixes: e1a90ebd8b23 ("NFSD: Combine decode operations for v4 and v4.1")
    Cc: Dan Shelton <dan.f.shelton@gmail.com>
    Cc: Roland Mainz <roland.mainz@nrubsig.org>
    Cc: stable@vger.kernel.org
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

NFSD: Limit the number of concurrent async COPY operations [+ + +]
Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Aug 28 13:40:04 2024 -0400

    NFSD: Limit the number of concurrent async COPY operations
    
    [ Upstream commit aadc3bbea163b6caaaebfdd2b6c4667fbc726752 ]
    
    Nothing appears to limit the number of concurrent async COPY
    operations that clients can start. In addition, AFAICT each async
    COPY can copy an unlimited number of 4MB chunks, so can run for a
    long time. Thus IMO async COPY can become a DoS vector.
    
    Add a restriction mechanism that bounds the number of concurrent
    background COPY operations. Start simple and try to be fair -- this
    patch implements a per-namespace limit.
    
    An async COPY request that occurs while this limit is exceeded gets
    NFS4ERR_DELAY. The requesting client can choose to send the request
    again after a delay or fall back to a traditional read/write style
    copy.
    
    If there is need to make the mechanism more sophisticated, we can
    visit that in future patches.
    
    Cc: stable@vger.kernel.org
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
nfsd: map the EBADMSG to nfserr_io to avoid warning [+ + +]
Author: Li Lingfeng <lilingfeng3@huawei.com>
Date:   Sat Aug 17 14:27:13 2024 +0800

    nfsd: map the EBADMSG to nfserr_io to avoid warning
    
    commit 340e61e44c1d2a15c42ec72ade9195ad525fd048 upstream.
    
    Ext4 will throw -EBADMSG through ext4_readdir when a checksum error
    occurs, resulting in the following WARNING.
    
    Fix it by mapping EBADMSG to nfserr_io.
    
    nfsd_buffered_readdir
     iterate_dir // -EBADMSG -74
      ext4_readdir // .iterate_shared
       ext4_dx_readdir
        ext4_htree_fill_tree
         htree_dirblock_to_tree
          ext4_read_dirblock
           __ext4_read_dirblock
            ext4_dirblock_csum_verify
             warn_no_space_for_csum
              __warn_no_space_for_csum
            return ERR_PTR(-EFSBADCRC) // -EBADMSG -74
     nfserrno // WARNING
    
    [  161.115610] ------------[ cut here ]------------
    [  161.116465] nfsd: non-standard errno: -74
    [  161.117315] WARNING: CPU: 1 PID: 780 at fs/nfsd/nfsproc.c:878 nfserrno+0x9d/0xd0
    [  161.118596] Modules linked in:
    [  161.119243] CPU: 1 PID: 780 Comm: nfsd Not tainted 5.10.0-00014-g79679361fd5d #138
    [  161.120684] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qe
    mu.org 04/01/2014
    [  161.123601] RIP: 0010:nfserrno+0x9d/0xd0
    [  161.124676] Code: 0f 87 da 30 dd 00 83 e3 01 b8 00 00 00 05 75 d7 44 89 ee 48 c7 c7 c0 57 24 98 89 44 24 04 c6
     05 ce 2b 61 03 01 e8 99 20 d8 00 <0f> 0b 8b 44 24 04 eb b5 4c 89 e6 48 c7 c7 a0 6d a4 99 e8 cc 15 33
    [  161.127797] RSP: 0018:ffffc90000e2f9c0 EFLAGS: 00010286
    [  161.128794] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
    [  161.130089] RDX: 1ffff1103ee16f6d RSI: 0000000000000008 RDI: fffff520001c5f2a
    [  161.131379] RBP: 0000000000000022 R08: 0000000000000001 R09: ffff8881f70c1827
    [  161.132664] R10: ffffed103ee18304 R11: 0000000000000001 R12: 0000000000000021
    [  161.133949] R13: 00000000ffffffb6 R14: ffff8881317c0000 R15: ffffc90000e2fbd8
    [  161.135244] FS:  0000000000000000(0000) GS:ffff8881f7080000(0000) knlGS:0000000000000000
    [  161.136695] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [  161.137761] CR2: 00007fcaad70b348 CR3: 0000000144256006 CR4: 0000000000770ee0
    [  161.139041] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [  161.140291] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [  161.141519] PKRU: 55555554
    [  161.142076] Call Trace:
    [  161.142575]  ? __warn+0x9b/0x140
    [  161.143229]  ? nfserrno+0x9d/0xd0
    [  161.143872]  ? report_bug+0x125/0x150
    [  161.144595]  ? handle_bug+0x41/0x90
    [  161.145284]  ? exc_invalid_op+0x14/0x70
    [  161.146009]  ? asm_exc_invalid_op+0x12/0x20
    [  161.146816]  ? nfserrno+0x9d/0xd0
    [  161.147487]  nfsd_buffered_readdir+0x28b/0x2b0
    [  161.148333]  ? nfsd4_encode_dirent_fattr+0x380/0x380
    [  161.149258]  ? nfsd_buffered_filldir+0xf0/0xf0
    [  161.150093]  ? wait_for_concurrent_writes+0x170/0x170
    [  161.151004]  ? generic_file_llseek_size+0x48/0x160
    [  161.151895]  nfsd_readdir+0x132/0x190
    [  161.152606]  ? nfsd4_encode_dirent_fattr+0x380/0x380
    [  161.153516]  ? nfsd_unlink+0x380/0x380
    [  161.154256]  ? override_creds+0x45/0x60
    [  161.155006]  nfsd4_encode_readdir+0x21a/0x3d0
    [  161.155850]  ? nfsd4_encode_readlink+0x210/0x210
    [  161.156731]  ? write_bytes_to_xdr_buf+0x97/0xe0
    [  161.157598]  ? __write_bytes_to_xdr_buf+0xd0/0xd0
    [  161.158494]  ? lock_downgrade+0x90/0x90
    [  161.159232]  ? nfs4svc_decode_voidarg+0x10/0x10
    [  161.160092]  nfsd4_encode_operation+0x15a/0x440
    [  161.160959]  nfsd4_proc_compound+0x718/0xe90
    [  161.161818]  nfsd_dispatch+0x18e/0x2c0
    [  161.162586]  svc_process_common+0x786/0xc50
    [  161.163403]  ? nfsd_svc+0x380/0x380
    [  161.164137]  ? svc_printk+0x160/0x160
    [  161.164846]  ? svc_xprt_do_enqueue.part.0+0x365/0x380
    [  161.165808]  ? nfsd_svc+0x380/0x380
    [  161.166523]  ? rcu_is_watching+0x23/0x40
    [  161.167309]  svc_process+0x1a5/0x200
    [  161.168019]  nfsd+0x1f5/0x380
    [  161.168663]  ? nfsd_shutdown_threads+0x260/0x260
    [  161.169554]  kthread+0x1c4/0x210
    [  161.170224]  ? kthread_insert_work_sanity_check+0x80/0x80
    [  161.171246]  ret_from_fork+0x1f/0x30
    
    Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Cc: stable@vger.kernel.org
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
nvme-keyring: restrict match length for version '1' identifiers [+ + +]
Author: Hannes Reinecke <hare@kernel.org>
Date:   Mon Jul 22 14:02:18 2024 +0200

    nvme-keyring: restrict match length for version '1' identifiers
    
    [ Upstream commit 79559c75332458985ab8a21f11b08bf7c9b833b0 ]
    
    TP8018 introduced a new TLS PSK identifier version (version 1), which appended
    a PSK hash value to the existing identifier (cf NVMe TCP specification v1.1,
    section 3.6.1.3 'TLS PSK and PSK Identity Derivation').
    An original (version 0) identifier has the form:
    
    NVMe0<type><hmac> <hostnqn> <subsysnqn>
    
    and a version 1 identifier has the form:
    
    NVMe1<type><hmac> <hostnqn> <subsysnqn> <hash>
    
    This patch modifies the lookup algorthm to compare only the first part
    of the identifier (excluding the hash value) to handle both version 0 and
    version 1 identifiers.
    And the spec declares 'version 0' identifiers obsolete, so the lookup
    algorithm is modified to prever v1 identifiers.
    
    Signed-off-by: Hannes Reinecke <hare@kernel.org>
    Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Keith Busch <kbusch@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
nvme-tcp: check for invalidated or revoked key [+ + +]
Author: Hannes Reinecke <hare@kernel.org>
Date:   Mon Jul 22 14:02:20 2024 +0200

    nvme-tcp: check for invalidated or revoked key
    
    [ Upstream commit 5bc46b49c828a6dfaab80b71ecb63fe76a1096d2 ]
    
    key_lookup() will always return a key, even if that key is revoked
    or invalidated. So check for invalid keys before continuing.
    
    Signed-off-by: Hannes Reinecke <hare@kernel.org>
    Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Keith Busch <kbusch@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nvme-tcp: fix link failure for TCP auth [+ + +]
Author: Arnd Bergmann <arnd@arndb.de>
Date:   Mon Sep 9 20:21:09 2024 +0000

    nvme-tcp: fix link failure for TCP auth
    
    [ Upstream commit 2d5a333e09c388189238291577e443221baacba0 ]
    
    The nvme fabric driver calls the nvme_tls_key_lookup() function from
    nvmf_parse_key() when the keyring is enabled, but this is broken in a
    configuration with CONFIG_NVME_FABRICS=y and CONFIG_NVME_TCP=m because
    this leads to the function definition being in a loadable module:
    
    x86_64-linux-ld: vmlinux.o: in function `nvmf_parse_key':
    fabrics.c:(.text+0xb1bdec): undefined reference to `nvme_tls_key_lookup'
    
    Move the 'select' up to CONFIG_NVME_FABRICS itself to force this
    part to be built-in as well if needed.
    
    Fixes: 5bc46b49c828 ("nvme-tcp: check for invalidated or revoked key")
    Signed-off-by: Arnd Bergmann <arnd@arndb.de>
    Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
    Signed-off-by: Keith Busch <kbusch@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nvme-tcp: sanitize TLS key handling [+ + +]
Author: Hannes Reinecke <hare@kernel.org>
Date:   Mon Jul 22 14:02:19 2024 +0200

    nvme-tcp: sanitize TLS key handling
    
    [ Upstream commit 363895767fbfa05891b0b4d9e06ebde7a10c6a07 ]
    
    There is a difference between TLS configured (ie the user has
    provisioned/requested a key) and TLS enabled (ie the connection
    is encrypted with TLS). This becomes important for secure concatenation,
    where the initial authentication is run on an unencrypted connection
    (ie with TLS configured, but not enabled), and then the queue is reset to
    run over TLS (ie TLS configured _and_ enabled).
    So to differentiate between those two states store the generated
    key in opts->tls_key (as we're using the same TLS key for all queues),
    the key serial of the resulting TLS handshake in ctrl->tls_pskid
    (to signal that TLS on the admin queue is enabled), and a simple
    flag for the queues to indicated that TLS has been enabled.
    
    Signed-off-by: Hannes Reinecke <hare@kernel.org>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Keith Busch <kbusch@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
nvme: fix metadata handling in nvme-passthrough [+ + +]
Author: Puranjay Mohan <pjy@amazon.com>
Date:   Thu Aug 29 13:32:17 2024 +0000

    nvme: fix metadata handling in nvme-passthrough
    
    [ Upstream commit 7c2fd76048e95dd267055b5f5e0a48e6e7c81fd9 ]
    
    On an NVMe namespace that does not support metadata, it is possible to
    send an IO command with metadata through io-passthru. This allows issues
    like [1] to trigger in the completion code path.
    nvme_map_user_request() doesn't check if the namespace supports metadata
    before sending it forward. It also allows admin commands with metadata to
    be processed as it ignores metadata when bdev == NULL and may report
    success.
    
    Reject an IO command with metadata when the NVMe namespace doesn't
    support it and reject an admin command if it has metadata.
    
    [1] https://lore.kernel.org/all/mb61pcylvnym8.fsf@amazon.com/
    
    Suggested-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Puranjay Mohan <pjy@amazon.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
    Reviewed-by: Anuj Gupta <anuj20.g@samsung.com>
    Signed-off-by: Keith Busch <kbusch@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ocfs2: cancel dqi_sync_work before freeing oinfo [+ + +]
Author: Joseph Qi <joseph.qi@linux.alibaba.com>
Date:   Wed Sep 4 15:10:03 2024 +0800

    ocfs2: cancel dqi_sync_work before freeing oinfo
    
    commit 35fccce29feb3706f649726d410122dd81b92c18 upstream.
    
    ocfs2_global_read_info() will initialize and schedule dqi_sync_work at the
    end, if error occurs after successfully reading global quota, it will
    trigger the following warning with CONFIG_DEBUG_OBJECTS_* enabled:
    
    ODEBUG: free active (active state 0) object: 00000000d8b0ce28 object type: timer_list hint: qsync_work_fn+0x0/0x16c
    
    This reports that there is an active delayed work when freeing oinfo in
    error handling, so cancel dqi_sync_work first.  BTW, return status instead
    of -1 when .read_file_info fails.
    
    Link: https://syzkaller.appspot.com/bug?extid=f7af59df5d6b25f0febd
    Link: https://lkml.kernel.org/r/20240904071004.2067695-1-joseph.qi@linux.alibaba.com
    Fixes: 171bf93ce11f ("ocfs2: Periodic quota syncing")
    Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
    Reviewed-by: Heming Zhao <heming.zhao@suse.com>
    Reported-by: syzbot+f7af59df5d6b25f0febd@syzkaller.appspotmail.com
    Tested-by: syzbot+f7af59df5d6b25f0febd@syzkaller.appspotmail.com
    Cc: Mark Fasheh <mark@fasheh.com>
    Cc: Joel Becker <jlbec@evilplan.org>
    Cc: Junxiao Bi <junxiao.bi@oracle.com>
    Cc: Changwei Ge <gechangwei@live.cn>
    Cc: Gang He <ghe@suse.com>
    Cc: Jun Piao <piaojun@huawei.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ocfs2: fix null-ptr-deref when journal load failed. [+ + +]
Author: Julian Sun <sunjunchao2870@gmail.com>
Date:   Mon Sep 2 11:08:44 2024 +0800

    ocfs2: fix null-ptr-deref when journal load failed.
    
    commit 5784d9fcfd43bd853654bb80c87ef293b9e8e80a upstream.
    
    During the mounting process, if journal_reset() fails because of too short
    journal, then lead to jbd2_journal_load() fails with NULL j_sb_buffer.
    Subsequently, ocfs2_journal_shutdown() calls
    jbd2_journal_flush()->jbd2_cleanup_journal_tail()->
    __jbd2_update_log_tail()->jbd2_journal_update_sb_log_tail()
    ->lock_buffer(journal->j_sb_buffer), resulting in a null-pointer
    dereference error.
    
    To resolve this issue, we should check the JBD2_LOADED flag to ensure the
    journal was properly loaded.  Additionally, use journal instead of
    osb->journal directly to simplify the code.
    
    Link: https://syzkaller.appspot.com/bug?extid=05b9b39d8bdfe1a0861f
    Link: https://lkml.kernel.org/r/20240902030844.422725-1-sunjunchao2870@gmail.com
    Fixes: f6f50e28f0cb ("jbd2: Fail to load a journal if it is too short")
    Signed-off-by: Julian Sun <sunjunchao2870@gmail.com>
    Reported-by: syzbot+05b9b39d8bdfe1a0861f@syzkaller.appspotmail.com
    Suggested-by: Joseph Qi <joseph.qi@linux.alibaba.com>
    Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
    Cc: Mark Fasheh <mark@fasheh.com>
    Cc: Joel Becker <jlbec@evilplan.org>
    Cc: Junxiao Bi <junxiao.bi@oracle.com>
    Cc: Changwei Ge <gechangwei@live.cn>
    Cc: Gang He <ghe@suse.com>
    Cc: Jun Piao <piaojun@huawei.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ocfs2: fix possible null-ptr-deref in ocfs2_set_buffer_uptodate [+ + +]
Author: Lizhi Xu <lizhi.xu@windriver.com>
Date:   Mon Sep 2 10:36:36 2024 +0800

    ocfs2: fix possible null-ptr-deref in ocfs2_set_buffer_uptodate
    
    commit 33b525cef4cff49e216e4133cc48452e11c0391e upstream.
    
    When doing cleanup, if flags without OCFS2_BH_READAHEAD, it may trigger
    NULL pointer dereference in the following ocfs2_set_buffer_uptodate() if
    bh is NULL.
    
    Link: https://lkml.kernel.org/r/20240902023636.1843422-3-joseph.qi@linux.alibaba.com
    Fixes: cf76c78595ca ("ocfs2: don't put and assigning null to bh allocated outside")
    Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com>
    Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
    Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
    Reported-by: Heming Zhao <heming.zhao@suse.com>
    Suggested-by: Heming Zhao <heming.zhao@suse.com>
    Cc: <stable@vger.kernel.org>    [4.20+]
    Cc: Changwei Ge <gechangwei@live.cn>
    Cc: Gang He <ghe@suse.com>
    Cc: Joel Becker <jlbec@evilplan.org>
    Cc: Jun Piao <piaojun@huawei.com>
    Cc: Junxiao Bi <junxiao.bi@oracle.com>
    Cc: Mark Fasheh <mark@fasheh.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ocfs2: fix the la space leak when unmounting an ocfs2 volume [+ + +]
Author: Heming Zhao <heming.zhao@suse.com>
Date:   Fri Jul 19 19:43:10 2024 +0800

    ocfs2: fix the la space leak when unmounting an ocfs2 volume
    
    commit dfe6c5692fb525e5e90cefe306ee0dffae13d35f upstream.
    
    This bug has existed since the initial OCFS2 code.  The code logic in
    ocfs2_sync_local_to_main() is wrong, as it ignores the last contiguous
    free bits, which causes an OCFS2 volume to lose the last free clusters of
    LA window on each umount command.
    
    Link: https://lkml.kernel.org/r/20240719114310.14245-1-heming.zhao@suse.com
    Signed-off-by: Heming Zhao <heming.zhao@suse.com>
    Reviewed-by: Su Yue <glass.su@suse.com>
    Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
    Cc: Mark Fasheh <mark@fasheh.com>
    Cc: Joel Becker <jlbec@evilplan.org>
    Cc: Junxiao Bi <junxiao.bi@oracle.com>
    Cc: Changwei Ge <gechangwei@live.cn>
    Cc: Gang He <ghe@suse.com>
    Cc: Jun Piao <piaojun@huawei.com>
    Cc: Heming Zhao <heming.zhao@suse.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ocfs2: fix uninit-value in ocfs2_get_block() [+ + +]
Author: Joseph Qi <joseph.qi@linux.alibaba.com>
Date:   Wed Sep 25 17:06:00 2024 +0800

    ocfs2: fix uninit-value in ocfs2_get_block()
    
    commit 2af148ef8549a12f8025286b8825c2833ee6bcb8 upstream.
    
    syzbot reported an uninit-value BUG:
    
    BUG: KMSAN: uninit-value in ocfs2_get_block+0xed2/0x2710 fs/ocfs2/aops.c:159
    ocfs2_get_block+0xed2/0x2710 fs/ocfs2/aops.c:159
    do_mpage_readpage+0xc45/0x2780 fs/mpage.c:225
    mpage_readahead+0x43f/0x840 fs/mpage.c:374
    ocfs2_readahead+0x269/0x320 fs/ocfs2/aops.c:381
    read_pages+0x193/0x1110 mm/readahead.c:160
    page_cache_ra_unbounded+0x901/0x9f0 mm/readahead.c:273
    do_page_cache_ra mm/readahead.c:303 [inline]
    force_page_cache_ra+0x3b1/0x4b0 mm/readahead.c:332
    force_page_cache_readahead mm/internal.h:347 [inline]
    generic_fadvise+0x6b0/0xa90 mm/fadvise.c:106
    vfs_fadvise mm/fadvise.c:185 [inline]
    ksys_fadvise64_64 mm/fadvise.c:199 [inline]
    __do_sys_fadvise64 mm/fadvise.c:214 [inline]
    __se_sys_fadvise64 mm/fadvise.c:212 [inline]
    __x64_sys_fadvise64+0x1fb/0x3a0 mm/fadvise.c:212
    x64_sys_call+0xe11/0x3ba0
    arch/x86/include/generated/asm/syscalls_64.h:222
    do_syscall_x64 arch/x86/entry/common.c:52 [inline]
    do_syscall_64+0xcd/0x1e0 arch/x86/entry/common.c:83
    entry_SYSCALL_64_after_hwframe+0x77/0x7f
    
    This is because when ocfs2_extent_map_get_blocks() fails, p_blkno is
    uninitialized.  So the error log will trigger the above uninit-value
    access.
    
    The error log is out-of-date since get_blocks() was removed long time ago.
    And the error code will be logged in ocfs2_extent_map_get_blocks() once
    ocfs2_get_cluster() fails, so fix this by only logging inode and block.
    
    Link: https://syzkaller.appspot.com/bug?extid=9709e73bae885b05314b
    Link: https://lkml.kernel.org/r/20240925090600.3643376-1-joseph.qi@linux.alibaba.com
    Fixes: ccd979bdbce9 ("[PATCH] OCFS2: The Second Oracle Cluster Filesystem")
    Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
    Reported-by: syzbot+9709e73bae885b05314b@syzkaller.appspotmail.com
    Tested-by: syzbot+9709e73bae885b05314b@syzkaller.appspotmail.com
    Cc: Heming Zhao <heming.zhao@suse.com>
    Cc: Mark Fasheh <mark@fasheh.com>
    Cc: Joel Becker <jlbec@evilplan.org>
    Cc: Junxiao Bi <junxiao.bi@oracle.com>
    Cc: Changwei Ge <gechangwei@live.cn>
    Cc: Gang He <ghe@suse.com>
    Cc: Jun Piao <piaojun@huawei.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ocfs2: remove unreasonable unlock in ocfs2_read_blocks [+ + +]
Author: Lizhi Xu <lizhi.xu@windriver.com>
Date:   Mon Sep 2 10:36:35 2024 +0800

    ocfs2: remove unreasonable unlock in ocfs2_read_blocks
    
    commit c03a82b4a0c935774afa01fd6d128b444fd930a1 upstream.
    
    Patch series "Misc fixes for ocfs2_read_blocks", v5.
    
    This series contains 2 fixes for ocfs2_read_blocks().  The first patch fix
    the issue reported by syzbot, which detects bad unlock balance in
    ocfs2_read_blocks().  The second patch fixes an issue reported by Heming
    Zhao when reviewing above fix.
    
    
    This patch (of 2):
    
    There was a lock release before exiting, so remove the unreasonable unlock.
    
    Link: https://lkml.kernel.org/r/20240902023636.1843422-1-joseph.qi@linux.alibaba.com
    Link: https://lkml.kernel.org/r/20240902023636.1843422-2-joseph.qi@linux.alibaba.com
    Fixes: cf76c78595ca ("ocfs2: don't put and assigning null to bh allocated outside")
    Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com>
    Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
    Reviewed-by: Heming Zhao <heming.zhao@suse.com>
    Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
    Reported-by: syzbot+ab134185af9ef88dfed5@syzkaller.appspotmail.com
    Closes: https://syzkaller.appspot.com/bug?extid=ab134185af9ef88dfed5
    Tested-by: syzbot+ab134185af9ef88dfed5@syzkaller.appspotmail.com
    Cc: Mark Fasheh <mark@fasheh.com>
    Cc: Joel Becker <jlbec@evilplan.org>
    Cc: Junxiao Bi <junxiao.bi@oracle.com>
    Cc: Changwei Ge <gechangwei@live.cn>
    Cc: Gang He <ghe@suse.com>
    Cc: Jun Piao <piaojun@huawei.com>
    Cc: <stable@vger.kernel.org>    [4.20+]
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ocfs2: reserve space for inline xattr before attaching reflink tree [+ + +]
Author: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com>
Date:   Wed Sep 18 06:38:44 2024 +0000

    ocfs2: reserve space for inline xattr before attaching reflink tree
    
    commit 5ca60b86f57a4d9648f68418a725b3a7de2816b0 upstream.
    
    One of our customers reported a crash and a corrupted ocfs2 filesystem.
    The crash was due to the detection of corruption.  Upon troubleshooting,
    the fsck -fn output showed the below corruption
    
    [EXTENT_LIST_FREE] Extent list in owner 33080590 claims 230 as the next free chain record,
    but fsck believes the largest valid value is 227.  Clamp the next record value? n
    
    The stat output from the debugfs.ocfs2 showed the following corruption
    where the "Next Free Rec:" had overshot the "Count:" in the root metadata
    block.
    
            Inode: 33080590   Mode: 0640   Generation: 2619713622 (0x9c25a856)
            FS Generation: 904309833 (0x35e6ac49)
            CRC32: 00000000   ECC: 0000
            Type: Regular   Attr: 0x0   Flags: Valid
            Dynamic Features: (0x16) HasXattr InlineXattr Refcounted
            Extended Attributes Block: 0  Extended Attributes Inline Size: 256
            User: 0 (root)   Group: 0 (root)   Size: 281320357888
            Links: 1   Clusters: 141738
            ctime: 0x66911b56 0x316edcb8 -- Fri Jul 12 06:02:30.829349048 2024
            atime: 0x66911d6b 0x7f7a28d -- Fri Jul 12 06:11:23.133669517 2024
            mtime: 0x66911b56 0x12ed75d7 -- Fri Jul 12 06:02:30.317552087 2024
            dtime: 0x0 -- Wed Dec 31 17:00:00 1969
            Refcount Block: 2777346
            Last Extblk: 2886943   Orphan Slot: 0
            Sub Alloc Slot: 0   Sub Alloc Bit: 14
            Tree Depth: 1   Count: 227   Next Free Rec: 230
            ## Offset        Clusters       Block#
            0  0             2310           2776351
            1  2310          2139           2777375
            2  4449          1221           2778399
            3  5670          731            2779423
            4  6401          566            2780447
            .......          ....           .......
            .......          ....           .......
    
    The issue was in the reflink workfow while reserving space for inline
    xattr.  The problematic function is ocfs2_reflink_xattr_inline().  By the
    time this function is called the reflink tree is already recreated at the
    destination inode from the source inode.  At this point, this function
    reserves space for inline xattrs at the destination inode without even
    checking if there is space at the root metadata block.  It simply reduces
    the l_count from 243 to 227 thereby making space of 256 bytes for inline
    xattr whereas the inode already has extents beyond this index (in this
    case up to 230), thereby causing corruption.
    
    The fix for this is to reserve space for inline metadata at the destination
    inode before the reflink tree gets recreated. The customer has verified the
    fix.
    
    Link: https://lkml.kernel.org/r/20240918063844.1830332-1-gautham.ananthakrishna@oracle.com
    Fixes: ef962df057aa ("ocfs2: xattr: fix inlined xattr reflink")
    Signed-off-by: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com>
    Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
    Cc: Mark Fasheh <mark@fasheh.com>
    Cc: Joel Becker <jlbec@evilplan.org>
    Cc: Junxiao Bi <junxiao.bi@oracle.com>
    Cc: Changwei Ge <gechangwei@live.cn>
    Cc: Gang He <ghe@suse.com>
    Cc: Jun Piao <piaojun@huawei.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
of/irq: Refer to actual buffer size in of_irq_parse_one() [+ + +]
Author: Geert Uytterhoeven <geert+renesas@glider.be>
Date:   Tue Aug 20 14:16:53 2024 +0200

    of/irq: Refer to actual buffer size in of_irq_parse_one()
    
    [ Upstream commit 39ab331ab5d377a18fbf5a0e0b228205edfcc7f4 ]
    
    Replace two open-coded calculations of the buffer size by invocations of
    sizeof() on the buffer itself, to make sure the code will always use the
    actual buffer size.
    
    Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
    Link: https://lore.kernel.org/r/817c0b9626fd30790fc488c472a3398324cfcc0c.1724156125.git.geert+renesas@glider.be
    Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

of/irq: Support #msi-cells=<0> in of_msi_get_domain [+ + +]
Author: Andrew Jones <ajones@ventanamicro.com>
Date:   Sat Aug 17 09:41:08 2024 +0200

    of/irq: Support #msi-cells=<0> in of_msi_get_domain
    
    commit db8e81132cf051843c9a59b46fa5a071c45baeb3 upstream.
    
    An 'msi-parent' property with a single entry and no accompanying
    '#msi-cells' property is considered the legacy definition as opposed
    to its definition after being expanded with commit 126b16e2ad98
    ("Docs: dt: add generic MSI bindings"). However, the legacy
    definition is completely compatible with the current definition and,
    since of_phandle_iterator_next() tolerates missing and present-but-
    zero *cells properties since commit e42ee61017f5 ("of: Let
    of_for_each_phandle fallback to non-negative cell_count"), there's no
    need anymore to special case the legacy definition in
    of_msi_get_domain().
    
    Indeed, special casing has turned out to be harmful, because, as of
    commit 7c025238b47a ("dt-bindings: irqchip: Describe the IMX MU block
    as a MSI controller"), MSI controller DT bindings have started
    specifying '#msi-cells' as a required property (even when the value
    must be zero) as an effort to make the bindings more explicit. But,
    since the special casing of 'msi-parent' only uses the existence of
    '#msi-cells' for its heuristic, and not whether or not it's also
    nonzero, the legacy path is not taken. Furthermore, the path to
    support the new, broader definition isn't taken either since that
    path has been restricted to the platform-msi bus.
    
    But, neither the definition of 'msi-parent' nor the definition of
    '#msi-cells' is platform-msi-specific (the platform-msi bus was just
    the first bus that needed '#msi-cells'), so remove both the special
    casing and the restriction. The code removal also requires changing
    to of_parse_phandle_with_optional_args() in order to ensure the
    legacy (but compatible) use of 'msi-parent' remains supported. This
    not only simplifies the code but also resolves an issue with PCI
    devices finding their MSI controllers on riscv, as the riscv,imsics
    binding requires '#msi-cells=<0>'.
    
    Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
    Link: https://lore.kernel.org/r/20240817074107.31153-2-ajones@ventanamicro.com
    Cc: stable@vger.kernel.org
    Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
of: address: Report error on resource bounds overflow [+ + +]
Author: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Date:   Thu Sep 5 09:46:01 2024 +0200

    of: address: Report error on resource bounds overflow
    
    commit 000f6d588a8f3d128f89351058dc04d38e54a327 upstream.
    
    The members "start" and "end" of struct resource are of type
    "resource_size_t" which can be 32bit wide.
    Values read from OF however are always 64bit wide.
    Avoid silently truncating the value and instead return an error value.
    
    This can happen on real systems when the DT was created for a
    PAE-enabled kernel and a non-PAE kernel is actually running.
    For example with an arm defconfig and "qemu-system-arm -M virt".
    
    Link: https://bugs.launchpad.net/qemu/+bug/1790975
    Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
    Tested-by: Nam Cao <namcao@linutronix.de>
    Reviewed-by: Nam Cao <namcao@linutronix.de>
    Link: https://lore.kernel.org/r/20240905-of-resource-overflow-v1-1-0cd8bb92cc1f@linutronix.de
    Cc: stable@vger.kernel.org
    Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
ovl: fail if trusted xattrs are needed but caller lacks permission [+ + +]
Author: Mike Baynton <mike@mbaynton.com>
Date:   Wed Jul 10 22:52:04 2024 -0500

    ovl: fail if trusted xattrs are needed but caller lacks permission
    
    commit 6c4a5f96450415735c31ed70ff354f0ee5cbf67b upstream.
    
    Some overlayfs features require permission to read/write trusted.*
    xattrs. These include redirect_dir, verity, metacopy, and data-only
    layers. This patch adds additional validations at mount time to stop
    overlays from mounting in certain cases where the resulting mount would
    not function according to the user's expectations because they lack
    permission to access trusted.* xattrs (for example, not global root.)
    
    Similar checks in ovl_make_workdir() that disable features instead of
    failing are still relevant and used in cases where the resulting mount
    can still work "reasonably well." Generally, if the feature was enabled
    through kernel config or module option, any mount that worked before
    will still work the same; this applies to redirect_dir and metacopy. The
    user must explicitly request these features in order to generate a mount
    failure. Verity and data-only layers on the other hand must be explictly
    requested and have no "reasonable" disabled or degraded alternative, so
    mounts attempting either always fail.
    
    "lower data-only dirs require metacopy support" moved down in case
    userxattr is set, which disables metacopy.
    
    Cc: stable@vger.kernel.org # v6.6+
    Signed-off-by: Mike Baynton <mike@mbaynton.com>
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ovl: fsync after metadata copy-up [+ + +]
Author: Amir Goldstein <amir73il@gmail.com>
Date:   Thu Aug 29 17:51:08 2024 +0200

    ovl: fsync after metadata copy-up
    
    [ Upstream commit 7d6899fb69d25e1bc6f4700b7c1d92e6b608593d ]
    
    For upper filesystems which do not use strict ordering of persisting
    metadata changes (e.g. ubifs), when overlayfs file is modified for
    the first time, copy up will create a copy of the lower file and
    its parent directories in the upper layer. Permission lost of the
    new upper parent directory was observed during power-cut stress test.
    
    Fix by moving the fsync call to after metadata copy to make sure that the
    metadata copied up directory and files persists to disk before renaming
    from tmp to final destination.
    
    With metacopy enabled, this change will hurt performance of workloads
    such as chown -R, so we keep the legacy behavior of fsync only on copyup
    of data.
    
    Link: https://lore.kernel.org/linux-unionfs/CAOQ4uxj-pOvmw1-uXR3qVdqtLjSkwcR9nVKcNU_vC10Zyf2miQ@mail.gmail.com/
    Reported-and-tested-by: Fei Lv <feilv@asrmicro.com>
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
parisc: Allow mmap(MAP_STACK) memory to automatically expand upwards [+ + +]
Author: Helge Deller <deller@kernel.org>
Date:   Sun Sep 8 20:51:17 2024 +0200

    parisc: Allow mmap(MAP_STACK) memory to automatically expand upwards
    
    commit 5d698966fa7b452035c44c937d704910bf3440dd upstream.
    
    When userspace allocates memory with mmap() in order to be used for stack,
    allow this memory region to automatically expand upwards up until the
    current maximum process stack size.
    The fault handler checks if the VM_GROWSUP bit is set in the vm_flags field
    of a memory area before it allows it to expand.
    This patch modifies the parisc specific code only.
    A RFC for a generic patch to modify mmap() for all architectures was sent
    to the mailing list but did not get enough Acks.
    
    Reported-by: Camm Maguire <camm@maguirefamily.org>
    Signed-off-by: Helge Deller <deller@gmx.de>
    Cc: stable@vger.kernel.org      # v5.10+
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

parisc: Fix 64-bit userspace syscall path [+ + +]
Author: Helge Deller <deller@kernel.org>
Date:   Sun Sep 8 00:40:38 2024 +0200

    parisc: Fix 64-bit userspace syscall path
    
    commit d24449864da5838936669618356b0e30ca2999c3 upstream.
    
    Currently the glibc isn't yet ported to 64-bit for hppa, so
    there is no usable userspace available yet.
    But it's possible to manually build a static 64-bit binary
    and run that for testing. One such 64-bit test program is
    available at http://ftp.parisc-linux.org/src/64bit.tar.gz
    and it shows various issues with the existing 64-bit syscall
    path in the kernel.
    This patch fixes those issues.
    
    Signed-off-by: Helge Deller <deller@gmx.de>
    Cc: stable@vger.kernel.org      # v4.19+
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

parisc: Fix itlb miss handler for 64-bit programs [+ + +]
Author: Helge Deller <deller@gmx.de>
Date:   Tue Sep 10 18:32:24 2024 +0200

    parisc: Fix itlb miss handler for 64-bit programs
    
    commit 9542130937e9dc707dd7c6b7af73326437da2d50 upstream.
    
    For an itlb miss when executing code above 4 Gb on ILP64 adjust the
    iasq/iaoq in the same way isr/ior was adjusted.  This fixes signal
    delivery for the 64-bit static test program from
    http://ftp.parisc-linux.org/src/64bit.tar.gz.  Note that signals are
    handled by the signal trampoline code in the 64-bit VDSO which is mapped
    into high userspace memory region above 4GB for 64-bit processes.
    
    Signed-off-by: Helge Deller <deller@gmx.de>
    Cc: stable@vger.kernel.org      # v4.19+
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

parisc: Fix stack start for ADDR_NO_RANDOMIZE personality [+ + +]
Author: Helge Deller <deller@gmx.de>
Date:   Sat Sep 7 18:28:11 2024 +0200

    parisc: Fix stack start for ADDR_NO_RANDOMIZE personality
    
    commit f31b256994acec6929306dfa86ac29716e7503d6 upstream.
    
    Fix the stack start address calculation for the parisc architecture in
    setup_arg_pages() when address randomization is disabled. When the
    ADDR_NO_RANDOMIZE process personality is disabled there is no need to add
    additional space for the stack.
    Note that this patch touches code inside an #ifdef CONFIG_STACK_GROWSUP hunk,
    which is why only the parisc architecture is affected since it's the
    only Linux architecture where the stack grows upwards.
    
    Without this patch you will find the stack in the middle of some
    mapped libaries and suddenly limited to 6MB instead of 8MB:
    
    root@parisc:~# setarch -R /bin/bash -c "cat /proc/self/maps"
    00010000-00019000 r-xp 00000000 08:05 1182034           /usr/bin/cat
    00019000-0001a000 rwxp 00009000 08:05 1182034           /usr/bin/cat
    0001a000-0003b000 rwxp 00000000 00:00 0                 [heap]
    f90c4000-f9283000 r-xp 00000000 08:05 1573004           /usr/lib/hppa-linux-gnu/libc.so.6
    f9283000-f9285000 r--p 001bf000 08:05 1573004           /usr/lib/hppa-linux-gnu/libc.so.6
    f9285000-f928a000 rwxp 001c1000 08:05 1573004           /usr/lib/hppa-linux-gnu/libc.so.6
    f928a000-f9294000 rwxp 00000000 00:00 0
    f9301000-f9323000 rwxp 00000000 00:00 0                 [stack]
    f98b4000-f98e4000 r-xp 00000000 08:05 1572869           /usr/lib/hppa-linux-gnu/ld.so.1
    f98e4000-f98e5000 r--p 00030000 08:05 1572869           /usr/lib/hppa-linux-gnu/ld.so.1
    f98e5000-f98e9000 rwxp 00031000 08:05 1572869           /usr/lib/hppa-linux-gnu/ld.so.1
    f9ad8000-f9b00000 rw-p 00000000 00:00 0
    f9b00000-f9b01000 r-xp 00000000 00:00 0                 [vdso]
    
    With the patch the stack gets correctly mapped at the end
    of the process memory map:
    
    root@panama:~# setarch -R /bin/bash -c "cat /proc/self/maps"
    00010000-00019000 r-xp 00000000 08:13 16385582          /usr/bin/cat
    00019000-0001a000 rwxp 00009000 08:13 16385582          /usr/bin/cat
    0001a000-0003b000 rwxp 00000000 00:00 0                 [heap]
    fef29000-ff0eb000 r-xp 00000000 08:13 16122400          /usr/lib/hppa-linux-gnu/libc.so.6
    ff0eb000-ff0ed000 r--p 001c2000 08:13 16122400          /usr/lib/hppa-linux-gnu/libc.so.6
    ff0ed000-ff0f2000 rwxp 001c4000 08:13 16122400          /usr/lib/hppa-linux-gnu/libc.so.6
    ff0f2000-ff0fc000 rwxp 00000000 00:00 0
    ff4b4000-ff4e4000 r-xp 00000000 08:13 16121913          /usr/lib/hppa-linux-gnu/ld.so.1
    ff4e4000-ff4e6000 r--p 00030000 08:13 16121913          /usr/lib/hppa-linux-gnu/ld.so.1
    ff4e6000-ff4ea000 rwxp 00032000 08:13 16121913          /usr/lib/hppa-linux-gnu/ld.so.1
    ff6d7000-ff6ff000 rw-p 00000000 00:00 0
    ff6ff000-ff700000 r-xp 00000000 00:00 0                 [vdso]
    ff700000-ff722000 rwxp 00000000 00:00 0                 [stack]
    
    Reported-by: Camm Maguire <camm@maguirefamily.org>
    Signed-off-by: Helge Deller <deller@gmx.de>
    Fixes: d045c77c1a69 ("parisc,metag: Fix crashes due to stack randomization on stack-grows-upwards architectures")
    Fixes: 17d9822d4b4c ("parisc: Consider stack randomization for mmap base only when necessary")
    Cc: stable@vger.kernel.org      # v5.2+
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
perf callchain: Fix stitch LBR memory leaks [+ + +]
Author: Ian Rogers <irogers@google.com>
Date:   Wed Aug 7 22:46:43 2024 -0700

    perf callchain: Fix stitch LBR memory leaks
    
    [ Upstream commit 599c19397b17d197fc1184bbc950f163a292efc9 ]
    
    The 'struct callchain_cursor_node' has a 'struct map_symbol' whose maps
    and map members are reference counted. Ensure these values use a _get
    routine to increment the reference counts and use map_symbol__exit() to
    release the reference counts.
    
    Do similar for 'struct thread's prev_lbr_cursor, but save the size of
    the prev_lbr_cursor array so that it may be iterated.
    
    Ensure that when stitch_nodes are placed on the free list the
    map_symbols are exited.
    
    Fix resolve_lbr_callchain_sample() by replacing list_replace_init() to
    list_splice_init(), so the whole list is moved and nodes aren't leaked.
    
    A reproduction of the memory leaks is possible with a leak sanitizer
    build in the perf report command of:
    
      ```
      $ perf record -e cycles --call-graph lbr perf test -w thloop
      $ perf report --stitch-lbr
      ```
    
    Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
    Fixes: ff165628d72644e3 ("perf callchain: Stitch LBR call stack")
    Signed-off-by: Ian Rogers <irogers@google.com>
    [ Basic tests after applying the patch, repeating the example above ]
    Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Cc: Adrian Hunter <adrian.hunter@intel.com>
    Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
    Cc: Andi Kleen <ak@linux.intel.com>
    Cc: Anne Macedo <retpolanne@posteo.net>
    Cc: Changbin Du <changbin.du@huawei.com>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Jiri Olsa <jolsa@kernel.org>
    Cc: Mark Rutland <mark.rutland@arm.com>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Link: https://lore.kernel.org/r/20240808054644.1286065-1-irogers@google.com
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
perf hist: Update hist symbol when updating maps [+ + +]
Author: Matt Fleming <matt@readmodwrite.com>
Date:   Thu Aug 15 15:22:12 2024 +0100

    perf hist: Update hist symbol when updating maps
    
    commit ac01c8c4246546fd8340a232f3ada1921dc0ee48 upstream.
    
    AddressSanitizer found a use-after-free bug in the symbol code which
    manifested as 'perf top' segfaulting.
    
      ==1238389==ERROR: AddressSanitizer: heap-use-after-free on address 0x60b00c48844b at pc 0x5650d8035961 bp 0x7f751aaecc90 sp 0x7f751aaecc80
      READ of size 1 at 0x60b00c48844b thread T193
          #0 0x5650d8035960 in _sort__sym_cmp util/sort.c:310
          #1 0x5650d8043744 in hist_entry__cmp util/hist.c:1286
          #2 0x5650d8043951 in hists__findnew_entry util/hist.c:614
          #3 0x5650d804568f in __hists__add_entry util/hist.c:754
          #4 0x5650d8045bf9 in hists__add_entry util/hist.c:772
          #5 0x5650d8045df1 in iter_add_single_normal_entry util/hist.c:997
          #6 0x5650d8043326 in hist_entry_iter__add util/hist.c:1242
          #7 0x5650d7ceeefe in perf_event__process_sample /home/matt/src/linux/tools/perf/builtin-top.c:845
          #8 0x5650d7ceeefe in deliver_event /home/matt/src/linux/tools/perf/builtin-top.c:1208
          #9 0x5650d7fdb51b in do_flush util/ordered-events.c:245
          #10 0x5650d7fdb51b in __ordered_events__flush util/ordered-events.c:324
          #11 0x5650d7ced743 in process_thread /home/matt/src/linux/tools/perf/builtin-top.c:1120
          #12 0x7f757ef1f133 in start_thread nptl/pthread_create.c:442
          #13 0x7f757ef9f7db in clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
    
    When updating hist maps it's also necessary to update the hist symbol
    reference because the old one gets freed in map__put().
    
    While this bug was probably introduced with 5c24b67aae72f54c ("perf
    tools: Replace map->referenced & maps->removed_maps with map->refcnt"),
    the symbol objects were leaked until c087e9480cf33672 ("perf machine:
    Fix refcount usage when processing PERF_RECORD_KSYMBOL") was merged so
    the bug was masked.
    
    Fixes: c087e9480cf33672 ("perf machine: Fix refcount usage when processing PERF_RECORD_KSYMBOL")
    Reported-by: Yunzhao Li <yunzhao@cloudflare.com>
    Signed-off-by: Matt Fleming (Cloudflare) <matt@readmodwrite.com>
    Cc: Ian Rogers <irogers@google.com>
    Cc: kernel-team@cloudflare.com
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Riccardo Mancini <rickyman7@gmail.com>
    Cc: stable@vger.kernel.org # v5.13+
    Link: https://lore.kernel.org/r/20240815142212.3834625-1-matt@readmodwrite.com
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
perf python: Allow checking for the existence of warning options in clang [+ + +]
Author: Arnaldo Carvalho de Melo <acme@redhat.com>
Date:   Thu Aug 22 14:13:49 2024 -0300

    perf python: Allow checking for the existence of warning options in clang
    
    commit b81162302001f41157f6e93654aaccc30e817e2a upstream.
    
    We'll need to check if an warning option introduced in clang 19 is
    available on the clang version being used, so cover the error message
    emitted when testing for a -W option.
    
    Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
    Cc: Ian Rogers <irogers@google.com>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Nathan Chancellor <nathan@kernel.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Link: https://lore.kernel.org/lkml/CA+icZUVtHn8X1Tb_Y__c-WswsO0K8U9uy3r2MzKXwTA5THtL7w@mail.gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

perf python: Disable -Wno-cast-function-type-mismatch if present on clang [+ + +]
Author: Arnaldo Carvalho de Melo <acme@redhat.com>
Date:   Thu Aug 22 14:13:49 2024 -0300

    perf python: Disable -Wno-cast-function-type-mismatch if present on clang
    
    commit 00dc514612fe98cfa117193b9df28f15e7c9db9c upstream.
    
    The -Wcast-function-type-mismatch option was introduced in clang 19 and
    its enabled by default, since we use -Werror, and python bindings do
    casts that are valid but trips this warning, disable it if present.
    
    Closes: https://lore.kernel.org/all/CA+icZUXoJ6BS3GMhJHV3aZWyb5Cz2haFneX0C5pUMUUhG-UVKQ@mail.gmail.com
    Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
    Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
    Cc: Ian Rogers <irogers@google.com>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Nathan Chancellor <nathan@kernel.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: stable@vger.kernel.org # To allow building with the upcoming clang 19
    Link: https://lore.kernel.org/lkml/CA+icZUVtHn8X1Tb_Y__c-WswsO0K8U9uy3r2MzKXwTA5THtL7w@mail.gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
perf report: Fix segfault when 'sym' sort key is not used [+ + +]
Author: Namhyung Kim <namhyung@kernel.org>
Date:   Mon Aug 26 15:10:42 2024 -0700

    perf report: Fix segfault when 'sym' sort key is not used
    
    commit 9af2efee41b27a0f386fb5aa95d8d0b4b5d9fede upstream.
    
    The fields in the hist_entry are filled on-demand which means they only
    have meaningful values when relevant sort keys are used.
    
    So if neither of 'dso' nor 'sym' sort keys are used, the map/symbols in
    the hist entry can be garbage.  So it shouldn't access it
    unconditionally.
    
    I got a segfault, when I wanted to see cgroup profiles.
    
      $ sudo perf record -a --all-cgroups --synth=cgroup true
    
      $ sudo perf report -s cgroup
    
      Program received signal SIGSEGV, Segmentation fault.
      0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48
      48            return RC_CHK_ACCESS(map)->dso;
      (gdb) bt
      #0  0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48
      #1  0x00005555557aa39b in map__load (map=0x0) at util/map.c:344
      #2  0x00005555557aa592 in map__find_symbol (map=0x0, addr=140736115941088) at util/map.c:385
      #3  0x00005555557ef000 in hists__findnew_entry (hists=0x555556039d60, entry=0x7fffffffa4c0, al=0x7fffffffa8c0, sample_self=true)
          at util/hist.c:644
      #4  0x00005555557ef61c in __hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0,
          block_info=0x0, sample=0x7fffffffaa90, sample_self=true, ops=0x0) at util/hist.c:761
      #5  0x00005555557ef71f in hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0,
          sample=0x7fffffffaa90, sample_self=true) at util/hist.c:779
      #6  0x00005555557f00fb in iter_add_single_normal_entry (iter=0x7fffffffa900, al=0x7fffffffa8c0) at util/hist.c:1015
      #7  0x00005555557f09a7 in hist_entry_iter__add (iter=0x7fffffffa900, al=0x7fffffffa8c0, max_stack_depth=127, arg=0x7fffffffbce0)
          at util/hist.c:1260
      #8  0x00005555555ba7ce in process_sample_event (tool=0x7fffffffbce0, event=0x7ffff7c14128, sample=0x7fffffffaa90, evsel=0x555556039ad0,
          machine=0x5555560388e8) at builtin-report.c:334
      #9  0x00005555557b30c8 in evlist__deliver_sample (evlist=0x555556039010, tool=0x7fffffffbce0, event=0x7ffff7c14128,
          sample=0x7fffffffaa90, evsel=0x555556039ad0, machine=0x5555560388e8) at util/session.c:1232
      #10 0x00005555557b32bc in machines__deliver_event (machines=0x5555560388e8, evlist=0x555556039010, event=0x7ffff7c14128,
          sample=0x7fffffffaa90, tool=0x7fffffffbce0, file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1271
      #11 0x00005555557b3848 in perf_session__deliver_event (session=0x5555560386d0, event=0x7ffff7c14128, tool=0x7fffffffbce0,
          file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1354
      #12 0x00005555557affaf in ordered_events__deliver_event (oe=0x555556038e60, event=0x555556135aa0) at util/session.c:132
      #13 0x00005555557bb605 in do_flush (oe=0x555556038e60, show_progress=false) at util/ordered-events.c:245
      #14 0x00005555557bb95c in __ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND, timestamp=0) at util/ordered-events.c:324
      #15 0x00005555557bba46 in ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND) at util/ordered-events.c:342
      #16 0x00005555557b1b3b in perf_event__process_finished_round (tool=0x7fffffffbce0, event=0x7ffff7c15bb8, oe=0x555556038e60)
          at util/session.c:780
      #17 0x00005555557b3b27 in perf_session__process_user_event (session=0x5555560386d0, event=0x7ffff7c15bb8, file_offset=117688,
          file_path=0x555556038ff0 "perf.data") at util/session.c:1406
    
    As you can see the entry->ms.map was NULL even if he->ms.map has a
    value.  This is because 'sym' sort key is not given, so it cannot assume
    whether he->ms.sym and entry->ms.sym is the same.  I only checked the
    'sym' sort key here as it implies 'dso' behavior (so maps are the same).
    
    Fixes: ac01c8c4246546fd ("perf hist: Update hist symbol when updating maps")
    Signed-off-by: Namhyung Kim <namhyung@kernel.org>
    Cc: Adrian Hunter <adrian.hunter@intel.com>
    Cc: Ian Rogers <irogers@google.com>
    Cc: Ingo Molnar <mingo@kernel.org>
    Cc: Jiri Olsa <jolsa@kernel.org>
    Cc: Kan Liang <kan.liang@linux.intel.com>
    Cc: Matt Fleming <matt@readmodwrite.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Stephane Eranian <eranian@google.com>
    Link: https://lore.kernel.org/r/20240826221045.1202305-2-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
perf,x86: avoid missing caller address in stack traces captured in uprobe [+ + +]
Author: Andrii Nakryiko <andrii@kernel.org>
Date:   Mon Jul 29 10:52:23 2024 -0700

    perf,x86: avoid missing caller address in stack traces captured in uprobe
    
    [ Upstream commit cfa7f3d2c526c224a6271cc78a4a27a0de06f4f0 ]
    
    When tracing user functions with uprobe functionality, it's common to
    install the probe (e.g., a BPF program) at the first instruction of the
    function. This is often going to be `push %rbp` instruction in function
    preamble, which means that within that function frame pointer hasn't
    been established yet. This leads to consistently missing an actual
    caller of the traced function, because perf_callchain_user() only
    records current IP (capturing traced function) and then following frame
    pointer chain (which would be caller's frame, containing the address of
    caller's caller).
    
    So when we have target_1 -> target_2 -> target_3 call chain and we are
    tracing an entry to target_3, captured stack trace will report
    target_1 -> target_3 call chain, which is wrong and confusing.
    
    This patch proposes a x86-64-specific heuristic to detect `push %rbp`
    (`push %ebp` on 32-bit architecture) instruction being traced. Given
    entire kernel implementation of user space stack trace capturing works
    under assumption that user space code was compiled with frame pointer
    register (%rbp/%ebp) preservation, it seems pretty reasonable to use
    this instruction as a strong indicator that this is the entry to the
    function. In that case, return address is still pointed to by %rsp/%esp,
    so we fetch it and add to stack trace before proceeding to unwind the
    rest using frame pointer-based logic.
    
    We also check for `endbr64` (for 64-bit modes) as another common pattern
    for function entry, as suggested by Josh Poimboeuf. Even if we get this
    wrong sometimes for uprobes attached not at the function entry, it's OK
    because stack trace will still be overall meaningful, just with one
    extra bogus entry. If we don't detect this, we end up with guaranteed to
    be missing caller function entry in the stack trace, which is worse
    overall.
    
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Link: https://lkml.kernel.org/r/20240729175223.23914-1-andrii@kernel.org
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
perf/core: Fix small negative period being ignored [+ + +]
Author: Luo Gengkun <luogengkun@huaweicloud.com>
Date:   Sat Aug 31 07:43:15 2024 +0000

    perf/core: Fix small negative period being ignored
    
    commit 62c0b1061593d7012292f781f11145b2d46f43ab upstream.
    
    In perf_adjust_period, we will first calculate period, and then use
    this period to calculate delta. However, when delta is less than 0,
    there will be a deviation compared to when delta is greater than or
    equal to 0. For example, when delta is in the range of [-14,-1], the
    range of delta = delta + 7 is between [-7,6], so the final value of
    delta/8 is 0. Therefore, the impact of -1 and -2 will be ignored.
    This is unacceptable when the target period is very short, because
    we will lose a lot of samples.
    
    Here are some tests and analyzes:
    before:
      # perf record -e cs -F 1000  ./a.out
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.022 MB perf.data (518 samples) ]
    
      # perf script
      ...
      a.out     396   257.956048:         23 cs:  ffffffff81f4eeec schedul>
      a.out     396   257.957891:         23 cs:  ffffffff81f4eeec schedul>
      a.out     396   257.959730:         23 cs:  ffffffff81f4eeec schedul>
      a.out     396   257.961545:         23 cs:  ffffffff81f4eeec schedul>
      a.out     396   257.963355:         23 cs:  ffffffff81f4eeec schedul>
      a.out     396   257.965163:         23 cs:  ffffffff81f4eeec schedul>
      a.out     396   257.966973:         23 cs:  ffffffff81f4eeec schedul>
      a.out     396   257.968785:         23 cs:  ffffffff81f4eeec schedul>
      a.out     396   257.970593:         23 cs:  ffffffff81f4eeec schedul>
      ...
    
    after:
      # perf record -e cs -F 1000  ./a.out
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.058 MB perf.data (1466 samples) ]
    
      # perf script
      ...
      a.out     395    59.338813:         11 cs:  ffffffff81f4eeec schedul>
      a.out     395    59.339707:         12 cs:  ffffffff81f4eeec schedul>
      a.out     395    59.340682:         13 cs:  ffffffff81f4eeec schedul>
      a.out     395    59.341751:         13 cs:  ffffffff81f4eeec schedul>
      a.out     395    59.342799:         12 cs:  ffffffff81f4eeec schedul>
      a.out     395    59.343765:         11 cs:  ffffffff81f4eeec schedul>
      a.out     395    59.344651:         11 cs:  ffffffff81f4eeec schedul>
      a.out     395    59.345539:         12 cs:  ffffffff81f4eeec schedul>
      a.out     395    59.346502:         13 cs:  ffffffff81f4eeec schedul>
      ...
    
    test.c
    
    int main() {
            for (int i = 0; i < 20000; i++)
                    usleep(10);
    
            return 0;
    }
    
      # time ./a.out
      real    0m1.583s
      user    0m0.040s
      sys     0m0.298s
    
    The above results were tested on x86-64 qemu with KVM enabled using
    test.c as test program. Ideally, we should have around 1500 samples,
    but the previous algorithm had only about 500, whereas the modified
    algorithm now has about 1400. Further more, the new version shows 1
    sample per 0.001s, while the previous one is 1 sample per 0.002s.This
    indicates that the new algorithm is more sensitive to small negative
    values compared to old algorithm.
    
    Fixes: bd2b5b12849a ("perf_counter: More aggressive frequency adjustment")
    Signed-off-by: Luo Gengkun <luogengkun@huaweicloud.com>
    Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
    Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
    Cc: stable@vger.kernel.org
    Link: https://lkml.kernel.org/r/20240831074316.2106159-2-luogengkun@huaweicloud.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
perf: Fix event_function_call() locking [+ + +]
Author: Peter Zijlstra <peterz@infradead.org>
Date:   Wed Aug 7 13:29:27 2024 +0200

    perf: Fix event_function_call() locking
    
    [ Upstream commit 558abc7e3f895049faa46b08656be4c60dc6e9fd ]
    
    All the event_function/@func call context already uses perf_ctx_lock()
    except for the !ctx->is_active case. Make it all consistent.
    
    Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
    Reviewed-by: Namhyung Kim <namhyung@kernel.org>
    Link: https://lore.kernel.org/r/20240807115550.138301094@infradead.org
    Signed-off-by: Sasha Levin <sashal@kernel.org>

perf: Really fix event_function_call() locking [+ + +]
Author: Namhyung Kim <namhyung@kernel.org>
Date:   Tue Aug 13 22:55:11 2024 +0200

    perf: Really fix event_function_call() locking
    
    [ Upstream commit fe826cc2654e8561b64246325e6a51b62bf2488c ]
    
    Commit 558abc7e3f89 ("perf: Fix event_function_call() locking") lost
    IRQ disabling by mistake.
    
    Fixes: 558abc7e3f89 ("perf: Fix event_function_call() locking")
    Reported-by: Pengfei Xu <pengfei.xu@intel.com>
    Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
    Tested-by: Pengfei Xu <pengfei.xu@intel.com>
    Signed-off-by: Namhyung Kim <namhyung@kernel.org>
    Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
pidfs: check for valid pid namespace [+ + +]
Author: Christian Brauner <brauner@kernel.org>
Date:   Thu Sep 26 18:51:46 2024 +0200

    pidfs: check for valid pid namespace
    
    commit 8a46067783bdff222d1fb8f8c20e3b7b711e3ce5 upstream.
    
    When we access a no-current task's pid namespace we need check that the
    task hasn't been reaped in the meantime and it's pid namespace isn't
    accessible anymore.
    
    The user namespace is fine because it is only released when the last
    reference to struct task_struct is put and exit_creds() is called.
    
    Link: https://lore.kernel.org/r/20240926-klebt-altgedienten-0415ad4d273c@brauner
    Fixes: 5b08bd408534 ("pidfs: allow retrieval of namespace file descriptors")
    CC: stable@vger.kernel.org # v6.11
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
platform/mellanox: mlxbf-pmc: fix lockdep warning [+ + +]
Author: Luiz Capitulino <luizcap@redhat.com>
Date:   Thu Sep 12 15:05:32 2024 -0400

    platform/mellanox: mlxbf-pmc: fix lockdep warning
    
    [ Upstream commit 305790dd91057a3f7497c9d128614a4f8486b62b ]
    
    It seems the mlxbf-pmc driver is missing initializing sysfs attributes
    which causes the warning below when CONFIG_LOCKDEP and
    CONFIG_DEBUG_LOCK_ALLOC are enabled. This commit fixes it.
    
    [  155.380843] BUG: key ffff470f45dfa6d8 has not been registered!
    [  155.386749] ------------[ cut here ]------------
    [  155.391361] DEBUG_LOCKS_WARN_ON(1)
    [  155.391381] WARNING: CPU: 4 PID: 1828 at kernel/locking/lockdep.c:4894 lockdep_init_map_type+0x1d0/0x288
    [  155.404254] Modules linked in: mlxbf_pmc(+) xfs libcrc32c mmc_block mlx5_core crct10dif_ce mlxfw ghash_ce virtio_net tls net_failover sha2
    _ce failover psample sha256_arm64 dw_mmc_bluefield pci_hyperv_intf sha1_ce dw_mmc_pltfm sbsa_gwdt dw_mmc micrel mmc_core nfit i2c_mlxbf pwr_m
    lxbf gpio_generic libnvdimm mlxbf_tmfifo mlxbf_gige dm_mirror dm_region_hash dm_log dm_mod
    [  155.436786] CPU: 4 UID: 0 PID: 1828 Comm: modprobe Kdump: loaded Not tainted 6.11.0-rc7-rep1+ #1
    [  155.445562] Hardware name: https://www.mellanox.com BlueField SoC/BlueField SoC, BIOS 4.8.0.13249 Aug  7 2024
    [  155.455463] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    [  155.462413] pc : lockdep_init_map_type+0x1d0/0x288
    [  155.467196] lr : lockdep_init_map_type+0x1d0/0x288
    [  155.471976] sp : ffff80008a1734e0
    [  155.475279] x29: ffff80008a1734e0 x28: ffff470f45df0240 x27: 00000000ffffee4b
    [  155.482406] x26: 00000000000011b4 x25: 0000000000000000 x24: 0000000000000000
    [  155.489532] x23: ffff470f45dfa6d8 x22: 0000000000000000 x21: ffffd54ef6bea000
    [  155.496659] x20: ffff470f45dfa6d8 x19: ffff470f49cdc638 x18: ffffffffffffffff
    [  155.503784] x17: 2f30303a31444642 x16: ffffd54ef48a65e8 x15: ffff80010a172fe7
    [  155.510911] x14: 0000000000000000 x13: 284e4f5f4e524157 x12: 5f534b434f4c5f47
    [  155.518037] x11: 0000000000000001 x10: 0000000000000001 x9 : ffffd54ef3f48a14
    [  155.525163] x8 : 00000000000bffe8 x7 : c0000000ffff7fff x6 : 00000000002bffa8
    [  155.532289] x5 : ffff4712bdcb6088 x4 : 0000000000000000 x3 : 0000000000000027
    [  155.539416] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff470f43e5be00
    [  155.546542] Call trace:
    [  155.548976]  lockdep_init_map_type+0x1d0/0x288
    [  155.553410]  __kernfs_create_file+0x80/0x138
    [  155.557673]  sysfs_add_file_mode_ns+0x94/0x150
    [  155.562106]  create_files+0xb0/0x248
    [  155.565672]  internal_create_group+0x10c/0x328
    [  155.570105]  internal_create_groups.part.0+0x50/0xc8
    [  155.575060]  sysfs_create_groups+0x20/0x38
    [  155.579146]  device_add_attrs+0x1b8/0x228
    [  155.583146]  device_add+0x2a4/0x690
    [  155.586625]  device_register+0x24/0x38
    [  155.590362]  __hwmon_device_register+0x1e0/0x3c8
    [  155.594969]  devm_hwmon_device_register_with_groups+0x78/0xe0
    [  155.600703]  mlxbf_pmc_probe+0x224/0x3a0 [mlxbf_pmc]
    [  155.605669]  platform_probe+0x6c/0xe0
    [  155.609320]  really_probe+0xc4/0x398
    [  155.612887]  __driver_probe_device+0x80/0x168
    [  155.617233]  driver_probe_device+0x44/0x120
    [  155.621405]  __driver_attach+0xf4/0x200
    [  155.625230]  bus_for_each_dev+0x7c/0xe8
    [  155.629055]  driver_attach+0x28/0x38
    [  155.632619]  bus_add_driver+0x110/0x238
    [  155.636445]  driver_register+0x64/0x128
    [  155.640270]  __platform_driver_register+0x2c/0x40
    [  155.644965]  pmc_driver_init+0x24/0xff8 [mlxbf_pmc]
    [  155.649833]  do_one_initcall+0x70/0x3d0
    [  155.653660]  do_init_module+0x64/0x220
    [  155.657400]  load_module+0x628/0x6a8
    [  155.660964]  init_module_from_file+0x8c/0xd8
    [  155.665222]  idempotent_init_module+0x194/0x290
    [  155.669742]  __arm64_sys_finit_module+0x6c/0xd8
    [  155.674261]  invoke_syscall.constprop.0+0x74/0xd0
    [  155.678957]  do_el0_svc+0xb4/0xd0
    [  155.682262]  el0_svc+0x5c/0x248
    [  155.685394]  el0t_64_sync_handler+0x134/0x150
    [  155.689739]  el0t_64_sync+0x17c/0x180
    [  155.693390] irq event stamp: 6407
    [  155.696693] hardirqs last  enabled at (6407): [<ffffd54ef3f48564>] console_unlock+0x154/0x1b8
    [  155.705207] hardirqs last disabled at (6406): [<ffffd54ef3f485ac>] console_unlock+0x19c/0x1b8
    [  155.713719] softirqs last  enabled at (6404): [<ffffd54ef3e9740c>] handle_softirqs+0x4f4/0x518
    [  155.722320] softirqs last disabled at (6395): [<ffffd54ef3df0160>] __do_softirq+0x18/0x20
    [  155.730484] ---[ end trace 0000000000000000 ]---
    
    Signed-off-by: Luiz Capitulino <luizcap@redhat.com>
    Link: https://lore.kernel.org/r/20240912190532.377097-1-luizcap@redhat.com
    Reviewed-by: Hans de Goede <hdegoede@redhat.com>
    Signed-off-by: Hans de Goede <hdegoede@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
platform/x86/amd: pmf: Add quirk for TUF Gaming A14 [+ + +]
Author: aln8 <aln8un@gmail.com>
Date:   Thu Sep 12 15:36:01 2024 +0800

    platform/x86/amd: pmf: Add quirk for TUF Gaming A14
    
    [ Upstream commit 06369503d644068abd9e90918c6611274d94c126 ]
    
    The ASUS TUF Gaming A14 has the same issue as the ROG Zephyrus G14
    where it advertises SPS support but doesn't use it.
    
    Signed-off-by: aln8 <aln8un@gmail.com>
    Acked-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
    Link: https://lore.kernel.org/r/20240912073601.65656-1-aln8un@gmail.com
    Reviewed-by: Hans de Goede <hdegoede@redhat.com>
    Signed-off-by: Hans de Goede <hdegoede@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
platform/x86: ISST: Fix the KASAN report slab-out-of-bounds bug [+ + +]
Author: Zach Wade <zachwade.k@gmail.com>
Date:   Mon Sep 23 22:45:08 2024 +0800

    platform/x86: ISST: Fix the KASAN report slab-out-of-bounds bug
    
    commit 7d59ac07ccb58f8f604f8057db63b8efcebeb3de upstream.
    
    Attaching SST PCI device to VM causes "BUG: KASAN: slab-out-of-bounds".
    kasan report:
    [   19.411889] ==================================================================
    [   19.413702] BUG: KASAN: slab-out-of-bounds in _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common]
    [   19.415634] Read of size 8 at addr ffff888829e65200 by task cpuhp/16/113
    [   19.417368]
    [   19.418627] CPU: 16 PID: 113 Comm: cpuhp/16 Tainted: G            E      6.9.0 #10
    [   19.420435] Hardware name: VMware, Inc. VMware20,1/440BX Desktop Reference Platform, BIOS VMW201.00V.20192059.B64.2207280713 07/28/2022
    [   19.422687] Call Trace:
    [   19.424091]  <TASK>
    [   19.425448]  dump_stack_lvl+0x5d/0x80
    [   19.426963]  ? _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common]
    [   19.428694]  print_report+0x19d/0x52e
    [   19.430206]  ? __pfx__raw_spin_lock_irqsave+0x10/0x10
    [   19.431837]  ? _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common]
    [   19.433539]  kasan_report+0xf0/0x170
    [   19.435019]  ? _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common]
    [   19.436709]  _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common]
    [   19.438379]  ? __pfx_sched_clock_cpu+0x10/0x10
    [   19.439910]  isst_if_cpu_online+0x406/0x58f [isst_if_common]
    [   19.441573]  ? __pfx_isst_if_cpu_online+0x10/0x10 [isst_if_common]
    [   19.443263]  ? ttwu_queue_wakelist+0x2c1/0x360
    [   19.444797]  cpuhp_invoke_callback+0x221/0xec0
    [   19.446337]  cpuhp_thread_fun+0x21b/0x610
    [   19.447814]  ? __pfx_cpuhp_thread_fun+0x10/0x10
    [   19.449354]  smpboot_thread_fn+0x2e7/0x6e0
    [   19.450859]  ? __pfx_smpboot_thread_fn+0x10/0x10
    [   19.452405]  kthread+0x29c/0x350
    [   19.453817]  ? __pfx_kthread+0x10/0x10
    [   19.455253]  ret_from_fork+0x31/0x70
    [   19.456685]  ? __pfx_kthread+0x10/0x10
    [   19.458114]  ret_from_fork_asm+0x1a/0x30
    [   19.459573]  </TASK>
    [   19.460853]
    [   19.462055] Allocated by task 1198:
    [   19.463410]  kasan_save_stack+0x30/0x50
    [   19.464788]  kasan_save_track+0x14/0x30
    [   19.466139]  __kasan_kmalloc+0xaa/0xb0
    [   19.467465]  __kmalloc+0x1cd/0x470
    [   19.468748]  isst_if_cdev_register+0x1da/0x350 [isst_if_common]
    [   19.470233]  isst_if_mbox_init+0x108/0xff0 [isst_if_mbox_msr]
    [   19.471670]  do_one_initcall+0xa4/0x380
    [   19.472903]  do_init_module+0x238/0x760
    [   19.474105]  load_module+0x5239/0x6f00
    [   19.475285]  init_module_from_file+0xd1/0x130
    [   19.476506]  idempotent_init_module+0x23b/0x650
    [   19.477725]  __x64_sys_finit_module+0xbe/0x130
    [   19.476506]  idempotent_init_module+0x23b/0x650
    [   19.477725]  __x64_sys_finit_module+0xbe/0x130
    [   19.478920]  do_syscall_64+0x82/0x160
    [   19.480036]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
    [   19.481292]
    [   19.482205] The buggy address belongs to the object at ffff888829e65000
     which belongs to the cache kmalloc-512 of size 512
    [   19.484818] The buggy address is located 0 bytes to the right of
     allocated 512-byte region [ffff888829e65000, ffff888829e65200)
    [   19.487447]
    [   19.488328] The buggy address belongs to the physical page:
    [   19.489569] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888829e60c00 pfn:0x829e60
    [   19.491140] head: order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0
    [   19.492466] anon flags: 0x57ffffc0000840(slab|head|node=1|zone=2|lastcpupid=0x1fffff)
    [   19.493914] page_type: 0xffffffff()
    [   19.494988] raw: 0057ffffc0000840 ffff88810004cc80 0000000000000000 0000000000000001
    [   19.496451] raw: ffff888829e60c00 0000000080200018 00000001ffffffff 0000000000000000
    [   19.497906] head: 0057ffffc0000840 ffff88810004cc80 0000000000000000 0000000000000001
    [   19.499379] head: ffff888829e60c00 0000000080200018 00000001ffffffff 0000000000000000
    [   19.500844] head: 0057ffffc0000003 ffffea0020a79801 ffffea0020a79848 00000000ffffffff
    [   19.502316] head: 0000000800000000 0000000000000000 00000000ffffffff 0000000000000000
    [   19.503784] page dumped because: kasan: bad access detected
    [   19.505058]
    [   19.505970] Memory state around the buggy address:
    [   19.507172]  ffff888829e65100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   19.508599]  ffff888829e65180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   19.510013] >ffff888829e65200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    [   19.510014]                    ^
    [   19.510016]  ffff888829e65280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    [   19.510018]  ffff888829e65300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    [   19.515367] ==================================================================
    
    The reason for this error is physical_package_ids assigned by VMware VMM
    are not continuous and have gaps. This will cause value returned by
    topology_physical_package_id() to be more than topology_max_packages().
    
    Here the allocation uses topology_max_packages(). The call to
    topology_max_packages() returns maximum logical package ID not physical
    ID. Hence use topology_logical_package_id() instead of
    topology_physical_package_id().
    
    Fixes: 9a1aac8a96dc ("platform/x86: ISST: PUNIT device mapping with Sub-NUMA clustering")
    Cc: stable@vger.kernel.org
    Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
    Signed-off-by: Zach Wade <zachwade.k@gmail.com>
    Link: https://lore.kernel.org/r/20240923144508.1764-1-zachwade.k@gmail.com
    Reviewed-by: Hans de Goede <hdegoede@redhat.com>
    Signed-off-by: Hans de Goede <hdegoede@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

platform/x86: lenovo-ymc: Ignore the 0x0 state [+ + +]
Author: Gergo Koteles <soyer@irl.hu>
Date:   Thu Aug 22 17:38:57 2024 +0200

    platform/x86: lenovo-ymc: Ignore the 0x0 state
    
    [ Upstream commit d9dca215708d32e7f88ac0591fbb187cbf368adb ]
    
    While booting, Lenovo 14ARB7 reports 'lenovo-ymc: Unknown key 0 pressed'
    warning. This is caused by lenovo_ymc_probe() calling lenovo_ymc_notify()
    at probe time to get the initial tablet-mode-switch state and the key-code
    lenovo_ymc_notify() reads from the firmware is not initialized at probe
    time yet on the Lenovo 14ARB7.
    
    The hardware/firmware does an ACPI notify on the WMI device itself when
    it initializes the tablet-mode-switch state later on.
    
    Add 0x0 YMC state to the sparse keymap to silence the warning.
    
    Signed-off-by: Gergo Koteles <soyer@irl.hu>
    Link: https://lore.kernel.org/r/08ab73bb74c4ad448409f2ce707b1148874a05ce.1724340562.git.soyer@irl.hu
    [hdegoede@redhat.com: Reword commit message]
    Reviewed-by: Hans de Goede <hdegoede@redhat.com>
    Signed-off-by: Hans de Goede <hdegoede@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

platform/x86: touchscreen_dmi: add nanote-next quirk [+ + +]
Author: Ckath <ckath@yandex.ru>
Date:   Wed Sep 11 21:12:40 2024 +0200

    platform/x86: touchscreen_dmi: add nanote-next quirk
    
    [ Upstream commit c11619af35bae5884029bd14170c3e4b55ddf6f3 ]
    
    Add touschscreen info for the nanote next (UMPC-03-SR).
    
    After checking with multiple owners the DMI info really is this generic.
    
    Signed-off-by: Ckath <ckath@yandex.ru>
    Link: https://lore.kernel.org/r/e8dda83a-10ae-42cf-a061-5d29be0d193a@yandex.ru
    Reviewed-by: Hans de Goede <hdegoede@redhat.com>
    Signed-off-by: Hans de Goede <hdegoede@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

platform/x86: x86-android-tablets: Adjust Xiaomi Pad 2 bottom bezel touch buttons LED [+ + +]
Author: Hans de Goede <hdegoede@redhat.com>
Date:   Mon Sep 16 11:02:55 2024 +0200

    platform/x86: x86-android-tablets: Adjust Xiaomi Pad 2 bottom bezel touch buttons LED
    
    [ Upstream commit df40a23cc34c200cfde559eda7ca540f3ae7bd9e ]
    
    The "input-events" LED trigger used to turn on the backlight LEDs had to
    be rewritten to use led_trigger_register_simple() + led_trigger_event()
    to fix a serious locking issue.
    
    This means it no longer supports using blink_brightness to set a per LED
    brightness for the trigger and it no longer sets LED_CORE_SUSPENDRESUME.
    
    Adjust the MiPad 2 bottom bezel touch buttons LED class device to match:
    
    1. Make LED_FULL the maximum brightness to fix the LED brightness
       being very low when on.
    2. Set flags = LED_CORE_SUSPENDRESUME.
    
    Signed-off-by: Hans de Goede <hdegoede@redhat.com>
    Link: https://lore.kernel.org/r/20240916090255.35548-1-hdegoede@redhat.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

platform/x86: x86-android-tablets: Fix use after free on platform_device_register() errors [+ + +]
Author: Hans de Goede <hdegoede@redhat.com>
Date:   Sat Oct 5 15:05:45 2024 +0200

    platform/x86: x86-android-tablets: Fix use after free on platform_device_register() errors
    
    commit 2fae3129c0c08e72b1fe93e61fd8fd203252094a upstream.
    
    x86_android_tablet_remove() frees the pdevs[] array, so it should not
    be used after calling x86_android_tablet_remove().
    
    When platform_device_register() fails, store the pdevs[x] PTR_ERR() value
    into the local ret variable before calling x86_android_tablet_remove()
    to avoid using pdevs[] after it has been freed.
    
    Fixes: 5eba0141206e ("platform/x86: x86-android-tablets: Add support for instantiating platform-devs")
    Fixes: e2200d3f26da ("platform/x86: x86-android-tablets: Add gpio_keys support to x86_android_tablet_init()")
    Cc: stable@vger.kernel.org
    Reported-by: Aleksandr Burakov <a.burakov@rosalinux.ru>
    Closes: https://lore.kernel.org/platform-driver-x86/20240917120458.7300-1-a.burakov@rosalinux.ru/
    Signed-off-by: Hans de Goede <hdegoede@redhat.com>
    Link: https://lore.kernel.org/r/20241005130545.64136-1-hdegoede@redhat.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
pmdomain: core: Don't hold the genpd-lock when calling dev_pm_domain_set() [+ + +]
Author: Ulf Hansson <ulf.hansson@linaro.org>
Date:   Mon May 27 16:25:52 2024 +0200

    pmdomain: core: Don't hold the genpd-lock when calling dev_pm_domain_set()
    
    [ Upstream commit b87eee38605c396f0e1fa435939960e5c6cd41d6 ]
    
    There is no need to hold the genpd-lock, while assigning the
    dev->pm_domain. In fact, it becomes a problem on a PREEMPT_RT based
    configuration as the genpd-lock may be a raw spinlock, while the lock
    acquired through the call to dev_pm_domain_set() is a regular spinlock.
    
    To fix the problem, let's simply move the calls to dev_pm_domain_set()
    outside the genpd-lock.
    
    Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
    Tested-by: Raghavendra Kakarla <quic_rkakarla@quicinc.com>  # qcm6490 with PREEMPT_RT set
    Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
    Link: https://lore.kernel.org/r/20240527142557.321610-3-ulf.hansson@linaro.org
    Signed-off-by: Sasha Levin <sashal@kernel.org>

pmdomain: core: Reduce debug summary table width [+ + +]
Author: Geert Uytterhoeven <geert+renesas@glider.be>
Date:   Wed Sep 4 16:30:48 2024 +0200

    pmdomain: core: Reduce debug summary table width
    
    commit c6ccb691d484544636bc4a097574c5c135ccccda upstream.
    
    Commit 9094e53ff5c86ebe ("pmdomain: core: Use dev_name() instead of
    kobject_get_path() in debugfs") severely shortened the names of devices
    in a PM Domain.  Now the most common format[1] consists of a 32-bit
    unit-address (8 characters), followed by a dot and a node name (20
    characters for "air-pollution-sensor" and "interrupt-controller", which
    are the longest generic node names documented in the Devicetree
    Specification), for a typical maximum of 29 characters.
    
    This offers a good opportunity to reduce the table width of the debug
    summary:
      - Reduce the device name field width from 50 to 30 characters, which
        matches the PM Domain name width,
      - Reduce the large inter-column space between the "performance" and
        "managed by" columns.
    
    Visual impact:
      - The "performance" column now starts at a position that is a
        multiple of 16, just like the "status" and "children" columns,
      - All of the "/device", "runtime status", and "managed by" columns are
        now indented 4 characters more than the columns right above them,
      - Everything fits in (one less than) 80 characters again ;-)
    
    [1] Note that some device names (e.g. TI AM335x interconnect target
        modules) do not follow this convention, and may be much longer, but
        these didn't fit in the old 50-character column width either.
    
    Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
    Link: https://lore.kernel.org/r/f8e1821364b6d5d11350447c128f6d2b470f33fe.1725459707.git.geert+renesas@glider.be
    Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

pmdomain: core: Use dev_name() instead of kobject_get_path() in debugfs [+ + +]
Author: Ulf Hansson <ulf.hansson@linaro.org>
Date:   Mon May 27 16:25:53 2024 +0200

    pmdomain: core: Use dev_name() instead of kobject_get_path() in debugfs
    
    [ Upstream commit 9094e53ff5c86ebe372ad3960c3216c9817a1a04 ]
    
    Using kobject_get_path() means a dynamic memory allocation gets done, which
    doesn't work on a PREEMPT_RT based configuration while holding genpd's raw
    spinlock.
    
    To fix the problem, let's convert into using the simpler dev_name(). This
    means the information about the path doesn't get presented in debugfs, but
    hopefully this shouldn't be an issue.
    
    Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
    Tested-by: Raghavendra Kakarla <quic_rkakarla@quicinc.com>  # qcm6490 with PREEMPT_RT set
    Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
    Link: https://lore.kernel.org/r/20240527142557.321610-4-ulf.hansson@linaro.org
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
power: reset: brcmstb: Do not go into infinite loop if reset fails [+ + +]
Author: Andrew Davis <afd@ti.com>
Date:   Mon Jun 10 09:28:36 2024 -0500

    power: reset: brcmstb: Do not go into infinite loop if reset fails
    
    [ Upstream commit cf8c39b00e982fa506b16f9d76657838c09150cb ]
    
    There may be other backup reset methods available, do not halt
    here so that other reset methods can be tried.
    
    Signed-off-by: Andrew Davis <afd@ti.com>
    Reviewed-by: Dhruva Gole <d-gole@ti.com>
    Acked-by: Florian Fainelli <florian.fainelli@broadcom.com>
    Link: https://lore.kernel.org/r/20240610142836.168603-5-afd@ti.com
    Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

power: supply: Drop use_cnt check from power_supply_property_is_writeable() [+ + +]
Author: Hans de Goede <hdegoede@redhat.com>
Date:   Sun Sep 8 20:53:36 2024 +0200

    power: supply: Drop use_cnt check from power_supply_property_is_writeable()
    
    commit 78f281e5bdeb6476fab97a2c3fcece1094b42aaf upstream.
    
    power_supply_property_is_writeable() gets called from the is_visible()
    callback for the sysfs attributes of power_supply class devices and for
    the sysfs attributes of power_supply core instantiated hwmon class devices.
    
    These sysfs attributes get registered by the device_add() respectively
    power_supply_add_hwmon_sysfs() calls in power_supply_register().
    
    use_cnt gets initialized to 0 and is incremented only after these calls.
    So when power_supply_property_is_writeable() gets called it always return
    -ENODEV because of use_cnt == 0.
    
    This causes all the attributes to have permissions of 444 even those which
    should be writable. This used to be a problem only for hwmon sysfs
    attributes but since commit be6299c6e55e ("power: supply: sysfs: use
    power_supply_property_is_writeable()") this now also impacts power_supply
    class sysfs attributes.
    
    Fixes: be6299c6e55e ("power: supply: sysfs: use power_supply_property_is_writeable()")
    Fixes: e67d4dfc9ff1 ("power: supply: Add HWMON compatibility layer")
    Cc: stable@vger.kernel.org
    Cc: Thomas Weißschuh <linux@weissschuh.net>
    Cc: Andrey Smirnov <andrew.smirnov@gmail.com>
    Signed-off-by: Hans de Goede <hdegoede@redhat.com>
    Link: https://lore.kernel.org/stable/20240908185337.103696-1-hdegoede%40redhat.com
    Link: https://lore.kernel.org/r/20240908185337.103696-1-hdegoede@redhat.com
    Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

power: supply: hwmon: Fix missing temp1_max_alarm attribute [+ + +]
Author: Hans de Goede <hdegoede@redhat.com>
Date:   Sun Sep 8 20:53:37 2024 +0200

    power: supply: hwmon: Fix missing temp1_max_alarm attribute
    
    commit e50a57d16f897e45de1112eb6478577b197fab52 upstream.
    
    Temp channel 0 aka temp1 can have a temp1_max_alarm attribute for
    power_supply devices which have a POWER_SUPPLY_PROP_TEMP_ALERT_MAX
    property.
    
    HWMON_T_MAX_ALARM was missing from power_supply_hwmon_info for
    temp channel 0, causing the hwmon temp1_max_alarm attribute to be
    missing from such power_supply devices.
    
    Add this to power_supply_hwmon_info to fix this.
    
    Fixes: f1d33ae806ec ("power: supply: remove duplicated argument in power_supply_hwmon_info")
    Cc: stable@vger.kernel.org
    Signed-off-by: Hans de Goede <hdegoede@redhat.com>
    Link: https://lore.kernel.org/r/20240908185337.103696-2-hdegoede@redhat.com
    Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
powerpc/pseries: Use correct data types from pseries_hp_errorlog struct [+ + +]
Author: Haren Myneni <haren@linux.ibm.com>
Date:   Wed Aug 21 19:50:26 2024 -0700

    powerpc/pseries: Use correct data types from pseries_hp_errorlog struct
    
    [ Upstream commit b76e0d4215b6b622127ebcceaa7f603313ceaec4 ]
    
    _be32 type is defined for some elements in pseries_hp_errorlog
    struct but also used them u32 after be32_to_cpu() conversion.
    
    Example: In handle_dlpar_errorlog()
    hp_elog->_drc_u.drc_index = be32_to_cpu(hp_elog->_drc_u.drc_index);
    
    And later assigned to u32 type
    dlpar_cpu() - u32 drc_index = hp_elog->_drc_u.drc_index;
    
    This incorrect usage is giving the following warnings and the
    patch resolve these warnings with the correct assignment.
    
    arch/powerpc/platforms/pseries/dlpar.c:398:53: sparse: sparse:
    incorrect type in argument 1 (different base types) @@
    expected unsigned int [usertype] drc_index @@
    got restricted __be32 [usertype] drc_index @@
    ...
    arch/powerpc/platforms/pseries/dlpar.c:418:43: sparse: sparse:
    incorrect type in assignment (different base types) @@
    expected restricted __be32 [usertype] drc_count @@
    got unsigned int [usertype] @@
    
    Reported-by: kernel test robot <lkp@intel.com>
    Closes: https://lore.kernel.org/oe-kbuild-all/202408182142.wuIKqYae-lkp@intel.com/
    Closes: https://lore.kernel.org/oe-kbuild-all/202408182302.o7QRO45S-lkp@intel.com/
    Signed-off-by: Haren Myneni <haren@linux.ibm.com>
    
    v3:
    - Fix warnings from using incorrect data types in pseries_hp_errorlog
      struct
    v2:
    - Remove pr_info() and TODO comments
    - Update more information in the commit logs
    
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Link: https://msgid.link/20240822025028.938332-1-haren@linux.ibm.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
powerpc/vdso: Fix VDSO data access when running in a non-root time namespace [+ + +]
Author: Christophe Leroy <christophe.leroy@csgroup.eu>
Date:   Fri Sep 6 10:33:43 2024 +0200

    powerpc/vdso: Fix VDSO data access when running in a non-root time namespace
    
    [ Upstream commit c73049389e58c01e2e3bbfae900c8daeee177191 ]
    
    When running in a non-root time namespace, the global VDSO data page
    is replaced by a dedicated namespace data page and the global data
    page is mapped next to it. Detailed explanations can be found at
    commit 660fd04f9317 ("lib/vdso: Prepare for time namespace support").
    
    When it happens, __kernel_get_syscall_map and __kernel_get_tbfreq
    and __kernel_sync_dicache don't work anymore because they read 0
    instead of the data they need.
    
    To address that, clock_mode has to be read. When it is set to
    VDSO_CLOCKMODE_TIMENS, it means it is a dedicated namespace data page
    and the global data is located on the following page.
    
    Add a macro called get_realdatapage which reads clock_mode and add
    PAGE_SIZE to the pointer provided by get_datapage macro when
    clock_mode is equal to VDSO_CLOCKMODE_TIMENS. Use this new macro
    instead of get_datapage macro except for time functions as they handle
    it internally.
    
    Fixes: 74205b3fc2ef ("powerpc/vdso: Add support for time namespaces")
    Reported-by: Jason A. Donenfeld <Jason@zx2c4.com>
    Closes: https://lore.kernel.org/all/ZtnYqZI-nrsNslwy@zx2c4.com/
    Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
    Acked-by: Michael Ellerman <mpe@ellerman.id.au>
    Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ppp: do not assume bh is held in ppp_channel_bridge_input() [+ + +]
Author: Eric Dumazet <edumazet@google.com>
Date:   Fri Sep 27 07:45:53 2024 +0000

    ppp: do not assume bh is held in ppp_channel_bridge_input()
    
    [ Upstream commit aec7291003df78cb71fd461d7b672912bde55807 ]
    
    Networking receive path is usually handled from BH handler.
    However, some protocols need to acquire the socket lock, and
    packets might be stored in the socket backlog is the socket was
    owned by a user process.
    
    In this case, release_sock(), __release_sock(), and sk_backlog_rcv()
    might call the sk->sk_backlog_rcv() handler in process context.
    
    sybot caught ppp was not considering this case in
    ppp_channel_bridge_input() :
    
    WARNING: inconsistent lock state
    6.11.0-rc7-syzkaller-g5f5673607153 #0 Not tainted
    --------------------------------
    inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
    ksoftirqd/1/24 [HC0[0]:SC1[1]:HE1:SE0] takes:
     ffff0000db7f11e0 (&pch->downl){+.?.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
     ffff0000db7f11e0 (&pch->downl){+.?.}-{2:2}, at: ppp_channel_bridge_input drivers/net/ppp/ppp_generic.c:2272 [inline]
     ffff0000db7f11e0 (&pch->downl){+.?.}-{2:2}, at: ppp_input+0x16c/0x854 drivers/net/ppp/ppp_generic.c:2304
    {SOFTIRQ-ON-W} state was registered at:
       lock_acquire+0x240/0x728 kernel/locking/lockdep.c:5759
       __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
       _raw_spin_lock+0x48/0x60 kernel/locking/spinlock.c:154
       spin_lock include/linux/spinlock.h:351 [inline]
       ppp_channel_bridge_input drivers/net/ppp/ppp_generic.c:2272 [inline]
       ppp_input+0x16c/0x854 drivers/net/ppp/ppp_generic.c:2304
       pppoe_rcv_core+0xfc/0x314 drivers/net/ppp/pppoe.c:379
       sk_backlog_rcv include/net/sock.h:1111 [inline]
       __release_sock+0x1a8/0x3d8 net/core/sock.c:3004
       release_sock+0x68/0x1b8 net/core/sock.c:3558
       pppoe_sendmsg+0xc8/0x5d8 drivers/net/ppp/pppoe.c:903
       sock_sendmsg_nosec net/socket.c:730 [inline]
       __sock_sendmsg net/socket.c:745 [inline]
       __sys_sendto+0x374/0x4f4 net/socket.c:2204
       __do_sys_sendto net/socket.c:2216 [inline]
       __se_sys_sendto net/socket.c:2212 [inline]
       __arm64_sys_sendto+0xd8/0xf8 net/socket.c:2212
       __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
       invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
       el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
       do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
       el0_svc+0x54/0x168 arch/arm64/kernel/entry-common.c:712
       el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730
       el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598
    irq event stamp: 282914
     hardirqs last  enabled at (282914): [<ffff80008b42e30c>] __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:151 [inline]
     hardirqs last  enabled at (282914): [<ffff80008b42e30c>] _raw_spin_unlock_irqrestore+0x38/0x98 kernel/locking/spinlock.c:194
     hardirqs last disabled at (282913): [<ffff80008b42e13c>] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:108 [inline]
     hardirqs last disabled at (282913): [<ffff80008b42e13c>] _raw_spin_lock_irqsave+0x2c/0x7c kernel/locking/spinlock.c:162
     softirqs last  enabled at (282904): [<ffff8000801f8e88>] softirq_handle_end kernel/softirq.c:400 [inline]
     softirqs last  enabled at (282904): [<ffff8000801f8e88>] handle_softirqs+0xa3c/0xbfc kernel/softirq.c:582
     softirqs last disabled at (282909): [<ffff8000801fbdf8>] run_ksoftirqd+0x70/0x158 kernel/softirq.c:928
    
    other info that might help us debug this:
     Possible unsafe locking scenario:
    
           CPU0
           ----
      lock(&pch->downl);
      <Interrupt>
        lock(&pch->downl);
    
     *** DEADLOCK ***
    
    1 lock held by ksoftirqd/1/24:
      #0: ffff80008f74dfa0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire+0x10/0x4c include/linux/rcupdate.h:325
    
    stack backtrace:
    CPU: 1 UID: 0 PID: 24 Comm: ksoftirqd/1 Not tainted 6.11.0-rc7-syzkaller-g5f5673607153 #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
    Call trace:
      dump_backtrace+0x1b8/0x1e4 arch/arm64/kernel/stacktrace.c:319
      show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:326
      __dump_stack lib/dump_stack.c:93 [inline]
      dump_stack_lvl+0xe4/0x150 lib/dump_stack.c:119
      dump_stack+0x1c/0x28 lib/dump_stack.c:128
      print_usage_bug+0x698/0x9ac kernel/locking/lockdep.c:4000
     mark_lock_irq+0x980/0xd2c
      mark_lock+0x258/0x360 kernel/locking/lockdep.c:4677
      __lock_acquire+0xf48/0x779c kernel/locking/lockdep.c:5096
      lock_acquire+0x240/0x728 kernel/locking/lockdep.c:5759
      __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
      _raw_spin_lock+0x48/0x60 kernel/locking/spinlock.c:154
      spin_lock include/linux/spinlock.h:351 [inline]
      ppp_channel_bridge_input drivers/net/ppp/ppp_generic.c:2272 [inline]
      ppp_input+0x16c/0x854 drivers/net/ppp/ppp_generic.c:2304
      ppp_async_process+0x98/0x150 drivers/net/ppp/ppp_async.c:495
      tasklet_action_common+0x318/0x3f4 kernel/softirq.c:785
      tasklet_action+0x68/0x8c kernel/softirq.c:811
      handle_softirqs+0x2e4/0xbfc kernel/softirq.c:554
      run_ksoftirqd+0x70/0x158 kernel/softirq.c:928
      smpboot_thread_fn+0x4b0/0x90c kernel/smpboot.c:164
      kthread+0x288/0x310 kernel/kthread.c:389
      ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:860
    
    Fixes: 4cf476ced45d ("ppp: add PPPIOCBRIDGECHAN and PPPIOCUNBRIDGECHAN ioctls")
    Reported-by: syzbot+bd8d55ee2acd0a71d8ce@syzkaller.appspotmail.com
    Closes: https://lore.kernel.org/netdev/66f661e2.050a0220.38ace9.000f.GAE@google.com/T/#u
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Cc: Tom Parkin <tparkin@katalix.com>
    Cc: James Chapman <jchapman@katalix.com>
    Link: https://patch.msgid.link/20240927074553.341910-1-edumazet@google.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
proc: add config & param to block forcing mem writes [+ + +]
Author: Adrian Ratiu <adrian.ratiu@collabora.com>
Date:   Fri Aug 2 11:02:25 2024 +0300

    proc: add config & param to block forcing mem writes
    
    [ Upstream commit 41e8149c8892ed1962bd15350b3c3e6e90cba7f4 ]
    
    This adds a Kconfig option and boot param to allow removing
    the FOLL_FORCE flag from /proc/pid/mem write calls because
    it can be abused.
    
    The traditional forcing behavior is kept as default because
    it can break GDB and some other use cases.
    
    Previously we tried a more sophisticated approach allowing
    distributions to fine-tune /proc/pid/mem behavior, however
    that got NAK-ed by Linus [1], who prefers this simpler
    approach with semantics also easier to understand for users.
    
    Link: https://lore.kernel.org/lkml/CAHk-=wiGWLChxYmUA5HrT5aopZrB7_2VTa0NLZcxORgkUe5tEQ@mail.gmail.com/ [1]
    Cc: Doug Anderson <dianders@chromium.org>
    Cc: Jeff Xu <jeffxu@google.com>
    Cc: Jann Horn <jannh@google.com>
    Cc: Kees Cook <kees@kernel.org>
    Cc: Ard Biesheuvel <ardb@kernel.org>
    Cc: Christian Brauner <brauner@kernel.org>
    Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
    Link: https://lore.kernel.org/r/20240802080225.89408-1-adrian.ratiu@collabora.com
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
r8169: add tally counter fields added with RTL8125 [+ + +]
Author: Heiner Kallweit <hkallweit1@gmail.com>
Date:   Tue Sep 17 23:04:46 2024 +0200

    r8169: add tally counter fields added with RTL8125
    
    [ Upstream commit ced8e8b8f40accfcce4a2bbd8b150aa76d5eff9a ]
    
    RTL8125 added fields to the tally counter, what may result in the chip
    dma'ing these new fields to unallocated memory. Therefore make sure
    that the allocated memory area is big enough to hold all of the
    tally counter values, even if we use only parts of it.
    
    Fixes: f1bce4ad2f1c ("r8169: add support for RTL8125")
    Cc: stable@vger.kernel.org
    Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://patch.msgid.link/741d26a9-2b2b-485d-91d9-ecb302e345b5@gmail.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

r8169: Fix spelling mistake: "tx_underun" -> "tx_underrun" [+ + +]
Author: Colin Ian King <colin.i.king@gmail.com>
Date:   Mon Sep 9 15:00:21 2024 +0100

    r8169: Fix spelling mistake: "tx_underun" -> "tx_underrun"
    
    [ Upstream commit 8df9439389a44fb2cc4ef695e08d6a8870b1616c ]
    
    There is a spelling mistake in the struct field tx_underun, rename
    it to tx_underrun.
    
    Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>
    Link: https://patch.msgid.link/20240909140021.64884-1-colin.i.king@gmail.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Stable-dep-of: ced8e8b8f40a ("r8169: add tally counter fields added with RTL8125")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
rcu-tasks: Fix access non-existent percpu rtpcp variable in rcu_tasks_need_gpcb() [+ + +]
Author: Zqiang <qiang.zhang1211@gmail.com>
Date:   Wed Jul 10 12:45:42 2024 +0800

    rcu-tasks: Fix access non-existent percpu rtpcp variable in rcu_tasks_need_gpcb()
    
    [ Upstream commit fd70e9f1d85f5323096ad313ba73f5fe3d15ea41 ]
    
    For kernels built with CONFIG_FORCE_NR_CPUS=y, the nr_cpu_ids is
    defined as NR_CPUS instead of the number of possible cpus, this
    will cause the following system panic:
    
    smpboot: Allowing 4 CPUs, 0 hotplug CPUs
    ...
    setup_percpu: NR_CPUS:512 nr_cpumask_bits:512 nr_cpu_ids:512 nr_node_ids:1
    ...
    BUG: unable to handle page fault for address: ffffffff9911c8c8
    Oops: 0000 [#1] PREEMPT SMP PTI
    CPU: 0 PID: 15 Comm: rcu_tasks_trace Tainted: G W
    6.6.21 #1 5dc7acf91a5e8e9ac9dcfc35bee0245691283ea6
    RIP: 0010:rcu_tasks_need_gpcb+0x25d/0x2c0
    RSP: 0018:ffffa371c00a3e60 EFLAGS: 00010082
    CR2: ffffffff9911c8c8 CR3: 000000040fa20005 CR4: 00000000001706f0
    Call Trace:
    <TASK>
    ? __die+0x23/0x80
    ? page_fault_oops+0xa4/0x180
    ? exc_page_fault+0x152/0x180
    ? asm_exc_page_fault+0x26/0x40
    ? rcu_tasks_need_gpcb+0x25d/0x2c0
    ? __pfx_rcu_tasks_kthread+0x40/0x40
    rcu_tasks_one_gp+0x69/0x180
    rcu_tasks_kthread+0x94/0xc0
    kthread+0xe8/0x140
    ? __pfx_kthread+0x40/0x40
    ret_from_fork+0x34/0x80
    ? __pfx_kthread+0x40/0x40
    ret_from_fork_asm+0x1b/0x80
    </TASK>
    
    Considering that there may be holes in the CPU numbers, use the
    maximum possible cpu number, instead of nr_cpu_ids, for configuring
    enqueue and dequeue limits.
    
    [ neeraj.upadhyay: Fix htmldocs build error reported by Stephen Rothwell ]
    
    Closes: https://lore.kernel.org/linux-input/CALMA0xaTSMN+p4xUXkzrtR5r6k7hgoswcaXx7baR_z9r5jjskw@mail.gmail.com/T/#u
    Reported-by: Zhixu Liu <zhixu.liu@gmail.com>
    Signed-off-by: Zqiang <qiang.zhang1211@gmail.com>
    Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
rcuscale: Provide clear error when async specified without primitives [+ + +]
Author: Paul E. McKenney <paulmck@kernel.org>
Date:   Thu Aug 1 17:43:03 2024 -0700

    rcuscale: Provide clear error when async specified without primitives
    
    [ Upstream commit 11377947b5861fa59bf77c827e1dd7c081842cc9 ]
    
    Currently, if the rcuscale module's async module parameter is specified
    for RCU implementations that do not have async primitives such as RCU
    Tasks Rude (which now lacks a call_rcu_tasks_rude() function), there
    will be a series of splats due to calls to a NULL pointer.  This commit
    therefore warns of this situation, but switches to non-async testing.
    
    Signed-off-by: "Paul E. McKenney" <paulmck@kernel.org>
    Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
RDMA/mana_ib: use the correct page size for mapping user-mode doorbell page [+ + +]
Author: Long Li <longli@microsoft.com>
Date:   Fri Aug 30 08:16:33 2024 -0700

    RDMA/mana_ib: use the correct page size for mapping user-mode doorbell page
    
    commit 4a3b99bc04e501b816db78f70064e26a01257910 upstream.
    
    When mapping doorbell page from user-mode, the driver should use the system
    page size as this memory is allocated via mmap() from user-mode.
    
    Cc: stable@vger.kernel.org
    Fixes: 0266a177631d ("RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter")
    Signed-off-by: Long Li <longli@microsoft.com>
    Link: https://patch.msgid.link/1725030993-16213-2-git-send-email-longli@linuxonhyperv.com
    Signed-off-by: Leon Romanovsky <leon@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

RDMA/mana_ib: use the correct page table index based on hardware page size [+ + +]
Author: Long Li <longli@microsoft.com>
Date:   Fri Aug 30 08:16:32 2024 -0700

    RDMA/mana_ib: use the correct page table index based on hardware page size
    
    commit 9e517a8e9d9a303bf9bde35e5c5374795544c152 upstream.
    
    MANA hardware uses 4k page size. When calculating the page table index,
    it should use the hardware page size, not the system page size.
    
    Cc: stable@vger.kernel.org
    Fixes: 0266a177631d ("RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter")
    Signed-off-by: Long Li <longli@microsoft.com>
    Link: https://patch.msgid.link/1725030993-16213-1-git-send-email-longli@linuxonhyperv.com
    Signed-off-by: Leon Romanovsky <leon@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
remoteproc: k3-r5: Acquire mailbox handle during probe routine [+ + +]
Author: Beleswar Padhi <b-padhi@ti.com>
Date:   Thu Aug 8 13:11:26 2024 +0530

    remoteproc: k3-r5: Acquire mailbox handle during probe routine
    
    [ Upstream commit f3f11cfe890733373ddbb1ce8991ccd4ee5e79e1 ]
    
    Acquire the mailbox handle during device probe and do not release handle
    in stop/detach routine or error paths. This removes the redundant
    requests for mbox handle later during rproc start/attach. This also
    allows to defer remoteproc driver's probe if mailbox is not probed yet.
    
    Signed-off-by: Beleswar Padhi <b-padhi@ti.com>
    Link: https://lore.kernel.org/r/20240808074127.2688131-3-b-padhi@ti.com
    Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
    Stable-dep-of: 8fa052c29e50 ("remoteproc: k3-r5: Delay notification of wakeup event")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

remoteproc: k3-r5: Delay notification of wakeup event [+ + +]
Author: Udit Kumar <u-kumar1@ti.com>
Date:   Tue Aug 20 16:20:04 2024 +0530

    remoteproc: k3-r5: Delay notification of wakeup event
    
    [ Upstream commit 8fa052c29e509f3e47d56d7fc2ca28094d78c60a ]
    
    Few times, core1 was scheduled to boot first before core0, which leads
    to error:
    
    'k3_r5_rproc_start: can not start core 1 before core 0'.
    
    This was happening due to some scheduling between prepare and start
    callback. The probe function waits for event, which is getting
    triggered by prepare callback. To avoid above condition move event
    trigger to start instead of prepare callback.
    
    Fixes: 61f6f68447ab ("remoteproc: k3-r5: Wait for core0 power-up before powering up core1")
    Signed-off-by: Udit Kumar <u-kumar1@ti.com>
    [ Applied wakeup event trigger only for Split-Mode booted rprocs ]
    Signed-off-by: Beleswar Padhi <b-padhi@ti.com>
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20240820105004.2788327-1-b-padhi@ti.com
    Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

remoteproc: k3-r5: Fix error handling when power-up failed [+ + +]
Author: Jan Kiszka <jan.kiszka@siemens.com>
Date:   Mon Aug 19 17:24:51 2024 +0200

    remoteproc: k3-r5: Fix error handling when power-up failed
    
    commit 9ab27eb5866ccbf57715cfdba4b03d57776092fb upstream.
    
    By simply bailing out, the driver was violating its rule and internal
    assumptions that either both or no rproc should be initialized. E.g.,
    this could cause the first core to be available but not the second one,
    leading to crashes on its shutdown later on while trying to dereference
    that second instance.
    
    Fixes: 61f6f68447ab ("remoteproc: k3-r5: Wait for core0 power-up before powering up core1")
    Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
    Acked-by: Beleswar Padhi <b-padhi@ti.com>
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/9f481156-f220-4adf-b3d9-670871351e26@siemens.com
    Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
resource: fix region_intersects() vs add_memory_driver_managed() [+ + +]
Author: Huang Ying <ying.huang@intel.com>
Date:   Fri Sep 6 11:07:11 2024 +0800

    resource: fix region_intersects() vs add_memory_driver_managed()
    
    commit b4afe4183ec77f230851ea139d91e5cf2644c68b upstream.
    
    On a system with CXL memory, the resource tree (/proc/iomem) related to
    CXL memory may look like something as follows.
    
    490000000-50fffffff : CXL Window 0
      490000000-50fffffff : region0
        490000000-50fffffff : dax0.0
          490000000-50fffffff : System RAM (kmem)
    
    Because drivers/dax/kmem.c calls add_memory_driver_managed() during
    onlining CXL memory, which makes "System RAM (kmem)" a descendant of "CXL
    Window X".  This confuses region_intersects(), which expects all "System
    RAM" resources to be at the top level of iomem_resource.  This can lead to
    bugs.
    
    For example, when the following command line is executed to write some
    memory in CXL memory range via /dev/mem,
    
     $ dd if=data of=/dev/mem bs=$((1 << 10)) seek=$((0x490000000 >> 10)) count=1
     dd: error writing '/dev/mem': Bad address
     1+0 records in
     0+0 records out
     0 bytes copied, 0.0283507 s, 0.0 kB/s
    
    the command fails as expected.  However, the error code is wrong.  It
    should be "Operation not permitted" instead of "Bad address".  More
    seriously, the /dev/mem permission checking in devmem_is_allowed() passes
    incorrectly.  Although the accessing is prevented later because ioremap()
    isn't allowed to map system RAM, it is a potential security issue.  During
    command executing, the following warning is reported in the kernel log for
    calling ioremap() on system RAM.
    
     ioremap on RAM at 0x0000000490000000 - 0x0000000490000fff
     WARNING: CPU: 2 PID: 416 at arch/x86/mm/ioremap.c:216 __ioremap_caller.constprop.0+0x131/0x35d
     Call Trace:
      memremap+0xcb/0x184
      xlate_dev_mem_ptr+0x25/0x2f
      write_mem+0x94/0xfb
      vfs_write+0x128/0x26d
      ksys_write+0xac/0xfe
      do_syscall_64+0x9a/0xfd
      entry_SYSCALL_64_after_hwframe+0x4b/0x53
    
    The details of command execution process are as follows.  In the above
    resource tree, "System RAM" is a descendant of "CXL Window 0" instead of a
    top level resource.  So, region_intersects() will report no System RAM
    resources in the CXL memory region incorrectly, because it only checks the
    top level resources.  Consequently, devmem_is_allowed() will return 1
    (allow access via /dev/mem) for CXL memory region incorrectly.
    Fortunately, ioremap() doesn't allow to map System RAM and reject the
    access.
    
    So, region_intersects() needs to be fixed to work correctly with the
    resource tree with "System RAM" not at top level as above.  To fix it, if
    we found a unmatched resource in the top level, we will continue to search
    matched resources in its descendant resources.  So, we will not miss any
    matched resources in resource tree anymore.
    
    In the new implementation, an example resource tree
    
    |------------- "CXL Window 0" ------------|
    |-- "System RAM" --|
    
    will behave similar as the following fake resource tree for
    region_intersects(, IORESOURCE_SYSTEM_RAM, ),
    
    |-- "System RAM" --||-- "CXL Window 0a" --|
    
    Where "CXL Window 0a" is part of the original "CXL Window 0" that
    isn't covered by "System RAM".
    
    Link: https://lkml.kernel.org/r/20240906030713.204292-2-ying.huang@intel.com
    Fixes: c221c0b0308f ("device-dax: "Hotplug" persistent memory for use like normal RAM")
    Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
    Cc: Dan Williams <dan.j.williams@intel.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Davidlohr Bueso <dave@stgolabs.net>
    Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
    Cc: Dave Jiang <dave.jiang@intel.com>
    Cc: Alison Schofield <alison.schofield@intel.com>
    Cc: Vishal Verma <vishal.l.verma@intel.com>
    Cc: Ira Weiny <ira.weiny@intel.com>
    Cc: Alistair Popple <apopple@nvidia.com>
    Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
    Cc: Bjorn Helgaas <bhelgaas@google.com>
    Cc: Baoquan He <bhe@redhat.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
Revert "ALSA: hda: Conditionally use snooping for AMD HDMI" [+ + +]
Author: Takashi Iwai <tiwai@suse.de>
Date:   Wed Oct 2 17:59:39 2024 +0200

    Revert "ALSA: hda: Conditionally use snooping for AMD HDMI"
    
    commit 3f7f36a4559ef78a6418c5f0447fbfbdcf671956 upstream.
    
    This reverts commit 478689b5990deb626a0b3f1ebf165979914d6be4.
    
    The fix seems leading to regressions for other systems.
    Also, the way to check the presence of IOMMU via get_dma_ops() isn't
    reliable and it's no longer applicable for 6.12.  After all, it's no
    right fix, so let's revert it at first.
    
    To be noted, the PCM buffer allocation has been changed to try the
    continuous pages at first since 6.12, so the problem could be already
    addressed without this hackish workaround.
    
    Reported-by: Salvatore Bonaccorso <carnil@debian.org>
    Closes: https://lore.kernel.org/ZvgCdYfKgwHpJXGE@eldamar.lan
    Link: https://patch.msgid.link/20241002155948.4859-1-tiwai@suse.de
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
Revert "drm/amd/display: Skip Recompute DSC Params if no Stream on Link" [+ + +]
Author: Jonathan Gray <jsg@jsg.id.au>
Date:   Mon Oct 7 14:59:22 2024 +1100

    Revert "drm/amd/display: Skip Recompute DSC Params if no Stream on Link"
    
    This reverts commit d45c64d933586d409d3f1e0ecaca4da494b1d9c6.
    
    duplicated a change made in 6.11-rc3
    50e376f1fe3bf571d0645ddf48ad37eb58323919
    
    Cc: stable@vger.kernel.org # 6.11
    Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
riscv: define ILLEGAL_POINTER_VALUE for 64bit [+ + +]
Author: Jisheng Zhang <jszhang@kernel.org>
Date:   Sat Jul 6 01:02:10 2024 +0800

    riscv: define ILLEGAL_POINTER_VALUE for 64bit
    
    commit 5c178472af247c7b50f962495bb7462ba453b9fb upstream.
    
    This is used in poison.h for poison pointer offset. Based on current
    SV39, SV48 and SV57 vm layout, 0xdead000000000000 is a proper value
    that is not mappable, this can avoid potentially turning an oops to
    an expolit.
    
    Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
    Fixes: fbe934d69eb7 ("RISC-V: Build Infrastructure")
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20240705170210.3236-1-jszhang@kernel.org
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

riscv: Fix kernel stack size when KASAN is enabled [+ + +]
Author: Alexandre Ghiti <alexghiti@rivosinc.com>
Date:   Tue Sep 17 17:03:28 2024 +0200

    riscv: Fix kernel stack size when KASAN is enabled
    
    commit cfb10de18538e383dbc4f3ce7f477ce49287ff3d upstream.
    
    We use Kconfig to select the kernel stack size, doubling the default
    size if KASAN is enabled.
    
    But that actually only works if KASAN is selected from the beginning,
    meaning that if KASAN config is added later (for example using
    menuconfig), CONFIG_THREAD_SIZE_ORDER won't be updated, keeping the
    default size, which is not enough for KASAN as reported in [1].
    
    So fix this by moving the logic to compute the right kernel stack into a
    header.
    
    Fixes: a7555f6b62e7 ("riscv: stack: Add config of thread stack size")
    Reported-by: syzbot+ba9eac24453387a9d502@syzkaller.appspotmail.com
    Closes: https://lore.kernel.org/all/000000000000eb301906222aadc2@google.com/ [1]
    Cc: stable@vger.kernel.org
    Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
    Link: https://lore.kernel.org/r/20240917150328.59831-1-alexghiti@rivosinc.com
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
rtc: at91sam9: fix OF node leak in probe() error path [+ + +]
Author: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Date:   Sun Aug 25 20:31:03 2024 +0200

    rtc: at91sam9: fix OF node leak in probe() error path
    
    commit 73580e2ee6adfb40276bd420da3bb1abae204e10 upstream.
    
    Driver is leaking an OF node reference obtained from
    of_parse_phandle_with_fixed_args().
    
    Fixes: 43e112bb3dea ("rtc: at91sam9: make use of syscon/regmap to access GPBR registers")
    Cc: stable@vger.kernel.org
    Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
    Link: https://lore.kernel.org/r/20240825183103.102904-1-krzysztof.kozlowski@linaro.org
    Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
rtla: Fix the help text in osnoise and timerlat top tools [+ + +]
Author: Eder Zulian <ezulian@redhat.com>
Date:   Tue Aug 13 17:58:31 2024 +0200

    rtla: Fix the help text in osnoise and timerlat top tools
    
    commit 3d7b8ea7a8a20a45d019382c4dc6ed79e8bb95cf upstream.
    
    The help text in osnoise top and timerlat top had some minor errors
    and omissions. The -d option was missing the 's' (second) abbreviation and
    the error message for '-d' used '-D'.
    
    Cc: stable@vger.kernel.org
    Fixes: 1eceb2fc2ca54 ("rtla/osnoise: Add osnoise top mode")
    Fixes: a828cd18bc4ad ("rtla: Add timerlat tool and timelart top mode")
    Link: https://lore.kernel.org/20240813155831.384446-1-ezulian@redhat.com
    Suggested-by: Tomas Glozar <tglozar@redhat.com>
    Reviewed-by: Tomas Glozar <tglozar@redhat.com>
    Signed-off-by: Eder Zulian <ezulian@redhat.com>
    Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
rust: kbuild: auto generate helper exports [+ + +]
Author: Gary Guo <gary@garyguo.net>
Date:   Sat Aug 17 17:51:32 2024 +0100

    rust: kbuild: auto generate helper exports
    
    [ Upstream commit e26fa546042add70944d018b930530d16b3cf626 ]
    
    This removes the need to explicitly export all symbols.
    
    Generate helper exports similarly to what's currently done for Rust
    crates. These helpers are exclusively called from within Rust code and
    therefore can be treated similar as other Rust symbols.
    
    Signed-off-by: Gary Guo <gary@garyguo.net>
    Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
    Tested-by: Boqun Feng <boqun.feng@gmail.com>
    Link: https://lore.kernel.org/r/20240817165302.3852499-1-gary@garyguo.net
    [ Fixed dependency path, reworded slightly, edited comment a bit and
      rebased on top of the changes made when applying Andreas' patch
      (e.g. no `README.md` anymore, so moved the edits).  - Miguel ]
    Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
    Stable-dep-of: d065cc76054d ("rust: mutex: fix __mutex_init() usage in case of PREEMPT_RT")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

rust: kbuild: split up helpers.c [+ + +]
Author: Andreas Hindborg <a.hindborg@kernel.org>
Date:   Thu Aug 15 10:30:26 2024 +0000

    rust: kbuild: split up helpers.c
    
    [ Upstream commit 876346536c1b59a5b1b5e44477b1b3ece77647fd ]
    
    This patch splits up the rust helpers C file. When rebasing patch sets on
    upstream linux, merge conflicts in helpers.c is common and time consuming
    [1]. Thus, split the file so that each kernel component can live in a
    separate file.
    
    This patch lists helper files explicitly and thus conflicts in the file
    list is still likely. However, they should be more simple to resolve than
    the conflicts usually seen in helpers.c.
    
    [ Removed `README.md` and undeleted the original comment since now,
      in v3 of the series, we have a `helpers.c` again; which also allows
      us to keep the "Sorted alphabetically" line and makes the diff easier.
    
      In addition, updated the Documentation/ mentions of the file, reworded
      title and removed blank lines at the end of `page.c`.  - Miguel ]
    
    Link: https://rust-for-linux.zulipchat.com/#narrow/stream/288089-General/topic/Splitting.20up.20helpers.2Ec/near/426694012 [1]
    Signed-off-by: Andreas Hindborg <a.hindborg@samsung.com>
    Reviewed-by: Gary Guo <gary@garyguo.net>
    Acked-by: Dirk Behme <dirk.behme@de.bosch.com>
    Reviewed-by: Alice Ryhl <aliceryhl@google.com>
    Reviewed-by: Benno Lossin <benno.lossin@proton.me>
    Link: https://lore.kernel.org/r/20240815103016.2771842-1-nmi@metaspace.dk
    Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
    Stable-dep-of: d065cc76054d ("rust: mutex: fix __mutex_init() usage in case of PREEMPT_RT")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

rust: mutex: fix __mutex_init() usage in case of PREEMPT_RT [+ + +]
Author: Dirk Behme <dirk.behme@de.bosch.com>
Date:   Mon Sep 16 09:37:52 2024 +0200

    rust: mutex: fix __mutex_init() usage in case of PREEMPT_RT
    
    [ Upstream commit d065cc76054d21e48a839a2a19ba99dbc51a4d11 ]
    
    In case CONFIG_PREEMPT_RT is enabled __mutex_init() becomes a macro
    instead of an extern function (simplified from
    include/linux/mutex.h):
    
        #ifndef CONFIG_PREEMPT_RT
        extern void __mutex_init(struct mutex *lock, const char *name,
                             struct lock_class_key *key);
        #else
        #define __mutex_init(mutex, name, key)              \
        do {                                                \
            rt_mutex_base_init(&(mutex)->rtmutex);          \
            __mutex_rt_init((mutex), name, key);            \
        } while (0)
        #endif
    
    The macro isn't resolved by bindgen, then. What results in a build
    error:
    
    error[E0425]: cannot find function `__mutex_init` in crate `bindings`
         --> rust/kernel/sync/lock/mutex.rs:104:28
          |
    104   |           unsafe { bindings::__mutex_init(ptr, name, key) }
          |                              ^^^^^^^^^^^^ help: a function with a similar name exists: `__mutex_rt_init`
          |
         ::: rust/bindings/bindings_generated.rs:23722:5
          |
    23722 | /     pub fn __mutex_rt_init(
    23723 | |         lock: *mut mutex,
    23724 | |         name: *const core::ffi::c_char,
    23725 | |         key: *mut lock_class_key,
    23726 | |     );
          | |_____- similarly named function `__mutex_rt_init` defined here
    
    Fix this by adding a helper.
    
    As explained by Gary Guo in [1] no #ifdef CONFIG_PREEMPT_RT
    is needed here as rust/bindings/lib.rs prefers externed function to
    helpers if an externed function exists.
    
    Reported-by: Conor Dooley <conor@kernel.org>
    Link: https://lore.kernel.org/rust-for-linux/20240913-shack-estate-b376a65921b1@spud/
    Link: https://lore.kernel.org/rust-for-linux/20240915123626.1a170103.gary@garyguo.net/ [1]
    Fixes: 6d20d629c6d8 ("rust: lock: introduce `Mutex`")
    Signed-off-by: Dirk Behme <dirk.behme@de.bosch.com>
    Tested-by: Conor Dooley <conor.dooley@microchip.com>
    Reviewed-by: Gary Guo <gary@garyguo.net>
    Link: https://lore.kernel.org/r/20240916073752.3123484-1-dirk.behme@de.bosch.com
    [ Reworded to include the proper example by Dirk. - Miguel ]
    Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

rust: sync: require `T: Sync` for `LockedBy::access` [+ + +]
Author: Alice Ryhl <aliceryhl@google.com>
Date:   Sun Sep 15 14:41:28 2024 +0000

    rust: sync: require `T: Sync` for `LockedBy::access`
    
    commit a8ee30f45d5d57467ddb7877ed6914d0eba0af7f upstream.
    
    The `LockedBy::access` method only requires a shared reference to the
    owner, so if we have shared access to the `LockedBy` from several
    threads at once, then two threads could call `access` in parallel and
    both obtain a shared reference to the inner value. Thus, require that
    `T: Sync` when calling the `access` method.
    
    An alternative is to require `T: Sync` in the `impl Sync for LockedBy`.
    This patch does not choose that approach as it gives up the ability to
    use `LockedBy` with `!Sync` types, which is okay as long as you only use
    `access_mut`.
    
    Cc: stable@vger.kernel.org
    Fixes: 7b1f55e3a984 ("rust: sync: introduce `LockedBy`")
    Signed-off-by: Alice Ryhl <aliceryhl@google.com>
    Suggested-by: Boqun Feng <boqun.feng@gmail.com>
    Reviewed-by: Gary Guo <gary@garyguo.net>
    Link: https://lore.kernel.org/r/20240915-locked-by-sync-fix-v2-1-1a8d89710392@google.com
    Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
rxrpc: Fix a race between socket set up and I/O thread creation [+ + +]
Author: David Howells <dhowells@redhat.com>
Date:   Tue Oct 1 14:26:58 2024 +0100

    rxrpc: Fix a race between socket set up and I/O thread creation
    
    commit bc212465326e8587325f520a052346f0b57360e6 upstream.
    
    In rxrpc_open_socket(), it sets up the socket and then sets up the I/O
    thread that will handle it.  This is a problem, however, as there's a gap
    between the two phases in which a packet may come into rxrpc_encap_rcv()
    from the UDP packet but we oops when trying to wake the not-yet created I/O
    thread.
    
    As a quick fix, just make rxrpc_encap_rcv() discard the packet if there's
    no I/O thread yet.
    
    A better, but more intrusive fix would perhaps be to rearrange things such
    that the socket creation is done by the I/O thread.
    
    Fixes: a275da62e8c1 ("rxrpc: Create a per-local endpoint receive queue and I/O thread")
    Signed-off-by: David Howells <dhowells@redhat.com>
    cc: yuxuanzhe@outlook.com
    cc: Marc Dionne <marc.dionne@auristor.com>
    cc: Simon Horman <horms@kernel.org>
    cc: linux-afs@lists.infradead.org
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Link: https://patch.msgid.link/20241001132702.3122709-2-dhowells@redhat.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
sched/core: Add clearing of ->dl_server in put_prev_task_balance() [+ + +]
Author: Joel Fernandes (Google) <joel@joelfernandes.org>
Date:   Mon May 27 14:06:48 2024 +0200

    sched/core: Add clearing of ->dl_server in put_prev_task_balance()
    
    commit c245910049d04fbfa85bb2f5acd591c24e9907c7 upstream.
    
    Paths using put_prev_task_balance() need to do a pick shortly
    after. Make sure they also clear the ->dl_server on prev as a
    part of that.
    
    Fixes: 63ba8422f876 ("sched/deadline: Introduce deadline servers")
    Signed-off-by: "Joel Fernandes (Google)" <joel@joelfernandes.org>
    Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
    Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Tested-by: Juri Lelli <juri.lelli@redhat.com>
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/d184d554434bedbad0581cb34656582d78655150.1716811044.git.bristot@kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

sched/core: Clear prev->dl_server in CFS pick fast path [+ + +]
Author: Youssef Esmat <youssefesmat@google.com>
Date:   Mon May 27 14:06:49 2024 +0200

    sched/core: Clear prev->dl_server in CFS pick fast path
    
    commit a741b82423f41501e301eb6f9820b45ca202e877 upstream.
    
    In case the previous pick was a DL server pick, ->dl_server might be
    set. Clear it in the fast path as well.
    
    Fixes: 63ba8422f876 ("sched/deadline: Introduce deadline servers")
    Signed-off-by: Youssef Esmat <youssefesmat@google.com>
    Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
    Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Tested-by: Juri Lelli <juri.lelli@redhat.com>
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/7f7381ccba09efcb4a1c1ff808ed58385eccc222.1716811044.git.bristot@kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
sched/deadline: Comment sched_dl_entity::dl_server variable [+ + +]
Author: Daniel Bristot de Oliveira <bristot@kernel.org>
Date:   Mon May 27 14:06:47 2024 +0200

    sched/deadline: Comment sched_dl_entity::dl_server variable
    
    commit f23c042ce34ba265cf3129d530702b5d218e3f4b upstream.
    
    Add an explanation for the newly added variable.
    
    Fixes: 63ba8422f876 ("sched/deadline: Introduce deadline servers")
    Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
    Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Tested-by: Juri Lelli <juri.lelli@redhat.com>
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/147f7aa8cb8fd925f36aa8059af6a35aad08b45a.1716811044.git.bristot@kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
sched: psi: fix bogus pressure spikes from aggregation race [+ + +]
Author: Johannes Weiner <hannes@cmpxchg.org>
Date:   Thu Oct 3 07:29:05 2024 -0400

    sched: psi: fix bogus pressure spikes from aggregation race
    
    commit 3840cbe24cf060ea05a585ca497814609f5d47d1 upstream.
    
    Brandon reports sporadic, non-sensical spikes in cumulative pressure
    time (total=) when reading cpu.pressure at a high rate. This is due to
    a race condition between reader aggregation and tasks changing states.
    
    While it affects all states and all resources captured by PSI, in
    practice it most likely triggers with CPU pressure, since scheduling
    events are so frequent compared to other resource events.
    
    The race context is the live snooping of ongoing stalls during a
    pressure read. The read aggregates per-cpu records for stalls that
    have concluded, but will also incorporate ad-hoc the duration of any
    active state that hasn't been recorded yet. This is important to get
    timely measurements of ongoing stalls. Those ad-hoc samples are
    calculated on-the-fly up to the current time on that CPU; since the
    stall hasn't concluded, it's expected that this is the minimum amount
    of stall time that will enter the per-cpu records once it does.
    
    The problem is that the path that concludes the state uses a CPU clock
    read that is not synchronized against aggregators; the clock is read
    outside of the seqlock protection. This allows aggregators to race and
    snoop a stall with a longer duration than will actually be recorded.
    
    With the recorded stall time being less than the last snapshot
    remembered by the aggregator, a subsequent sample will underflow and
    observe a bogus delta value, resulting in an erratic jump in pressure.
    
    Fix this by moving the clock read of the state change into the seqlock
    protection. This ensures no aggregation can snoop live stalls past the
    time that's recorded when the state concludes.
    
    Reported-by: Brandon Duffany <brandon@buildbuddy.io>
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=219194
    Link: https://lore.kernel.org/lkml/20240827121851.GB438928@cmpxchg.org/
    Fixes: df77430639c9 ("psi: Reduce calls to sched_clock() in psi")
    Cc: stable@vger.kernel.org
    Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
    Reviewed-by: Chengming Zhou <chengming.zhou@linux.dev>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
scripts/gdb: add iteration function for rbtree [+ + +]
Author: Kuan-Ying Lee <kuan-ying.lee@canonical.com>
Date:   Tue Jul 23 14:48:58 2024 +0800

    scripts/gdb: add iteration function for rbtree
    
    commit 0c77e103c45fa1b119f5d3bb4625eee081c1a6cf upstream.
    
    Add inorder iteration function for rbtree usage.
    
    This is a preparation patch for the next patch to fix the gdb mounts
    issue.
    
    Link: https://lkml.kernel.org/r/20240723064902.124154-3-kuan-ying.lee@canonical.com
    Fixes: 2eea9ce4310d ("mounts: keep list of mounts in an rbtree")
    Signed-off-by: Kuan-Ying Lee <kuan-ying.lee@canonical.com>
    Cc: Jan Kiszka <jan.kiszka@siemens.com>
    Cc: Kieran Bingham <kbingham@kernel.org>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

scripts/gdb: fix lx-mounts command error [+ + +]
Author: Kuan-Ying Lee <kuan-ying.lee@canonical.com>
Date:   Tue Jul 23 14:48:59 2024 +0800

    scripts/gdb: fix lx-mounts command error
    
    commit 4b183f613924ad536be2f8bd12b307e9c5a96bf6 upstream.
    
    (gdb) lx-mounts
          mount          super_block     devname pathname fstype options
    Python Exception <class 'gdb.error'>: There is no member named list.
    Error occurred in Python: There is no member named list.
    
    We encounter the above issue after commit 2eea9ce4310d ("mounts: keep
    list of mounts in an rbtree"). The commit move a mount from list into
    rbtree.
    
    So we can instead use rbtree to iterate all mounts information.
    
    Link: https://lkml.kernel.org/r/20240723064902.124154-4-kuan-ying.lee@canonical.com
    Fixes: 2eea9ce4310d ("mounts: keep list of mounts in an rbtree")
    Signed-off-by: Kuan-Ying Lee <kuan-ying.lee@canonical.com>
    Cc: Jan Kiszka <jan.kiszka@siemens.com>
    Cc: Kieran Bingham <kbingham@kernel.org>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

scripts/gdb: fix timerlist parsing issue [+ + +]
Author: Kuan-Ying Lee <kuan-ying.lee@canonical.com>
Date:   Tue Jul 23 14:48:57 2024 +0800

    scripts/gdb: fix timerlist parsing issue
    
    commit a633a4b8001a7f2a12584f267a3280990d9ababa upstream.
    
    Patch series "Fix some GDB command error and add some GDB commands", v3.
    
    Fix some GDB command errors and add some useful GDB commands.
    
    
    This patch (of 5):
    
    Commit 7988e5ae2be7 ("tick: Split nohz and highres features from
    nohz_mode") and commit 7988e5ae2be7 ("tick: Split nohz and highres
    features from nohz_mode") move 'tick_stopped' and 'nohz_mode' to flags
    field which will break the gdb lx-mounts command:
    
    (gdb) lx-timerlist
    Python Exception <class 'gdb.error'>: There is no member named nohz_mode.
    Error occurred in Python: There is no member named nohz_mode.
    
    (gdb) lx-timerlist
    Python Exception <class 'gdb.error'>: There is no member named tick_stopped.
    Error occurred in Python: There is no member named tick_stopped.
    
    We move 'tick_stopped' and 'nohz_mode' to flags field instead.
    
    Link: https://lkml.kernel.org/r/20240723064902.124154-1-kuan-ying.lee@canonical.com
    Link: https://lkml.kernel.org/r/20240723064902.124154-2-kuan-ying.lee@canonical.com
    Fixes: a478ffb2ae23 ("tick: Move individual bit features to debuggable mask accesses")
    Fixes: 7988e5ae2be7 ("tick: Split nohz and highres features from nohz_mode")
    Signed-off-by: Kuan-Ying Lee <kuan-ying.lee@canonical.com>
    Cc: Jan Kiszka <jan.kiszka@siemens.com>
    Cc: Kieran Bingham <kbingham@kernel.org>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
scsi: aacraid: Rearrange order of struct aac_srb_unit [+ + +]
Author: Kees Cook <kees@kernel.org>
Date:   Thu Jul 11 14:57:37 2024 -0700

    scsi: aacraid: Rearrange order of struct aac_srb_unit
    
    [ Upstream commit 6e5860b0ad4934baee8c7a202c02033b2631bb44 ]
    
    struct aac_srb_unit contains struct aac_srb, which contains struct sgmap,
    which ends in a (currently) "fake" (1-element) flexible array.  Converting
    this to a flexible array is needed so that runtime bounds checking won't
    think the array is fixed size (i.e. under CONFIG_FORTIFY_SOURCE=y and/or
    CONFIG_UBSAN_BOUNDS=y), as other parts of aacraid use struct sgmap as a
    flexible array.
    
    It is not legal to have a flexible array in the middle of a structure, so
    it either needs to be split up or rearranged so that it is at the end of
    the structure. Luckily, struct aac_srb_unit, which is exclusively
    consumed/updated by aac_send_safw_bmic_cmd(), does not depend on member
    ordering.
    
    The values set in the on-stack struct aac_srb_unit instance "srbu" by the
    only two callers, aac_issue_safw_bmic_identify() and
    aac_get_safw_ciss_luns(), do not contain anything in srbu.srb.sgmap.sg, and
    they both implicitly initialize srbu.srb.sgmap.count to 0 during
    memset(). For example:
    
            memset(&srbu, 0, sizeof(struct aac_srb_unit));
    
            srbcmd = &srbu.srb;
            srbcmd->flags   = cpu_to_le32(SRB_DataIn);
            srbcmd->cdb[0]  = CISS_REPORT_PHYSICAL_LUNS;
            srbcmd->cdb[1]  = 2; /* extended reporting */
            srbcmd->cdb[8]  = (u8)(datasize >> 8);
            srbcmd->cdb[9]  = (u8)(datasize);
    
            rcode = aac_send_safw_bmic_cmd(dev, &srbu, phys_luns, datasize);
    
    During aac_send_safw_bmic_cmd(), a separate srb is mapped into DMA, and has
    srbu.srb copied into it:
    
            srb = fib_data(fibptr);
            memcpy(srb, &srbu->srb, sizeof(struct aac_srb));
    
    Only then is srb.sgmap.count written and srb->sg populated:
    
            srb->count              = cpu_to_le32(xfer_len);
    
            sg64 = (struct sgmap64 *)&srb->sg;
            sg64->count             = cpu_to_le32(1);
            sg64->sg[0].addr[1]     = cpu_to_le32(upper_32_bits(addr));
            sg64->sg[0].addr[0]     = cpu_to_le32(lower_32_bits(addr));
            sg64->sg[0].count       = cpu_to_le32(xfer_len);
    
    But this is happening in the DMA memory, not in srbu.srb. An attempt to
    copy the changes back to srbu does happen:
    
            /*
             * Copy the updated data for other dumping or other usage if
             * needed
             */
            memcpy(&srbu->srb, srb, sizeof(struct aac_srb));
    
    But this was never correct: the sg64 (3 u32s) overlap of srb.sg (2 u32s)
    always meant that srbu.srb would have held truncated information and any
    attempt to walk srbu.srb.sg.sg based on the value of srbu.srb.sg.count
    would result in attempting to parse past the end of srbu.srb.sg.sg[0] into
    srbu.srb_reply.
    
    After getting a reply from hardware, the reply is copied into
    srbu.srb_reply:
    
            srb_reply = (struct aac_srb_reply *)fib_data(fibptr);
            memcpy(&srbu->srb_reply, srb_reply, sizeof(struct aac_srb_reply));
    
    This has always been fixed-size, so there's no issue here. It is worth
    noting that the two callers _never check_ srbu contents -- neither
    srbu.srb nor srbu.srb_reply is examined. (They depend on the mapped
    xfer_buf instead.)
    
    Therefore, the ordering of members in struct aac_srb_unit does not matter,
    and the flexible array member can moved to the end.
    
    (Additionally, the two memcpy()s that update srbu could be entirely
    removed as they are never consumed, but I left that as-is.)
    
    Signed-off-by: Kees Cook <kees@kernel.org>
    Link: https://lore.kernel.org/r/20240711215739.208776-1-kees@kernel.org
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

scsi: lpfc: Fix unsolicited FLOGI kref imbalance when in direct attached topology [+ + +]
Author: Justin Tee <justin.tee@broadcom.com>
Date:   Fri Jul 26 16:15:09 2024 -0700

    scsi: lpfc: Fix unsolicited FLOGI kref imbalance when in direct attached topology
    
    [ Upstream commit b5c18c9dd138733c16893613345af44deadcf05e ]
    
    In direct attached topology, certain target vendors that are quick to issue
    FLOGI followed by a cable pull for more than dev_loss_tmo may result in a
    kref imbalance for the remote port ndlp object.
    
    Add an nlp_get when the defer_flogi_acc flag is set.  This is expected to
    balance the nlp_put in the defer_flogi_acc clause in the
    lpfc_issue_els_flogi() routine.  Because we need to retain the ndlp ptr,
    reorganize all of the defer_flogi_acc information into one
    lpfc_defer_flogi_acc struct.
    
    Signed-off-by: Justin Tee <justin.tee@broadcom.com>
    Link: https://lore.kernel.org/r/20240726231512.92867-6-justintee8345@gmail.com
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

scsi: lpfc: Update PRLO handling in direct attached topology [+ + +]
Author: Justin Tee <justin.tee@broadcom.com>
Date:   Fri Jul 26 16:15:10 2024 -0700

    scsi: lpfc: Update PRLO handling in direct attached topology
    
    [ Upstream commit 1f0f7679ad8942f810b0f19ee9cf098c3502d66a ]
    
    A kref imbalance occurs when handling an unsolicited PRLO in direct
    attached topology.
    
    Rework PRLO rcv handling when in MAPPED state.  Save the state that we were
    handling a PRLO by setting nlp_last_elscmd to ELS_CMD_PRLO.  Then in the
    lpfc_cmpl_els_logo_acc() completion routine, manually restart discovery.
    By issuing the PLOGI, which nlp_gets, before nlp_put at the end of the
    lpfc_cmpl_els_logo_acc() routine, we are saving us from a final nlp_put.
    And, we are still allowing the unreg_rpi to happen.
    
    Signed-off-by: Justin Tee <justin.tee@broadcom.com>
    Link: https://lore.kernel.org/r/20240726231512.92867-7-justintee8345@gmail.com
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

scsi: lpfc: Validate hdwq pointers before dereferencing in reset/errata paths [+ + +]
Author: Justin Tee <justin.tee@broadcom.com>
Date:   Fri Jul 26 16:15:07 2024 -0700

    scsi: lpfc: Validate hdwq pointers before dereferencing in reset/errata paths
    
    [ Upstream commit 2be1d4f11944cd6283cb97268b3e17c4424945ca ]
    
    When the HBA is undergoing a reset or is handling an errata event, NULL ptr
    dereference crashes may occur in routines such as
    lpfc_sli_flush_io_rings(), lpfc_dev_loss_tmo_callbk(), or
    lpfc_abort_handler().
    
    Add NULL ptr checks before dereferencing hdwq pointers that may have been
    freed due to operations colliding with a reset or errata event handler.
    
    Signed-off-by: Justin Tee <justin.tee@broadcom.com>
    Link: https://lore.kernel.org/r/20240726231512.92867-4-justintee8345@gmail.com
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

scsi: NCR5380: Initialize buffer for MSG IN and STATUS transfers [+ + +]
Author: Finn Thain <fthain@linux-m68k.org>
Date:   Wed Aug 7 13:36:28 2024 +1000

    scsi: NCR5380: Initialize buffer for MSG IN and STATUS transfers
    
    [ Upstream commit 1c71065df2df693d208dd32758171c1dece66341 ]
    
    Following an incomplete transfer in MSG IN phase, the driver would not
    notice the problem and would make use of invalid data. Initialize 'tmp'
    appropriately and bail out if no message was received. For STATUS phase,
    preserve the existing status code unless a new value was transferred.
    
    Tested-by: Stan Johnson <userm57@yahoo.com>
    Signed-off-by: Finn Thain <fthain@linux-m68k.org>
    Link: https://lore.kernel.org/r/52e02a8812ae1a2d810d7f9f7fd800c3ccc320c4.1723001788.git.fthain@linux-m68k.org
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

scsi: pm8001: Do not overwrite PCI queue mapping [+ + +]
Author: Daniel Wagner <dwagner@suse.de>
Date:   Thu Sep 12 10:58:28 2024 +0200

    scsi: pm8001: Do not overwrite PCI queue mapping
    
    [ Upstream commit a141c17a543332fc1238eb5cba562bfc66879126 ]
    
    blk_mq_pci_map_queues() maps all queues but right after this, we overwrite
    these mappings by calling blk_mq_map_queues(). Just use one helper but not
    both.
    
    Fixes: 42f22fe36d51 ("scsi: pm8001: Expose hardware queues for pm80xx")
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: John Garry <john.g.garry@oracle.com>
    Signed-off-by: Daniel Wagner <dwagner@suse.de>
    Link: https://lore.kernel.org/r/20240912-do-not-overwrite-pci-mapping-v1-1-85724b6cec49@suse.de
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

scsi: smartpqi: add new controller PCI IDs [+ + +]
Author: David Strahan <David.Strahan@microchip.com>
Date:   Tue Aug 27 13:54:58 2024 -0500

    scsi: smartpqi: add new controller PCI IDs
    
    [ Upstream commit dbc39b84540f746cc814e69b21e53e6d3e12329a ]
    
    All PCI ID entries in Hex.
    
    Add new cisco pci ids:
                                                 VID  / DID  / SVID / SDID
                                                 ----   ----   ----   ----
                                                 9005   028f   1137   02fe
                                                 9005   028f   1137   02ff
                                                 9005   028f   1137   0300
    
    Add new h3c pci ids:
                                                 VID  / DID  / SVID / SDID
                                                 ----   ----   ----   ----
                                                 9005   028f   193d   0462
                                                 9005   028f   193d   8462
    
    Add new ieit pci ids:
                                                 VID  / DID  / SVID / SDID
                                                 ----   ----   ----   ----
                                                 9005   028f   1ff9   00a3
    
    Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
    Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
    Signed-off-by: David Strahan <David.Strahan@microchip.com>
    Signed-off-by: Don Brace <don.brace@microchip.com>
    Link: https://lore.kernel.org/r/20240827185501.692804-5-don.brace@microchip.com
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

scsi: smartpqi: Add new controller PCI IDs [+ + +]
Author: David Strahan <David.Strahan@microchip.com>
Date:   Thu Jul 11 14:47:00 2024 -0500

    scsi: smartpqi: Add new controller PCI IDs
    
    [ Upstream commit 0e21e73384d324f75ea16f3d622cfc433fa6209b ]
    
    All PCI ID entries in hex.
    
    Add new inagile PCI IDs:
                                                 VID  / DID  / SVID / SDID
                                                 ----   ----   ----   ----
                SMART-HBA 8242-24i               9005 / 028f / 1ff9 / 0045
                RAID 8236-16i                    9005 / 028f / 1ff9 / 0046
                RAID 8240-24i                    9005 / 028f / 1ff9 / 0047
                SMART-HBA 8238-16i               9005 / 028f / 1ff9 / 0048
                PM8222-SHBA                      9005 / 028f / 1ff9 / 004a
                RAID PM8204-2GB                  9005 / 028f / 1ff9 / 004b
                RAID PM8204-4GB                  9005 / 028f / 1ff9 / 004c
                PM8222-HBA                       9005 / 028f / 1ff9 / 004f
                MT0804M6R                        9005 / 028f / 1ff9 / 0051
                MT0801M6E                        9005 / 028f / 1ff9 / 0052
                MT0808M6R                        9005 / 028f / 1ff9 / 0053
                MT0800M6H                        9005 / 028f / 1ff9 / 0054
                RS0800M5H24i                     9005 / 028f / 1ff9 / 006b
                RS0800M5E8i                      9005 / 028f / 1ff9 / 006c
                RS0800M5H8i                      9005 / 028f / 1ff9 / 006d
                RS0804M5R16i                     9005 / 028f / 1ff9 / 006f
                RS0800M5E24i                     9005 / 028f / 1ff9 / 0070
                RS0800M5H16i                     9005 / 028f / 1ff9 / 0071
                RS0800M5E16i                     9005 / 028f / 1ff9 / 0072
                RT0800M7E                        9005 / 028f / 1ff9 / 0086
                RT0800M7H                        9005 / 028f / 1ff9 / 0087
                RT0804M7R                        9005 / 028f / 1ff9 / 0088
                RT0808M7R                        9005 / 028f / 1ff9 / 0089
                RT1608M6R16i                     9005 / 028f / 1ff9 / 00a1
    
    Add new h3c pci_id:
                                                 VID  / DID  / SVID / SDID
                                                 ----   ----   ----   ----
                UN RAID P4408-Mr-2               9005 / 028f / 193d / 1110
    
    Add new powerleader pci ids:
                                                 VID  / DID  / SVID / SDID
                                                 ----   ----   ----   ----
                PL SmartROC PM8204               9005 / 028f / 1f3a / 0104
    
    Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
    Reviewed-by: Scott Teel <scott.teel@microchip.com>
    Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
    Signed-off-by: David Strahan <David.Strahan@microchip.com>
    Signed-off-by: Don Brace <don.brace@microchip.com>
    Link: https://lore.kernel.org/r/20240711194704.982400-2-don.brace@microchip.com
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

scsi: smartpqi: correct stream detection [+ + +]
Author: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>
Date:   Tue Aug 27 13:54:56 2024 -0500

    scsi: smartpqi: correct stream detection
    
    [ Upstream commit 4c76114932d1d6fad2e72823e7898a3c960cf2a7 ]
    
    Correct stream detection by initializing the structure
    pqi_scsi_dev_raid_map_data to 0s.
    
    When the OS issues SCSI READ commands, the driver erroneously considers
    them as SCSI WRITES. If they are identified as sequential IOs, the driver
    then submits those requests via the RAID path instead of the AIO path.
    
    The 'is_write' flag might be set for SCSI READ commands also.  The driver
    may interpret SCSI READ commands as SCSI WRITE commands, resulting in IOs
    being submitted through the RAID path.
    
    Note: This does not cause data corruption.
    
    Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
    Reviewed-by: Scott Teel <scott.teel@microchip.com>
    Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
    Signed-off-by: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>
    Signed-off-by: Don Brace <don.brace@microchip.com>
    Link: https://lore.kernel.org/r/20240827185501.692804-3-don.brace@microchip.com
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

scsi: st: Fix input/output error on empty drive reset [+ + +]
Author: Rafael Rocha <rrochavi@fnal.gov>
Date:   Thu Sep 5 12:39:21 2024 -0500

    scsi: st: Fix input/output error on empty drive reset
    
    [ Upstream commit 3d882cca73be830549833517ddccb3ac4668c04e ]
    
    A previous change was introduced to prevent data loss during a power-on
    reset when a tape is present inside the drive. This commit set the
    "pos_unknown" flag to true to avoid operations that could compromise data
    by performing actions from an untracked position. The relevant change is
    commit 9604eea5bd3a ("scsi: st: Add third party poweron reset handling")
    
    As a consequence of this change, a new issue has surfaced: the driver now
    returns an "Input/output error" even for empty drives when the drive, host,
    or bus is reset. This issue stems from the "flush_buffer" function, which
    first checks whether the "pos_unknown" flag is set. If the flag is set, the
    user will encounter an "Input/output error" until the tape position is
    known again. This behavior differs from the previous implementation, where
    empty drives were not affected at system start up time, allowing tape
    software to send commands to the driver to retrieve the drive's status and
    other information.
    
    The current behavior prioritizes the "pos_unknown" flag over the
    "ST_NO_TAPE" status, leading to issues for software that detects drives
    during system startup. This software will receive an "Input/output error"
    until a tape is loaded and its position is known.
    
    To resolve this, the "ST_NO_TAPE" status should take priority when the
    drive is empty, allowing communication with the drive following a power-on
    reset. At the same time, the change should continue to protect data by
    maintaining the "pos_unknown" flag when the drive contains a tape and its
    position is unknown.
    
    Signed-off-by: Rafael Rocha <rrochavi@fnal.gov>
    Link: https://lore.kernel.org/r/20240905173921.10944-1-rrochavi@fnal.gov
    Fixes: 9604eea5bd3a ("scsi: st: Add third party poweron reset handling")
    Acked-by: Kai Mäkisara <kai.makisara@kolumbus.fi>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
sctp: set sk_state back to CLOSED if autobind fails in sctp_listen_start [+ + +]
Author: Xin Long <lucien.xin@gmail.com>
Date:   Mon Sep 30 16:49:51 2024 -0400

    sctp: set sk_state back to CLOSED if autobind fails in sctp_listen_start
    
    [ Upstream commit 8beee4d8dee76b67c75dc91fd8185d91e845c160 ]
    
    In sctp_listen_start() invoked by sctp_inet_listen(), it should set the
    sk_state back to CLOSED if sctp_autobind() fails due to whatever reason.
    
    Otherwise, next time when calling sctp_inet_listen(), if sctp_sk(sk)->reuse
    is already set via setsockopt(SCTP_REUSE_PORT), sctp_sk(sk)->bind_hash will
    be dereferenced as sk_state is LISTENING, which causes a crash as bind_hash
    is NULL.
    
      KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
      RIP: 0010:sctp_inet_listen+0x7f0/0xa20 net/sctp/socket.c:8617
      Call Trace:
       <TASK>
       __sys_listen_socket net/socket.c:1883 [inline]
       __sys_listen+0x1b7/0x230 net/socket.c:1894
       __do_sys_listen net/socket.c:1902 [inline]
    
    Fixes: 5e8f3f703ae4 ("sctp: simplify sctp listening code")
    Reported-by: syzbot+f4e0f821e3a3b7cee51d@syzkaller.appspotmail.com
    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
    Link: https://patch.msgid.link/a93e655b3c153dc8945d7a812e6d8ab0d52b7aa0.1727729391.git.lucien.xin@gmail.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
selftest: hid: add missing run-hid-tools-tests.sh [+ + +]
Author: Yun Lu <luyun@kylinos.cn>
Date:   Sun Sep 29 16:55:49 2024 +0800

    selftest: hid: add missing run-hid-tools-tests.sh
    
    [ Upstream commit 160c826b4dd0d570f0f51cf002cb49bda807e9f5 ]
    
    HID test cases run tests using the run-hid-tools-tests.sh script.
    When installed with "make install", the run-hid-tools-tests.sh
    script will not be copied over, resulting in the following error message.
    
      make -C tools/testing/selftests/ TARGETS=hid install \
              INSTALL_PATH=$KSFT_INSTALL_PATH
    
      cd $KSFT_INSTALL_PATH
      ./run_kselftest.sh -c hid
    
    selftests: hid: hid-core.sh
    bash: ./run-hid-tools-tests.sh: No such file or directory
    
    Add the run-hid-tools-tests.sh script to the TEST_FILES in the Makefile
    for it to be installed.
    
    Fixes: ffb85d5c9e80 ("selftests: hid: import hid-tools hid-core tests")
    Signed-off-by: Yun Lu <luyun@kylinos.cn>
    Acked-by: Benjamin Tissoires <bentiss@kernel.org>
    Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
selftests/bpf: fix uprobe.path leak in bpf_testmod [+ + +]
Author: Jiri Olsa <olsajiri@gmail.com>
Date:   Thu Aug 1 15:27:24 2024 +0200

    selftests/bpf: fix uprobe.path leak in bpf_testmod
    
    [ Upstream commit db61e6a4eee5a7884b2cafeaf407895f253bbaa7 ]
    
    testmod_unregister_uprobe() forgets to path_put(&uprobe.path).
    
    Signed-off-by: Jiri Olsa <olsajiri@gmail.com>
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Link: https://lore.kernel.org/r/20240801132724.GA8791@redhat.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
selftests/mm: fix charge_reserved_hugetlb.sh test [+ + +]
Author: David Hildenbrand <david@redhat.com>
Date:   Wed Aug 21 14:31:15 2024 +0200

    selftests/mm: fix charge_reserved_hugetlb.sh test
    
    [ Upstream commit c41a701d18efe6b8aa402efab16edbaba50c9548 ]
    
    Currently, running the charge_reserved_hugetlb.sh selftest we can
    sometimes observe something like:
    
      $ ./charge_reserved_hugetlb.sh -cgroup-v2
      ...
      write_result is 0
      After write:
      hugetlb_usage=0
      reserved_usage=10485760
      killing write_to_hugetlbfs
      Received 2.
      Deleting the memory
      Detach failure: Invalid argument
      umount: /mnt/huge: target is busy.
    
    Both cases are issues in the test.
    
    While the unmount error seems to be racy, it will make the test fail:
            $ ./run_vmtests.sh -t hugetlb
            ...
            # [FAIL]
            not ok 10 charge_reserved_hugetlb.sh -cgroup-v2 # exit=32
    
    The issue is that we are not waiting for the write_to_hugetlbfs process to
    quit.  So it might still have a hugetlbfs file open, about which umount is
    not happy.  Fix that by making "killall" wait for the process to quit.
    
    The other error ("Detach failure: Invalid argument") does not seem to
    result in a test error, but is misleading.  Turns out write_to_hugetlbfs.c
    unconditionally tries to cleanup using shmdt(), even when we only
    mmap()'ed a hugetlb file.  Even worse, shmaddr is never even set for the
    SHM case.  Fix that as well.
    
    With this change it seems to work as expected.
    
    Link: https://lkml.kernel.org/r/20240821123115.2068812-1-david@redhat.com
    Fixes: 29750f71a9b4 ("hugetlb_cgroup: add hugetlb_cgroup reservation tests")
    Signed-off-by: David Hildenbrand <david@redhat.com>
    Reported-by: Mario Casquero <mcasquer@redhat.com>
    Reviewed-by: Mina Almasry <almasrymina@google.com>
    Tested-by: Mario Casquero <mcasquer@redhat.com>
    Cc: Shuah Khan <shuah@kernel.org>
    Cc: Muchun Song <muchun.song@linux.dev>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
selftests/nolibc: avoid passing NULL to printf("%s") [+ + +]
Author: Thomas Weißschuh <linux@weissschuh.net>
Date:   Wed Aug 7 23:51:44 2024 +0200

    selftests/nolibc: avoid passing NULL to printf("%s")
    
    [ Upstream commit f1a58f61d88642ae1e6e97e9d72d73bc70a93cb8 ]
    
    Clang on higher optimization levels detects that NULL is passed to
    printf("%s") and warns about it.
    While printf() from nolibc gracefully handles that NULL,
    it is undefined behavior as per POSIX, so the warning is reasonable.
    Avoid the warning by transforming NULL into a non-NULL placeholder.
    
    Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
    Acked-by: Willy Tarreau <w@1wt.eu>
    Link: https://lore.kernel.org/r/20240807-nolibc-llvm-v2-8-c20f2f5fc7c2@weissschuh.net
    Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
selftests: breakpoints: use remaining time to check if suspend succeed [+ + +]
Author: Yifei Liu <yifei.l.liu@oracle.com>
Date:   Mon Sep 30 15:40:25 2024 -0700

    selftests: breakpoints: use remaining time to check if suspend succeed
    
    [ Upstream commit c66be905cda24fb782b91053b196bd2e966f95b7 ]
    
    step_after_suspend_test fails with device busy error while
    writing to /sys/power/state to start suspend. The test believes
    it failed to enter suspend state with
    
    $ sudo ./step_after_suspend_test
    TAP version 13
    Bail out! Failed to enter Suspend state
    
    However, in the kernel message, I indeed see the system get
    suspended and then wake up later.
    
    [611172.033108] PM: suspend entry (s2idle)
    [611172.044940] Filesystems sync: 0.006 seconds
    [611172.052254] Freezing user space processes
    [611172.059319] Freezing user space processes completed (elapsed 0.001 seconds)
    [611172.067920] OOM killer disabled.
    [611172.072465] Freezing remaining freezable tasks
    [611172.080332] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
    [611172.089724] printk: Suspending console(s) (use no_console_suspend to debug)
    [611172.117126] serial 00:03: disabled
    some other hardware get reconnected
    [611203.136277] OOM killer enabled.
    [611203.140637] Restarting tasks ...
    [611203.141135] usb 1-8.1: USB disconnect, device number 7
    [611203.141755] done.
    [611203.155268] random: crng reseeded on system resumption
    [611203.162059] PM: suspend exit
    
    After investigation, I noticed that for the code block
    if (write(power_state_fd, "mem", strlen("mem")) != strlen("mem"))
            ksft_exit_fail_msg("Failed to enter Suspend state\n");
    
    The write will return -1 and errno is set to 16 (device busy).
    It should be caused by the write function is not successfully returned
    before the system suspend and the return value get messed when waking up.
    As a result, It may be better to check the time passed of those few
    instructions to determine whether the suspend is executed correctly for
    it is pretty hard to execute those few lines for 5 seconds.
    
    The timer to wake up the system is set to expire after 5 seconds and
    no re-arm. If the timer remaining time is 0 second and 0 nano secomd,
    it means the timer expired and wake the system up. Otherwise, the system
    could be considered to enter the suspend state failed if there is any
    remaining time.
    
    After appling this patch, the test would not fail for it believes the
    system does not go to suspend by mistake. It now could continue to the
    rest part of the test after suspend.
    
    Fixes: bfd092b8c272 ("selftests: breakpoint: add step_after_suspend_test")
    Reported-by: Sinadin Shan <sinadin.shan@oracle.com>
    Signed-off-by: Yifei Liu <yifei.l.liu@oracle.com>
    Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

selftests: netfilter: Add missing return value [+ + +]
Author: zhang jiao <zhangjiao2@cmss.chinamobile.com>
Date:   Fri Sep 27 11:22:05 2024 +0800

    selftests: netfilter: Add missing return value
    
    [ Upstream commit 10dbd23633f0433f8d13c2803d687b36a675ef60 ]
    
    There is no return value in count_entries, just add it.
    
    Fixes: eff3c558bb7e ("netfilter: ctnetlink: support filtering by zone")
    Signed-off-by: zhang jiao <zhangjiao2@cmss.chinamobile.com>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

selftests: netfilter: Fix nft_audit.sh for newer nft binaries [+ + +]
Author: Phil Sutter <phil@nwl.cc>
Date:   Thu Sep 26 18:56:31 2024 +0200

    selftests: netfilter: Fix nft_audit.sh for newer nft binaries
    
    [ Upstream commit 8a89015644513ef69193a037eb966f2d55fe385a ]
    
    As a side-effect of nftables' commit dbff26bfba833 ("cache: consolidate
    reset command"), audit logs changed when more objects were reset than
    fit into a single netlink message.
    
    Since the objects' distribution in netlink messages is not relevant,
    implement a summarizing function which combines repeated audit logs into
    a single one with summed up 'entries=' value.
    
    Fixes: 203bb9d39866 ("selftests: netfilter: Extend nft_audit.sh")
    Signed-off-by: Phil Sutter <phil@nwl.cc>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

selftests: vDSO: fix ELF hash table entry size for s390x [+ + +]
Author: Jens Remus <jremus@linux.ibm.com>
Date:   Wed Sep 11 10:50:14 2024 +0200

    selftests: vDSO: fix ELF hash table entry size for s390x
    
    [ Upstream commit 14be4e6f35221c4731b004553ecf7cbc6dc1d2d8 ]
    
    The vDSO self tests fail on s390x for a vDSO linked with the GNU linker
    ld as follows:
    
      # ./vdso_test_gettimeofday
      Floating point exception (core dumped)
    
    On s390x the ELF hash table entries are 64 bits instead of 32 bits in
    size (see Glibc sysdeps/unix/sysv/linux/s390/bits/elfclass.h).
    
    Fixes: 40723419f407 ("kselftest: Enable vDSO test on non x86 platforms")
    Reported-by: Heiko Carstens <hca@linux.ibm.com>
    Tested-by: Heiko Carstens <hca@linux.ibm.com>
    Signed-off-by: Jens Remus <jremus@linux.ibm.com>
    Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
    Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

selftests: vDSO: fix vDSO name for powerpc [+ + +]
Author: Christophe Leroy <christophe.leroy@csgroup.eu>
Date:   Fri Aug 30 14:28:35 2024 +0200

    selftests: vDSO: fix vDSO name for powerpc
    
    [ Upstream commit 59eb856c3ed9b3552befd240c0c339f22eed3fa1 ]
    
    Following error occurs when running vdso_test_correctness on powerpc:
    
    ~ # ./vdso_test_correctness
    [WARN]  failed to find vDSO
    [SKIP]  No vDSO, so skipping clock_gettime() tests
    [SKIP]  No vDSO, so skipping clock_gettime64() tests
    [RUN]   Testing getcpu...
    [OK]    CPU 0: syscall: cpu 0, node 0
    
    On powerpc, vDSO is neither called linux-vdso.so.1 nor linux-gate.so.1
    but linux-vdso32.so.1 or linux-vdso64.so.1.
    
    Also search those two names before giving up.
    
    Fixes: c7e5789b24d3 ("kselftest: Move test_vdso to the vDSO test suite")
    Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
    Acked-by: Shuah Khan <skhan@linuxfoundation.org>
    Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

selftests: vDSO: fix vDSO symbols lookup for powerpc64 [+ + +]
Author: Christophe Leroy <christophe.leroy@csgroup.eu>
Date:   Fri Aug 30 14:28:37 2024 +0200

    selftests: vDSO: fix vDSO symbols lookup for powerpc64
    
    [ Upstream commit ba83b3239e657469709d15dcea5f9b65bf9dbf34 ]
    
    On powerpc64, following tests fail locating vDSO functions:
    
      ~ # ./vdso_test_abi
      TAP version 13
      1..16
      # [vDSO kselftest] VDSO_VERSION: LINUX_2.6.15
      # Couldn't find __kernel_gettimeofday
      ok 1 # SKIP __kernel_gettimeofday
      # clock_id: CLOCK_REALTIME
      # Couldn't find __kernel_clock_gettime
      ok 2 # SKIP __kernel_clock_gettime CLOCK_REALTIME
      # Couldn't find __kernel_clock_getres
      ok 3 # SKIP __kernel_clock_getres CLOCK_REALTIME
      ...
      # Couldn't find __kernel_time
      ok 16 # SKIP __kernel_time
      # Totals: pass:0 fail:0 xfail:0 xpass:0 skip:16 error:0
    
      ~ # ./vdso_test_getrandom
      __kernel_getrandom is missing!
    
      ~ # ./vdso_test_gettimeofday
      Could not find __kernel_gettimeofday
    
      ~ # ./vdso_test_getcpu
      Could not find __kernel_getcpu
    
    On powerpc64, as shown below by readelf, vDSO functions symbols have
    type NOTYPE, so also accept that type when looking for symbols.
    
    $ powerpc64-linux-gnu-readelf -a arch/powerpc/kernel/vdso/vdso64.so.dbg
    ELF Header:
      Magic:   7f 45 4c 46 02 02 01 00 00 00 00 00 00 00 00 00
      Class:                             ELF64
      Data:                              2's complement, big endian
      Version:                           1 (current)
      OS/ABI:                            UNIX - System V
      ABI Version:                       0
      Type:                              DYN (Shared object file)
      Machine:                           PowerPC64
      Version:                           0x1
    ...
    
    Symbol table '.dynsym' contains 12 entries:
       Num:    Value          Size Type    Bind   Vis      Ndx Name
         0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
         1: 0000000000000524    84 NOTYPE  GLOBAL DEFAULT    8 __[...]@@LINUX_2.6.15
         2: 00000000000005f0    36 NOTYPE  GLOBAL DEFAULT    8 __[...]@@LINUX_2.6.15
         3: 0000000000000578    68 NOTYPE  GLOBAL DEFAULT    8 __[...]@@LINUX_2.6.15
         4: 0000000000000000     0 OBJECT  GLOBAL DEFAULT  ABS LINUX_2.6.15
         5: 00000000000006c0    48 NOTYPE  GLOBAL DEFAULT    8 __[...]@@LINUX_2.6.15
         6: 0000000000000614   172 NOTYPE  GLOBAL DEFAULT    8 __[...]@@LINUX_2.6.15
         7: 00000000000006f0    84 NOTYPE  GLOBAL DEFAULT    8 __[...]@@LINUX_2.6.15
         8: 000000000000047c    84 NOTYPE  GLOBAL DEFAULT    8 __[...]@@LINUX_2.6.15
         9: 0000000000000454    12 NOTYPE  GLOBAL DEFAULT    8 __[...]@@LINUX_2.6.15
        10: 00000000000004d0    84 NOTYPE  GLOBAL DEFAULT    8 __[...]@@LINUX_2.6.15
        11: 00000000000005bc    52 NOTYPE  GLOBAL DEFAULT    8 __[...]@@LINUX_2.6.15
    
    Symbol table '.symtab' contains 56 entries:
       Num:    Value          Size Type    Bind   Vis      Ndx Name
    ...
        45: 0000000000000000     0 OBJECT  GLOBAL DEFAULT  ABS LINUX_2.6.15
        46: 00000000000006c0    48 NOTYPE  GLOBAL DEFAULT    8 __kernel_getcpu
        47: 0000000000000524    84 NOTYPE  GLOBAL DEFAULT    8 __kernel_clock_getres
        48: 00000000000005f0    36 NOTYPE  GLOBAL DEFAULT    8 __kernel_get_tbfreq
        49: 000000000000047c    84 NOTYPE  GLOBAL DEFAULT    8 __kernel_gettimeofday
        50: 0000000000000614   172 NOTYPE  GLOBAL DEFAULT    8 __kernel_sync_dicache
        51: 00000000000006f0    84 NOTYPE  GLOBAL DEFAULT    8 __kernel_getrandom
        52: 0000000000000454    12 NOTYPE  GLOBAL DEFAULT    8 __kernel_sigtram[...]
        53: 0000000000000578    68 NOTYPE  GLOBAL DEFAULT    8 __kernel_time
        54: 00000000000004d0    84 NOTYPE  GLOBAL DEFAULT    8 __kernel_clock_g[...]
        55: 00000000000005bc    52 NOTYPE  GLOBAL DEFAULT    8 __kernel_get_sys[...]
    
    Fixes: 98eedc3a9dbf ("Document the vDSO and add a reference parser")
    Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
    Acked-by: Shuah Khan <skhan@linuxfoundation.org>
    Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

selftests: vDSO: fix vdso_config for powerpc [+ + +]
Author: Christophe Leroy <christophe.leroy@csgroup.eu>
Date:   Fri Aug 30 14:28:36 2024 +0200

    selftests: vDSO: fix vdso_config for powerpc
    
    [ Upstream commit 7d297c419b08eafa69ce27243ee9bbecab4fcaa4 ]
    
    Running vdso_test_correctness on powerpc64 gives the following warning:
    
      ~ # ./vdso_test_correctness
      Warning: failed to find clock_gettime64 in vDSO
    
    This is because vdso_test_correctness was built with VDSO_32BIT defined.
    
    __powerpc__ macro is defined on both powerpc32 and powerpc64 so
    __powerpc64__ needs to be checked first in vdso_config.h
    
    Fixes: 693f5ca08ca0 ("kselftest: Extend vDSO selftest")
    Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
    Acked-by: Shuah Khan <skhan@linuxfoundation.org>
    Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

selftests: vDSO: fix vdso_config for s390 [+ + +]
Author: Heiko Carstens <hca@linux.ibm.com>
Date:   Wed Sep 11 10:50:15 2024 +0200

    selftests: vDSO: fix vdso_config for s390
    
    [ Upstream commit a6e23fb8d3c0e3904da70beaf5d7e840a983c97f ]
    
    Running vdso_test_correctness on s390x (aka s390 64 bit) emits a warning:
    
    Warning: failed to find clock_gettime64 in vDSO
    
    This is caused by the "#elif defined (__s390__)" check in vdso_config.h
    which the defines VDSO_32BIT.
    
    If __s390x__ is defined also __s390__ is defined. Therefore the correct
    check must make sure that only __s390__ is defined.
    
    Therefore add the missing !defined(__s390x__). Also use common
    __s390x__ define instead of __s390X__.
    
    Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
    Fixes: 693f5ca08ca0 ("kselftest: Extend vDSO selftest")
    Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
smb3: fix incorrect mode displayed for read-only files [+ + +]
Author: Steve French <stfrench@microsoft.com>
Date:   Sat Sep 21 23:28:32 2024 -0500

    smb3: fix incorrect mode displayed for read-only files
    
    commit 2f3017e7cc7515e0110a3733d8dca84de2a1d23d upstream.
    
    Commands like "chmod 0444" mark a file readonly via the attribute flag
    (when mapping of mode bits into the ACL are not set, or POSIX extensions
    are not negotiated), but they were not reported correctly for stat of
    directories (they were reported ok for files and for "ls").  See example
    below:
    
        root:~# ls /mnt2 -l
        total 12
        drwxr-xr-x 2 root root         0 Sep 21 18:03 normaldir
        -rwxr-xr-x 1 root root         0 Sep 21 23:24 normalfile
        dr-xr-xr-x 2 root root         0 Sep 21 17:55 readonly-dir
        -r-xr-xr-x 1 root root 209716224 Sep 21 18:15 readonly-file
        root:~# stat -c %a /mnt2/readonly-dir
        755
        root:~# stat -c %a /mnt2/readonly-file
        555
    
    This fixes the stat of directories when ATTR_READONLY is set
    (in cases where the mode can not be obtained other ways).
    
        root:~# stat -c %a /mnt2/readonly-dir
        555
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
smb: client: use actual path when queryfs [+ + +]
Author: wangrong <wangrong@uniontech.com>
Date:   Thu Jun 20 16:37:29 2024 +0800

    smb: client: use actual path when queryfs
    
    commit a421e3fe0e6abe27395078f4f0cec5daf466caea upstream.
    
    Due to server permission control, the client does not have access to
    the shared root directory, but can access subdirectories normally, so
    users usually mount the shared subdirectories directly. In this case,
    queryfs should use the actual path instead of the root directory to
    avoid the call returning an error (EACCES).
    
    Signed-off-by: wangrong <wangrong@uniontech.com>
    Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
spi: bcm63xx: Fix missing pm_runtime_disable() [+ + +]
Author: Jinjie Ruan <ruanjinjie@huawei.com>
Date:   Mon Aug 19 20:33:49 2024 +0800

    spi: bcm63xx: Fix missing pm_runtime_disable()
    
    commit 265697288ec2160ca84707565d6641d46f69b0ff upstream.
    
    The pm_runtime_disable() is missing in the remove function, fix it
    by using devm_pm_runtime_enable(), so the pm_runtime_disable() in
    the probe error path can also be removed.
    
    Fixes: 2d13f2ff6073 ("spi: bcm63xx-spi: fix pm_runtime")
    Cc: stable@vger.kernel.org # v5.13+
    Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
    Suggested-by: Jonas Gorski <jonas.gorski@gmail.com>
    Link: https://patch.msgid.link/20240819123349.4020472-3-ruanjinjie@huawei.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

spi: bcm63xx: Fix module autoloading [+ + +]
Author: Jinjie Ruan <ruanjinjie@huawei.com>
Date:   Mon Aug 19 20:33:48 2024 +0800

    spi: bcm63xx: Fix module autoloading
    
    commit 909f34f2462a99bf876f64c5c61c653213e32fce upstream.
    
    Add MODULE_DEVICE_TABLE(), so modules could be properly autoloaded
    based on the alias from platform_device_id table.
    
    Fixes: 44d8fb30941d ("spi/bcm63xx: move register definitions into the driver")
    Cc: stable@vger.kernel.org
    Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
    Reviewed-by: Jonas Gorski <jonas.gorski@gmail.com>
    Link: https://patch.msgid.link/20240819123349.4020472-2-ruanjinjie@huawei.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

spi: rpc-if: Add missing MODULE_DEVICE_TABLE [+ + +]
Author: Biju Das <biju.das.jz@bp.renesas.com>
Date:   Wed Jul 31 08:29:53 2024 +0100

    spi: rpc-if: Add missing MODULE_DEVICE_TABLE
    
    [ Upstream commit 0880f669436028c5499901e5acd8f4b4ea0e0c6a ]
    
    Add missing MODULE_DEVICE_TABLE definition for automatic loading of the
    driver when it is built as a module.
    
    Fixes: eb8d6d464a27 ("spi: add Renesas RPC-IF driver")
    Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
    Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
    Link: https://patch.msgid.link/20240731072955.224125-1-biju.das.jz@bp.renesas.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

spi: s3c64xx: fix timeout counters in flush_fifo [+ + +]
Author: Ben Dooks <ben.dooks@codethink.co.uk>
Date:   Tue Sep 24 14:40:08 2024 +0100

    spi: s3c64xx: fix timeout counters in flush_fifo
    
    [ Upstream commit 68a16708d2503b6303d67abd43801e2ca40c208d ]
    
    In the s3c64xx_flush_fifo() code, the loops counter is post-decremented
    in the do { } while(test && loops--) condition. This means the loops is
    left at the unsigned equivalent of -1 if the loop times out. The test
    after will never pass as if tests for loops == 0.
    
    Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
    Fixes: 230d42d422e7 ("spi: Add s3c64xx SPI Controller driver")
    Reviewed-by: Andi Shyti <andi.shyti@kernel.org>
    Link: https://patch.msgid.link/20240924134009.116247-2-ben.dooks@codethink.co.uk
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

spi: spi-cadence: Fix missing spi_controller_is_target() check [+ + +]
Author: Jinjie Ruan <ruanjinjie@huawei.com>
Date:   Mon Sep 23 12:00:15 2024 +0800

    spi: spi-cadence: Fix missing spi_controller_is_target() check
    
    [ Upstream commit 3eae4a916fc0eb6f85b5d399e10335dbd24dd765 ]
    
    The spi_controller_is_target() check is missing for pm_runtime_disable()
    in cdns_spi_remove(), add it.
    
    Fixes: b1b90514eaa3 ("spi: spi-cadence: Add support for Slave mode")
    Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
    Link: https://patch.msgid.link/20240923040015.3009329-4-ruanjinjie@huawei.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

spi: spi-cadence: Fix pm_runtime_set_suspended() with runtime pm enabled [+ + +]
Author: Jinjie Ruan <ruanjinjie@huawei.com>
Date:   Mon Sep 23 12:00:14 2024 +0800

    spi: spi-cadence: Fix pm_runtime_set_suspended() with runtime pm enabled
    
    [ Upstream commit 67d4a70faa662df07451e83db1546d3ca0695e08 ]
    
    It is not valid to call pm_runtime_set_suspended() for devices
    with runtime PM enabled because it returns -EAGAIN if it is enabled
    already and working. So, call pm_runtime_disable() before to fix it.
    
    Fixes: d36ccd9f7ea4 ("spi: cadence: Runtime pm adaptation")
    Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
    Link: https://patch.msgid.link/20240923040015.3009329-3-ruanjinjie@huawei.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

spi: spi-imx: Fix pm_runtime_set_suspended() with runtime pm enabled [+ + +]
Author: Jinjie Ruan <ruanjinjie@huawei.com>
Date:   Mon Sep 23 12:00:13 2024 +0800

    spi: spi-imx: Fix pm_runtime_set_suspended() with runtime pm enabled
    
    [ Upstream commit b6e05ba0844139dde138625906015c974c86aa93 ]
    
    It is not valid to call pm_runtime_set_suspended() for devices
    with runtime PM enabled because it returns -EAGAIN if it is enabled
    already and working. So, call pm_runtime_disable() before to fix it.
    
    Fixes: 43b6bf406cd0 ("spi: imx: fix runtime pm support for !CONFIG_PM")
    Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
    Link: https://patch.msgid.link/20240923040015.3009329-2-ruanjinjie@huawei.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
static_call: Handle module init failure correctly in static_call_del_module() [+ + +]
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Wed Sep 4 11:09:07 2024 +0200

    static_call: Handle module init failure correctly in static_call_del_module()
    
    [ Upstream commit 4b30051c4864234ec57290c3d142db7c88f10d8a ]
    
    Module insertion invokes static_call_add_module() to initialize the static
    calls in a module. static_call_add_module() invokes __static_call_init(),
    which allocates a struct static_call_mod to either encapsulate the built-in
    static call sites of the associated key into it so further modules can be
    added or to append the module to the module chain.
    
    If that allocation fails the function returns with an error code and the
    module core invokes static_call_del_module() to clean up eventually added
    static_call_mod entries.
    
    This works correctly, when all keys used by the module were converted over
    to a module chain before the failure. If not then static_call_del_module()
    causes a #GP as it blindly assumes that key::mods points to a valid struct
    static_call_mod.
    
    The problem is that key::mods is not a individual struct member of struct
    static_call_key, it's part of a union to save space:
    
            union {
                    /* bit 0: 0 = mods, 1 = sites */
                    unsigned long type;
                    struct static_call_mod *mods;
                    struct static_call_site *sites;
            };
    
    key::sites is a pointer to the list of built-in usage sites of the static
    call. The type of the pointer is differentiated by bit 0. A mods pointer
    has the bit clear, the sites pointer has the bit set.
    
    As static_call_del_module() blidly assumes that the pointer is a valid
    static_call_mod type, it fails to check for this failure case and
    dereferences the pointer to the list of built-in call sites, which is
    obviously bogus.
    
    Cure it by checking whether the key has a sites or a mods pointer.
    
    If it's a sites pointer then the key is not to be touched. As the sites are
    walked in the same order as in __static_call_init() the site walk can be
    terminated because all subsequent sites have not been touched by the init
    code due to the error exit.
    
    If it was converted before the allocation fail, then the inner loop which
    searches for a module match will find nothing.
    
    A fail in the second allocation in __static_call_init() is harmless and
    does not require special treatment. The first allocation succeeded and
    converted the key to a module chain. That first entry has mod::mod == NULL
    and mod::next == NULL, so the inner loop of static_call_del_module() will
    neither find a module match nor a module chain. The next site in the walk
    was either already converted, but can't match the module, or it will exit
    the outer loop because it has a static_call_site pointer and not a
    static_call_mod pointer.
    
    Fixes: 9183c3f9ed71 ("static_call: Add inline static call infrastructure")
    Closes: https://lore.kernel.org/all/20230915082126.4187913-1-ruanjinjie@huawei.com
    Reported-by: Jinjie Ruan <ruanjinjie@huawei.com>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Tested-by: Jinjie Ruan <ruanjinjie@huawei.com>
    Link: https://lore.kernel.org/r/87zfon6b0s.ffs@tglx
    Signed-off-by: Sasha Levin <sashal@kernel.org>
static_call: Replace pointless WARN_ON() in static_call_module_notify() [+ + +]
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Wed Sep 4 11:08:28 2024 +0200

    static_call: Replace pointless WARN_ON() in static_call_module_notify()
    
    [ Upstream commit fe513c2ef0a172a58f158e2e70465c4317f0a9a2 ]
    
    static_call_module_notify() triggers a WARN_ON(), when memory allocation
    fails in __static_call_add_module().
    
    That's not really justified, because the failure case must be correctly
    handled by the well known call chain and the error code is passed
    through to the initiating userspace application.
    
    A memory allocation fail is not a fatal problem, but the WARN_ON() takes
    the machine out when panic_on_warn is set.
    
    Replace it with a pr_warn().
    
    Fixes: 9183c3f9ed71 ("static_call: Add inline static call infrastructure")
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Link: https://lkml.kernel.org/r/8734mf7pmb.ffs@tglx
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
sunrpc: change sp_nrthreads from atomic_t to unsigned int. [+ + +]
Author: NeilBrown <neilb@suse.de>
Date:   Mon Jul 15 17:14:18 2024 +1000

    sunrpc: change sp_nrthreads from atomic_t to unsigned int.
    
    [ Upstream commit 60749cbe3d8ae572a6c7dda675de3e8b25797a18 ]
    
    sp_nrthreads is only ever accessed under the service mutex
      nlmsvc_mutex nfs_callback_mutex nfsd_mutex
    so these is no need for it to be an atomic_t.
    
    The fact that all code using it is single-threaded means that we can
    simplify svc_pool_victim and remove the temporary elevation of
    sp_nrthreads.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Stable-dep-of: aadc3bbea163 ("NFSD: Limit the number of concurrent async COPY operations")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
sysctl: avoid spurious permanent empty tables [+ + +]
Author: Thomas Weißschuh <linux@weissschuh.net>
Date:   Mon Aug 5 11:39:35 2024 +0200

    sysctl: avoid spurious permanent empty tables
    
    commit 559d4c6a9d3b60f239493239070eb304edaea594 upstream.
    
    The test if a table is a permanently empty one, inspects the address of
    the registered ctl_table argument.
    However as sysctl_mount_point is an empty array and does not occupy and
    space it can end up sharing an address with another object in memory.
    If that other object itself is a "struct ctl_table" then registering
    that table will fail as it's incorrectly recognized as permanently empty.
    
    Avoid this issue by adding a dummy element to the array so that is not
    empty anymore.
    Explicitly register the table with zero elements as otherwise the dummy
    element would be recognized as a sentinel element which would lead to a
    runtime warning from the sysctl core.
    
    While the issue seems not being encountered at this time, this seems
    mostly to be due to luck.
    Also a future change, constifying sysctl_mount_point and root_table, can
    reliably trigger this issue on clang 18.
    
    Given that empty arrays are non-standard in the first place it seems
    prudent to avoid them if possible.
    
    Fixes: 4a7b29f65094 ("sysctl: move sysctl type to ctl_table_header")
    Fixes: a35dd3a786f5 ("sysctl: drop now unnecessary out-of-bounds check")
    Cc: stable@vger.kernel.org
    Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
    Closes: https://lore.kernel.org/oe-lkp/202408051453.f638857e-lkp@intel.com
    Signed-off-by: Joel Granados <j.granados@samsung.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
tcp: avoid reusing FIN_WAIT2 when trying to find port in connect() process [+ + +]
Author: Jason Xing <kernelxing@tencent.com>
Date:   Fri Aug 23 08:11:52 2024 +0800

    tcp: avoid reusing FIN_WAIT2 when trying to find port in connect() process
    
    [ Upstream commit 0d9e5df4a257afc3a471a82961ace9a22b88295a ]
    
    We found that one close-wait socket was reset by the other side
    due to a new connection reusing the same port which is beyond our
    expectation, so we have to investigate the underlying reason.
    
    The following experiment is conducted in the test environment. We
    limit the port range from 40000 to 40010 and delay the time to close()
    after receiving a fin from the active close side, which can help us
    easily reproduce like what happened in production.
    
    Here are three connections captured by tcpdump:
    127.0.0.1.40002 > 127.0.0.1.9999: Flags [S], seq 2965525191
    127.0.0.1.9999 > 127.0.0.1.40002: Flags [S.], seq 2769915070
    127.0.0.1.40002 > 127.0.0.1.9999: Flags [.], ack 1
    127.0.0.1.40002 > 127.0.0.1.9999: Flags [F.], seq 1, ack 1
    // a few seconds later, within 60 seconds
    127.0.0.1.40002 > 127.0.0.1.9999: Flags [S], seq 2965590730
    127.0.0.1.9999 > 127.0.0.1.40002: Flags [.], ack 2
    127.0.0.1.40002 > 127.0.0.1.9999: Flags [R], seq 2965525193
    // later, very quickly
    127.0.0.1.40002 > 127.0.0.1.9999: Flags [S], seq 2965590730
    127.0.0.1.9999 > 127.0.0.1.40002: Flags [S.], seq 3120990805
    127.0.0.1.40002 > 127.0.0.1.9999: Flags [.], ack 1
    
    As we can see, the first flow is reset because:
    1) client starts a new connection, I mean, the second one
    2) client tries to find a suitable port which is a timewait socket
       (its state is timewait, substate is fin_wait2)
    3) client occupies that timewait port to send a SYN
    4) server finds a corresponding close-wait socket in ehash table,
       then replies with a challenge ack
    5) client sends an RST to terminate this old close-wait socket.
    
    I don't think the port selection algo can choose a FIN_WAIT2 socket
    when we turn on tcp_tw_reuse because on the server side there
    remain unread data. In some cases, if one side haven't call close() yet,
    we should not consider it as expendable and treat it at will.
    
    Even though, sometimes, the server isn't able to call close() as soon
    as possible like what we expect, it can not be terminated easily,
    especially due to a second unrelated connection happening.
    
    After this patch, we can see the expected failure if we start a
    connection when all the ports are occupied in fin_wait2 state:
    "Ncat: Cannot assign requested address."
    
    Reported-by: Jade Dong <jadedong@tencent.com>
    Signed-off-by: Jason Xing <kernelxing@tencent.com>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Link: https://patch.msgid.link/20240823001152.31004-1-kerneljasonxing@gmail.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
tipc: guard against string buffer overrun [+ + +]
Author: Simon Horman <horms@kernel.org>
Date:   Thu Aug 1 19:35:37 2024 +0100

    tipc: guard against string buffer overrun
    
    [ Upstream commit 6555a2a9212be6983d2319d65276484f7c5f431a ]
    
    Smatch reports that copying media_name and if_name to name_parts may
    overwrite the destination.
    
     .../bearer.c:166 bearer_name_validate() error: strcpy() 'media_name' too large for 'name_parts->media_name' (32 vs 16)
     .../bearer.c:167 bearer_name_validate() error: strcpy() 'if_name' too large for 'name_parts->if_name' (1010102 vs 16)
    
    This does seem to be the case so guard against this possibility by using
    strscpy() and failing if truncation occurs.
    
    Introduced by commit b97bf3fd8f6a ("[TIPC] Initial merge")
    
    Compile tested only.
    
    Reviewed-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Simon Horman <horms@kernel.org>
    Link: https://patch.msgid.link/20240801-tipic-overrun-v2-1-c5b869d1f074@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
tomoyo: fallback to realpath if symlink's pathname does not exist [+ + +]
Author: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date:   Wed Sep 25 22:30:59 2024 +0900

    tomoyo: fallback to realpath if symlink's pathname does not exist
    
    commit ada1986d07976d60bed5017aa38b7f7cf27883f7 upstream.
    
    Alfred Agrell found that TOMOYO cannot handle execveat(AT_EMPTY_PATH)
    inside chroot environment where /dev and /proc are not mounted, for
    commit 51f39a1f0cea ("syscalls: implement execveat() system call") missed
    that TOMOYO tries to canonicalize argv[0] when the filename fed to the
    executed program as argv[0] is supplied using potentially nonexistent
    pathname.
    
    Since "/dev/fd/<fd>" already lost symlink information used for obtaining
    that <fd>, it is too late to reconstruct symlink's pathname. Although
    <filename> part of "/dev/fd/<fd>/<filename>" might not be canonicalized,
    TOMOYO cannot use tomoyo_realpath_nofollow() when /dev or /proc is not
    mounted. Therefore, fallback to tomoyo_realpath_from_path() when
    tomoyo_realpath_nofollow() failed.
    
    Reported-by: Alfred Agrell <blubban@gmail.com>
    Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1082001
    Fixes: 51f39a1f0cea ("syscalls: implement execveat() system call")
    Cc: stable@vger.kernel.org # v3.19+
    Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
tools/hv: Add memory allocation check in hv_fcopy_start [+ + +]
Author: Zhu Jun <zhujun2@cmss.chinamobile.com>
Date:   Fri Sep 6 02:13:33 2024 -0700

    tools/hv: Add memory allocation check in hv_fcopy_start
    
    [ Upstream commit 94e86b174d103d941b4afc4f016af8af9e5352fa ]
    
    Added error handling for memory allocation failures
    of file_name and path_name.
    
    Signed-off-by: Zhu Jun <zhujun2@cmss.chinamobile.com>
    Reviewed-by: Dexuan Cui <decui@microsoft.com>
    Tested-by: Saurabh Sengar <ssengar@linux.microsoft.com>
    Link: https://lore.kernel.org/r/20240906091333.11419-1-zhujun2@cmss.chinamobile.com
    Signed-off-by: Wei Liu <wei.liu@kernel.org>
    Message-ID: <20240906091333.11419-1-zhujun2@cmss.chinamobile.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
tools/nolibc: powerpc: limit stack-protector workaround to GCC [+ + +]
Author: Thomas Weißschuh <linux@weissschuh.net>
Date:   Wed Aug 7 23:51:39 2024 +0200

    tools/nolibc: powerpc: limit stack-protector workaround to GCC
    
    [ Upstream commit 1daea158d0aae0770371f3079305a29fdb66829e ]
    
    As mentioned in the comment, the workaround for
    __attribute__((no_stack_protector)) is only necessary on GCC.
    Avoid applying the workaround on clang, as clang does not recognize
    __attribute__((__optimize__)) and would fail.
    
    Acked-by: Willy Tarreau <w@1wt.eu>
    Link: https://lore.kernel.org/r/20240807-nolibc-llvm-v2-3-c20f2f5fc7c2@weissschuh.net
    Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
tools/rtla: Fix installation from out-of-tree build [+ + +]
Author: Ben Hutchings <benh@debian.org>
Date:   Mon Sep 16 01:31:58 2024 +0200

    tools/rtla: Fix installation from out-of-tree build
    
    [ Upstream commit f771d5369f1dbfe32c93bcb4f5d7ca8322b15389 ]
    
    rtla now supports out-of-tree builds, but installation fails as it
    still tries to install the rtla binary from the source tree.  Use the
    existing macro $(RTLA) to refer to the binary.
    
    Link: https://lore.kernel.org/ZudubuoU_JHjPZ7w@decadent.org.uk
    Fixes: 01474dc706ca ("tools/rtla: Use tools/build makefiles to build rtla")
    Reviewed-by: Tomas Glozar <tglozar@redhat.com>
    Tested-by: Tomas Glozar <tglozar@redhat.com>
    Signed-off-by: Ben Hutchings <benh@debian.org>
    Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
tools/x86/kcpuid: Protect against faulty "max subleaf" values [+ + +]
Author: Ahmed S. Darwish <darwi@linutronix.de>
Date:   Thu Jul 18 15:47:44 2024 +0200

    tools/x86/kcpuid: Protect against faulty "max subleaf" values
    
    [ Upstream commit cf96ab1a966b87b09fdd9e8cc8357d2d00776a3a ]
    
    Protect against the kcpuid code parsing faulty max subleaf numbers
    through a min() expression.  Thus, ensuring that max_subleaf will always
    be ≤ MAX_SUBLEAF_NUM.
    
    Use "u32" for the subleaf numbers since kcpuid is compiled with -Wextra,
    which includes signed/unsigned comparisons warnings.
    
    Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Link: https://lore.kernel.org/all/20240718134755.378115-5-darwi@linutronix.de
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
tracing/hwlat: Fix a race during cpuhp processing [+ + +]
Author: Wei Li <liwei391@huawei.com>
Date:   Tue Sep 24 17:45:14 2024 +0800

    tracing/hwlat: Fix a race during cpuhp processing
    
    commit 2a13ca2e8abb12ee43ada8a107dadca83f140937 upstream.
    
    The cpuhp online/offline processing race also exists in percpu-mode hwlat
    tracer in theory, apply the fix too. That is:
    
        T1                       | T2
        [CPUHP_ONLINE]           | cpu_device_down()
         hwlat_hotplug_workfn()  |
                                 |     cpus_write_lock()
                                 |     takedown_cpu(1)
                                 |     cpus_write_unlock()
        [CPUHP_OFFLINE]          |
            cpus_read_lock()     |
            start_kthread(1)     |
            cpus_read_unlock()   |
    
    Cc: stable@vger.kernel.org
    Cc: Masami Hiramatsu <mhiramat@kernel.org>
    Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
    Link: https://lore.kernel.org/20240924094515.3561410-5-liwei391@huawei.com
    Fixes: ba998f7d9531 ("trace/hwlat: Support hotplug operations")
    Signed-off-by: Wei Li <liwei391@huawei.com>
    Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
tracing/timerlat: Drop interface_lock in stop_kthread() [+ + +]
Author: Wei Li <liwei391@huawei.com>
Date:   Tue Sep 24 17:45:12 2024 +0800

    tracing/timerlat: Drop interface_lock in stop_kthread()
    
    commit b484a02c9cedf8703eff8f0756f94618004bd165 upstream.
    
    stop_kthread() is the offline callback for "trace/osnoise:online", since
    commit 5bfbcd1ee57b ("tracing/timerlat: Add interface_lock around clearing
    of kthread in stop_kthread()"), the following ABBA deadlock scenario is
    introduced:
    
    T1                            | T2 [BP]               | T3 [AP]
    osnoise_hotplug_workfn()      | work_for_cpu_fn()     | cpuhp_thread_fun()
                                  |   _cpu_down()         |   osnoise_cpu_die()
      mutex_lock(&interface_lock) |                       |     stop_kthread()
                                  |     cpus_write_lock() |       mutex_lock(&interface_lock)
      cpus_read_lock()            |     cpuhp_kick_ap()   |
    
    As the interface_lock here in just for protecting the "kthread" field of
    the osn_var, use xchg() instead to fix this issue. Also use
    for_each_online_cpu() back in stop_per_cpu_kthreads() as it can take
    cpu_read_lock() again.
    
    Cc: stable@vger.kernel.org
    Cc: Masami Hiramatsu <mhiramat@kernel.org>
    Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
    Link: https://lore.kernel.org/20240924094515.3561410-3-liwei391@huawei.com
    Fixes: 5bfbcd1ee57b ("tracing/timerlat: Add interface_lock around clearing of kthread in stop_kthread()")
    Signed-off-by: Wei Li <liwei391@huawei.com>
    Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

tracing/timerlat: Fix a race during cpuhp processing [+ + +]
Author: Wei Li <liwei391@huawei.com>
Date:   Tue Sep 24 17:45:13 2024 +0800

    tracing/timerlat: Fix a race during cpuhp processing
    
    commit 829e0c9f0855f26b3ae830d17b24aec103f7e915 upstream.
    
    There is another found exception that the "timerlat/1" thread was
    scheduled on CPU0, and lead to timer corruption finally:
    
    ```
    ODEBUG: init active (active state 0) object: ffff888237c2e108 object type: hrtimer hint: timerlat_irq+0x0/0x220
    WARNING: CPU: 0 PID: 426 at lib/debugobjects.c:518 debug_print_object+0x7d/0xb0
    Modules linked in:
    CPU: 0 UID: 0 PID: 426 Comm: timerlat/1 Not tainted 6.11.0-rc7+ #45
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
    RIP: 0010:debug_print_object+0x7d/0xb0
    ...
    Call Trace:
     <TASK>
     ? __warn+0x7c/0x110
     ? debug_print_object+0x7d/0xb0
     ? report_bug+0xf1/0x1d0
     ? prb_read_valid+0x17/0x20
     ? handle_bug+0x3f/0x70
     ? exc_invalid_op+0x13/0x60
     ? asm_exc_invalid_op+0x16/0x20
     ? debug_print_object+0x7d/0xb0
     ? debug_print_object+0x7d/0xb0
     ? __pfx_timerlat_irq+0x10/0x10
     __debug_object_init+0x110/0x150
     hrtimer_init+0x1d/0x60
     timerlat_main+0xab/0x2d0
     ? __pfx_timerlat_main+0x10/0x10
     kthread+0xb7/0xe0
     ? __pfx_kthread+0x10/0x10
     ret_from_fork+0x2d/0x40
     ? __pfx_kthread+0x10/0x10
     ret_from_fork_asm+0x1a/0x30
     </TASK>
    ```
    
    After tracing the scheduling event, it was discovered that the migration
    of the "timerlat/1" thread was performed during thread creation. Further
    analysis confirmed that it is because the CPU online processing for
    osnoise is implemented through workers, which is asynchronous with the
    offline processing. When the worker was scheduled to create a thread, the
    CPU may has already been removed from the cpu_online_mask during the offline
    process, resulting in the inability to select the right CPU:
    
    T1                       | T2
    [CPUHP_ONLINE]           | cpu_device_down()
    osnoise_hotplug_workfn() |
                             |     cpus_write_lock()
                             |     takedown_cpu(1)
                             |     cpus_write_unlock()
    [CPUHP_OFFLINE]          |
        cpus_read_lock()     |
        start_kthread(1)     |
        cpus_read_unlock()   |
    
    To fix this, skip online processing if the CPU is already offline.
    
    Cc: stable@vger.kernel.org
    Cc: Masami Hiramatsu <mhiramat@kernel.org>
    Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
    Link: https://lore.kernel.org/20240924094515.3561410-4-liwei391@huawei.com
    Fixes: c8895e271f79 ("trace/osnoise: Support hotplug operations")
    Signed-off-by: Wei Li <liwei391@huawei.com>
    Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

tracing/timerlat: Fix duplicated kthread creation due to CPU online/offline [+ + +]
Author: Wei Li <liwei391@huawei.com>
Date:   Tue Sep 24 17:45:11 2024 +0800

    tracing/timerlat: Fix duplicated kthread creation due to CPU online/offline
    
    commit 0bb0a5c12ecf36ad561542bbb95f96355e036a02 upstream.
    
    osnoise_hotplug_workfn() is the asynchronous online callback for
    "trace/osnoise:online". It may be congested when a CPU goes online and
    offline repeatedly and is invoked for multiple times after a certain
    online.
    
    This will lead to kthread leak and timer corruption. Add a check
    in start_kthread() to prevent this situation.
    
    Cc: stable@vger.kernel.org
    Cc: Masami Hiramatsu <mhiramat@kernel.org>
    Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
    Link: https://lore.kernel.org/20240924094515.3561410-2-liwei391@huawei.com
    Fixes: c8895e271f79 ("trace/osnoise: Support hotplug operations")
    Signed-off-by: Wei Li <liwei391@huawei.com>
    Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
uprobes: fix kernel info leak via "[uprobes]" vma [+ + +]
Author: Oleg Nesterov <oleg@redhat.com>
Date:   Mon Oct 7 19:46:01 2024 +0200

    uprobes: fix kernel info leak via "[uprobes]" vma
    
    commit 34820304cc2cd1804ee1f8f3504ec77813d29c8e upstream.
    
    xol_add_vma() maps the uninitialized page allocated by __create_xol_area()
    into userspace. On some architectures (x86) this memory is readable even
    without VM_READ, VM_EXEC results in the same pgprot_t as VM_EXEC|VM_READ,
    although this doesn't really matter, debugger can read this memory anyway.
    
    Link: https://lore.kernel.org/all/20240929162047.GA12611@redhat.com/
    
    Reported-by: Will Deacon <will@kernel.org>
    Fixes: d4b3b6384f98 ("uprobes/core: Allocate XOL slots for uprobes use")
    Cc: stable@vger.kernel.org
    Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
vfs: use RCU in ilookup [+ + +]
Author: Mateusz Guzik <mjguzik@gmail.com>
Date:   Mon Jul 15 09:13:24 2024 +0200

    vfs: use RCU in ilookup
    
    [ Upstream commit 122381a46954ad592ee93d7da2bef5074b396247 ]
    
    A soft lockup in ilookup was reported when stress-testing a 512-way
    system [1] (see [2] for full context) and it was verified that not
    taking the lock shifts issues back to mm.
    
    [1] https://lore.kernel.org/linux-mm/56865e57-c250-44da-9713-cf1404595bcc@amd.com/
    [2] https://lore.kernel.org/linux-mm/d2841226-e27b-4d3d-a578-63587a3aa4f3@amd.com/
    
    Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
    Link: https://lore.kernel.org/r/20240715071324.265879-1-mjguzik@gmail.com
    Reviewed-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
vhost/scsi: null-ptr-dereference in vhost_scsi_get_req() [+ + +]
Author: Haoran Zhang <wh1sper@zju.edu.cn>
Date:   Tue Oct 1 15:14:15 2024 -0500

    vhost/scsi: null-ptr-dereference in vhost_scsi_get_req()
    
    commit 221af82f606d928ccef19a16d35633c63026f1be upstream.
    
    Since commit 3f8ca2e115e5 ("vhost/scsi: Extract common handling code
    from control queue handler") a null pointer dereference bug can be
    triggered when guest sends an SCSI AN request.
    
    In vhost_scsi_ctl_handle_vq(), `vc.target` is assigned with
    `&v_req.tmf.lun[1]` within a switch-case block and is then passed to
    vhost_scsi_get_req() which extracts `vc->req` and `tpg`. However, for
    a `VIRTIO_SCSI_T_AN_*` request, tpg is not required, so `vc.target` is
    set to NULL in this branch. Later, in vhost_scsi_get_req(),
    `vc->target` is dereferenced without being checked, leading to a null
    pointer dereference bug. This bug can be triggered from guest.
    
    When this bug occurs, the vhost_worker process is killed while holding
    `vq->mutex` and the corresponding tpg will remain occupied
    indefinitely.
    
    Below is the KASAN report:
    Oops: general protection fault, probably for non-canonical address
    0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN NOPTI
    KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
    CPU: 1 PID: 840 Comm: poc Not tainted 6.10.0+ #1
    Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS
    1.16.3-debian-1.16.3-2 04/01/2014
    RIP: 0010:vhost_scsi_get_req+0x165/0x3a0
    Code: 00 fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 2b 02 00 00
    48 b8 00 00 00 00 00 fc ff df 4d 8b 65 30 4c 89 e2 48 c1 ea 03 <0f> b6
    04 02 4c 89 e2 83 e2 07 38 d0 7f 08 84 c0 0f 85 be 01 00 00
    RSP: 0018:ffff888017affb50 EFLAGS: 00010246
    RAX: dffffc0000000000 RBX: ffff88801b000000 RCX: 0000000000000000
    RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff888017affcb8
    RBP: ffff888017affb80 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
    R13: ffff888017affc88 R14: ffff888017affd1c R15: ffff888017993000
    FS:  000055556e076500(0000) GS:ffff88806b100000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00000000200027c0 CR3: 0000000010ed0004 CR4: 0000000000370ef0
    Call Trace:
     <TASK>
     ? show_regs+0x86/0xa0
     ? die_addr+0x4b/0xd0
     ? exc_general_protection+0x163/0x260
     ? asm_exc_general_protection+0x27/0x30
     ? vhost_scsi_get_req+0x165/0x3a0
     vhost_scsi_ctl_handle_vq+0x2a4/0xca0
     ? __pfx_vhost_scsi_ctl_handle_vq+0x10/0x10
     ? __switch_to+0x721/0xeb0
     ? __schedule+0xda5/0x5710
     ? __kasan_check_write+0x14/0x30
     ? _raw_spin_lock+0x82/0xf0
     vhost_scsi_ctl_handle_kick+0x52/0x90
     vhost_run_work_list+0x134/0x1b0
     vhost_task_fn+0x121/0x350
    ...
     </TASK>
    ---[ end trace 0000000000000000 ]---
    
    Let's add a check in vhost_scsi_get_req.
    
    Fixes: 3f8ca2e115e5 ("vhost/scsi: Extract common handling code from control queue handler")
    Signed-off-by: Haoran Zhang <wh1sper@zju.edu.cn>
    [whitespace fixes]
    Signed-off-by: Mike Christie <michael.christie@oracle.com>
    Message-Id: <b26d7ddd-b098-4361-88f8-17ca7f90adf7@oracle.com>
    Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
virt: sev-guest: Ensure the SNP guest messages do not exceed a page [+ + +]
Author: Nikunj A Dadhania <nikunj@amd.com>
Date:   Wed Jul 31 20:37:55 2024 +0530

    virt: sev-guest: Ensure the SNP guest messages do not exceed a page
    
    [ Upstream commit 2b9ac0b84c2cae91bbaceab62df4de6d503421ec ]
    
    Currently, struct snp_guest_msg includes a message header (96 bytes) and
    a payload (4000 bytes). There is an implicit assumption here that the
    SNP message header will always be 96 bytes, and with that assumption the
    payload array size has been set to 4000 bytes - a magic number. If any
    new member is added to the SNP message header, the SNP guest message
    will span more than a page.
    
    Instead of using a magic number for the payload, declare struct
    snp_guest_msg in a way that payload plus the message header do not
    exceed a page.
    
      [ bp: Massage. ]
    
    Suggested-by: Tom Lendacky <thomas.lendacky@amd.com>
    Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
    Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
    Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
    Link: https://lore.kernel.org/r/20240731150811.156771-5-nikunj@amd.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
vrf: revert "vrf: Remove unnecessary RCU-bh critical section" [+ + +]
Author: Willem de Bruijn <willemb@google.com>
Date:   Sun Sep 29 02:18:20 2024 -0400

    vrf: revert "vrf: Remove unnecessary RCU-bh critical section"
    
    commit b04c4d9eb4f25b950b33218e33b04c94e7445e51 upstream.
    
    This reverts commit 504fc6f4f7f681d2a03aa5f68aad549d90eab853.
    
    dev_queue_xmit_nit is expected to be called with BH disabled.
    __dev_queue_xmit has the following:
    
            /* Disable soft irqs for various locks below. Also
             * stops preemption for RCU.
             */
            rcu_read_lock_bh();
    
    VRF must follow this invariant. The referenced commit removed this
    protection. Which triggered a lockdep warning:
    
            ================================
            WARNING: inconsistent lock state
            6.11.0 #1 Tainted: G        W
            --------------------------------
            inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
            btserver/134819 [HC0[0]:SC0[0]:HE1:SE1] takes:
            ffff8882da30c118 (rlock-AF_PACKET){+.?.}-{2:2}, at: tpacket_rcv+0x863/0x3b30
            {IN-SOFTIRQ-W} state was registered at:
              lock_acquire+0x19a/0x4f0
              _raw_spin_lock+0x27/0x40
              packet_rcv+0xa33/0x1320
              __netif_receive_skb_core.constprop.0+0xcb0/0x3a90
              __netif_receive_skb_list_core+0x2c9/0x890
              netif_receive_skb_list_internal+0x610/0xcc0
              [...]
    
            other info that might help us debug this:
             Possible unsafe locking scenario:
    
                   CPU0
                   ----
              lock(rlock-AF_PACKET);
              <Interrupt>
                lock(rlock-AF_PACKET);
    
             *** DEADLOCK ***
    
            Call Trace:
             <TASK>
             dump_stack_lvl+0x73/0xa0
             mark_lock+0x102e/0x16b0
             __lock_acquire+0x9ae/0x6170
             lock_acquire+0x19a/0x4f0
             _raw_spin_lock+0x27/0x40
             tpacket_rcv+0x863/0x3b30
             dev_queue_xmit_nit+0x709/0xa40
             vrf_finish_direct+0x26e/0x340 [vrf]
             vrf_l3_out+0x5f4/0xe80 [vrf]
             __ip_local_out+0x51e/0x7a0
              [...]
    
    Fixes: 504fc6f4f7f6 ("vrf: Remove unnecessary RCU-bh critical section")
    Link: https://lore.kernel.org/netdev/20240925185216.1990381-1-greearb@candelatech.com/
    Reported-by: Ben Greear <greearb@candelatech.com>
    Signed-off-by: Willem de Bruijn <willemb@google.com>
    Cc: stable@vger.kernel.org
    Reviewed-by: Ido Schimmel <idosch@nvidia.com>
    Tested-by: Ido Schimmel <idosch@nvidia.com>
    Reviewed-by: David Ahern <dsahern@kernel.org>
    Link: https://patch.msgid.link/20240929061839.1175300-1-willemdebruijn.kernel@gmail.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
wifi: ath11k: fix array out-of-bound access in SoC stats [+ + +]
Author: Karthikeyan Periyasamy <quic_periyasa@quicinc.com>
Date:   Thu Jul 4 12:38:11 2024 +0530

    wifi: ath11k: fix array out-of-bound access in SoC stats
    
    [ Upstream commit 69f253e46af98af17e3efa3e5dfa72fcb7d1983d ]
    
    Currently, the ath11k_soc_dp_stats::hal_reo_error array is defined with a
    maximum size of DP_REO_DST_RING_MAX. However, the ath11k_dp_process_rx()
    function access ath11k_soc_dp_stats::hal_reo_error using the REO
    destination SRNG ring ID, which is incorrect. SRNG ring ID differ from
    normal ring ID, and this usage leads to out-of-bounds array access. To fix
    this issue, modify ath11k_dp_process_rx() to use the normal ring ID
    directly instead of the SRNG ring ID to avoid out-of-bounds array access.
    
    Tested-on: QCN9074 hw1.0 PCI WLAN.HK.2.7.0.1-01744-QCAHKSWPL_SILICONZ-1
    
    Signed-off-by: Karthikeyan Periyasamy <quic_periyasa@quicinc.com>
    Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
    Link: https://patch.msgid.link/20240704070811.4186543-3-quic_periyasa@quicinc.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: ath12k: fix array out-of-bound access in SoC stats [+ + +]
Author: Karthikeyan Periyasamy <quic_periyasa@quicinc.com>
Date:   Thu Jul 4 12:38:10 2024 +0530

    wifi: ath12k: fix array out-of-bound access in SoC stats
    
    [ Upstream commit e106b7ad13c1d246adaa57df73edb8f8b8acb240 ]
    
    Currently, the ath12k_soc_dp_stats::hal_reo_error array is defined with a
    maximum size of DP_REO_DST_RING_MAX. However, the ath12k_dp_rx_process()
    function access ath12k_soc_dp_stats::hal_reo_error using the REO
    destination SRNG ring ID, which is incorrect. SRNG ring ID differ from
    normal ring ID, and this usage leads to out-of-bounds array access. To
    fix this issue, modify ath12k_dp_rx_process() to use the normal ring ID
    directly instead of the SRNG ring ID to avoid out-of-bounds array access.
    
    Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1
    
    Signed-off-by: Karthikeyan Periyasamy <quic_periyasa@quicinc.com>
    Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
    Link: https://patch.msgid.link/20240704070811.4186543-2-quic_periyasa@quicinc.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: ath9k: fix possible integer overflow in ath9k_get_et_stats() [+ + +]
Author: Dmitry Kandybka <d.kandybka@gmail.com>
Date:   Thu Jul 25 14:17:43 2024 +0300

    wifi: ath9k: fix possible integer overflow in ath9k_get_et_stats()
    
    [ Upstream commit 3f66f26703093886db81f0610b97a6794511917c ]
    
    In 'ath9k_get_et_stats()', promote TX stats counters to 'u64'
    to avoid possible integer overflow. Compile tested only.
    
    Found by Linux Verification Center (linuxtesting.org) with SVACE.
    
    Signed-off-by: Dmitry Kandybka <d.kandybka@gmail.com>
    Acked-by: Toke Høiland-Jørgensen <toke@toke.dk>
    Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
    Link: https://patch.msgid.link/20240725111743.14422-1-d.kandybka@gmail.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: ath9k_htc: Use __skb_set_length() for resetting urb before resubmit [+ + +]
Author: Toke Høiland-Jørgensen <toke@redhat.com>
Date:   Mon Aug 12 16:24:46 2024 +0200

    wifi: ath9k_htc: Use __skb_set_length() for resetting urb before resubmit
    
    [ Upstream commit 94745807f3ebd379f23865e6dab196f220664179 ]
    
    Syzbot points out that skb_trim() has a sanity check on the existing length of
    the skb, which can be uninitialised in some error paths. The intent here is
    clearly just to reset the length to zero before resubmitting, so switch to
    calling __skb_set_length(skb, 0) directly. In addition, __skb_set_length()
    already contains a call to skb_reset_tail_pointer(), so remove the redundant
    call.
    
    The syzbot report came from ath9k_hif_usb_reg_in_cb(), but there's a similar
    usage of skb_trim() in ath9k_hif_usb_rx_cb(), change both while we're at it.
    
    Reported-by: syzbot+98afa303be379af6cdb2@syzkaller.appspotmail.com
    Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
    Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
    Link: https://patch.msgid.link/20240812142447.12328-1-toke@toke.dk
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: cfg80211: Set correct chandef when starting CAC [+ + +]
Author: Issam Hamdi <ih@simonwunderlich.de>
Date:   Fri Aug 16 16:24:18 2024 +0200

    wifi: cfg80211: Set correct chandef when starting CAC
    
    [ Upstream commit 20361712880396e44ce80aaeec2d93d182035651 ]
    
    When starting CAC in a mode other than AP mode, it return a
    "WARNING: CPU: 0 PID: 63 at cfg80211_chandef_dfs_usable+0x20/0xaf [cfg80211]"
    caused by the chandef.chan being null at the end of CAC.
    
    Solution: Ensure the channel definition is set for the different modes
    when starting CAC to avoid getting a NULL 'chan' at the end of CAC.
    
     Call Trace:
      ? show_regs.part.0+0x14/0x16
      ? __warn+0x67/0xc0
      ? cfg80211_chandef_dfs_usable+0x20/0xaf [cfg80211]
      ? report_bug+0xa7/0x130
      ? exc_overflow+0x30/0x30
      ? handle_bug+0x27/0x50
      ? exc_invalid_op+0x18/0x60
      ? handle_exception+0xf6/0xf6
      ? exc_overflow+0x30/0x30
      ? cfg80211_chandef_dfs_usable+0x20/0xaf [cfg80211]
      ? exc_overflow+0x30/0x30
      ? cfg80211_chandef_dfs_usable+0x20/0xaf [cfg80211]
      ? regulatory_propagate_dfs_state.cold+0x1b/0x4c [cfg80211]
      ? cfg80211_propagate_cac_done_wk+0x1a/0x30 [cfg80211]
      ? process_one_work+0x165/0x280
      ? worker_thread+0x120/0x3f0
      ? kthread+0xc2/0xf0
      ? process_one_work+0x280/0x280
      ? kthread_complete_and_exit+0x20/0x20
      ? ret_from_fork+0x19/0x24
    
    Reported-by: Kretschmer Mathias <mathias.kretschmer@fit.fraunhofer.de>
    Signed-off-by: Issam Hamdi <ih@simonwunderlich.de>
    Link: https://patch.msgid.link/20240816142418.3381951-1-ih@simonwunderlich.de
    [shorten subject, remove OCB, reorder cases to match previous list]
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: iwlwifi: allow only CN mcc from WRDD [+ + +]
Author: Anjaneyulu <pagadala.yesu.anjaneyulu@intel.com>
Date:   Thu Aug 8 23:22:49 2024 +0300

    wifi: iwlwifi: allow only CN mcc from WRDD
    
    [ Upstream commit ff5aabe7c2a4a4b089a9ced0cb3d0e284963a7dd ]
    
    Block other mcc expect CN from WRDD ACPI.
    
    Signed-off-by: Anjaneyulu <pagadala.yesu.anjaneyulu@intel.com>
    Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
    Link: https://patch.msgid.link/20240808232017.fe6ea7aa4b39.I86004687a2963fe26f990770aca103e2f5cb1628@changeid
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: iwlwifi: mvm: avoid NULL pointer dereference [+ + +]
Author: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Date:   Sun Aug 25 19:17:09 2024 +0300

    wifi: iwlwifi: mvm: avoid NULL pointer dereference
    
    [ Upstream commit 557a6cd847645e667f3b362560bd7e7c09aac284 ]
    
    iwl_mvm_tx_skb_sta() and iwl_mvm_tx_mpdu() verify that the mvmvsta
    pointer is not NULL.
    It retrieves this pointer using iwl_mvm_sta_from_mac80211, which is
    dereferencing the ieee80211_sta pointer.
    If sta is NULL, iwl_mvm_sta_from_mac80211 will dereference a NULL
    pointer.
    Fix this by checking the sta pointer before retrieving the mvmsta
    from it. If sta is not NULL, then mvmsta isn't either.
    
    Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
    Reviewed-by: Johannes Berg <johannes.berg@intel.com>
    Link: https://patch.msgid.link/20240825191257.880921ce23b7.I340052d70ab6d3410724ce955eb00da10e08188f@changeid
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: iwlwifi: mvm: drop wrong STA selection in TX [+ + +]
Author: Johannes Berg <johannes.berg@intel.com>
Date:   Thu Aug 8 23:22:48 2024 +0300

    wifi: iwlwifi: mvm: drop wrong STA selection in TX
    
    [ Upstream commit 1c7e1068a7c9c39ed27636db93e71911e0045419 ]
    
    This shouldn't happen at all, since in station mode all MMPDUs
    go through the TXQ for the STA, and not this function. There
    may or may not be a race in mac80211 through which this might
    happen for some frames while a station is being added, but in
    that case we can also just drop the frame and pretend the STA
    didn't exist yet.
    
    Also, the code is simply wrong since it uses deflink, and it's
    not easy to fix it since the mvmvif->ap_sta pointer cannot be
    used without the mutex, and perhaps the right link might not
    even be known.
    
    Just drop the frame at that point instead of trying to fix it
    up.
    
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
    Link: https://patch.msgid.link/20240808232017.45ad105dc7fe.I6d45c82e5758395d9afb8854057ded03c7dc81d7@changeid
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: iwlwifi: mvm: Fix a race in scan abort flow [+ + +]
Author: Ilan Peer <ilan.peer@intel.com>
Date:   Sun Aug 25 08:56:37 2024 +0300

    wifi: iwlwifi: mvm: Fix a race in scan abort flow
    
    [ Upstream commit 87c1c28a9aa149489e1667f5754fc24f4973d2d0 ]
    
    When the upper layer requests to cancel an ongoing scan, a race
    is possible in which by the time the driver starts to handle the
    upper layers scan cancel flow, the FW already completed handling
    the scan request and the driver received the scan complete
    notification but still did not handle the notification. In such a
    case the FW will simply ignore the scan abort request coming from
    the driver, no notification would arrive from the FW and the entire
    abort flow would be considered a failure.
    
    To better handle this, check the status code returned by the FW for
    the scan abort command. In case the status indicates that
    no scan was aborted, complete the scan abort flow with success, i.e.,
    the scan was aborted, as the flow is expected to consume the scan
    complete notification.
    
    Signed-off-by: Ilan Peer <ilan.peer@intel.com>
    Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
    Link: https://patch.msgid.link/20240825085558.483989d3baef.I3340556a222388504c6330b333360bf77d10f9e2@changeid
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: iwlwifi: mvm: use correct key iteration [+ + +]
Author: Johannes Berg <johannes.berg@intel.com>
Date:   Mon Jul 29 20:20:05 2024 +0300

    wifi: iwlwifi: mvm: use correct key iteration
    
    [ Upstream commit 4f1591d292277eec51d027405a92f0d4ef5e299e ]
    
    In the cases changed here, key iteration isn't done from
    an RCU critical section, but rather using the wiphy lock
    as protection. Therefore, just use ieee80211_iter_keys().
    The link switch case can therefore also use sync commands.
    
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
    Link: https://patch.msgid.link/20240729201718.69a2d18580c1.I2148e04d4b467d0b100beac8f7e449bfaaf775a5@changeid
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: mac80211: fix RCU list iterations [+ + +]
Author: Johannes Berg <johannes.berg@intel.com>
Date:   Tue Aug 27 09:49:40 2024 +0200

    wifi: mac80211: fix RCU list iterations
    
    [ Upstream commit ac35180032fbc5d80b29af00ba4881815ceefcb6 ]
    
    There are a number of places where RCU list iteration is
    used, but that aren't (always) called with RCU held. Use
    just list_for_each_entry() in most, and annotate iface
    iteration with the required locks.
    
    Reviewed-by: Miriam Rachel Korenblit <miriam.rachel.korenblit@intel.com>
    Link: https://patch.msgid.link/20240827094939.ed8ac0b2f897.I8443c9c3c0f8051841353491dae758021b53115e@changeid
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: mt76: mt7915: add dummy HW offload of IEEE 802.11 fragmentation [+ + +]
Author: Benjamin Lin <benjamin-jw.lin@mediatek.com>
Date:   Tue Aug 27 11:30:03 2024 +0200

    wifi: mt76: mt7915: add dummy HW offload of IEEE 802.11 fragmentation
    
    [ Upstream commit f2cc859149240d910fdc6405717673e0b84bfda8 ]
    
    Currently, CONNAC2 series do not support encryption for fragmented Tx frames.
    Therefore, add dummy function mt7915_set_frag_threshold() to prevent SW
    IEEE 802.11 fragmentation.
    
    Signed-off-by: Benjamin Lin <benjamin-jw.lin@mediatek.com>
    Link: https://patch.msgid.link/20240827093011.18621-16-nbd@nbd.name
    Signed-off-by: Felix Fietkau <nbd@nbd.name>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: mt76: mt7915: disable tx worker during tx BA session enable/disable [+ + +]
Author: Felix Fietkau <nbd@nbd.name>
Date:   Tue Aug 27 11:29:54 2024 +0200

    wifi: mt76: mt7915: disable tx worker during tx BA session enable/disable
    
    [ Upstream commit 256cbd26fbafb30ba3314339106e5c594e9bd5f9 ]
    
    Avoids firmware race condition.
    
    Link: https://patch.msgid.link/20240827093011.18621-7-nbd@nbd.name
    Signed-off-by: Felix Fietkau <nbd@nbd.name>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: mt76: mt7915: hold dev->mt76.mutex while disabling tx worker [+ + +]
Author: Felix Fietkau <nbd@nbd.name>
Date:   Tue Aug 27 11:30:04 2024 +0200

    wifi: mt76: mt7915: hold dev->mt76.mutex while disabling tx worker
    
    [ Upstream commit 8f7152f10cb434f954aeff85ca1be9cd4d01912b ]
    
    Prevent racing against other functions disabling the same worker
    
    Link: https://patch.msgid.link/20240827093011.18621-17-nbd@nbd.name
    Signed-off-by: Felix Fietkau <nbd@nbd.name>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: mwifiex: Fix memcpy() field-spanning write warning in mwifiex_cmd_802_11_scan_ext() [+ + +]
Author: Gustavo A. R. Silva <gustavoars@kernel.org>
Date:   Wed Aug 21 15:23:51 2024 -0600

    wifi: mwifiex: Fix memcpy() field-spanning write warning in mwifiex_cmd_802_11_scan_ext()
    
    [ Upstream commit 498365e52bebcbc36a93279fe7e9d6aec8479cee ]
    
    Replace one-element array with a flexible-array member in
    `struct host_cmd_ds_802_11_scan_ext`.
    
    With this, fix the following warning:
    
    elo 16 17:51:58 surfacebook kernel: ------------[ cut here ]------------
    elo 16 17:51:58 surfacebook kernel: memcpy: detected field-spanning write (size 243) of single field "ext_scan->tlv_buffer" at drivers/net/wireless/marvell/mwifiex/scan.c:2239 (size 1)
    elo 16 17:51:58 surfacebook kernel: WARNING: CPU: 0 PID: 498 at drivers/net/wireless/marvell/mwifiex/scan.c:2239 mwifiex_cmd_802_11_scan_ext+0x83/0x90 [mwifiex]
    
    Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
    Closes: https://lore.kernel.org/linux-hardening/ZsZNgfnEwOcPdCly@black.fi.intel.com/
    Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
    Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
    Acked-by: Brian Norris <briannorris@chromium.org>
    Signed-off-by: Kalle Valo <kvalo@kernel.org>
    Link: https://patch.msgid.link/ZsZa5xRcsLq9D+RX@elsanto
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: rtw88: select WANT_DEV_COREDUMP [+ + +]
Author: Zong-Zhe Yang <kevin_yang@realtek.com>
Date:   Thu Jul 18 15:06:15 2024 +0800

    wifi: rtw88: select WANT_DEV_COREDUMP
    
    [ Upstream commit 7e989b0c1e33210c07340bf5228aa83ea52515b5 ]
    
    We have invoked device coredump when fw crash.
    Should select WANT_DEV_COREDUMP by ourselves.
    
    Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
    Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
    Link: https://patch.msgid.link/20240718070616.42217-1-pkshih@realtek.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: rtw89: 885xb: reset IDMEM mode to prevent download firmware failure [+ + +]
Author: Ping-Ke Shih <pkshih@realtek.com>
Date:   Wed Jul 24 13:26:25 2024 +0800

    wifi: rtw89: 885xb: reset IDMEM mode to prevent download firmware failure
    
    [ Upstream commit 80fb81bb46a57daedd5decbcc253ea48428a254e ]
    
    For different firmware type, it could change IDMEM mode, so reset it to
    default to avoid encountering error for RTL8851B/RTL8852B/RTL8852BT
    if that kind of firmware was downloaded before.
    
        rtw89_8851be 0000:02:00.0: Firmware version 0.29.41.3, cmd version 0, type 5
        rtw89_8851be 0000:02:00.0: Firmware version 0.29.41.3, cmd version 0, type 3
        rtw89_8851be 0000:02:00.0: MAC has already powered on
        rtw89_8851be 0000:02:00.0: fw security fail
        rtw89_8851be 0000:02:00.0: download firmware fail
        rtw89_8851be 0000:02:00.0: [ERR]fwdl 0x1E0 = 0x62
        rtw89_8851be 0000:02:00.0: [ERR]fwdl 0x83F2 = 0x8
        rtw89_8851be 0000:02:00.0: [ERR]fw PC = 0xb892f51c
        rtw89_8851be 0000:02:00.0: [ERR]fw PC = 0xb892f524
        rtw89_8851be 0000:02:00.0: [ERR]fw PC = 0xb892f51c
        rtw89_8851be 0000:02:00.0: [ERR]fw PC = 0xb892f500
        rtw89_8851be 0000:02:00.0: [ERR]fw PC = 0xb892f51c
        rtw89_8851be 0000:02:00.0: [ERR]fw PC = 0xb892f53c
        rtw89_8851be 0000:02:00.0: [ERR]fw PC = 0xb892f520
        rtw89_8851be 0000:02:00.0: [ERR]fw PC = 0xb892f520
        rtw89_8851be 0000:02:00.0: [ERR]fw PC = 0xb892f508
        rtw89_8851be 0000:02:00.0: [ERR]fw PC = 0xb892f534
        rtw89_8851be 0000:02:00.0: [ERR]fw PC = 0xb892f520
        rtw89_8851be 0000:02:00.0: [ERR]fw PC = 0xb892f534
        rtw89_8851be 0000:02:00.0: [ERR]fw PC = 0xb892f508
        rtw89_8851be 0000:02:00.0: [ERR]fw PC = 0xb892f53c
        rtw89_8851be 0000:02:00.0: [ERR]fw PC = 0xb892f524
        rtw89_8851be 0000:02:00.0: failed to setup chip information
        rtw89_8851be: probe of 0000:02:00.0 failed with error -16
    
    Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
    Link: https://patch.msgid.link/20240724052626.12774-4-pkshih@realtek.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: rtw89: avoid reading out of bounds when loading TX power FW elements [+ + +]
Author: Zong-Zhe Yang <kevin_yang@realtek.com>
Date:   Mon Sep 2 09:58:03 2024 +0800

    wifi: rtw89: avoid reading out of bounds when loading TX power FW elements
    
    [ Upstream commit ed2e4bb17a4884cf29c3347353d8aabb7265b46c ]
    
    Because the loop-expression will do one more time before getting false from
    cond-expression, the original code copied one more entry size beyond valid
    region.
    
    Fix it by moving the entry copy to loop-body.
    
    Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
    Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
    Link: https://patch.msgid.link/20240902015803.20420-1-pkshih@realtek.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: rtw89: avoid to add interface to list twice when SER [+ + +]
Author: Chih-Kang Chang <gary.chang@realtek.com>
Date:   Wed Jul 31 15:05:04 2024 +0800

    wifi: rtw89: avoid to add interface to list twice when SER
    
    [ Upstream commit 7dd5d2514a8ea58f12096e888b0bd050d7eae20a ]
    
    If SER L2 occurs during the WoWLAN resume flow, the add interface flow
    is triggered by ieee80211_reconfig(). However, due to
    rtw89_wow_resume() return failure, it will cause the add interface flow
    to be executed again, resulting in a double add list and causing a kernel
    panic. Therefore, we have added a check to prevent double adding of the
    list.
    
    list_add double add: new=ffff99d6992e2010, prev=ffff99d6992e2010, next=ffff99d695302628.
    ------------[ cut here ]------------
    kernel BUG at lib/list_debug.c:37!
    invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
    CPU: 0 PID: 9 Comm: kworker/0:1 Tainted: G        W  O       6.6.30-02659-gc18865c4dfbd #1 770df2933251a0e3c888ba69d1053a817a6376a7
    Hardware name: HP Grunt/Grunt, BIOS Google_Grunt.11031.169.0 06/24/2021
    Workqueue: events_freezable ieee80211_restart_work [mac80211]
    RIP: 0010:__list_add_valid_or_report+0x5e/0xb0
    Code: c7 74 18 48 39 ce 74 13 b0 01 59 5a 5e 5f 41 58 41 59 41 5a 5d e9 e2 d6 03 00 cc 48 c7 c7 8d 4f 17 83 48 89 c2 e8 02 c0 00 00 <0f> 0b 48 c7 c7 aa 8c 1c 83 e8 f4 bf 00 00 0f 0b 48 c7 c7 c8 bc 12
    RSP: 0018:ffffa91b8007bc50 EFLAGS: 00010246
    RAX: 0000000000000058 RBX: ffff99d6992e0900 RCX: a014d76c70ef3900
    RDX: ffffa91b8007bae8 RSI: 00000000ffffdfff RDI: 0000000000000001
    RBP: ffffa91b8007bc88 R08: 0000000000000000 R09: ffffa91b8007bae0
    R10: 00000000ffffdfff R11: ffffffff83a79800 R12: ffff99d695302060
    R13: ffff99d695300900 R14: ffff99d6992e1be0 R15: ffff99d6992e2010
    FS:  0000000000000000(0000) GS:ffff99d6aac00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 000078fbdba43480 CR3: 000000010e464000 CR4: 00000000001506f0
    Call Trace:
     <TASK>
     ? __die_body+0x1f/0x70
     ? die+0x3d/0x60
     ? do_trap+0xa4/0x110
     ? __list_add_valid_or_report+0x5e/0xb0
     ? do_error_trap+0x6d/0x90
     ? __list_add_valid_or_report+0x5e/0xb0
     ? handle_invalid_op+0x30/0x40
     ? __list_add_valid_or_report+0x5e/0xb0
     ? exc_invalid_op+0x3c/0x50
     ? asm_exc_invalid_op+0x16/0x20
     ? __list_add_valid_or_report+0x5e/0xb0
     rtw89_ops_add_interface+0x309/0x310 [rtw89_core 7c32b1ee6854761c0321027c8a58c5160e41f48f]
     drv_add_interface+0x5c/0x130 [mac80211 83e989e6e616bd5b4b8a2b0a9f9352a2c385a3bc]
     ieee80211_reconfig+0x241/0x13d0 [mac80211 83e989e6e616bd5b4b8a2b0a9f9352a2c385a3bc]
     ? finish_wait+0x3e/0x90
     ? synchronize_rcu_expedited+0x174/0x260
     ? sync_rcu_exp_done_unlocked+0x50/0x50
     ? wake_bit_function+0x40/0x40
     ieee80211_restart_work+0xf0/0x140 [mac80211 83e989e6e616bd5b4b8a2b0a9f9352a2c385a3bc]
     process_scheduled_works+0x1e5/0x480
     worker_thread+0xea/0x1e0
     kthread+0xdb/0x110
     ? move_linked_works+0x90/0x90
     ? kthread_associate_blkcg+0xa0/0xa0
     ret_from_fork+0x3b/0x50
     ? kthread_associate_blkcg+0xa0/0xa0
     ret_from_fork_asm+0x11/0x20
     </TASK>
    Modules linked in: dm_integrity async_xor xor async_tx lz4 lz4_compress zstd zstd_compress zram zsmalloc rfcomm cmac uinput algif_hash algif_skcipher af_alg btusb btrtl iio_trig_hrtimer industrialio_sw_trigger btmtk industrialio_configfs btbcm btintel uvcvideo videobuf2_vmalloc iio_trig_sysfs videobuf2_memops videobuf2_v4l2 videobuf2_common uvc snd_hda_codec_hdmi veth snd_hda_intel snd_intel_dspcfg acpi_als snd_hda_codec industrialio_triggered_buffer kfifo_buf snd_hwdep industrialio i2c_piix4 snd_hda_core designware_i2s ip6table_nat snd_soc_max98357a xt_MASQUERADE xt_cgroup snd_soc_acp_rt5682_mach fuse rtw89_8922ae(O) rtw89_8922a(O) rtw89_pci(O) rtw89_core(O) 8021q mac80211(O) bluetooth ecdh_generic ecc cfg80211 r8152 mii joydev
    gsmi: Log Shutdown Reason 0x03
    ---[ end trace 0000000000000000 ]---
    
    Signed-off-by: Chih-Kang Chang <gary.chang@realtek.com>
    Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
    Link: https://patch.msgid.link/20240731070506.46100-4-pkshih@realtek.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: rtw89: correct base HT rate mask for firmware [+ + +]
Author: Ping-Ke Shih <pkshih@realtek.com>
Date:   Fri Aug 9 15:20:10 2024 +0800

    wifi: rtw89: correct base HT rate mask for firmware
    
    [ Upstream commit 45742881f9eee2a4daeb6008e648a460dd3742cd ]
    
    Coverity reported that u8 rx_mask << 24 will become signed 32 bits, which
    casting to unsigned 64 bits will do sign extension. For example,
    putting 0x80000000 (signed 32 bits) to a u64 variable will become
    0xFFFFFFFF_80000000.
    
    The real case we meet is:
      rx_mask[0...3] = ff ff 00 00
      ra_mask = 0xffffffff_ff0ff000
    
    After this fix:
      rx_mask[0...3] = ff ff 00 00
      ra_mask = 0x00000000_ff0ff000
    
    Fortunately driver does bitwise-AND with incorrect ra_mask and supported
    rates (1ss and 2ss rate only) afterward, so the final rate mask of
    original code is still correct.
    
    Addresses-Coverity-ID: 1504762 ("Unintended sign extension")
    
    Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
    Link: https://patch.msgid.link/20240809072012.84152-5-pkshih@realtek.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: wilc1000: Do not operate uninitialized hardware during suspend/resume [+ + +]
Author: Marek Vasut <marex@denx.de>
Date:   Wed Aug 21 20:36:03 2024 +0200

    wifi: wilc1000: Do not operate uninitialized hardware during suspend/resume
    
    [ Upstream commit b0dc7018477e8fbb7e40c908c29cf663d06b17a7 ]
    
    In case the hardware is not initialized, do not operate it during
    suspend/resume cycle, the hardware is already off so there is no
    reason to access it.
    
    In fact, wilc_sdio_enable_interrupt() in the resume callback does
    interfere with the same call when initializing the hardware after
    resume and makes such initialization after resume fail. Fix this
    by not operating uninitialized hardware during suspend/resume.
    
    Signed-off-by: Marek Vasut <marex@denx.de>
    Reviewed-by: Alexis Lothoré <alexis.lothore@bootlin.com>
    Signed-off-by: Kalle Valo <kvalo@kernel.org>
    Link: https://patch.msgid.link/20240821183639.163187-1-marex@denx.de
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
x86/apic: Remove logical destination mode for 64-bit [+ + +]
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Sun Jul 28 13:06:10 2024 +0200

    x86/apic: Remove logical destination mode for 64-bit
    
    [ Upstream commit 838ba7733e4e3a94a928e8d0a058de1811a58621 ]
    
    Logical destination mode of the local APIC is used for systems with up to
    8 CPUs. It has an advantage over physical destination mode as it allows to
    target multiple CPUs at once with IPIs.
    
    That advantage was definitely worth it when systems with up to 8 CPUs
    were state of the art for servers and workstations, but that's history.
    
    Aside of that there are systems which fail to work with logical destination
    mode as the ACPI/DMI quirks show and there are AMD Zen1 systems out there
    which fail when interrupt remapping is enabled as reported by Rob and
    Christian. The latter problem can be cured by firmware updates, but not all
    OEMs distribute the required changes.
    
    Physical destination mode is guaranteed to work because it is the only way
    to get a CPU up and running via the INIT/INIT/STARTUP sequence.
    
    As the number of CPUs keeps increasing, logical destination mode becomes a
    less used code path so there is no real good reason to keep it around.
    
    Therefore remove logical destination mode support for 64-bit and default to
    physical destination mode.
    
    Reported-by: Rob Newcater <rob@durendal.co.uk>
    Reported-by: Christian Heusel <christian@heusel.eu>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
    Tested-by: Rob Newcater <rob@durendal.co.uk>
    Link: https://lore.kernel.org/all/877cd5u671.ffs@tglx
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
x86/bugs: Add missing NO_SSB flag [+ + +]
Author: Daniel Sneddon <daniel.sneddon@linux.intel.com>
Date:   Thu Aug 29 12:24:37 2024 -0700

    x86/bugs: Add missing NO_SSB flag
    
    [ Upstream commit 23e12b54acf621f4f03381dca91cc5f1334f21fd ]
    
    The Moorefield and Lightning Mountain Atom processors are
    missing the NO_SSB flag in the vulnerabilities whitelist.
    This will cause unaffected parts to incorrectly be reported
    as vulnerable. Add the missing flag.
    
    These parts are currently out of service and were verified
    internally with archived documentation that they need the
    NO_SSB flag.
    
    Closes: https://lore.kernel.org/lkml/CAEJ9NQdhh+4GxrtG1DuYgqYhvc0hi-sKZh-2niukJ-MyFLntAA@mail.gmail.com/
    Reported-by: Shanavas.K.S <shanavasks@gmail.com>
    Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
    Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
    Link: https://lore.kernel.org/r/20240829192437.4074196-1-daniel.sneddon@linux.intel.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

x86/bugs: Fix handling when SRSO mitigation is disabled [+ + +]
Author: David Kaplan <david.kaplan@amd.com>
Date:   Wed Sep 4 10:07:11 2024 -0500

    x86/bugs: Fix handling when SRSO mitigation is disabled
    
    [ Upstream commit 1dbb6b1495d472806fef1f4c94f5b3e4c89a3c1d ]
    
    When the SRSO mitigation is disabled, either via mitigations=off or
    spec_rstack_overflow=off, the warning about the lack of IBPB-enhancing
    microcode is printed anyway.
    
    This is unnecessary since the user has turned off the mitigation.
    
      [ bp: Massage, drop SBPB rationale as it doesn't matter because when
        mitigations are disabled x86_pred_cmd is not being used anyway. ]
    
    Signed-off-by: David Kaplan <david.kaplan@amd.com>
    Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
    Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
    Link: https://lore.kernel.org/r/20240904150711.193022-1-david.kaplan@amd.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
x86/ioapic: Handle allocation failures gracefully [+ + +]
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Fri Aug 2 18:15:34 2024 +0200

    x86/ioapic: Handle allocation failures gracefully
    
    [ Upstream commit 830802a0fea8fb39d3dc9fb7d6b5581e1343eb1f ]
    
    Breno observed panics when using failslab under certain conditions during
    runtime:
    
       can not alloc irq_pin_list (-1,0,20)
       Kernel panic - not syncing: IO-APIC: failed to add irq-pin. Can not proceed
    
       panic+0x4e9/0x590
       mp_irqdomain_alloc+0x9ab/0xa80
       irq_domain_alloc_irqs_locked+0x25d/0x8d0
       __irq_domain_alloc_irqs+0x80/0x110
       mp_map_pin_to_irq+0x645/0x890
       acpi_register_gsi_ioapic+0xe6/0x150
       hpet_open+0x313/0x480
    
    That's a pointless panic which is a leftover of the historic IO/APIC code
    which panic'ed during early boot when the interrupt allocation failed.
    
    The only place which might justify panic is the PIT/HPET timer_check() code
    which tries to figure out whether the timer interrupt is delivered through
    the IO/APIC. But that code does not require to handle interrupt allocation
    failures. If the interrupt cannot be allocated then timer delivery fails
    and it either panics due to that or falls back to legacy mode.
    
    Cure this by removing the panic wrapper around __add_pin_to_irq_node() and
    making mp_irqdomain_alloc() aware of the failure condition and handle it as
    any other failure in this function gracefully.
    
    Reported-by: Breno Leitao <leitao@debian.org>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Tested-by: Breno Leitao <leitao@debian.org>
    Tested-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
    Link: https://lore.kernel.org/all/ZqfJmUF8sXIyuSHN@gmail.com
    Link: https://lore.kernel.org/all/20240802155440.275200843@linutronix.de
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
x86/kexec: Add EFI config table identity mapping for kexec kernel [+ + +]
Author: Tao Liu <ltao@redhat.com>
Date:   Wed Jul 17 16:31:20 2024 -0500

    x86/kexec: Add EFI config table identity mapping for kexec kernel
    
    [ Upstream commit 5760929f6545c651682de3c2c6c6786816b17bb1 ]
    
    A kexec kernel boot failure is sometimes observed on AMD CPUs due to an
    unmapped EFI config table array.  This can be seen when "nogbpages" is on
    the kernel command line, and has been observed as a full BIOS reboot rather
    than a successful kexec.
    
    This was also the cause of reported regressions attributed to Commit
    7143c5f4cf20 ("x86/mm/ident_map: Use gbpages only where full GB page should
    be mapped.") which was subsequently reverted.
    
    To avoid this page fault, explicitly include the EFI config table array in
    the kexec identity map.
    
    Further explanation:
    
    The following 2 commits caused the EFI config table array to be
    accessed when enabling sev at kernel startup.
    
        commit ec1c66af3a30 ("x86/compressed/64: Detect/setup SEV/SME features
                              earlier during boot")
        commit c01fce9cef84 ("x86/compressed: Add SEV-SNP feature
                              detection/setup")
    
    This is in the code that examines whether SEV should be enabled or not, so
    it can even affect systems that are not SEV capable.
    
    This may result in a page fault if the EFI config table array's address is
    unmapped. Since the page fault occurs before the new kernel establishes its
    own identity map and page fault routines, it is unrecoverable and kexec
    fails.
    
    Most often, this problem is not seen because the EFI config table array
    gets included in the map by the luck of being placed at a memory address
    close enough to other memory areas that *are* included in the map created
    by kexec.
    
    Both the "nogbpages" command line option and the "use gpbages only where
    full GB page should be mapped" change greatly reduce the chance of being
    included in the map by luck, which is why the problem appears.
    
    Signed-off-by: Tao Liu <ltao@redhat.com>
    Signed-off-by: Steve Wahl <steve.wahl@hpe.com>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Tested-by: Pavin Joseph <me@pavinjoseph.com>
    Tested-by: Sarah Brofeldt <srhb@dbc.dk>
    Tested-by: Eric Hagberg <ehagberg@gmail.com>
    Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
    Link: https://lore.kernel.org/all/20240717213121.3064030-2-steve.wahl@hpe.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
x86/mm/ident_map: Use gbpages only where full GB page should be mapped. [+ + +]
Author: Steve Wahl <steve.wahl@hpe.com>
Date:   Wed Jul 17 16:31:21 2024 -0500

    x86/mm/ident_map: Use gbpages only where full GB page should be mapped.
    
    [ Upstream commit cc31744a294584a36bf764a0ffa3255a8e69f036 ]
    
    When ident_pud_init() uses only GB pages to create identity maps, large
    ranges of addresses not actually requested can be included in the resulting
    table; a 4K request will map a full GB.  This can include a lot of extra
    address space past that requested, including areas marked reserved by the
    BIOS.  That allows processor speculation into reserved regions, that on UV
    systems can cause system halts.
    
    Only use GB pages when map creation requests include the full GB page of
    space.  Fall back to using smaller 2M pages when only portions of a GB page
    are included in the request.
    
    No attempt is made to coalesce mapping requests. If a request requires a
    map entry at the 2M (pmd) level, subsequent mapping requests within the
    same 1G region will also be at the pmd level, even if adjacent or
    overlapping such requests could have been combined to map a full GB page.
    Existing usage starts with larger regions and then adds smaller regions, so
    this should not have any great consequence.
    
    Signed-off-by: Steve Wahl <steve.wahl@hpe.com>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Tested-by: Pavin Joseph <me@pavinjoseph.com>
    Tested-by: Sarah Brofeldt <srhb@dbc.dk>
    Tested-by: Eric Hagberg <ehagberg@gmail.com>
    Link: https://lore.kernel.org/all/20240717213121.3064030-3-steve.wahl@hpe.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
x86/pkeys: Add PKRU as a parameter in signal handling functions [+ + +]
Author: Aruna Ramakrishna <aruna.ramakrishna@oracle.com>
Date:   Fri Aug 2 06:13:14 2024 +0000

    x86/pkeys: Add PKRU as a parameter in signal handling functions
    
    [ Upstream commit 24cf2bc982ffe02aeffb4a3885c71751a2c7023b ]
    
    Assume there's a multithreaded application that runs untrusted user
    code. Each thread has its stack/code protected by a non-zero PKEY, and the
    PKRU register is set up such that only that particular non-zero PKEY is
    enabled. Each thread also sets up an alternate signal stack to handle
    signals, which is protected by PKEY zero. The PKEYs man page documents that
    the PKRU will be reset to init_pkru when the signal handler is invoked,
    which means that PKEY zero access will be enabled.  But this reset happens
    after the kernel attempts to push fpu state to the alternate stack, which
    is not (yet) accessible by the kernel, which leads to a new SIGSEGV being
    sent to the application, terminating it.
    
    Enabling both the non-zero PKEY (for the thread) and PKEY zero in
    userspace will not work for this use case. It cannot have the alt stack
    writeable by all - the rationale here is that the code running in that
    thread (using a non-zero PKEY) is untrusted and should not have access
    to the alternate signal stack (that uses PKEY zero), to prevent the
    return address of a function from being changed. The expectation is that
    kernel should be able to set up the alternate signal stack and deliver
    the signal to the application even if PKEY zero is explicitly disabled
    by the application. The signal handler accessibility should not be
    dictated by whatever PKRU value the thread sets up.
    
    The PKRU register is managed by XSAVE, which means the sigframe contents
    must match the register contents - which is not the case here. It's
    required that the signal frame contains the user-defined PKRU value (so
    that it is restored correctly from sigcontext) but the actual register must
    be reset to init_pkru so that the alt stack is accessible and the signal
    can be delivered to the application. It seems that the proper fix here
    would be to remove PKRU from the XSAVE framework and manage it separately,
    which is quite complicated. As a workaround, do this:
    
            orig_pkru = rdpkru();
            wrpkru(orig_pkru & init_pkru_value);
            xsave_to_user_sigframe();
            put_user(pkru_sigframe_addr, orig_pkru)
    
    In preparation for writing PKRU to sigframe, pass PKRU as an additional
    parameter down the call chain from get_sigframe().
    
    No functional change.
    
    Signed-off-by: Aruna Ramakrishna <aruna.ramakrishna@oracle.com>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Link: https://lore.kernel.org/all/20240802061318.2140081-2-aruna.ramakrishna@oracle.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

x86/pkeys: Restore altstack access in sigreturn() [+ + +]
Author: Aruna Ramakrishna <aruna.ramakrishna@oracle.com>
Date:   Fri Aug 2 06:13:17 2024 +0000

    x86/pkeys: Restore altstack access in sigreturn()
    
    [ Upstream commit d10b554919d4cc8fa8fe2e95b57ad2624728c8e4 ]
    
    A process can disable access to the alternate signal stack by not
    enabling the altstack's PKEY in the PKRU register.
    
    Nevertheless, the kernel updates the PKRU temporarily for signal
    handling. However, in sigreturn(), restore_sigcontext() will restore the
    PKRU to the user-defined PKRU value.
    
    This will cause restore_altstack() to fail with a SIGSEGV as it needs read
    access to the altstack which is prohibited by the user-defined PKRU value.
    
    Fix this by restoring altstack before restoring PKRU.
    
    Signed-off-by: Aruna Ramakrishna <aruna.ramakrishna@oracle.com>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Link: https://lore.kernel.org/all/20240802061318.2140081-5-aruna.ramakrishna@oracle.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
x86/syscall: Avoid memcpy() for ia32 syscall_get_arguments() [+ + +]
Author: Kees Cook <kees@kernel.org>
Date:   Mon Jul 8 13:22:06 2024 -0700

    x86/syscall: Avoid memcpy() for ia32 syscall_get_arguments()
    
    [ Upstream commit d19d638b1e6cf746263ef60b7d0dee0204d8216a ]
    
    Modern (fortified) memcpy() prefers to avoid writing (or reading) beyond
    the end of the addressed destination (or source) struct member:
    
    In function ‘fortify_memcpy_chk’,
        inlined from ‘syscall_get_arguments’ at ./arch/x86/include/asm/syscall.h:85:2,
        inlined from ‘populate_seccomp_data’ at kernel/seccomp.c:258:2,
        inlined from ‘__seccomp_filter’ at kernel/seccomp.c:1231:3:
    ./include/linux/fortify-string.h:580:25: error: call to ‘__read_overflow2_field’ declared with attribute warning: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Werror=attribute-warning]
      580 |                         __read_overflow2_field(q_size_field, size);
          |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    As already done for x86_64 and compat mode, do not use memcpy() to
    extract syscall arguments from struct pt_regs but rather just perform
    direct assignments. Binary output differences are negligible, and actually
    ends up using less stack space:
    
    -       sub    $0x84,%esp
    +       sub    $0x6c,%esp
    
    and less text size:
    
       text    data     bss     dec     hex filename
      10794     252       0   11046    2b26 gcc-32b/kernel/seccomp.o.stock
      10714     252       0   10966    2ad6 gcc-32b/kernel/seccomp.o.after
    
    Closes: https://lore.kernel.org/lkml/9b69fb14-df89-4677-9c82-056ea9e706f5@gmail.com/
    Reported-by: Mirsad Todorovac <mtodorovac69@gmail.com>
    Signed-off-by: Kees Cook <kees@kernel.org>
    Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
    Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
    Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
    Tested-by: Mirsad Todorovac <mtodorovac69@gmail.com>
    Link: https://lore.kernel.org/all/20240708202202.work.477-kees%40kernel.org
    Signed-off-by: Sasha Levin <sashal@kernel.org>