Changelog in Linux kernel 6.1.121

 
acpi: nfit: vmalloc-out-of-bounds Read in acpi_nfit_ctl [+ + +]
Author: Suraj Sonawane <surajsonawane0215@gmail.com>
Date:   Mon Nov 18 21:56:09 2024 +0530

    acpi: nfit: vmalloc-out-of-bounds Read in acpi_nfit_ctl
    
    [ Upstream commit 265e98f72bac6c41a4492d3e30a8e5fd22fe0779 ]
    
    Fix an issue detected by syzbot with KASAN:
    
    BUG: KASAN: vmalloc-out-of-bounds in cmd_to_func drivers/acpi/nfit/
    core.c:416 [inline]
    BUG: KASAN: vmalloc-out-of-bounds in acpi_nfit_ctl+0x20e8/0x24a0
    drivers/acpi/nfit/core.c:459
    
    The issue occurs in cmd_to_func when the call_pkg->nd_reserved2
    array is accessed without verifying that call_pkg points to a buffer
    that is appropriately sized as a struct nd_cmd_pkg. This can lead
    to out-of-bounds access and undefined behavior if the buffer does not
    have sufficient space.
    
    To address this, a check was added in acpi_nfit_ctl() to ensure that
    buf is not NULL and that buf_len is less than sizeof(*call_pkg)
    before accessing it. This ensures safe access to the members of
    call_pkg, including the nd_reserved2 array.
    
    Reported-by: syzbot+7534f060ebda6b8b51b3@syzkaller.appspotmail.com
    Closes: https://syzkaller.appspot.com/bug?extid=7534f060ebda6b8b51b3
    Tested-by: syzbot+7534f060ebda6b8b51b3@syzkaller.appspotmail.com
    Fixes: ebe9f6f19d80 ("acpi/nfit: Fix bus command validation")
    Signed-off-by: Suraj Sonawane <surajsonawane0215@gmail.com>
    Reviewed-by: Alison Schofield <alison.schofield@intel.com>
    Reviewed-by: Dave Jiang <dave.jiang@intel.com>
    Link: https://patch.msgid.link/20241118162609.29063-1-surajsonawane0215@gmail.com
    Signed-off-by: Ira Weiny <ira.weiny@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ACPI: resource: Fix memory resource type union access [+ + +]
Author: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Date:   Mon Dec 2 12:06:13 2024 +0200

    ACPI: resource: Fix memory resource type union access
    
    [ Upstream commit 7899ca9f3bd2b008e9a7c41f2a9f1986052d7e96 ]
    
    In acpi_decode_space() addr->info.mem.caching is checked on main level
    for any resource type but addr->info.mem is part of union and thus
    valid only if the resource type is memory range.
    
    Move the check inside the preceeding switch/case to only execute it
    when the union is of correct type.
    
    Fixes: fcb29bbcd540 ("ACPI: Add prefetch decoding to the address space parser")
    Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
    Link: https://patch.msgid.link/20241202100614.20731-1-ilpo.jarvinen@linux.intel.com
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ACPICA: events/evxfregn: don't release the ContextMutex that was never acquired [+ + +]
Author: Daniil Tatianin <d-tatianin@yandex-team.ru>
Date:   Fri Nov 22 11:29:54 2024 +0300

    ACPICA: events/evxfregn: don't release the ContextMutex that was never acquired
    
    [ Upstream commit c53d96a4481f42a1635b96d2c1acbb0a126bfd54 ]
    
    This bug was first introduced in c27f3d011b08, where the author of the
    patch probably meant to do DeleteMutex instead of ReleaseMutex. The
    mutex leak was noticed later on and fixed in e4dfe108371, but the bogus
    MutexRelease line was never removed, so do it now.
    
    Link: https://github.com/acpica/acpica/pull/982
    Fixes: c27f3d011b08 ("ACPICA: Fix race in generic_serial_bus (I2C) and GPIO op_region parameter handling")
    Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
    Link: https://patch.msgid.link/20241122082954.658356-1-d-tatianin@yandex-team.ru
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ALSA: usb-audio: Add implicit feedback quirk for Yamaha THR5 [+ + +]
Author: Jaakko Salo <jaakkos@gmail.com>
Date:   Fri Dec 6 18:44:48 2024 +0200

    ALSA: usb-audio: Add implicit feedback quirk for Yamaha THR5
    
    commit 82fdcf9b518b205da040046fbe7747fb3fd18657 upstream.
    
    Use implicit feedback from the capture endpoint to fix popping
    sounds during playback.
    
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=219567
    Signed-off-by: Jaakko Salo <jaakkos@gmail.com>
    Cc: <stable@vger.kernel.org>
    Link: https://patch.msgid.link/20241206164448.8136-1-jaakkos@gmail.com
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: usb-audio: Fix a DMA to stack memory bug [+ + +]
Author: Dan Carpenter <dan.carpenter@linaro.org>
Date:   Mon Dec 2 15:57:54 2024 +0300

    ALSA: usb-audio: Fix a DMA to stack memory bug
    
    commit f7d306b47a24367302bd4fe846854e07752ffcd9 upstream.
    
    The usb_get_descriptor() function does DMA so we're not allowed
    to use a stack buffer for that.  Doing DMA to the stack is not portable
    all architectures.  Move the "new_device_descriptor" from being stored
    on the stack and allocate it with kmalloc() instead.
    
    Fixes: b909df18ce2a ("ALSA: usb-audio: Fix potential out-of-bound accesses for Extigy and Mbox devices")
    Cc: stable@kernel.org
    Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
    Link: https://patch.msgid.link/60e3aa09-039d-46d2-934c-6f123026c2eb@stanley.mountain
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Benoît Sevens <bsevens@google.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
amdgpu/uvd: get ring reference from rq scheduler [+ + +]
Author: David (Ming Qiang) Wu <David.Wu3@amd.com>
Date:   Wed Dec 4 11:30:01 2024 -0500

    amdgpu/uvd: get ring reference from rq scheduler
    
    [ Upstream commit 47f402a3e08113e0f5d8e1e6fcc197667a16022f ]
    
    base.sched may not be set for each instance and should not
    be used for cases such as non-IB tests.
    
    Fixes: 2320c9e6a768 ("drm/sched: memset() 'job' in drm_sched_job_init()")
    Signed-off-by: David (Ming Qiang) Wu <David.Wu3@amd.com>
    Reviewed-by: Christian König <christian.koenig@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ASoC: amd: yc: Fix the wrong return value [+ + +]
Author: Venkata Prasad Potturu <venkataprasad.potturu@amd.com>
Date:   Tue Dec 10 14:40:25 2024 +0530

    ASoC: amd: yc: Fix the wrong return value
    
    [ Upstream commit 984795e76def5c903724b8d6a8228e356bbdf2af ]
    
    With the current implementation, when ACP driver fails to read
    ACPI _WOV entry then the DMI overrides code won't invoke,
    may cause regressions for some BIOS versions.
    
    Add a condition check to jump to check the DMI entries incase of
    ACP driver fail to read ACPI _WOV method.
    
    Fixes: 4095cf872084 (ASoC: amd: yc: Fix for enabling DMIC on acp6x via _DSD entry)
    
    Signed-off-by: Venkata Prasad Potturu <venkataprasad.potturu@amd.com>
    Link: https://patch.msgid.link/20241210091026.996860-1-venkataprasad.potturu@amd.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ata: sata_highbank: fix OF node reference leak in highbank_initialize_phys() [+ + +]
Author: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
Date:   Thu Dec 5 19:30:14 2024 +0900

    ata: sata_highbank: fix OF node reference leak in highbank_initialize_phys()
    
    commit 676fe1f6f74db988191dab5df3bf256908177072 upstream.
    
    The OF node reference obtained by of_parse_phandle_with_args() is not
    released on early return. Add a of_node_put() call before returning.
    
    Fixes: 8996b89d6bc9 ("ata: add platform driver for Calxeda AHCI controller")
    Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
    Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
batman-adv: Do not let TT changes list grows indefinitely [+ + +]
Author: Remi Pommarel <repk@triplefau.lt>
Date:   Fri Nov 22 16:52:50 2024 +0100

    batman-adv: Do not let TT changes list grows indefinitely
    
    [ Upstream commit fff8f17c1a6fc802ca23bbd3a276abfde8cc58e6 ]
    
    When TT changes list is too big to fit in packet due to MTU size, an
    empty OGM is sent expected other node to send TT request to get the
    changes. The issue is that tt.last_changeset was not built thus the
    originator was responding with previous changes to those TT requests
    (see batadv_send_my_tt_response). Also the changes list was never
    cleaned up effectively never ending growing from this point onwards,
    repeatedly sending the same TT response changes over and over, and
    creating a new empty OGM every OGM interval expecting for the local
    changes to be purged.
    
    When there is more TT changes that can fit in packet, drop all changes,
    send empty OGM and wait for TT request so we can respond with a full
    table instead.
    
    Fixes: e1bf0c14096f ("batman-adv: tvlv - convert tt data sent within OGMs")
    Signed-off-by: Remi Pommarel <repk@triplefau.lt>
    Acked-by: Antonio Quartulli <Antonio@mandelbit.com>
    Signed-off-by: Sven Eckelmann <sven@narfation.org>
    Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

batman-adv: Do not send uninitialized TT changes [+ + +]
Author: Remi Pommarel <repk@triplefau.lt>
Date:   Fri Nov 22 16:52:48 2024 +0100

    batman-adv: Do not send uninitialized TT changes
    
    [ Upstream commit f2f7358c3890e7366cbcb7512b4bc8b4394b2d61 ]
    
    The number of TT changes can be less than initially expected in
    batadv_tt_tvlv_container_update() (changes can be removed by
    batadv_tt_local_event() in ADD+DEL sequence between reading
    tt_diff_entries_num and actually iterating the change list under lock).
    
    Thus tt_diff_len could be bigger than the actual changes size that need
    to be sent. Because batadv_send_my_tt_response sends the whole
    packet, uninitialized data can be interpreted as TT changes on other
    nodes leading to weird TT global entries on those nodes such as:
    
     * 00:00:00:00:00:00   -1 [....] (  0) 88:12:4e:ad:7e:ba (179) (0x45845380)
     * 00:00:00:00:78:79 4092 [.W..] (  0) 88:12:4e:ad:7e:3c (145) (0x8ebadb8b)
    
    All of the above also applies to OGM tvlv container buffer's tvlv_len.
    
    Remove the extra allocated space to avoid sending uninitialized TT
    changes in batadv_send_my_tt_response() and batadv_v_ogm_send_softif().
    
    Fixes: e1bf0c14096f ("batman-adv: tvlv - convert tt data sent within OGMs")
    Signed-off-by: Remi Pommarel <repk@triplefau.lt>
    Signed-off-by: Sven Eckelmann <sven@narfation.org>
    Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

batman-adv: Remove uninitialized data in full table TT response [+ + +]
Author: Remi Pommarel <repk@triplefau.lt>
Date:   Fri Nov 22 16:52:49 2024 +0100

    batman-adv: Remove uninitialized data in full table TT response
    
    [ Upstream commit 8038806db64da15721775d6b834990cacbfcf0b2 ]
    
    The number of entries filled by batadv_tt_tvlv_generate() can be less
    than initially expected in batadv_tt_prepare_tvlv_{global,local}_data()
    (changes can be removed by batadv_tt_local_event() in ADD+DEL sequence
    in the meantime as the lock held during the whole tvlv global/local data
    generation).
    
    Thus tvlv_len could be bigger than the actual TT entry size that need
    to be sent so full table TT_RESPONSE could hold invalid TT entries such
    as below.
    
     * 00:00:00:00:00:00   -1 [....] (  0) 88:12:4e:ad:7e:ba (179) (0x45845380)
     * 00:00:00:00:78:79 4092 [.W..] (  0) 88:12:4e:ad:7e:3c (145) (0x8ebadb8b)
    
    Remove the extra allocated space to avoid sending uninitialized entries
    for full table TT_RESPONSE in both batadv_send_other_tt_response() and
    batadv_send_my_tt_response().
    
    Fixes: 7ea7b4a14275 ("batman-adv: make the TT CRC logic VLAN specific")
    Signed-off-by: Remi Pommarel <repk@triplefau.lt>
    Signed-off-by: Sven Eckelmann <sven@narfation.org>
    Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
blk-cgroup: Fix UAF in blkcg_unpin_online() [+ + +]
Author: Tejun Heo <tj@kernel.org>
Date:   Fri Dec 6 07:59:51 2024 -1000

    blk-cgroup: Fix UAF in blkcg_unpin_online()
    
    commit 86e6ca55b83c575ab0f2e105cf08f98e58d3d7af upstream.
    
    blkcg_unpin_online() walks up the blkcg hierarchy putting the online pin. To
    walk up, it uses blkcg_parent(blkcg) but it was calling that after
    blkcg_destroy_blkgs(blkcg) which could free the blkcg, leading to the
    following UAF:
    
      ==================================================================
      BUG: KASAN: slab-use-after-free in blkcg_unpin_online+0x15a/0x270
      Read of size 8 at addr ffff8881057678c0 by task kworker/9:1/117
    
      CPU: 9 UID: 0 PID: 117 Comm: kworker/9:1 Not tainted 6.13.0-rc1-work-00182-gb8f52214c61a-dirty #48
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS unknown 02/02/2022
      Workqueue: cgwb_release cgwb_release_workfn
      Call Trace:
       <TASK>
       dump_stack_lvl+0x27/0x80
       print_report+0x151/0x710
       kasan_report+0xc0/0x100
       blkcg_unpin_online+0x15a/0x270
       cgwb_release_workfn+0x194/0x480
       process_scheduled_works+0x71b/0xe20
       worker_thread+0x82a/0xbd0
       kthread+0x242/0x2c0
       ret_from_fork+0x33/0x70
       ret_from_fork_asm+0x1a/0x30
       </TASK>
      ...
      Freed by task 1944:
       kasan_save_track+0x2b/0x70
       kasan_save_free_info+0x3c/0x50
       __kasan_slab_free+0x33/0x50
       kfree+0x10c/0x330
       css_free_rwork_fn+0xe6/0xb30
       process_scheduled_works+0x71b/0xe20
       worker_thread+0x82a/0xbd0
       kthread+0x242/0x2c0
       ret_from_fork+0x33/0x70
       ret_from_fork_asm+0x1a/0x30
    
    Note that the UAF is not easy to trigger as the free path is indirected
    behind a couple RCU grace periods and a work item execution. I could only
    trigger it with artifical msleep() injected in blkcg_unpin_online().
    
    Fix it by reading the parent pointer before destroying the blkcg's blkg's.
    
    Signed-off-by: Tejun Heo <tj@kernel.org>
    Reported-by: Abagail ren <renzezhongucas@gmail.com>
    Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org>
    Fixes: 4308a434e5e0 ("blkcg: don't offline parent blkcg first")
    Cc: stable@vger.kernel.org # v5.7+
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
blk-iocost: Avoid using clamp() on inuse in __propagate_weights() [+ + +]
Author: Nathan Chancellor <nathan@kernel.org>
Date:   Thu Dec 12 10:13:29 2024 -0700

    blk-iocost: Avoid using clamp() on inuse in __propagate_weights()
    
    [ Upstream commit 57e420c84f9ab55ba4c5e2ae9c5f6c8e1ea834d2 ]
    
    After a recent change to clamp() and its variants [1] that increases the
    coverage of the check that high is greater than low because it can be
    done through inlining, certain build configurations (such as s390
    defconfig) fail to build with clang with:
    
      block/blk-iocost.c:1101:11: error: call to '__compiletime_assert_557' declared with 'error' attribute: clamp() low limit 1 greater than high limit active
       1101 |                 inuse = clamp_t(u32, inuse, 1, active);
            |                         ^
      include/linux/minmax.h:218:36: note: expanded from macro 'clamp_t'
        218 | #define clamp_t(type, val, lo, hi) __careful_clamp(type, val, lo, hi)
            |                                    ^
      include/linux/minmax.h:195:2: note: expanded from macro '__careful_clamp'
        195 |         __clamp_once(type, val, lo, hi, __UNIQUE_ID(v_), __UNIQUE_ID(l_), __UNIQUE_ID(h_))
            |         ^
      include/linux/minmax.h:188:2: note: expanded from macro '__clamp_once'
        188 |         BUILD_BUG_ON_MSG(statically_true(ulo > uhi),                            \
            |         ^
    
    __propagate_weights() is called with an active value of zero in
    ioc_check_iocgs(), which results in the high value being less than the
    low value, which is undefined because the value returned depends on the
    order of the comparisons.
    
    The purpose of this expression is to ensure inuse is not more than
    active and at least 1. This could be written more simply with a ternary
    expression that uses min(inuse, active) as the condition so that the
    value of that condition can be used if it is not zero and one if it is.
    Do this conversion to resolve the error and add a comment to deter
    people from turning this back into clamp().
    
    Fixes: 7caa47151ab2 ("blkcg: implement blk-iocost")
    Link: https://lore.kernel.org/r/34d53778977747f19cce2abb287bb3e6@AcuMS.aculab.com/ [1]
    Suggested-by: David Laight <david.laight@aculab.com>
    Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
    Closes: https://lore.kernel.org/llvm/CA+G9fYsD7mw13wredcZn0L-KBA3yeoVSTuxnss-AEWMN3ha0cA@mail.gmail.com/
    Reported-by: kernel test robot <lkp@intel.com>
    Closes: https://lore.kernel.org/oe-kbuild-all/202412120322.3GfVe3vF-lkp@intel.com/
    Signed-off-by: Nathan Chancellor <nathan@kernel.org>
    Acked-by: Tejun Heo <tj@kernel.org>
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
Bluetooth: iso: Fix recursive locking warning [+ + +]
Author: Iulia Tanasescu <iulia.tanasescu@nxp.com>
Date:   Wed Dec 4 14:28:49 2024 +0200

    Bluetooth: iso: Fix recursive locking warning
    
    [ Upstream commit 9bde7c3b3ad0e1f39d6df93dd1c9caf63e19e50f ]
    
    This updates iso_sock_accept to use nested locking for the parent
    socket, to avoid lockdep warnings caused because the parent and
    child sockets are locked by the same thread:
    
    [   41.585683] ============================================
    [   41.585688] WARNING: possible recursive locking detected
    [   41.585694] 6.12.0-rc6+ #22 Not tainted
    [   41.585701] --------------------------------------------
    [   41.585705] iso-tester/3139 is trying to acquire lock:
    [   41.585711] ffff988b29530a58 (sk_lock-AF_BLUETOOTH)
                   at: bt_accept_dequeue+0xe3/0x280 [bluetooth]
    [   41.585905]
                   but task is already holding lock:
    [   41.585909] ffff988b29533a58 (sk_lock-AF_BLUETOOTH)
                   at: iso_sock_accept+0x61/0x2d0 [bluetooth]
    [   41.586064]
                   other info that might help us debug this:
    [   41.586069]  Possible unsafe locking scenario:
    
    [   41.586072]        CPU0
    [   41.586076]        ----
    [   41.586079]   lock(sk_lock-AF_BLUETOOTH);
    [   41.586086]   lock(sk_lock-AF_BLUETOOTH);
    [   41.586093]
                    *** DEADLOCK ***
    
    [   41.586097]  May be due to missing lock nesting notation
    
    [   41.586101] 1 lock held by iso-tester/3139:
    [   41.586107]  #0: ffff988b29533a58 (sk_lock-AF_BLUETOOTH)
                    at: iso_sock_accept+0x61/0x2d0 [bluetooth]
    
    Fixes: ccf74f2390d6 ("Bluetooth: Add BTPROTO_ISO socket type")
    Signed-off-by: Iulia Tanasescu <iulia.tanasescu@nxp.com>
    Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Bluetooth: SCO: Add support for 16 bits transparent voice setting [+ + +]
Author: Frédéric Danis <frederic.danis@collabora.com>
Date:   Thu Dec 5 16:51:59 2024 +0100

    Bluetooth: SCO: Add support for 16 bits transparent voice setting
    
    [ Upstream commit 29a651451e6c264f58cd9d9a26088e579d17b242 ]
    
    The voice setting is used by sco_connect() or sco_conn_defer_accept()
    after being set by sco_sock_setsockopt().
    
    The PCM part of the voice setting is used for offload mode through PCM
    chipset port.
    This commits add support for mSBC 16 bits offloading, i.e. audio data
    not transported over HCI.
    
    The BCM4349B1 supports 16 bits transparent data on its I2S port.
    If BT_VOICE_TRANSPARENT is used when accepting a SCO connection, this
    gives only garbage audio while using BT_VOICE_TRANSPARENT_16BIT gives
    correct audio.
    This has been tested with connection to iPhone 14 and Samsung S24.
    
    Fixes: ad10b1a48754 ("Bluetooth: Add Bluetooth socket voice option")
    Signed-off-by: Frédéric Danis <frederic.danis@collabora.com>
    Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
bonding: Fix feature propagation of NETIF_F_GSO_ENCAP_ALL [+ + +]
Author: Daniel Borkmann <daniel@iogearbox.net>
Date:   Tue Dec 10 15:12:43 2024 +0100

    bonding: Fix feature propagation of NETIF_F_GSO_ENCAP_ALL
    
    [ Upstream commit 77b11c8bf3a228d1c63464534c2dcc8d9c8bf7ff ]
    
    Drivers like mlx5 expose NIC's vlan_features such as
    NETIF_F_GSO_UDP_TUNNEL & NETIF_F_GSO_UDP_TUNNEL_CSUM which are
    later not propagated when the underlying devices are bonded and
    a vlan device created on top of the bond.
    
    Right now, the more cumbersome workaround for this is to create
    the vlan on top of the mlx5 and then enslave the vlan devices
    to a bond.
    
    To fix this, add NETIF_F_GSO_ENCAP_ALL to BOND_VLAN_FEATURES
    such that bond_compute_features() can probe and propagate the
    vlan_features from the slave devices up to the vlan device.
    
    Given the following bond:
    
      # ethtool -i enp2s0f{0,1}np{0,1}
      driver: mlx5_core
      [...]
    
      # ethtool -k enp2s0f0np0 | grep udp
      tx-udp_tnl-segmentation: on
      tx-udp_tnl-csum-segmentation: on
      tx-udp-segmentation: on
      rx-udp_tunnel-port-offload: on
      rx-udp-gro-forwarding: off
    
      # ethtool -k enp2s0f1np1 | grep udp
      tx-udp_tnl-segmentation: on
      tx-udp_tnl-csum-segmentation: on
      tx-udp-segmentation: on
      rx-udp_tunnel-port-offload: on
      rx-udp-gro-forwarding: off
    
      # ethtool -k bond0 | grep udp
      tx-udp_tnl-segmentation: on
      tx-udp_tnl-csum-segmentation: on
      tx-udp-segmentation: on
      rx-udp_tunnel-port-offload: off [fixed]
      rx-udp-gro-forwarding: off
    
    Before:
    
      # ethtool -k bond0.100 | grep udp
      tx-udp_tnl-segmentation: off [requested on]
      tx-udp_tnl-csum-segmentation: off [requested on]
      tx-udp-segmentation: on
      rx-udp_tunnel-port-offload: off [fixed]
      rx-udp-gro-forwarding: off
    
    After:
    
      # ethtool -k bond0.100 | grep udp
      tx-udp_tnl-segmentation: on
      tx-udp_tnl-csum-segmentation: on
      tx-udp-segmentation: on
      rx-udp_tunnel-port-offload: off [fixed]
      rx-udp-gro-forwarding: off
    
    Various users have run into this reporting performance issues when
    configuring Cilium in vxlan tunneling mode and having the combination
    of bond & vlan for the core devices connecting the Kubernetes cluster
    to the outside world.
    
    Fixes: a9b3ace44c7d ("bonding: fix vlan_features computing")
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Cc: Nikolay Aleksandrov <razor@blackwall.org>
    Cc: Ido Schimmel <idosch@idosch.org>
    Cc: Jiri Pirko <jiri@nvidia.com>
    Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
    Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
    Link: https://patch.msgid.link/20241210141245.327886-3-daniel@iogearbox.net
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
bpf, sockmap: Fix update element with same [+ + +]
Author: Michal Luczaj <mhal@rbox.co>
Date:   Mon Dec 2 12:29:23 2024 +0100

    bpf, sockmap: Fix update element with same
    
    commit 75e072a390da9a22e7ae4a4e8434dfca5da499fb upstream.
    
    Consider a sockmap entry being updated with the same socket:
    
            osk = stab->sks[idx];
            sock_map_add_link(psock, link, map, &stab->sks[idx]);
            stab->sks[idx] = sk;
            if (osk)
                    sock_map_unref(osk, &stab->sks[idx]);
    
    Due to sock_map_unref(), which invokes sock_map_del_link(), all the
    psock's links for stab->sks[idx] are torn:
    
            list_for_each_entry_safe(link, tmp, &psock->link, list) {
                    if (link->link_raw == link_raw) {
                            ...
                            list_del(&link->list);
                            sk_psock_free_link(link);
                    }
            }
    
    And that includes the new link sock_map_add_link() added just before
    the unref.
    
    This results in a sockmap holding a socket, but without the respective
    link. This in turn means that close(sock) won't trigger the cleanup,
    i.e. a closed socket will not be automatically removed from the sockmap.
    
    Stop tearing the links when a matching link_raw is found.
    
    Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
    Signed-off-by: Michal Luczaj <mhal@rbox.co>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Reviewed-by: John Fastabend <john.fastabend@gmail.com>
    Link: https://lore.kernel.org/bpf/20241202-sockmap-replace-v1-1-1e88579e7bd5@rbox.co
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
bpf,perf: Fix invalid prog_array access in perf_event_detach_bpf_prog [+ + +]
Author: Jiri Olsa <jolsa@kernel.org>
Date:   Sun Dec 8 15:25:07 2024 +0100

    bpf,perf: Fix invalid prog_array access in perf_event_detach_bpf_prog
    
    commit 978c4486cca5c7b9253d3ab98a88c8e769cb9bbd upstream.
    
    Syzbot reported [1] crash that happens for following tracing scenario:
    
      - create tracepoint perf event with attr.inherit=1, attach it to the
        process and set bpf program to it
      - attached process forks -> chid creates inherited event
    
        the new child event shares the parent's bpf program and tp_event
        (hence prog_array) which is global for tracepoint
    
      - exit both process and its child -> release both events
      - first perf_event_detach_bpf_prog call will release tp_event->prog_array
        and second perf_event_detach_bpf_prog will crash, because
        tp_event->prog_array is NULL
    
    The fix makes sure the perf_event_detach_bpf_prog checks prog_array
    is valid before it tries to remove the bpf program from it.
    
    [1] https://lore.kernel.org/bpf/Z1MR6dCIKajNS6nU@krava/T/#m91dbf0688221ec7a7fc95e896a7ef9ff93b0b8ad
    
    Fixes: 0ee288e69d03 ("bpf,perf: Fix perf_event_detach_bpf_prog error handling")
    Reported-by: syzbot+2e0d2840414ce817aaac@syzkaller.appspotmail.com
    Signed-off-by: Jiri Olsa <jolsa@kernel.org>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20241208142507.1207698-1-jolsa@kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
bpf: Fix UAF via mismatching bpf_prog/attachment RCU flavors [+ + +]
Author: Jann Horn <jannh@google.com>
Date:   Tue Dec 10 17:32:13 2024 +0100

    bpf: Fix UAF via mismatching bpf_prog/attachment RCU flavors
    
    commit ef1b808e3b7c98612feceedf985c2fbbeb28f956 upstream.
    
    Uprobes always use bpf_prog_run_array_uprobe() under tasks-trace-RCU
    protection. But it is possible to attach a non-sleepable BPF program to a
    uprobe, and non-sleepable BPF programs are freed via normal RCU (see
    __bpf_prog_put_noref()). This leads to UAF of the bpf_prog because a normal
    RCU grace period does not imply a tasks-trace-RCU grace period.
    
    Fix it by explicitly waiting for a tasks-trace-RCU grace period after
    removing the attachment of a bpf_prog to a perf_event.
    
    Fixes: 8c7dcb84e3b7 ("bpf: implement sleepable uprobes by chaining gps")
    Suggested-by: Andrii Nakryiko <andrii@kernel.org>
    Suggested-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Jann Horn <jannh@google.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/bpf/20241210-bpf-fix-actual-uprobe-uaf-v1-1-19439849dd44@google.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
bpf: sync_linked_regs() must preserve subreg_def [+ + +]
Author: Eduard Zingerman <eddyz87@gmail.com>
Date:   Tue Sep 24 14:08:43 2024 -0700

    bpf: sync_linked_regs() must preserve subreg_def
    
    commit e9bd9c498cb0f5843996dbe5cbce7a1836a83c70 upstream.
    
    Range propagation must not affect subreg_def marks, otherwise the
    following example is rewritten by verifier incorrectly when
    BPF_F_TEST_RND_HI32 flag is set:
    
      0: call bpf_ktime_get_ns                   call bpf_ktime_get_ns
      1: r0 &= 0x7fffffff       after verifier   r0 &= 0x7fffffff
      2: w1 = w0                rewrites         w1 = w0
      3: if w0 < 10 goto +0     -------------->  r11 = 0x2f5674a6     (r)
      4: r1 >>= 32                               r11 <<= 32           (r)
      5: r0 = r1                                 r1 |= r11            (r)
      6: exit;                                   if w0 < 0xa goto pc+0
                                                 r1 >>= 32
                                                 r0 = r1
                                                 exit
    
    (or zero extension of w1 at (2) is missing for architectures that
     require zero extension for upper register half).
    
    The following happens w/o this patch:
    - r0 is marked as not a subreg at (0);
    - w1 is marked as subreg at (2);
    - w1 subreg_def is overridden at (3) by copy_register_state();
    - w1 is read at (5) but mark_insn_zext() does not mark (2)
      for zero extension, because w1 subreg_def is not set;
    - because of BPF_F_TEST_RND_HI32 flag verifier inserts random
      value for hi32 bits of (2) (marked (r));
    - this random value is read at (5).
    
    Fixes: 75748837b7e5 ("bpf: Propagate scalar ranges through register assignments.")
    Reported-by: Lonial Con <kongln9170@gmail.com>
    Signed-off-by: Lonial Con <kongln9170@gmail.com>
    Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: Daniel Borkmann <daniel@iogearbox.net>
    Closes: https://lore.kernel.org/bpf/7e2aa30a62d740db182c170fdd8f81c596df280d.camel@gmail.com
    Link: https://lore.kernel.org/bpf/20240924210844.1758441-1-eddyz87@gmail.com
    [ shung-hsi.yu: sync_linked_regs() was called find_equal_scalars() before commit
      4bf79f9be434 ("bpf: Track equal scalars history on per-instruction level"), and
      modification is done because there is only a single call to
      copy_register_state() before commit 98d7ca374ba4 ("bpf: Track delta between
      "linked" registers."). ]
    Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
cxgb4: use port number to set mac addr [+ + +]
Author: Anumula Murali Mohan Reddy <anumula@chelsio.com>
Date:   Fri Dec 6 11:50:14 2024 +0530

    cxgb4: use port number to set mac addr
    
    [ Upstream commit 356983f569c1f5991661fc0050aa263792f50616 ]
    
    t4_set_vf_mac_acl() uses pf to set mac addr, but t4vf_get_vf_mac_acl()
    uses port number to get mac addr, this leads to error when an attempt
    to set MAC address on VF's of PF2 and PF3.
    This patch fixes the issue by using port number to set mac address.
    
    Fixes: e0cdac65ba26 ("cxgb4vf: configure ports accessible by the VF")
    Signed-off-by: Anumula Murali Mohan Reddy <anumula@chelsio.com>
    Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://patch.msgid.link/20241206062014.49414-1-anumula@chelsio.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
Documentation: PM: Clarify pm_runtime_resume_and_get() return value [+ + +]
Author: Paul Barker <paul.barker.ct@bp.renesas.com>
Date:   Tue Dec 3 14:37:29 2024 +0000

    Documentation: PM: Clarify pm_runtime_resume_and_get() return value
    
    [ Upstream commit ccb84dc8f4a02e7d30ffd388522996546b4d00e1 ]
    
    Update the documentation to match the behaviour of the code.
    
    pm_runtime_resume_and_get() always returns 0 on success, even if
    __pm_runtime_resume() returns 1.
    
    Fixes: 2c412337cfe6 ("PM: runtime: Add documentation for pm_runtime_resume_and_get()")
    Signed-off-by: Paul Barker <paul.barker.ct@bp.renesas.com>
    Link: https://patch.msgid.link/20241203143729.478-1-paul.barker.ct@bp.renesas.com
    [ rjw: Subject and changelog edits, adjusted new comment formatting ]
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
drm/i915: Fix memory leak by correcting cache object name in error handler [+ + +]
Author: Jiasheng Jiang <jiashengjiangcool@outlook.com>
Date:   Wed Nov 27 20:10:42 2024 +0000

    drm/i915: Fix memory leak by correcting cache object name in error handler
    
    commit 2828e5808bcd5aae7fdcd169cac1efa2701fa2dd upstream.
    
    Replace "slab_priorities" with "slab_dependencies" in the error handler
    to avoid memory leak.
    
    Fixes: 32eb6bcfdda9 ("drm/i915: Make request allocation caches global")
    Cc: <stable@vger.kernel.org> # v5.2+
    Signed-off-by: Jiasheng Jiang <jiashengjiangcool@outlook.com>
    Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
    Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
    Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20241127201042.29620-1-jiashengjiangcool@gmail.com
    (cherry picked from commit 9bc5e7dc694d3112bbf0fa4c46ef0fa0f114937a)
    Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
exfat: fix potential deadlock on __exfat_get_dentry_set [+ + +]
Author: Sungjong Seo <sj1557.seo@samsung.com>
Date:   Fri May 31 19:14:44 2024 +0900

    exfat: fix potential deadlock on __exfat_get_dentry_set
    
    commit 89fc548767a2155231128cb98726d6d2ea1256c9 upstream.
    
    When accessing a file with more entries than ES_MAX_ENTRY_NUM, the bh-array
    is allocated in __exfat_get_entry_set. The problem is that the bh-array is
    allocated with GFP_KERNEL. It does not make sense. In the following cases,
    a deadlock for sbi->s_lock between the two processes may occur.
    
           CPU0                CPU1
           ----                ----
      kswapd
       balance_pgdat
        lock(fs_reclaim)
                          exfat_iterate
                           lock(&sbi->s_lock)
                           exfat_readdir
                            exfat_get_uniname_from_ext_entry
                             exfat_get_dentry_set
                              __exfat_get_dentry_set
                               kmalloc_array
                                ...
                                lock(fs_reclaim)
        ...
        evict
         exfat_evict_inode
          lock(&sbi->s_lock)
    
    To fix this, let's allocate bh-array with GFP_NOFS.
    
    Fixes: a3ff29a95fde ("exfat: support dynamic allocate bh for exfat_entry_set_cache")
    Cc: stable@vger.kernel.org # v6.2+
    Reported-by: syzbot+412a392a2cd4a65e71db@syzkaller.appspotmail.com
    Closes: https://lore.kernel.org/lkml/000000000000fef47e0618c0327f@google.com
    Signed-off-by: Sungjong Seo <sj1557.seo@samsung.com>
    Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
    Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

exfat: support dynamic allocate bh for exfat_entry_set_cache [+ + +]
Author: Yuezhang Mo <Yuezhang.Mo@sony.com>
Date:   Wed Nov 9 13:50:22 2022 +0800

    exfat: support dynamic allocate bh for exfat_entry_set_cache
    
    commit a3ff29a95fde16906304455aa8c0bd84eb770258 upstream.
    
    In special cases, a file or a directory may occupied more than 19
    directory entries, pre-allocating 3 bh is not enough. Such as
      - Support vendor secondary directory entry in the future.
      - Since file directory entry is damaged, the SecondaryCount
        field is bigger than 18.
    
    So this commit supports dynamic allocation of bh.
    
    Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
    Reviewed-by: Andy Wu <Andy.Wu@sony.com>
    Reviewed-by: Aoyama Wataru <wataru.aoyama@sony.com>
    Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
    Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
    Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
ksmbd: fix racy issue from session lookup and expire [+ + +]
Author: Namjae Jeon <linkinjeon@kernel.org>
Date:   Thu Dec 5 21:38:47 2024 +0900

    ksmbd: fix racy issue from session lookup and expire
    
    commit b95629435b84b9ecc0c765995204a4d8a913ed52 upstream.
    
    Increment the session reference count within the lock for lookup to avoid
    racy issue with session expire.
    
    Cc: stable@vger.kernel.org
    Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-25737
    Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
Linux: Linux 6.1.121 [+ + +]
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date:   Thu Dec 19 18:08:59 2024 +0100

    Linux 6.1.121
    
    Link: https://lore.kernel.org/r/20241217170526.232803729@linuxfoundation.org
    Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
    Tested-by: Pavel Machek (CIP) <pavel@denx.de>
    Tested-by: Shuah Khan <skhan@linuxfoundation.org>
    Tested-by: Ron Economos <re@w6rz.net>
    Tested-by: Peter Schneider <pschneider1968@googlemail.com>
    Tested-by: Mark Brown <broonie@kernel.org>
    Tested-by: Salvatore Bonaccorso <carnil@debian.org>
    Tested-by: Jon Hunter <jonathanh@nvidia.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
net/mlx5: DR, prevent potential error pointer dereference [+ + +]
Author: Dan Carpenter <dan.carpenter@linaro.org>
Date:   Wed Dec 4 15:06:41 2024 +0300

    net/mlx5: DR, prevent potential error pointer dereference
    
    [ Upstream commit 11776cff0b563c8b8a4fa76cab620bfb633a8cb8 ]
    
    The dr_domain_add_vport_cap() function generally returns NULL on error
    but sometimes we want it to return ERR_PTR(-EBUSY) so the caller can
    retry.  The problem here is that "ret" can be either -EBUSY or -ENOMEM
    and if it's and -ENOMEM then the error pointer is propogated back and
    eventually dereferenced in dr_ste_v0_build_src_gvmi_qpn_tag().
    
    Fixes: 11a45def2e19 ("net/mlx5: DR, Add support for SF vports")
    Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
    Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
    Link: https://patch.msgid.link/07477254-e179-43e2-b1b3-3b9db4674195@stanley.mountain
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
net/sched: netem: account for backlog updates from child qdisc [+ + +]
Author: Martin Ottens <martin.ottens@fau.de>
Date:   Tue Dec 10 14:14:11 2024 +0100

    net/sched: netem: account for backlog updates from child qdisc
    
    [ Upstream commit f8d4bc455047cf3903cd6f85f49978987dbb3027 ]
    
    In general, 'qlen' of any classful qdisc should keep track of the
    number of packets that the qdisc itself and all of its children holds.
    In case of netem, 'qlen' only accounts for the packets in its internal
    tfifo. When netem is used with a child qdisc, the child qdisc can use
    'qdisc_tree_reduce_backlog' to inform its parent, netem, about created
    or dropped SKBs. This function updates 'qlen' and the backlog statistics
    of netem, but netem does not account for changes made by a child qdisc.
    'qlen' then indicates the wrong number of packets in the tfifo.
    If a child qdisc creates new SKBs during enqueue and informs its parent
    about this, netem's 'qlen' value is increased. When netem dequeues the
    newly created SKBs from the child, the 'qlen' in netem is not updated.
    If 'qlen' reaches the configured sch->limit, the enqueue function stops
    working, even though the tfifo is not full.
    
    Reproduce the bug:
    Ensure that the sender machine has GSO enabled. Configure netem as root
    qdisc and tbf as its child on the outgoing interface of the machine
    as follows:
    $ tc qdisc add dev <oif> root handle 1: netem delay 100ms limit 100
    $ tc qdisc add dev <oif> parent 1:0 tbf rate 50Mbit burst 1542 latency 50ms
    
    Send bulk TCP traffic out via this interface, e.g., by running an iPerf3
    client on the machine. Check the qdisc statistics:
    $ tc -s qdisc show dev <oif>
    
    Statistics after 10s of iPerf3 TCP test before the fix (note that
    netem's backlog > limit, netem stopped accepting packets):
    qdisc netem 1: root refcnt 2 limit 1000 delay 100ms
     Sent 2767766 bytes 1848 pkt (dropped 652, overlimits 0 requeues 0)
     backlog 4294528236b 1155p requeues 0
    qdisc tbf 10: parent 1:1 rate 50Mbit burst 1537b lat 50ms
     Sent 2767766 bytes 1848 pkt (dropped 327, overlimits 7601 requeues 0)
     backlog 0b 0p requeues 0
    
    Statistics after the fix:
    qdisc netem 1: root refcnt 2 limit 1000 delay 100ms
     Sent 37766372 bytes 24974 pkt (dropped 9, overlimits 0 requeues 0)
     backlog 0b 0p requeues 0
    qdisc tbf 10: parent 1:1 rate 50Mbit burst 1537b lat 50ms
     Sent 37766372 bytes 24974 pkt (dropped 327, overlimits 96017 requeues 0)
     backlog 0b 0p requeues 0
    
    tbf segments the GSO SKBs (tbf_segment) and updates the netem's 'qlen'.
    The interface fully stops transferring packets and "locks". In this case,
    the child qdisc and tfifo are empty, but 'qlen' indicates the tfifo is at
    its limit and no more packets are accepted.
    
    This patch adds a counter for the entries in the tfifo. Netem's 'qlen' is
    only decreased when a packet is returned by its dequeue function, and not
    during enqueuing into the child qdisc. External updates to 'qlen' are thus
    accounted for and only the behavior of the backlog statistics changes. As
    in other qdiscs, 'qlen' then keeps track of  how many packets are held in
    netem and all of its children. As before, sch->limit remains as the
    maximum number of packets in the tfifo. The same applies to netem's
    backlog statistics.
    
    Fixes: 50612537e9ab ("netem: fix classful handling")
    Signed-off-by: Martin Ottens <martin.ottens@fau.de>
    Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
    Link: https://patch.msgid.link/20241210131412.1837202-1-martin.ottens@fau.de
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
net: defer final 'struct net' free in netns dismantle [+ + +]
Author: Eric Dumazet <edumazet@google.com>
Date:   Wed Dec 4 12:54:55 2024 +0000

    net: defer final 'struct net' free in netns dismantle
    
    commit 0f6ede9fbc747e2553612271bce108f7517e7a45 upstream.
    
    Ilya reported a slab-use-after-free in dst_destroy [1]
    
    Issue is in xfrm6_net_init() and xfrm4_net_init() :
    
    They copy xfrm[46]_dst_ops_template into net->xfrm.xfrm[46]_dst_ops.
    
    But net structure might be freed before all the dst callbacks are
    called. So when dst_destroy() calls later :
    
    if (dst->ops->destroy)
        dst->ops->destroy(dst);
    
    dst->ops points to the old net->xfrm.xfrm[46]_dst_ops, which has been freed.
    
    See a relevant issue fixed in :
    
    ac888d58869b ("net: do not delay dst_entries_add() in dst_release()")
    
    A fix is to queue the 'struct net' to be freed after one
    another cleanup_net() round (and existing rcu_barrier())
    
    [1]
    
    BUG: KASAN: slab-use-after-free in dst_destroy (net/core/dst.c:112)
    Read of size 8 at addr ffff8882137ccab0 by task swapper/37/0
    Dec 03 05:46:18 kernel:
    CPU: 37 UID: 0 PID: 0 Comm: swapper/37 Kdump: loaded Not tainted 6.12.0 #67
    Hardware name: Red Hat KVM/RHEL, BIOS 1.16.1-1.el9 04/01/2014
    Call Trace:
     <IRQ>
    dump_stack_lvl (lib/dump_stack.c:124)
    print_address_description.constprop.0 (mm/kasan/report.c:378)
    ? dst_destroy (net/core/dst.c:112)
    print_report (mm/kasan/report.c:489)
    ? dst_destroy (net/core/dst.c:112)
    ? kasan_addr_to_slab (mm/kasan/common.c:37)
    kasan_report (mm/kasan/report.c:603)
    ? dst_destroy (net/core/dst.c:112)
    ? rcu_do_batch (kernel/rcu/tree.c:2567)
    dst_destroy (net/core/dst.c:112)
    rcu_do_batch (kernel/rcu/tree.c:2567)
    ? __pfx_rcu_do_batch (kernel/rcu/tree.c:2491)
    ? lockdep_hardirqs_on_prepare (kernel/locking/lockdep.c:4339 kernel/locking/lockdep.c:4406)
    rcu_core (kernel/rcu/tree.c:2825)
    handle_softirqs (kernel/softirq.c:554)
    __irq_exit_rcu (kernel/softirq.c:589 kernel/softirq.c:428 kernel/softirq.c:637)
    irq_exit_rcu (kernel/softirq.c:651)
    sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1049 arch/x86/kernel/apic/apic.c:1049)
     </IRQ>
     <TASK>
    asm_sysvec_apic_timer_interrupt (./arch/x86/include/asm/idtentry.h:702)
    RIP: 0010:default_idle (./arch/x86/include/asm/irqflags.h:37 ./arch/x86/include/asm/irqflags.h:92 arch/x86/kernel/process.c:743)
    Code: 00 4d 29 c8 4c 01 c7 4c 29 c2 e9 6e ff ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 66 90 0f 00 2d c7 c9 27 00 fb f4 <fa> c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 90
    RSP: 0018:ffff888100d2fe00 EFLAGS: 00000246
    RAX: 00000000001870ed RBX: 1ffff110201a5fc2 RCX: ffffffffb61a3e46
    RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffb3d4d123
    RBP: 0000000000000000 R08: 0000000000000001 R09: ffffed11c7e1835d
    R10: ffff888e3f0c1aeb R11: 0000000000000000 R12: 0000000000000000
    R13: ffff888100d20000 R14: dffffc0000000000 R15: 0000000000000000
    ? ct_kernel_exit.constprop.0 (kernel/context_tracking.c:148)
    ? cpuidle_idle_call (kernel/sched/idle.c:186)
    default_idle_call (./include/linux/cpuidle.h:143 kernel/sched/idle.c:118)
    cpuidle_idle_call (kernel/sched/idle.c:186)
    ? __pfx_cpuidle_idle_call (kernel/sched/idle.c:168)
    ? lock_release (kernel/locking/lockdep.c:467 kernel/locking/lockdep.c:5848)
    ? lockdep_hardirqs_on_prepare (kernel/locking/lockdep.c:4347 kernel/locking/lockdep.c:4406)
    ? tsc_verify_tsc_adjust (arch/x86/kernel/tsc_sync.c:59)
    do_idle (kernel/sched/idle.c:326)
    cpu_startup_entry (kernel/sched/idle.c:423 (discriminator 1))
    start_secondary (arch/x86/kernel/smpboot.c:202 arch/x86/kernel/smpboot.c:282)
    ? __pfx_start_secondary (arch/x86/kernel/smpboot.c:232)
    ? soft_restart_cpu (arch/x86/kernel/head_64.S:452)
    common_startup_64 (arch/x86/kernel/head_64.S:414)
     </TASK>
    Dec 03 05:46:18 kernel:
    Allocated by task 12184:
    kasan_save_stack (mm/kasan/common.c:48)
    kasan_save_track (./arch/x86/include/asm/current.h:49 mm/kasan/common.c:60 mm/kasan/common.c:69)
    __kasan_slab_alloc (mm/kasan/common.c:319 mm/kasan/common.c:345)
    kmem_cache_alloc_noprof (mm/slub.c:4085 mm/slub.c:4134 mm/slub.c:4141)
    copy_net_ns (net/core/net_namespace.c:421 net/core/net_namespace.c:480)
    create_new_namespaces (kernel/nsproxy.c:110)
    unshare_nsproxy_namespaces (kernel/nsproxy.c:228 (discriminator 4))
    ksys_unshare (kernel/fork.c:3313)
    __x64_sys_unshare (kernel/fork.c:3382)
    do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83)
    entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
    Dec 03 05:46:18 kernel:
    Freed by task 11:
    kasan_save_stack (mm/kasan/common.c:48)
    kasan_save_track (./arch/x86/include/asm/current.h:49 mm/kasan/common.c:60 mm/kasan/common.c:69)
    kasan_save_free_info (mm/kasan/generic.c:582)
    __kasan_slab_free (mm/kasan/common.c:271)
    kmem_cache_free (mm/slub.c:4579 mm/slub.c:4681)
    cleanup_net (net/core/net_namespace.c:456 net/core/net_namespace.c:446 net/core/net_namespace.c:647)
    process_one_work (kernel/workqueue.c:3229)
    worker_thread (kernel/workqueue.c:3304 kernel/workqueue.c:3391)
    kthread (kernel/kthread.c:389)
    ret_from_fork (arch/x86/kernel/process.c:147)
    ret_from_fork_asm (arch/x86/entry/entry_64.S:257)
    Dec 03 05:46:18 kernel:
    Last potentially related work creation:
    kasan_save_stack (mm/kasan/common.c:48)
    __kasan_record_aux_stack (mm/kasan/generic.c:541)
    insert_work (./include/linux/instrumented.h:68 ./include/asm-generic/bitops/instrumented-non-atomic.h:141 kernel/workqueue.c:788 kernel/workqueue.c:795 kernel/workqueue.c:2186)
    __queue_work (kernel/workqueue.c:2340)
    queue_work_on (kernel/workqueue.c:2391)
    xfrm_policy_insert (net/xfrm/xfrm_policy.c:1610)
    xfrm_add_policy (net/xfrm/xfrm_user.c:2116)
    xfrm_user_rcv_msg (net/xfrm/xfrm_user.c:3321)
    netlink_rcv_skb (net/netlink/af_netlink.c:2536)
    xfrm_netlink_rcv (net/xfrm/xfrm_user.c:3344)
    netlink_unicast (net/netlink/af_netlink.c:1316 net/netlink/af_netlink.c:1342)
    netlink_sendmsg (net/netlink/af_netlink.c:1886)
    sock_write_iter (net/socket.c:729 net/socket.c:744 net/socket.c:1165)
    vfs_write (fs/read_write.c:590 fs/read_write.c:683)
    ksys_write (fs/read_write.c:736)
    do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83)
    entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
    Dec 03 05:46:18 kernel:
    Second to last potentially related work creation:
    kasan_save_stack (mm/kasan/common.c:48)
    __kasan_record_aux_stack (mm/kasan/generic.c:541)
    insert_work (./include/linux/instrumented.h:68 ./include/asm-generic/bitops/instrumented-non-atomic.h:141 kernel/workqueue.c:788 kernel/workqueue.c:795 kernel/workqueue.c:2186)
    __queue_work (kernel/workqueue.c:2340)
    queue_work_on (kernel/workqueue.c:2391)
    __xfrm_state_insert (./include/linux/workqueue.h:723 net/xfrm/xfrm_state.c:1150 net/xfrm/xfrm_state.c:1145 net/xfrm/xfrm_state.c:1513)
    xfrm_state_update (./include/linux/spinlock.h:396 net/xfrm/xfrm_state.c:1940)
    xfrm_add_sa (net/xfrm/xfrm_user.c:912)
    xfrm_user_rcv_msg (net/xfrm/xfrm_user.c:3321)
    netlink_rcv_skb (net/netlink/af_netlink.c:2536)
    xfrm_netlink_rcv (net/xfrm/xfrm_user.c:3344)
    netlink_unicast (net/netlink/af_netlink.c:1316 net/netlink/af_netlink.c:1342)
    netlink_sendmsg (net/netlink/af_netlink.c:1886)
    sock_write_iter (net/socket.c:729 net/socket.c:744 net/socket.c:1165)
    vfs_write (fs/read_write.c:590 fs/read_write.c:683)
    ksys_write (fs/read_write.c:736)
    do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83)
    entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
    
    Fixes: a8a572a6b5f2 ("xfrm: dst_entries_init() per-net dst_ops")
    Reported-by: Ilya Maximets <i.maximets@ovn.org>
    Closes: https://lore.kernel.org/netdev/CANn89iKKYDVpB=MtmfH7nyv2p=rJWSLedO5k7wSZgtY_tO8WQg@mail.gmail.com/T/#m02c98c3009fe66382b73cfb4db9cf1df6fab3fbf
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Acked-by: Paolo Abeni <pabeni@redhat.com>
    Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
    Link: https://patch.msgid.link/20241204125455.3871859-1-edumazet@google.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: dsa: felix: fix stuck CPU-injected packets with short taprio windows [+ + +]
Author: Vladimir Oltean <vladimir.oltean@nxp.com>
Date:   Tue Dec 10 15:26:40 2024 +0200

    net: dsa: felix: fix stuck CPU-injected packets with short taprio windows
    
    [ Upstream commit acfcdb78d5d4cdb78e975210c8825b9a112463f6 ]
    
    With this port schedule:
    
    tc qdisc replace dev $send_if parent root handle 100 taprio \
            num_tc 8 queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
            map 0 1 2 3 4 5 6 7 \
            base-time 0 cycle-time 10000 \
            sched-entry S 01 1250 \
            sched-entry S 02 1250 \
            sched-entry S 04 1250 \
            sched-entry S 08 1250 \
            sched-entry S 10 1250 \
            sched-entry S 20 1250 \
            sched-entry S 40 1250 \
            sched-entry S 80 1250 \
            flags 2
    
    ptp4l would fail to take TX timestamps of Pdelay_Resp messages like this:
    
    increasing tx_timestamp_timeout may correct this issue, but it is likely caused by a driver bug
    ptp4l[4134.168]: port 2: send peer delay response failed
    
    It turns out that the driver can't take their TX timestamps because it
    can't transmit them in the first place. And there's nothing special
    about the Pdelay_Resp packets - they're just regular 68 byte packets.
    But with this taprio configuration, the switch would refuse to send even
    the ETH_ZLEN minimum packet size.
    
    This should have definitely not been the case. When applying the taprio
    config, the driver prints:
    
    mscc_felix 0000:00:00.5: port 0 tc 0 min gate length 1250 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 132 octets including FCS
    mscc_felix 0000:00:00.5: port 0 tc 1 min gate length 1250 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 132 octets including FCS
    mscc_felix 0000:00:00.5: port 0 tc 2 min gate length 1250 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 132 octets including FCS
    mscc_felix 0000:00:00.5: port 0 tc 3 min gate length 1250 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 132 octets including FCS
    mscc_felix 0000:00:00.5: port 0 tc 4 min gate length 1250 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 132 octets including FCS
    mscc_felix 0000:00:00.5: port 0 tc 5 min gate length 1250 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 132 octets including FCS
    mscc_felix 0000:00:00.5: port 0 tc 6 min gate length 1250 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 132 octets including FCS
    mscc_felix 0000:00:00.5: port 0 tc 7 min gate length 1250 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 132 octets including FCS
    
    and thus, everything under 132 bytes - ETH_FCS_LEN should have been sent
    without problems. Yet it's not.
    
    For the forwarding path, the configuration is fine, yet packets injected
    from Linux get stuck with this schedule no matter what.
    
    The first hint that the static guard bands are the cause of the problem
    is that reverting Michael Walle's commit 297c4de6f780 ("net: dsa: felix:
    re-enable TAS guard band mode") made things work. It must be that the
    guard bands are calculated incorrectly.
    
    I remembered that there is a magic constant in the driver, set to 33 ns
    for no logical reason other than experimentation, which says "never let
    the static guard bands get so large as to leave less than this amount of
    remaining space in the time slot, because the queue system will refuse
    to schedule packets otherwise, and they will get stuck". I had a hunch
    that my previous experimentally-determined value was only good for
    packets coming from the forwarding path, and that the CPU injection path
    needed more.
    
    I came to the new value of 35 ns through binary search, after seeing
    that with 544 ns (the bit time required to send the Pdelay_Resp packet
    at gigabit) it works. Again, this is purely experimental, there's no
    logic and the manual doesn't say anything.
    
    The new driver prints for this schedule look like this:
    
    mscc_felix 0000:00:00.5: port 0 tc 0 min gate length 1250 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 131 octets including FCS
    mscc_felix 0000:00:00.5: port 0 tc 1 min gate length 1250 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 131 octets including FCS
    mscc_felix 0000:00:00.5: port 0 tc 2 min gate length 1250 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 131 octets including FCS
    mscc_felix 0000:00:00.5: port 0 tc 3 min gate length 1250 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 131 octets including FCS
    mscc_felix 0000:00:00.5: port 0 tc 4 min gate length 1250 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 131 octets including FCS
    mscc_felix 0000:00:00.5: port 0 tc 5 min gate length 1250 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 131 octets including FCS
    mscc_felix 0000:00:00.5: port 0 tc 6 min gate length 1250 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 131 octets including FCS
    mscc_felix 0000:00:00.5: port 0 tc 7 min gate length 1250 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 131 octets including FCS
    
    So yes, the maximum MTU is now even smaller by 1 byte than before.
    This is maybe counter-intuitive, but makes more sense with a diagram of
    one time slot.
    
    Before:
    
     Gate open                                   Gate close
     |                                                    |
     v           1250 ns total time slot duration         v
     <---------------------------------------------------->
     <----><---------------------------------------------->
      33 ns            1217 ns static guard band
      useful
    
     Gate open                                   Gate close
     |                                                    |
     v           1250 ns total time slot duration         v
     <---------------------------------------------------->
     <-----><--------------------------------------------->
      35 ns            1215 ns static guard band
      useful
    
    The static guard band implemented by this switch hardware directly
    determines the maximum allowable MTU for that traffic class. The larger
    it is, the earlier the switch will stop scheduling frames for
    transmission, because otherwise they might overrun the gate close time
    (and avoiding that is the entire purpose of Michael's patch).
    So, we now have guard bands smaller by 2 ns, thus, in this particular
    case, we lose a byte of the maximum MTU.
    
    Fixes: 11afdc6526de ("net: dsa: felix: tc-taprio intervals smaller than MTU should send at least one packet")
    Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Reviewed-by: Michael Walle <mwalle@kernel.org>
    Link: https://patch.msgid.link/20241210132640.3426788-1-vladimir.oltean@nxp.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: lapb: increase LAPB_HEADER_LEN [+ + +]
Author: Eric Dumazet <edumazet@google.com>
Date:   Wed Dec 4 14:10:31 2024 +0000

    net: lapb: increase LAPB_HEADER_LEN
    
    [ Upstream commit a6d75ecee2bf828ac6a1b52724aba0a977e4eaf4 ]
    
    It is unclear if net/lapb code is supposed to be ready for 8021q.
    
    We can at least avoid crashes like the following :
    
    skbuff: skb_under_panic: text:ffffffff8aabe1f6 len:24 put:20 head:ffff88802824a400 data:ffff88802824a3fe tail:0x16 end:0x140 dev:nr0.2
    ------------[ cut here ]------------
     kernel BUG at net/core/skbuff.c:206 !
    Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
    CPU: 1 UID: 0 PID: 5508 Comm: dhcpcd Not tainted 6.12.0-rc7-syzkaller-00144-g66418447d27b #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/30/2024
     RIP: 0010:skb_panic net/core/skbuff.c:206 [inline]
     RIP: 0010:skb_under_panic+0x14b/0x150 net/core/skbuff.c:216
    Code: 0d 8d 48 c7 c6 2e 9e 29 8e 48 8b 54 24 08 8b 0c 24 44 8b 44 24 04 4d 89 e9 50 41 54 41 57 41 56 e8 1a 6f 37 02 48 83 c4 20 90 <0f> 0b 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3
    RSP: 0018:ffffc90002ddf638 EFLAGS: 00010282
    RAX: 0000000000000086 RBX: dffffc0000000000 RCX: 7a24750e538ff600
    RDX: 0000000000000000 RSI: 0000000000000201 RDI: 0000000000000000
    RBP: ffff888034a86650 R08: ffffffff8174b13c R09: 1ffff920005bbe60
    R10: dffffc0000000000 R11: fffff520005bbe61 R12: 0000000000000140
    R13: ffff88802824a400 R14: ffff88802824a3fe R15: 0000000000000016
    FS:  00007f2a5990d740(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 000000110c2631fd CR3: 0000000029504000 CR4: 00000000003526f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
     <TASK>
      skb_push+0xe5/0x100 net/core/skbuff.c:2636
      nr_header+0x36/0x320 net/netrom/nr_dev.c:69
      dev_hard_header include/linux/netdevice.h:3148 [inline]
      vlan_dev_hard_header+0x359/0x480 net/8021q/vlan_dev.c:83
      dev_hard_header include/linux/netdevice.h:3148 [inline]
      lapbeth_data_transmit+0x1f6/0x2a0 drivers/net/wan/lapbether.c:257
      lapb_data_transmit+0x91/0xb0 net/lapb/lapb_iface.c:447
      lapb_transmit_buffer+0x168/0x1f0 net/lapb/lapb_out.c:149
     lapb_establish_data_link+0x84/0xd0
     lapb_device_event+0x4e0/0x670
      notifier_call_chain+0x19f/0x3e0 kernel/notifier.c:93
     __dev_notify_flags+0x207/0x400
      dev_change_flags+0xf0/0x1a0 net/core/dev.c:8922
      devinet_ioctl+0xa4e/0x1aa0 net/ipv4/devinet.c:1188
      inet_ioctl+0x3d7/0x4f0 net/ipv4/af_inet.c:1003
      sock_do_ioctl+0x158/0x460 net/socket.c:1227
      sock_ioctl+0x626/0x8e0 net/socket.c:1346
      vfs_ioctl fs/ioctl.c:51 [inline]
      __do_sys_ioctl fs/ioctl.c:907 [inline]
      __se_sys_ioctl+0xf9/0x170 fs/ioctl.c:893
      do_syscall_x64 arch/x86/entry/common.c:52 [inline]
      do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
    
    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Reported-by: syzbot+fb99d1b0c0f81d94a5e2@syzkaller.appspotmail.com
    Closes: https://lore.kernel.org/netdev/67506220.050a0220.17bd51.006c.GAE@google.com/T/#u
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://patch.msgid.link/20241204141031.4030267-1-edumazet@google.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: mscc: ocelot: be resilient to loss of PTP packets during transmission [+ + +]
Author: Vladimir Oltean <vladimir.oltean@nxp.com>
Date:   Thu Dec 5 16:55:18 2024 +0200

    net: mscc: ocelot: be resilient to loss of PTP packets during transmission
    
    [ Upstream commit b454abfab52543c44b581afc807b9f97fc1e7a3a ]
    
    The Felix DSA driver presents unique challenges that make the simplistic
    ocelot PTP TX timestamping procedure unreliable: any transmitted packet
    may be lost in hardware before it ever leaves our local system.
    
    This may happen because there is congestion on the DSA conduit, the
    switch CPU port or even user port (Qdiscs like taprio may delay packets
    indefinitely by design).
    
    The technical problem is that the kernel, i.e. ocelot_port_add_txtstamp_skb(),
    runs out of timestamp IDs eventually, because it never detects that
    packets are lost, and keeps the IDs of the lost packets on hold
    indefinitely. The manifestation of the issue once the entire timestamp
    ID range becomes busy looks like this in dmesg:
    
    mscc_felix 0000:00:00.5: port 0 delivering skb without TX timestamp
    mscc_felix 0000:00:00.5: port 1 delivering skb without TX timestamp
    
    At the surface level, we need a timeout timer so that the kernel knows a
    timestamp ID is available again. But there is a deeper problem with the
    implementation, which is the monotonically increasing ocelot_port->ts_id.
    In the presence of packet loss, it will be impossible to detect that and
    reuse one of the holes created in the range of free timestamp IDs.
    
    What we actually need is a bitmap of 63 timestamp IDs tracking which one
    is available. That is able to use up holes caused by packet loss, but
    also gives us a unique opportunity to not implement an actual timer_list
    for the timeout timer (very complicated in terms of locking).
    
    We could only declare a timestamp ID stale on demand (lazily), aka when
    there's no other timestamp ID available. There are pros and cons to this
    approach: the implementation is much more simple than per-packet timers
    would be, but most of the stale packets would be quasi-leaked - not
    really leaked, but blocked in driver memory, since this algorithm sees
    no reason to free them.
    
    An improved technique would be to check for stale timestamp IDs every
    time we allocate a new one. Assuming a constant flux of PTP packets,
    this avoids stale packets being blocked in memory, but of course,
    packets lost at the end of the flux are still blocked until the flux
    resumes (nobody left to kick them out).
    
    Since implementing per-packet timers is way too complicated, this should
    be good enough.
    
    Testing procedure:
    
    Persistently block traffic class 5 and try to run PTP on it:
    $ tc qdisc replace dev swp3 parent root taprio num_tc 8 \
            map 0 1 2 3 4 5 6 7 queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
            base-time 0 sched-entry S 0xdf 100000 flags 0x2
    [  126.948141] mscc_felix 0000:00:00.5: port 3 tc 5 min gate length 0 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 1 octets including FCS
    $ ptp4l -i swp3 -2 -P -m --socket_priority 5 --fault_reset_interval ASAP --logSyncInterval -3
    ptp4l[70.351]: port 1 (swp3): INITIALIZING to LISTENING on INIT_COMPLETE
    ptp4l[70.354]: port 0 (/var/run/ptp4l): INITIALIZING to LISTENING on INIT_COMPLETE
    ptp4l[70.358]: port 0 (/var/run/ptp4lro): INITIALIZING to LISTENING on INIT_COMPLETE
    [   70.394583] mscc_felix 0000:00:00.5: port 3 timestamp id 0
    ptp4l[70.406]: timed out while polling for tx timestamp
    ptp4l[70.406]: increasing tx_timestamp_timeout or increasing kworker priority may correct this issue, but a driver bug likely causes it
    ptp4l[70.406]: port 1 (swp3): send peer delay response failed
    ptp4l[70.407]: port 1 (swp3): clearing fault immediately
    ptp4l[70.952]: port 1 (swp3): new foreign master d858d7.fffe.00ca6d-1
    [   71.394858] mscc_felix 0000:00:00.5: port 3 timestamp id 1
    ptp4l[71.400]: timed out while polling for tx timestamp
    ptp4l[71.400]: increasing tx_timestamp_timeout or increasing kworker priority may correct this issue, but a driver bug likely causes it
    ptp4l[71.401]: port 1 (swp3): send peer delay response failed
    ptp4l[71.401]: port 1 (swp3): clearing fault immediately
    [   72.393616] mscc_felix 0000:00:00.5: port 3 timestamp id 2
    ptp4l[72.401]: timed out while polling for tx timestamp
    ptp4l[72.402]: increasing tx_timestamp_timeout or increasing kworker priority may correct this issue, but a driver bug likely causes it
    ptp4l[72.402]: port 1 (swp3): send peer delay response failed
    ptp4l[72.402]: port 1 (swp3): clearing fault immediately
    ptp4l[72.952]: port 1 (swp3): new foreign master d858d7.fffe.00ca6d-1
    [   73.395291] mscc_felix 0000:00:00.5: port 3 timestamp id 3
    ptp4l[73.400]: timed out while polling for tx timestamp
    ptp4l[73.400]: increasing tx_timestamp_timeout or increasing kworker priority may correct this issue, but a driver bug likely causes it
    ptp4l[73.400]: port 1 (swp3): send peer delay response failed
    ptp4l[73.400]: port 1 (swp3): clearing fault immediately
    [   74.394282] mscc_felix 0000:00:00.5: port 3 timestamp id 4
    ptp4l[74.400]: timed out while polling for tx timestamp
    ptp4l[74.401]: increasing tx_timestamp_timeout or increasing kworker priority may correct this issue, but a driver bug likely causes it
    ptp4l[74.401]: port 1 (swp3): send peer delay response failed
    ptp4l[74.401]: port 1 (swp3): clearing fault immediately
    ptp4l[74.953]: port 1 (swp3): new foreign master d858d7.fffe.00ca6d-1
    [   75.396830] mscc_felix 0000:00:00.5: port 3 invalidating stale timestamp ID 0 which seems lost
    [   75.405760] mscc_felix 0000:00:00.5: port 3 timestamp id 0
    ptp4l[75.410]: timed out while polling for tx timestamp
    ptp4l[75.411]: increasing tx_timestamp_timeout or increasing kworker priority may correct this issue, but a driver bug likely causes it
    ptp4l[75.411]: port 1 (swp3): send peer delay response failed
    ptp4l[75.411]: port 1 (swp3): clearing fault immediately
    (...)
    
    Remove the blocking condition and see that the port recovers:
    $ same tc command as above, but use "sched-entry S 0xff" instead
    $ same ptp4l command as above
    ptp4l[99.489]: port 1 (swp3): INITIALIZING to LISTENING on INIT_COMPLETE
    ptp4l[99.490]: port 0 (/var/run/ptp4l): INITIALIZING to LISTENING on INIT_COMPLETE
    ptp4l[99.492]: port 0 (/var/run/ptp4lro): INITIALIZING to LISTENING on INIT_COMPLETE
    [  100.403768] mscc_felix 0000:00:00.5: port 3 invalidating stale timestamp ID 0 which seems lost
    [  100.412545] mscc_felix 0000:00:00.5: port 3 invalidating stale timestamp ID 1 which seems lost
    [  100.421283] mscc_felix 0000:00:00.5: port 3 invalidating stale timestamp ID 2 which seems lost
    [  100.430015] mscc_felix 0000:00:00.5: port 3 invalidating stale timestamp ID 3 which seems lost
    [  100.438744] mscc_felix 0000:00:00.5: port 3 invalidating stale timestamp ID 4 which seems lost
    [  100.447470] mscc_felix 0000:00:00.5: port 3 timestamp id 0
    [  100.505919] mscc_felix 0000:00:00.5: port 3 timestamp id 0
    ptp4l[100.963]: port 1 (swp3): new foreign master d858d7.fffe.00ca6d-1
    [  101.405077] mscc_felix 0000:00:00.5: port 3 timestamp id 0
    [  101.507953] mscc_felix 0000:00:00.5: port 3 timestamp id 0
    [  102.405405] mscc_felix 0000:00:00.5: port 3 timestamp id 0
    [  102.509391] mscc_felix 0000:00:00.5: port 3 timestamp id 0
    [  103.406003] mscc_felix 0000:00:00.5: port 3 timestamp id 0
    [  103.510011] mscc_felix 0000:00:00.5: port 3 timestamp id 0
    [  104.405601] mscc_felix 0000:00:00.5: port 3 timestamp id 0
    [  104.510624] mscc_felix 0000:00:00.5: port 3 timestamp id 0
    ptp4l[104.965]: selected best master clock d858d7.fffe.00ca6d
    ptp4l[104.966]: port 1 (swp3): assuming the grand master role
    ptp4l[104.967]: port 1 (swp3): LISTENING to GRAND_MASTER on RS_GRAND_MASTER
    [  105.106201] mscc_felix 0000:00:00.5: port 3 timestamp id 0
    [  105.232420] mscc_felix 0000:00:00.5: port 3 timestamp id 0
    [  105.359001] mscc_felix 0000:00:00.5: port 3 timestamp id 0
    [  105.405500] mscc_felix 0000:00:00.5: port 3 timestamp id 0
    [  105.485356] mscc_felix 0000:00:00.5: port 3 timestamp id 0
    [  105.511220] mscc_felix 0000:00:00.5: port 3 timestamp id 0
    [  105.610938] mscc_felix 0000:00:00.5: port 3 timestamp id 0
    [  105.737237] mscc_felix 0000:00:00.5: port 3 timestamp id 0
    (...)
    
    Notice that in this new usage pattern, a non-congested port should
    basically use timestamp ID 0 all the time, progressing to higher numbers
    only if there are unacknowledged timestamps in flight. Compare this to
    the old usage, where the timestamp ID used to monotonically increase
    modulo OCELOT_MAX_PTP_ID.
    
    In terms of implementation, this simplifies the bookkeeping of the
    ocelot_port :: ts_id and ptp_skbs_in_flight. Since we need to traverse
    the list of two-step timestampable skbs for each new packet anyway, the
    information can already be computed and does not need to be stored.
    Also, ocelot_port->tx_skbs is always accessed under the switch-wide
    ocelot->ts_id_lock IRQ-unsafe spinlock, so we don't need the skb queue's
    lock and can use the unlocked primitives safely.
    
    This problem was actually detected using the tc-taprio offload, and is
    causing trouble in TSN scenarios, which Felix (NXP LS1028A / VSC9959)
    supports but Ocelot (VSC7514) does not. Thus, I've selected the commit
    to blame as the one adding initial timestamping support for the Felix
    switch.
    
    Fixes: c0bcf537667c ("net: dsa: ocelot: add hardware timestamping support for Felix")
    Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Link: https://patch.msgid.link/20241205145519.1236778-5-vladimir.oltean@nxp.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: mscc: ocelot: fix memory leak on ocelot_port_add_txtstamp_skb() [+ + +]
Author: Vladimir Oltean <vladimir.oltean@nxp.com>
Date:   Thu Dec 5 16:55:15 2024 +0200

    net: mscc: ocelot: fix memory leak on ocelot_port_add_txtstamp_skb()
    
    [ Upstream commit 4b01bec25bef62544228bce06db6a3afa5d3d6bb ]
    
    If ocelot_port_add_txtstamp_skb() fails, for example due to a full PTP
    timestamp FIFO, we must undo the skb_clone_sk() call with kfree_skb().
    Otherwise, the reference to the skb clone is lost.
    
    Fixes: 52849bcf0029 ("net: mscc: ocelot: avoid overflowing the PTP timestamp FIFO")
    Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Link: https://patch.msgid.link/20241205145519.1236778-2-vladimir.oltean@nxp.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: mscc: ocelot: improve handling of TX timestamp for unknown skb [+ + +]
Author: Vladimir Oltean <vladimir.oltean@nxp.com>
Date:   Thu Dec 5 16:55:16 2024 +0200

    net: mscc: ocelot: improve handling of TX timestamp for unknown skb
    
    [ Upstream commit b6fba4b3f0becb794e274430f3a0839d8ba31262 ]
    
    This condition, theoretically impossible to trigger, is not really
    handled well. By "continuing", we are skipping the write to SYS_PTP_NXT
    which advances the timestamp FIFO to the next entry. So we are reading
    the same FIFO entry all over again, printing stack traces and eventually
    killing the kernel.
    
    No real problem has been observed here. This is part of a larger rework
    of the timestamp IRQ procedure, with this logical change split out into
    a patch of its own. We will need to "goto next_ts" for other conditions
    as well.
    
    Fixes: 9fde506e0c53 ("net: mscc: ocelot: warn when a PTP IRQ is raised for an unknown skb")
    Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Link: https://patch.msgid.link/20241205145519.1236778-3-vladimir.oltean@nxp.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: mscc: ocelot: ocelot->ts_id_lock and ocelot_port->tx_skbs.lock are IRQ-safe [+ + +]
Author: Vladimir Oltean <vladimir.oltean@nxp.com>
Date:   Thu Dec 5 16:55:17 2024 +0200

    net: mscc: ocelot: ocelot->ts_id_lock and ocelot_port->tx_skbs.lock are IRQ-safe
    
    [ Upstream commit 0c53cdb95eb4a604062e326636971d96dd9b1b26 ]
    
    ocelot_get_txtstamp() is a threaded IRQ handler, requested explicitly as
    such by both ocelot_ptp_rdy_irq_handler() and vsc9959_irq_handler().
    
    As such, it runs with IRQs enabled, and not in hardirq context. Thus,
    ocelot_port_add_txtstamp_skb() has no reason to turn off IRQs, it cannot
    be preempted by ocelot_get_txtstamp(). For the same reason,
    dev_kfree_skb_any_reason() will always evaluate as kfree_skb_reason() in
    this calling context, so just simplify the dev_kfree_skb_any() call to
    kfree_skb().
    
    Also, ocelot_port_txtstamp_request() runs from NET_TX softirq context,
    not with hardirqs enabled. Thus, ocelot_get_txtstamp() which shares the
    ocelot_port->tx_skbs.lock lock with it, has no reason to disable hardirqs.
    
    This is part of a larger rework of the TX timestamping procedure.
    A logical subportion of the rework has been split into a separate
    change.
    
    Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Link: https://patch.msgid.link/20241205145519.1236778-4-vladimir.oltean@nxp.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Stable-dep-of: b454abfab525 ("net: mscc: ocelot: be resilient to loss of PTP packets during transmission")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: mscc: ocelot: perform error cleanup in ocelot_hwstamp_set() [+ + +]
Author: Vladimir Oltean <vladimir.oltean@nxp.com>
Date:   Thu Dec 5 16:55:19 2024 +0200

    net: mscc: ocelot: perform error cleanup in ocelot_hwstamp_set()
    
    [ Upstream commit 43a4166349a254446e7a3db65f721c6a30daccf3 ]
    
    An unsupported RX filter will leave the port with TX timestamping still
    applied as per the new request, rather than the old setting. When
    parsing the tx_type, don't apply it just yet, but delay that until after
    we've parsed the rx_filter as well (and potentially returned -ERANGE for
    that).
    
    Similarly, copy_to_user() may fail, which is a rare occurrence, but
    should still be treated by unwinding what was done.
    
    Fixes: 96ca08c05838 ("net: mscc: ocelot: set up traps for PTP packets")
    Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Link: https://patch.msgid.link/20241205145519.1236778-6-vladimir.oltean@nxp.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: sparx5: fix FDMA performance issue [+ + +]
Author: Daniel Machon <daniel.machon@microchip.com>
Date:   Thu Dec 5 14:54:26 2024 +0100

    net: sparx5: fix FDMA performance issue
    
    [ Upstream commit f004f2e535e2b66ccbf5ac35f8eaadeac70ad7b7 ]
    
    The FDMA handler is responsible for scheduling a NAPI poll, which will
    eventually fetch RX packets from the FDMA queue. Currently, the FDMA
    handler is run in a threaded context. For some reason, this kills
    performance.  Admittedly, I did not do a thorough investigation to see
    exactly what causes the issue, however, I noticed that in the other
    driver utilizing the same FDMA engine, we run the FDMA handler in hard
    IRQ context.
    
    Fix this performance issue, by  running the FDMA handler in hard IRQ
    context, not deferring any work to a thread.
    
    Prior to this change, the RX UDP performance was:
    
    Interval           Transfer     Bitrate         Jitter
    0.00-10.20  sec    44.6 MBytes  36.7 Mbits/sec  0.027 ms
    
    After this change, the rx UDP performance is:
    
    Interval           Transfer     Bitrate         Jitter
    0.00-9.12   sec    1.01 GBytes  953 Mbits/sec   0.020 ms
    
    Fixes: 10615907e9b5 ("net: sparx5: switchdev: adding frame DMA functionality")
    Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: sparx5: fix the maximum frame length register [+ + +]
Author: Daniel Machon <daniel.machon@microchip.com>
Date:   Thu Dec 5 14:54:28 2024 +0100

    net: sparx5: fix the maximum frame length register
    
    [ Upstream commit ddd7ba006078a2bef5971b2dc5f8383d47f96207 ]
    
    On port initialization, we configure the maximum frame length accepted
    by the receive module associated with the port. This value is currently
    written to the MAX_LEN field of the DEV10G_MAC_ENA_CFG register, when in
    fact, it should be written to the DEV10G_MAC_MAXLEN_CFG register. Fix
    this.
    
    Fixes: 946e7fd5053a ("net: sparx5: add port module support")
    Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
objtool/x86: allow syscall instruction [+ + +]
Author: Juergen Gross <jgross@suse.com>
Date:   Fri Nov 29 15:47:49 2024 +0100

    objtool/x86: allow syscall instruction
    
    commit dda014ba59331dee4f3b773a020e109932f4bd24 upstream.
    
    The syscall instruction is used in Xen PV mode for doing hypercalls.
    Allow syscall to be used in the kernel in case it is tagged with an
    unwind hint for objtool.
    
    This is part of XSA-466 / CVE-2024-53241.
    
    Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
    Signed-off-by: Juergen Gross <jgross@suse.com>
    Co-developed-by: Peter Zijlstra <peterz@infradead.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
ptp: kvm: Use decrypted memory in confidential guest on x86 [+ + +]
Author: Jeremi Piotrowski <jpiotrowski@linux.microsoft.com>
Date:   Wed Mar 8 15:05:31 2023 +0000

    ptp: kvm: Use decrypted memory in confidential guest on x86
    
    [ Upstream commit 6365ba64b4dbe8b59ddaeaa724b281f3787715d5 ]
    
    KVM_HC_CLOCK_PAIRING currently fails inside SEV-SNP guests because the
    guest passes an address to static data to the host. In confidential
    computing the host can't access arbitrary guest memory so handling the
    hypercall runs into an "rmpfault". To make the hypercall work, the guest
    needs to explicitly mark the memory as decrypted. Do that in
    kvm_arch_ptp_init(), but retain the previous behavior for
    non-confidential guests to save us from having to allocate memory.
    
    Add a new arch-specific function (kvm_arch_ptp_exit()) to free the
    allocation and mark the memory as encrypted again.
    
    Signed-off-by: Jeremi Piotrowski <jpiotrowski@linux.microsoft.com>
    Link: https://lore.kernel.org/r/20230308150531.477741-1-jpiotrowski@linux.microsoft.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Stable-dep-of: 5e7aa97c7acf ("ptp: kvm: x86: Return EOPNOTSUPP instead of ENODEV from kvm_arch_ptp_init()")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ptp: kvm: x86: Return EOPNOTSUPP instead of ENODEV from kvm_arch_ptp_init() [+ + +]
Author: Thomas Weißschuh <linux@weissschuh.net>
Date:   Tue Dec 3 18:09:55 2024 +0100

    ptp: kvm: x86: Return EOPNOTSUPP instead of ENODEV from kvm_arch_ptp_init()
    
    [ Upstream commit 5e7aa97c7acf171275ac02a8bb018c31b8918d13 ]
    
    The caller, ptp_kvm_init(), emits a warning if kvm_arch_ptp_init() exits
    with any error which is not EOPNOTSUPP:
    
            "fail to initialize ptp_kvm"
    
    Replace ENODEV with EOPNOTSUPP to avoid this spurious warning,
    aligning with the ARM implementation.
    
    Fixes: a86ed2cfa13c ("ptp: Don't print an error if ptp_kvm is not supported")
    Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
    Link: https://patch.msgid.link/20241203-kvm_ptp-eopnotsuppp-v2-1-d1d060f27aa6@weissschuh.net
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
qca_spi: Fix clock speed for multiple QCA7000 [+ + +]
Author: Stefan Wahren <wahrenst@gmx.net>
Date:   Fri Dec 6 19:46:42 2024 +0100

    qca_spi: Fix clock speed for multiple QCA7000
    
    [ Upstream commit 4dba406fac06b009873fe7a28231b9b7e4288b09 ]
    
    Storing the maximum clock speed in module parameter qcaspi_clkspeed
    has the unintended side effect that the first probed instance
    defines the value for all other instances. Fix this issue by storing
    it in max_speed_hz of the relevant SPI device.
    
    This fix keeps the priority of the speed parameter (module parameter,
    device tree property, driver default). Btw this uses the opportunity
    to get the rid of the unused member clkspeed.
    
    Fixes: 291ab06ecf67 ("net: qualcomm: new Ethernet over SPI driver for QCA7000")
    Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
    Link: https://patch.msgid.link/20241206184643.123399-2-wahrenst@gmx.net
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

qca_spi: Make driver probing reliable [+ + +]
Author: Stefan Wahren <wahrenst@gmx.net>
Date:   Fri Dec 6 19:46:43 2024 +0100

    qca_spi: Make driver probing reliable
    
    [ Upstream commit becc6399ce3b724cffe9ccb7ef0bff440bb1b62b ]
    
    The module parameter qcaspi_pluggable controls if QCA7000 signature
    should be checked at driver probe (current default) or not. Unfortunately
    this could fail in case the chip is temporary in reset, which isn't under
    total control by the Linux host. So disable this check per default
    in order to avoid unexpected probe failures.
    
    Fixes: 291ab06ecf67 ("net: qualcomm: new Ethernet over SPI driver for QCA7000")
    Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
    Link: https://patch.msgid.link/20241206184643.123399-3-wahrenst@gmx.net
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
selftests: mlxsw: sharedbuffer: Ensure no extra packets are counted [+ + +]
Author: Danielle Ratson <danieller@nvidia.com>
Date:   Thu Dec 5 17:36:01 2024 +0100

    selftests: mlxsw: sharedbuffer: Ensure no extra packets are counted
    
    [ Upstream commit 5f2c7ab15fd806043db1a7d54b5ec36be0bd93b1 ]
    
    The test assumes that the packet it is sending is the only packet being
    passed to the device.
    
    However, it is not the case and so other packets are filling the buffers
    as well. Therefore, the test sometimes fails because it is reading a
    maximum occupancy that is larger than expected.
    
    Add egress filters on $h1 and $h2 that will guarantee the above.
    
    Fixes: a865ad999603 ("selftests: mlxsw: Add shared buffer traffic test")
    Signed-off-by: Danielle Ratson <danieller@nvidia.com>
    Reviewed-by: Ido Schimmel <idosch@nvidia.com>
    Signed-off-by: Ido Schimmel <idosch@nvidia.com>
    Signed-off-by: Petr Machata <petrm@nvidia.com>
    Link: https://patch.msgid.link/64c28bc9b1cc1d78c4a73feda7cedbe9526ccf8b.1733414773.git.petrm@nvidia.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

selftests: mlxsw: sharedbuffer: Remove duplicate test cases [+ + +]
Author: Danielle Ratson <danieller@nvidia.com>
Date:   Thu Dec 5 17:36:00 2024 +0100

    selftests: mlxsw: sharedbuffer: Remove duplicate test cases
    
    [ Upstream commit 6c46ad4d1bb2e8ec2265296e53765190f6e32f33 ]
    
    On both port_tc_ip_test() and port_tc_arp_test(), the max occupancy is
    checked on $h2 twice, when only the error message is different and does not
    match the check itself.
    
    Remove the two duplicated test cases from the test.
    
    Fixes: a865ad999603 ("selftests: mlxsw: Add shared buffer traffic test")
    Signed-off-by: Danielle Ratson <danieller@nvidia.com>
    Reviewed-by: Ido Schimmel <idosch@nvidia.com>
    Signed-off-by: Ido Schimmel <idosch@nvidia.com>
    Signed-off-by: Petr Machata <petrm@nvidia.com>
    Link: https://patch.msgid.link/d9eb26f6fc16a06a30b5c2c16ad80caf502bc561.1733414773.git.petrm@nvidia.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

selftests: mlxsw: sharedbuffer: Remove h1 ingress test case [+ + +]
Author: Danielle Ratson <danieller@nvidia.com>
Date:   Thu Dec 5 17:35:59 2024 +0100

    selftests: mlxsw: sharedbuffer: Remove h1 ingress test case
    
    [ Upstream commit cf3515c556907b4da290967a2a6cbbd9ee0ee723 ]
    
    The test is sending only one packet generated with mausezahn from $h1 to
    $h2. However, for some reason, it is testing for non-zero maximum occupancy
    in both the ingress pool of $h1 and $h2. The former only passes when $h2
    happens to send a packet.
    
    Avoid intermittent failures by removing unintentional test case
    regarding the ingress pool of $h1.
    
    Fixes: a865ad999603 ("selftests: mlxsw: Add shared buffer traffic test")
    Signed-off-by: Danielle Ratson <danieller@nvidia.com>
    Reviewed-by: Ido Schimmel <idosch@nvidia.com>
    Signed-off-by: Ido Schimmel <idosch@nvidia.com>
    Signed-off-by: Petr Machata <petrm@nvidia.com>
    Link: https://patch.msgid.link/5b7344608d5e06f38209e48d8af8c92fa11b6742.1733414773.git.petrm@nvidia.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
smb: client: fix UAF in smb2_reconnect_server() [+ + +]
Author: Paulo Alcantara <pc@manguebit.com>
Date:   Mon Apr 1 14:13:10 2024 -0300

    smb: client: fix UAF in smb2_reconnect_server()
    
    commit 24a9799aa8efecd0eb55a75e35f9d8e6400063aa upstream.
    
    The UAF bug is due to smb2_reconnect_server() accessing a session that
    is already being teared down by another thread that is executing
    __cifs_put_smb_ses().  This can happen when (a) the client has
    connection to the server but no session or (b) another thread ends up
    setting @ses->ses_status again to something different than
    SES_EXITING.
    
    To fix this, we need to make sure to unconditionally set
    @ses->ses_status to SES_EXITING and prevent any other threads from
    setting a new status while we're still tearing it down.
    
    The following can be reproduced by adding some delay to right after
    the ipc is freed in __cifs_put_smb_ses() - which will give
    smb2_reconnect_server() worker a chance to run and then accessing
    @ses->ipc:
    
    kinit ...
    mount.cifs //srv/share /mnt/1 -o sec=krb5,nohandlecache,echo_interval=10
    [disconnect srv]
    ls /mnt/1 &>/dev/null
    sleep 30
    kdestroy
    [reconnect srv]
    sleep 10
    umount /mnt/1
    ...
    CIFS: VFS: Verify user has a krb5 ticket and keyutils is installed
    CIFS: VFS: \\srv Send error in SessSetup = -126
    CIFS: VFS: Verify user has a krb5 ticket and keyutils is installed
    CIFS: VFS: \\srv Send error in SessSetup = -126
    general protection fault, probably for non-canonical address
    0x6b6b6b6b6b6b6b6b: 0000 [#1] PREEMPT SMP NOPTI
    CPU: 3 PID: 50 Comm: kworker/3:1 Not tainted 6.9.0-rc2 #1
    Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-1.fc39
    04/01/2014
    Workqueue: cifsiod smb2_reconnect_server [cifs]
    RIP: 0010:__list_del_entry_valid_or_report+0x33/0xf0
    Code: 4f 08 48 85 d2 74 42 48 85 c9 74 59 48 b8 00 01 00 00 00 00 ad
    de 48 39 c2 74 61 48 b8 22 01 00 00 00 00 74 69 <48> 8b 01 48 39 f8 75
    7b 48 8b 72 08 48 39 c6 0f 85 88 00 00 00 b8
    RSP: 0018:ffffc900001bfd70 EFLAGS: 00010a83
    RAX: dead000000000122 RBX: ffff88810da53838 RCX: 6b6b6b6b6b6b6b6b
    RDX: 6b6b6b6b6b6b6b6b RSI: ffffffffc02f6878 RDI: ffff88810da53800
    RBP: ffff88810da53800 R08: 0000000000000001 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000001 R12: ffff88810c064000
    R13: 0000000000000001 R14: ffff88810c064000 R15: ffff8881039cc000
    FS: 0000000000000000(0000) GS:ffff888157c00000(0000)
    knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007fe3728b1000 CR3: 000000010caa4000 CR4: 0000000000750ef0
    PKRU: 55555554
    Call Trace:
     <TASK>
     ? die_addr+0x36/0x90
     ? exc_general_protection+0x1c1/0x3f0
     ? asm_exc_general_protection+0x26/0x30
     ? __list_del_entry_valid_or_report+0x33/0xf0
     __cifs_put_smb_ses+0x1ae/0x500 [cifs]
     smb2_reconnect_server+0x4ed/0x710 [cifs]
     process_one_work+0x205/0x6b0
     worker_thread+0x191/0x360
     ? __pfx_worker_thread+0x10/0x10
     kthread+0xe2/0x110
     ? __pfx_kthread+0x10/0x10
     ret_from_fork+0x34/0x50
     ? __pfx_kthread+0x10/0x10
     ret_from_fork_asm+0x1a/0x30
     </TASK>
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    [ Michael Krause: Naive, manual merge because the 3rd hunk would not apply. ]
    Signed-off-by: Michael Krause <mk-debian@galax.is>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
spi: aspeed: Fix an error handling path in aspeed_spi_[read|write]_user() [+ + +]
Author: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Date:   Tue Nov 19 22:30:29 2024 +0100

    spi: aspeed: Fix an error handling path in aspeed_spi_[read|write]_user()
    
    [ Upstream commit c84dda3751e945a67d71cbe3af4474aad24a5794 ]
    
    A aspeed_spi_start_user() is not balanced by a corresponding
    aspeed_spi_stop_user().
    Add the missing call.
    
    Fixes: e3228ed92893 ("spi: spi-mem: Convert Aspeed SMC driver to spi-mem")
    Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
    Link: https://patch.msgid.link/4052aa2f9a9ea342fa6af83fa991b55ce5d5819e.1732051814.git.christophe.jaillet@wanadoo.fr
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
tcp: check space before adding MPTCP SYN options [+ + +]
Author: MoYuanhao <moyuanhao3676@163.com>
Date:   Mon Dec 9 13:28:14 2024 +0100

    tcp: check space before adding MPTCP SYN options
    
    commit 06d64ab46f19ac12f59a1d2aa8cd196b2e4edb5b upstream.
    
    Ensure there is enough space before adding MPTCP options in
    tcp_syn_options().
    
    Without this check, 'remaining' could underflow, and causes issues. If
    there is not enough space, MPTCP should not be used.
    
    Signed-off-by: MoYuanhao <moyuanhao3676@163.com>
    Fixes: cec37a6e41aa ("mptcp: Handle MP_CAPABLE options for outgoing connections")
    Cc: stable@vger.kernel.org
    Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    [ Matt: Add Fixes, cc Stable, update Description ]
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Link: https://patch.msgid.link/20241209-net-mptcp-check-space-syn-v1-1-2da992bb6f74@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
team: Fix feature propagation of NETIF_F_GSO_ENCAP_ALL [+ + +]
Author: Daniel Borkmann <daniel@iogearbox.net>
Date:   Tue Dec 10 15:12:45 2024 +0100

    team: Fix feature propagation of NETIF_F_GSO_ENCAP_ALL
    
    [ Upstream commit 98712844589e06d9aa305b5077169942139fd75c ]
    
    Similar to bonding driver, add NETIF_F_GSO_ENCAP_ALL to TEAM_VLAN_FEATURES
    in order to support slave devices which propagate NETIF_F_GSO_UDP_TUNNEL &
    NETIF_F_GSO_UDP_TUNNEL_CSUM as vlan_features.
    
    Fixes: 3625920b62c3 ("teaming: fix vlan_features computing")
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Cc: Nikolay Aleksandrov <razor@blackwall.org>
    Cc: Ido Schimmel <idosch@idosch.org>
    Cc: Jiri Pirko <jiri@nvidia.com>
    Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
    Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
    Link: https://patch.msgid.link/20241210141245.327886-5-daniel@iogearbox.net
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
tipc: fix NULL deref in cleanup_bearer() [+ + +]
Author: Eric Dumazet <edumazet@google.com>
Date:   Wed Dec 4 17:05:48 2024 +0000

    tipc: fix NULL deref in cleanup_bearer()
    
    [ Upstream commit b04d86fff66b15c07505d226431f808c15b1703c ]
    
    syzbot found [1] that after blamed commit, ub->ubsock->sk
    was NULL when attempting the atomic_dec() :
    
    atomic_dec(&tipc_net(sock_net(ub->ubsock->sk))->wq_count);
    
    Fix this by caching the tipc_net pointer.
    
    [1]
    
    Oops: general protection fault, probably for non-canonical address 0xdffffc0000000006: 0000 [#1] PREEMPT SMP KASAN PTI
    KASAN: null-ptr-deref in range [0x0000000000000030-0x0000000000000037]
    CPU: 0 UID: 0 PID: 5896 Comm: kworker/0:3 Not tainted 6.13.0-rc1-next-20241203-syzkaller #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
    Workqueue: events cleanup_bearer
     RIP: 0010:read_pnet include/net/net_namespace.h:387 [inline]
     RIP: 0010:sock_net include/net/sock.h:655 [inline]
     RIP: 0010:cleanup_bearer+0x1f7/0x280 net/tipc/udp_media.c:820
    Code: 18 48 89 d8 48 c1 e8 03 42 80 3c 28 00 74 08 48 89 df e8 3c f7 99 f6 48 8b 1b 48 83 c3 30 e8 f0 e4 60 00 48 89 d8 48 c1 e8 03 <42> 80 3c 28 00 74 08 48 89 df e8 1a f7 99 f6 49 83 c7 e8 48 8b 1b
    RSP: 0018:ffffc9000410fb70 EFLAGS: 00010206
    RAX: 0000000000000006 RBX: 0000000000000030 RCX: ffff88802fe45a00
    RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffc9000410f900
    RBP: ffff88807e1f0908 R08: ffffc9000410f907 R09: 1ffff92000821f20
    R10: dffffc0000000000 R11: fffff52000821f21 R12: ffff888031d19980
    R13: dffffc0000000000 R14: dffffc0000000000 R15: ffff88807e1f0918
    FS:  0000000000000000(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000556ca050b000 CR3: 0000000031c0c000 CR4: 00000000003526f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    
    Fixes: 6a2fa13312e5 ("tipc: Fix use-after-free of kernel socket in cleanup_bearer().")
    Reported-by: syzbot+46aa5474f179dacd1a3b@syzkaller.appspotmail.com
    Closes: https://lore.kernel.org/netdev/67508b5f.050a0220.17bd51.0070.GAE@google.com/T/#u
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
    Link: https://patch.msgid.link/20241204170548.4152658-1-edumazet@google.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
tracing/kprobes: Skip symbol counting logic for module symbols in create_local_trace_kprobe() [+ + +]
Author: Nikolay Kuratov <kniv@yandex-team.ru>
Date:   Mon Dec 16 14:19:23 2024 +0300

    tracing/kprobes: Skip symbol counting logic for module symbols in create_local_trace_kprobe()
    
    commit b022f0c7e404 ("tracing/kprobes: Return EADDRNOTAVAIL when func matches several symbols")
    avoids checking number_of_same_symbols() for module symbol in
    __trace_kprobe_create(), but create_local_trace_kprobe() should avoid this
    check too. Doing this check leads to ENOENT for module_name:symbol_name
    constructions passed over perf_event_open.
    
    No bug in newer kernels as it was fixed more generally by
    commit 9d8616034f16 ("tracing/kprobes: Add symbol counting check when module loads")
    
    Link: https://lore.kernel.org/linux-trace-kernel/20240705161030.b3ddb33a8167013b9b1da202@kernel.org
    Fixes: b022f0c7e404 ("tracing/kprobes: Return EADDRNOTAVAIL when func matches several symbols")
    Signed-off-by: Nikolay Kuratov <kniv@yandex-team.ru>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
usb: dwc2: Fix HCD port connection race [+ + +]
Author: Stefan Wahren <wahrenst@gmx.net>
Date:   Mon Dec 2 01:16:31 2024 +0100

    usb: dwc2: Fix HCD port connection race
    
    commit 1cf1bd88f129f3bd647fead4dca270a5894274bb upstream.
    
    On Raspberry Pis without onboard USB hub frequent device reconnects
    can trigger a interrupt storm after DWC2 entered host clock gating.
    This is caused by a race between _dwc2_hcd_suspend() and the port
    interrupt, which sets port_connect_status. The issue occurs if
    port_connect_status is still 1, but there is no connection anymore:
    
    usb 1-1: USB disconnect, device number 25
    dwc2 3f980000.usb: _dwc2_hcd_suspend: port_connect_status: 1
    dwc2 3f980000.usb: Entering host clock gating.
    Disabling IRQ #66
    irq 66: nobody cared (try booting with the "irqpoll" option)
    CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.12.0-gc1bb81b13202-dirty #322
    Hardware name: BCM2835
    Call trace:
     unwind_backtrace from show_stack+0x10/0x14
     show_stack from dump_stack_lvl+0x50/0x64
     dump_stack_lvl from __report_bad_irq+0x38/0xc0
     __report_bad_irq from note_interrupt+0x2ac/0x2f4
     note_interrupt from handle_irq_event+0x88/0x8c
     handle_irq_event from handle_level_irq+0xb4/0x1ac
     handle_level_irq from generic_handle_domain_irq+0x24/0x34
     generic_handle_domain_irq from bcm2836_chained_handle_irq+0x24/0x28
     bcm2836_chained_handle_irq from generic_handle_domain_irq+0x24/0x34
     generic_handle_domain_irq from generic_handle_arch_irq+0x34/0x44
     generic_handle_arch_irq from __irq_svc+0x88/0xb0
     Exception stack(0xc1d01f20 to 0xc1d01f68)
     1f20: 0004ef3c 00000001 00000000 00000000 c1d09780 c1f6bb5c c1d04e54 c1c60ca8
     1f40: c1d04e94 00000000 00000000 c1d092a8 c1f6af20 c1d01f70 c1211b98 c1212f40
     1f60: 60000013 ffffffff
     __irq_svc from default_idle_call+0x1c/0xb0
     default_idle_call from do_idle+0x21c/0x284
     do_idle from cpu_startup_entry+0x28/0x2c
     cpu_startup_entry from kernel_init+0x0/0x12c
    handlers:
     [<e3a25c00>] dwc2_handle_common_intr
     [<58bf98a3>] usb_hcd_irq
    Disabling IRQ #66
    
    So avoid this by reading the connection status directly.
    
    Fixes: 113f86d0c302 ("usb: dwc2: Update partial power down entering by system suspend")
    Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
    Link: https://lore.kernel.org/r/20241202001631.75473-4-wahrenst@gmx.net
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: dwc2: Fix HCD resume [+ + +]
Author: Stefan Wahren <wahrenst@gmx.net>
Date:   Mon Dec 2 01:16:29 2024 +0100

    usb: dwc2: Fix HCD resume
    
    commit 336f72d3cbf5cc17df2947bbbd2ba6e2509f17e8 upstream.
    
    The Raspberry Pi can suffer on interrupt storms on HCD resume. The dwc2
    driver sometimes misses to enable HCD_FLAG_HW_ACCESSIBLE before re-enabling
    the interrupts. This causes a situation where both handler ignore a incoming
    port interrupt and force the upper layers to disable the dwc2 interrupt
    line. This leaves the USB interface in a unusable state:
    
    irq 66: nobody cared (try booting with the "irqpoll" option)
    CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W          6.10.0-rc3
    Hardware name: BCM2835
    Call trace:
    unwind_backtrace from show_stack+0x10/0x14
    show_stack from dump_stack_lvl+0x50/0x64
    dump_stack_lvl from __report_bad_irq+0x38/0xc0
    __report_bad_irq from note_interrupt+0x2ac/0x2f4
    note_interrupt from handle_irq_event+0x88/0x8c
    handle_irq_event from handle_level_irq+0xb4/0x1ac
    handle_level_irq from generic_handle_domain_irq+0x24/0x34
    generic_handle_domain_irq from bcm2836_chained_handle_irq+0x24/0x28
    bcm2836_chained_handle_irq from generic_handle_domain_irq+0x24/0x34
    generic_handle_domain_irq from generic_handle_arch_irq+0x34/0x44
    generic_handle_arch_irq from __irq_svc+0x88/0xb0
    Exception stack(0xc1b01f20 to 0xc1b01f68)
    1f20: 0005c0d4 00000001 00000000 00000000 c1b09780 c1d6b32c c1b04e54 c1a5eae8
    1f40: c1b04e90 00000000 00000000 00000000 c1d6a8a0 c1b01f70 c11d2da8 c11d4160
    1f60: 60000013 ffffffff
    __irq_svc from default_idle_call+0x1c/0xb0
    default_idle_call from do_idle+0x21c/0x284
    do_idle from cpu_startup_entry+0x28/0x2c
    cpu_startup_entry from kernel_init+0x0/0x12c
    handlers:
    [<f539e0f4>] dwc2_handle_common_intr
    [<75cd278b>] usb_hcd_irq
    Disabling IRQ #66
    
    So enable the HCD_FLAG_HW_ACCESSIBLE flag in case there is a port
    connection.
    
    Fixes: c74c26f6e398 ("usb: dwc2: Fix partial power down exiting by system resume")
    Closes: https://lore.kernel.org/linux-usb/3fd0c2fb-4752-45b3-94eb-42352703e1fd@gmx.net/T/
    Link: https://lore.kernel.org/all/5e8cbce0-3260-2971-484f-fc73a3b2bd28@synopsys.com/
    Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
    Link: https://lore.kernel.org/r/20241202001631.75473-2-wahrenst@gmx.net
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: dwc2: hcd: Fix GetPortStatus & SetPortFeature [+ + +]
Author: Stefan Wahren <wahrenst@gmx.net>
Date:   Mon Dec 2 01:16:30 2024 +0100

    usb: dwc2: hcd: Fix GetPortStatus & SetPortFeature
    
    commit a8d3e4a734599c7d0f6735f8db8a812e503395dd upstream.
    
    On Rasperry Pis without onboard USB hub the power cycle during
    power connect init only disable the port but never enabled it again:
    
      usb usb1-port1: attempt power cycle
    
    The port relevant part in dwc2_hcd_hub_control() is skipped in case
    port_connect_status = 0 under the assumption the core is or will be soon
    in device mode. But this assumption is wrong, because after ClearPortFeature
    USB_PORT_FEAT_POWER the port_connect_status will also be 0 and
    SetPortFeature (incl. USB_PORT_FEAT_POWER) will be a no-op.
    
    Fix the behavior of dwc2_hcd_hub_control() by replacing the
    port_connect_status check with dwc2_is_device_mode().
    
    Link: https://github.com/raspberrypi/linux/issues/6247
    Fixes: 7359d482eb4d ("staging: HCD files for the DWC2 driver")
    Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
    Link: https://lore.kernel.org/r/20241202001631.75473-3-wahrenst@gmx.net
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: dwc3: xilinx: make sure pipe clock is deselected in usb2 only mode [+ + +]
Author: Neal Frager <neal.frager@amd.com>
Date:   Mon Dec 2 23:41:51 2024 +0530

    usb: dwc3: xilinx: make sure pipe clock is deselected in usb2 only mode
    
    commit a48f744bef9ee74814a9eccb030b02223e48c76c upstream.
    
    When the USB3 PHY is not defined in the Linux device tree, there could
    still be a case where there is a USB3 PHY active on the board and enabled
    by the first stage bootloader. If serdes clock is being used then the USB
    will fail to enumerate devices in 2.0 only mode.
    
    To solve this, make sure that the PIPE clock is deselected whenever the
    USB3 PHY is not defined and guarantees that the USB2 only mode will work
    in all cases.
    
    Fixes: 9678f3361afc ("usb: dwc3: xilinx: Skip resets and USB3 register settings for USB2.0 mode")
    Cc: stable@vger.kernel.org
    Signed-off-by: Neal Frager <neal.frager@amd.com>
    Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com>
    Acked-by: Peter Korsgaard <peter@korsgaard.com>
    Link: https://lore.kernel.org/r/1733163111-1414816-1-git-send-email-radhey.shyam.pandey@amd.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: ehci-hcd: fix call balance of clocks handling routines [+ + +]
Author: Vitalii Mordan <mordan@ispras.ru>
Date:   Thu Nov 21 14:47:00 2024 +0300

    usb: ehci-hcd: fix call balance of clocks handling routines
    
    commit 97264eaaba0122a5b7e8ddd7bf4ff3ac57c2b170 upstream.
    
    If the clocks priv->iclk and priv->fclk were not enabled in ehci_hcd_sh_probe,
    they should not be disabled in any path.
    
    Conversely, if they was enabled in ehci_hcd_sh_probe, they must be disabled
    in all error paths to ensure proper cleanup.
    
    Found by Linux Verification Center (linuxtesting.org) with Klever.
    
    Fixes: 63c845522263 ("usb: ehci-hcd: Add support for SuperH EHCI.")
    Cc: stable@vger.kernel.org # ff30bd6a6618: sh: clk: Fix clk_enable() to return 0 on NULL clk
    Signed-off-by: Vitalii Mordan <mordan@ispras.ru>
    Reviewed-by: Alan Stern <stern@rowland.harvard.edu>
    Link: https://lore.kernel.org/r/20241121114700.2100520-1-mordan@ispras.ru
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: gadget: u_serial: Fix the issue that gs_start_io crashed due to accessing null pointer [+ + +]
Author: Lianqin Hu <hulianqin@vivo.com>
Date:   Tue Dec 3 12:14:16 2024 +0000

    usb: gadget: u_serial: Fix the issue that gs_start_io crashed due to accessing null pointer
    
    commit 4cfbca86f6a8b801f3254e0e3c8f2b1d2d64be2b upstream.
    
    Considering that in some extreme cases,
    when u_serial driver is accessed by multiple threads,
    Thread A is executing the open operation and calling the gs_open,
    Thread B is executing the disconnect operation and calling the
    gserial_disconnect function,The port->port_usb pointer will be set to NULL.
    
    E.g.
        Thread A                                 Thread B
        gs_open()                                gadget_unbind_driver()
        gs_start_io()                            composite_disconnect()
        gs_start_rx()                            gserial_disconnect()
        ...                                      ...
        spin_unlock(&port->port_lock)
        status = usb_ep_queue()                  spin_lock(&port->port_lock)
        spin_lock(&port->port_lock)              port->port_usb = NULL
        gs_free_requests(port->port_usb->in)     spin_unlock(&port->port_lock)
        Crash
    
    This causes thread A to access a null pointer (port->port_usb is null)
    when calling the gs_free_requests function, causing a crash.
    
    If port_usb is NULL, the release request will be skipped as it
    will be done by gserial_disconnect.
    
    So add a null pointer check to gs_start_io before attempting
    to access the value of the pointer port->port_usb.
    
    Call trace:
     gs_start_io+0x164/0x25c
     gs_open+0x108/0x13c
     tty_open+0x314/0x638
     chrdev_open+0x1b8/0x258
     do_dentry_open+0x2c4/0x700
     vfs_open+0x2c/0x3c
     path_openat+0xa64/0xc60
     do_filp_open+0xb8/0x164
     do_sys_openat2+0x84/0xf0
     __arm64_sys_openat+0x70/0x9c
     invoke_syscall+0x58/0x114
     el0_svc_common+0x80/0xe0
     do_el0_svc+0x1c/0x28
     el0_svc+0x38/0x68
    
    Fixes: c1dca562be8a ("usb gadget: split out serial core")
    Cc: stable@vger.kernel.org
    Suggested-by: Prashanth K <quic_prashk@quicinc.com>
    Signed-off-by: Lianqin Hu <hulianqin@vivo.com>
    Acked-by: Prashanth K <quic_prashk@quicinc.com>
    Link: https://lore.kernel.org/r/TYUPR06MB62178DC3473F9E1A537DCD02D2362@TYUPR06MB6217.apcprd06.prod.outlook.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: host: max3421-hcd: Correctly abort a USB request. [+ + +]
Author: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz>
Date:   Mon Nov 25 11:14:30 2024 +1300

    usb: host: max3421-hcd: Correctly abort a USB request.
    
    commit 0d2ada05227881f3d0722ca2364e3f7a860a301f upstream.
    
    If the current USB request was aborted, the spi thread would not respond
    to any further requests. This is because the "curr_urb" pointer would
    not become NULL, so no further requests would be taken off the queue.
    The solution here is to set the "urb_done" flag, as this will cause the
    correct handling of the URB. Also clear interrupts that should only be
    expected if an URB is in progress.
    
    Fixes: 2d53139f3162 ("Add support for using a MAX3421E chip as a host driver.")
    Cc: stable <stable@kernel.org>
    Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz>
    Link: https://lore.kernel.org/r/20241124221430.1106080-1-mark.tomlinson@alliedtelesis.co.nz
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: typec: anx7411: fix fwnode_handle reference leak [+ + +]
Author: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
Date:   Thu Nov 21 11:34:29 2024 +0900

    usb: typec: anx7411: fix fwnode_handle reference leak
    
    commit 645d56e4cc74e953284809d096532c1955918a28 upstream.
    
    An fwnode_handle and usb_role_switch are obtained with an incremented
    refcount in anx7411_typec_port_probe(), however the refcounts are not
    decremented in the error path. The fwnode_handle is also not decremented
    in the .remove() function. Therefore, call fwnode_handle_put() and
    usb_role_switch_put() accordingly.
    
    Fixes: fe6d8a9c8e64 ("usb: typec: anx7411: Add Analogix PD ANX7411 support")
    Cc: stable@vger.kernel.org
    Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
    Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
    Link: https://lore.kernel.org/r/20241121023429.962848-1-joe@pf.is.s.u-tokyo.ac.jp
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: typec: anx7411: fix OF node reference leaks in anx7411_typec_switch_probe() [+ + +]
Author: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
Date:   Tue Nov 26 10:49:09 2024 +0900

    usb: typec: anx7411: fix OF node reference leaks in anx7411_typec_switch_probe()
    
    commit ef42b906df5c57d0719b69419df9dfd25f25c161 upstream.
    
    The refcounts of the OF nodes obtained by of_get_child_by_name() calls
    in anx7411_typec_switch_probe() are not decremented. Replace them with
    device_get_named_child_node() calls and store the return values to the
    newly created fwnode_handle fields in anx7411_data, and call
    fwnode_handle_put() on them in the error path and in the unregister
    functions.
    
    Fixes: e45d7337dc0e ("usb: typec: anx7411: Use of_get_child_by_name() instead of of_find_node_by_name()")
    Cc: stable@vger.kernel.org
    Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
    Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
    Link: https://lore.kernel.org/r/20241126014909.3687917-1-joe@pf.is.s.u-tokyo.ac.jp
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
wifi: mac80211: clean up 'ret' in sta_link_apply_parameters() [+ + +]
Author: Johannes Berg <johannes.berg@intel.com>
Date:   Wed Jun 5 13:57:19 2024 +0300

    wifi: mac80211: clean up 'ret' in sta_link_apply_parameters()
    
    [ Upstream commit 642508a42f74d7467aae7c56dff3016db64a25bd ]
    
    There's no need to have the always-zero ret variable in
    the function scope, move it into the inner scope only.
    
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
    Link: https://msgid.link/20240605135233.eb7a24632d98.I72d7fe1da89d4b89bcfd0f5fb9057e3e69355cfe@changeid
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Stable-dep-of: 819e0f1e58e0 ("wifi: mac80211: fix station NSS capability initialization order")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: mac80211: fix station NSS capability initialization order [+ + +]
Author: Benjamin Lin <benjamin-jw.lin@mediatek.com>
Date:   Mon Nov 18 16:07:22 2024 +0800

    wifi: mac80211: fix station NSS capability initialization order
    
    [ Upstream commit 819e0f1e58e0ba3800cd9eb96b2a39e44e49df97 ]
    
    Station's spatial streaming capability should be initialized before
    handling VHT OMN, because the handling requires the capability information.
    
    Fixes: a8bca3e9371d ("wifi: mac80211: track capability/opmode NSS separately")
    Signed-off-by: Benjamin Lin <benjamin-jw.lin@mediatek.com>
    Link: https://patch.msgid.link/20241118080722.9603-1-benjamin-jw.lin@mediatek.com
    [rewrite subject]
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: nl80211: fix NL80211_ATTR_MLO_LINK_ID off-by-one [+ + +]
Author: Lin Ma <linma@zju.edu.cn>
Date:   Sun Dec 1 01:05:26 2024 +0800

    wifi: nl80211: fix NL80211_ATTR_MLO_LINK_ID off-by-one
    
    [ Upstream commit 2e3dbf938656986cce73ac4083500d0bcfbffe24 ]
    
    Since the netlink attribute range validation provides inclusive
    checking, the *max* of attribute NL80211_ATTR_MLO_LINK_ID should be
    IEEE80211_MLD_MAX_NUM_LINKS - 1 otherwise causing an off-by-one.
    
    One crash stack for demonstration:
    ==================================================================
    BUG: KASAN: wild-memory-access in ieee80211_tx_control_port+0x3b6/0xca0 net/mac80211/tx.c:5939
    Read of size 6 at addr 001102080000000c by task fuzzer.386/9508
    
    CPU: 1 PID: 9508 Comm: syz.1.386 Not tainted 6.1.70 #2
    Call Trace:
     <TASK>
     __dump_stack lib/dump_stack.c:88 [inline]
     dump_stack_lvl+0x177/0x231 lib/dump_stack.c:106
     print_report+0xe0/0x750 mm/kasan/report.c:398
     kasan_report+0x139/0x170 mm/kasan/report.c:495
     kasan_check_range+0x287/0x290 mm/kasan/generic.c:189
     memcpy+0x25/0x60 mm/kasan/shadow.c:65
     ieee80211_tx_control_port+0x3b6/0xca0 net/mac80211/tx.c:5939
     rdev_tx_control_port net/wireless/rdev-ops.h:761 [inline]
     nl80211_tx_control_port+0x7b3/0xc40 net/wireless/nl80211.c:15453
     genl_family_rcv_msg_doit+0x22e/0x320 net/netlink/genetlink.c:756
     genl_family_rcv_msg net/netlink/genetlink.c:833 [inline]
     genl_rcv_msg+0x539/0x740 net/netlink/genetlink.c:850
     netlink_rcv_skb+0x1de/0x420 net/netlink/af_netlink.c:2508
     genl_rcv+0x24/0x40 net/netlink/genetlink.c:861
     netlink_unicast_kernel net/netlink/af_netlink.c:1326 [inline]
     netlink_unicast+0x74b/0x8c0 net/netlink/af_netlink.c:1352
     netlink_sendmsg+0x882/0xb90 net/netlink/af_netlink.c:1874
     sock_sendmsg_nosec net/socket.c:716 [inline]
     __sock_sendmsg net/socket.c:728 [inline]
     ____sys_sendmsg+0x5cc/0x8f0 net/socket.c:2499
     ___sys_sendmsg+0x21c/0x290 net/socket.c:2553
     __sys_sendmsg net/socket.c:2582 [inline]
     __do_sys_sendmsg net/socket.c:2591 [inline]
     __se_sys_sendmsg+0x19e/0x270 net/socket.c:2589
     do_syscall_x64 arch/x86/entry/common.c:51 [inline]
     do_syscall_64+0x45/0x90 arch/x86/entry/common.c:81
     entry_SYSCALL_64_after_hwframe+0x63/0xcd
    
    Update the policy to ensure correct validation.
    
    Fixes: 7b0a0e3c3a88 ("wifi: cfg80211: do some rework towards MLO link APIs")
    Signed-off-by: Lin Ma <linma@zju.edu.cn>
    Suggested-by: Cengiz Can <cengiz.can@canonical.com>
    Link: https://patch.msgid.link/20241130170526.96698-1-linma@zju.edu.cn
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
x86/static-call: fix 32-bit build [+ + +]
Author: Juergen Gross <jgross@suse.com>
Date:   Wed Dec 18 09:02:28 2024 +0100

    x86/static-call: fix 32-bit build
    
    commit 349f0086ba8b2a169877d21ff15a4d9da3a60054 upstream.
    
    In 32-bit x86 builds CONFIG_STATIC_CALL_INLINE isn't set, leading to
    static_call_initialized not being available.
    
    Define it as "0" in that case.
    
    Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
    Fixes: 0ef8047b737d ("x86/static-call: provide a way to do very early static-call updates")
    Signed-off-by: Juergen Gross <jgross@suse.com>
    Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/static-call: provide a way to do very early static-call updates [+ + +]
Author: Juergen Gross <jgross@suse.com>
Date:   Fri Nov 29 16:15:54 2024 +0100

    x86/static-call: provide a way to do very early static-call updates
    
    commit 0ef8047b737d7480a5d4c46d956e97c190f13050 upstream.
    
    Add static_call_update_early() for updating static-call targets in
    very early boot.
    
    This will be needed for support of Xen guest type specific hypercall
    functions.
    
    This is part of XSA-466 / CVE-2024-53241.
    
    Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
    Signed-off-by: Juergen Gross <jgross@suse.com>
    Co-developed-by: Peter Zijlstra <peterz@infradead.org>
    Co-developed-by: Josh Poimboeuf <jpoimboe@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
x86/xen: add central hypercall functions [+ + +]
Author: Juergen Gross <jgross@suse.com>
Date:   Thu Oct 17 11:00:52 2024 +0200

    x86/xen: add central hypercall functions
    
    commit b4845bb6383821a9516ce30af3a27dc873e37fd4 upstream.
    
    Add generic hypercall functions usable for all normal (i.e. not iret)
    hypercalls. Depending on the guest type and the processor vendor
    different functions need to be used due to the to be used instruction
    for entering the hypervisor:
    
    - PV guests need to use syscall
    - HVM/PVH guests on Intel need to use vmcall
    - HVM/PVH guests on AMD and Hygon need to use vmmcall
    
    As PVH guests need to issue hypercalls very early during boot, there
    is a 4th hypercall function needed for HVM/PVH which can be used on
    Intel and AMD processors. It will check the vendor type and then set
    the Intel or AMD specific function to use via static_call().
    
    This is part of XSA-466 / CVE-2024-53241.
    
    Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
    Signed-off-by: Juergen Gross <jgross@suse.com>
    Co-developed-by: Peter Zijlstra <peterz@infradead.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/xen: don't do PV iret hypercall through hypercall page [+ + +]
Author: Juergen Gross <jgross@suse.com>
Date:   Wed Oct 16 10:40:26 2024 +0200

    x86/xen: don't do PV iret hypercall through hypercall page
    
    commit a2796dff62d6c6bfc5fbebdf2bee0d5ac0438906 upstream.
    
    Instead of jumping to the Xen hypercall page for doing the iret
    hypercall, directly code the required sequence in xen-asm.S.
    
    This is done in preparation of no longer using hypercall page at all,
    as it has shown to cause problems with speculation mitigations.
    
    This is part of XSA-466 / CVE-2024-53241.
    
    Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
    Signed-off-by: Juergen Gross <jgross@suse.com>
    Reviewed-by: Jan Beulich <jbeulich@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/xen: remove hypercall page [+ + +]
Author: Juergen Gross <jgross@suse.com>
Date:   Thu Oct 17 15:27:31 2024 +0200

    x86/xen: remove hypercall page
    
    commit 7fa0da5373685e7ed249af3fa317ab1e1ba8b0a6 upstream.
    
    The hypercall page is no longer needed. It can be removed, as from the
    Xen perspective it is optional.
    
    But, from Linux's perspective, it removes naked RET instructions that
    escape the speculative protections that Call Depth Tracking and/or
    Untrain Ret are trying to achieve.
    
    This is part of XSA-466 / CVE-2024-53241.
    
    Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
    Signed-off-by: Juergen Gross <jgross@suse.com>
    Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
    Reviewed-by: Jan Beulich <jbeulich@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/xen: use new hypercall functions instead of hypercall page [+ + +]
Author: Juergen Gross <jgross@suse.com>
Date:   Thu Oct 17 14:47:13 2024 +0200

    x86/xen: use new hypercall functions instead of hypercall page
    
    commit b1c2cb86f4a7861480ad54bb9a58df3cbebf8e92 upstream.
    
    Call the Xen hypervisor via the new xen_hypercall_func static-call
    instead of the hypercall page.
    
    This is part of XSA-466 / CVE-2024-53241.
    
    Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
    Signed-off-by: Juergen Gross <jgross@suse.com>
    Co-developed-by: Peter Zijlstra <peterz@infradead.org>
    Co-developed-by: Josh Poimboeuf <jpoimboe@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
x86: make get_cpu_vendor() accessible from Xen code [+ + +]
Author: Juergen Gross <jgross@suse.com>
Date:   Thu Oct 17 08:29:48 2024 +0200

    x86: make get_cpu_vendor() accessible from Xen code
    
    commit efbcd61d9bebb771c836a3b8bfced8165633db7c upstream.
    
    In order to be able to differentiate between AMD and Intel based
    systems for very early hypercalls without having to rely on the Xen
    hypercall page, make get_cpu_vendor() non-static.
    
    Refactor early_cpu_init() for the same reason by splitting out the
    loop initializing cpu_devs() into an externally callable function.
    
    This is part of XSA-466 / CVE-2024-53241.
    
    Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
    Signed-off-by: Juergen Gross <jgross@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
xen/netfront: fix crash when removing device [+ + +]
Author: Juergen Gross <jgross@suse.com>
Date:   Thu Nov 7 16:17:00 2024 +0100

    xen/netfront: fix crash when removing device
    
    commit f9244fb55f37356f75c739c57323d9422d7aa0f8 upstream.
    
    When removing a netfront device directly after a suspend/resume cycle
    it might happen that the queues have not been setup again, causing a
    crash during the attempt to stop the queues another time.
    
    Fix that by checking the queues are existing before trying to stop
    them.
    
    This is XSA-465 / CVE-2024-53240.
    
    Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
    Fixes: d50b7914fae0 ("xen-netfront: Fix NULL sring after live migration")
    Signed-off-by: Juergen Gross <jgross@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
xfs: don't drop errno values when we fail to ficlone the entire range [+ + +]
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Mon Dec 2 10:57:27 2024 -0800

    xfs: don't drop errno values when we fail to ficlone the entire range
    
    commit 7ce31f20a0771d71779c3b0ec9cdf474cc3c8e9a upstream.
    
    Way back when we first implemented FICLONE for XFS, life was simple --
    either the the entire remapping completed, or something happened and we
    had to return an errno explaining what happened.  Neither of those
    ioctls support returning partial results, so it's all or nothing.
    
    Then things got complicated when copy_file_range came along, because it
    actually can return the number of bytes copied, so commit 3f68c1f562f1e4
    tried to make it so that we could return a partial result if the
    REMAP_FILE_CAN_SHORTEN flag is set.  This is also how FIDEDUPERANGE can
    indicate that the kernel performed a partial deduplication.
    
    Unfortunately, the logic is wrong if an error stops the remapping and
    CAN_SHORTEN is not set.  Because those callers cannot return partial
    results, it is an error for ->remap_file_range to return a positive
    quantity that is less than the @len passed in.  Implementations really
    should be returning a negative errno in this case, because that's what
    btrfs (which introduced FICLONE{,RANGE}) did.
    
    Therefore, ->remap_range implementations cannot silently drop an errno
    that they might have when the number of bytes remapped is less than the
    number of bytes requested and CAN_SHORTEN is not set.
    
    Found by running generic/562 on a 64k fsblock filesystem and wondering
    why it reported corrupt files.
    
    Cc: <stable@vger.kernel.org> # v4.20
    Fixes: 3fc9f5e409319e ("xfs: remove xfs_reflink_remap_range")
    Really-Fixes: 3f68c1f562f1e4 ("xfs: support returning partial reflink results")
    Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

xfs: fix scrub tracepoints when inode-rooted btrees are involved [+ + +]
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Mon Dec 2 10:57:32 2024 -0800

    xfs: fix scrub tracepoints when inode-rooted btrees are involved
    
    commit ffc3ea4f3c1cc83a86b7497b0c4b0aee7de5480d upstream.
    
    Fix a minor mistakes in the scrub tracepoints that can manifest when
    inode-rooted btrees are enabled.  The existing code worked fine for bmap
    btrees, but we should tighten the code up to be less sloppy.
    
    Cc: <stable@vger.kernel.org> # v5.7
    Fixes: 92219c292af8dd ("xfs: convert btree cursor inode-private member names")
    Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

xfs: only run precommits once per transaction object [+ + +]
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Mon Dec 2 10:57:33 2024 -0800

    xfs: only run precommits once per transaction object
    
    commit 44d9b07e52db25035680713c3428016cadcd2ea1 upstream.
    
    Committing a transaction tx0 with a defer ops chain of (A, B, C)
    creates a chain of transactions that looks like this:
    
    tx0 -> txA -> txB -> txC
    
    Prior to commit cb042117488dbf, __xfs_trans_commit would run precommits
    on tx0, then call xfs_defer_finish_noroll to convert A-C to tx[A-C].
    Unfortunately, after the finish_noroll loop we forgot to run precommits
    on txC.  That was fixed by adding the second precommit call.
    
    Unfortunately, none of us remembered that xfs_defer_finish_noroll
    calls __xfs_trans_commit a second time to commit tx0 before finishing
    work A in txA and committing that.  In other words, we run precommits
    twice on tx0:
    
    xfs_trans_commit(tx0)
        __xfs_trans_commit(tx0, false)
            xfs_trans_run_precommits(tx0)
            xfs_defer_finish_noroll(tx0)
                xfs_trans_roll(tx0)
                    txA = xfs_trans_dup(tx0)
                    __xfs_trans_commit(tx0, true)
                    xfs_trans_run_precommits(tx0)
    
    This currently isn't an issue because the inode item precommit is
    idempotent; the iunlink item precommit deletes itself so it can't be
    called again; and the buffer/dquot item precommits only check the incore
    objects for corruption.  However, it doesn't make sense to run
    precommits twice.
    
    Fix this situation by only running precommits after finish_noroll.
    
    Cc: <stable@vger.kernel.org> # v6.4
    Fixes: cb042117488dbf ("xfs: defered work could create precommits")
    Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

xfs: return from xfs_symlink_verify early on V4 filesystems [+ + +]
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Mon Dec 2 10:57:43 2024 -0800

    xfs: return from xfs_symlink_verify early on V4 filesystems
    
    commit 7f8b718c58783f3ff0810b39e2f62f50ba2549f6 upstream.
    
    V4 symlink blocks didn't have headers, so return early if this is a V4
    filesystem.
    
    Cc: <stable@vger.kernel.org> # v5.1
    Fixes: 39708c20ab5133 ("xfs: miscellaneous verifier magic value fixups")
    Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

xfs: update btree keys correctly when _insrec splits an inode root block [+ + +]
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Mon Dec 2 10:57:31 2024 -0800

    xfs: update btree keys correctly when _insrec splits an inode root block
    
    commit 6d7b4bc1c3e00b1a25b7a05141a64337b4629337 upstream.
    
    In commit 2c813ad66a72, I partially fixed a bug wherein xfs_btree_insrec
    would erroneously try to update the parent's key for a block that had
    been split if we decided to insert the new record into the new block.
    The solution was to detect this situation and update the in-core key
    value that we pass up to the caller so that the caller will (eventually)
    add the new block to the parent level of the tree with the correct key.
    
    However, I missed a subtlety about the way inode-rooted btrees work.  If
    the full block was a maximally sized inode root block, we'll solve that
    fullness by moving the root block's records to a new block, resizing the
    root block, and updating the root to point to the new block.  We don't
    pass a pointer to the new block to the caller because that work has
    already been done.  The new record will /always/ land in the new block,
    so in this case we need to use xfs_btree_update_keys to update the keys.
    
    This bug can theoretically manifest itself in the very rare case that we
    split a bmbt root block and the new record lands in the very first slot
    of the new block, though I've never managed to trigger it in practice.
    However, it is very easy to reproduce by running generic/522 with the
    realtime rmapbt patchset if rtinherit=1.
    
    Cc: <stable@vger.kernel.org> # v4.8
    Fixes: 2c813ad66a7218 ("xfs: support btrees with overlapping intervals for keys")
    Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>