Changelog in Linux kernel 6.1.116

 
ACPI: CPPC: Make rmw_lock a raw_spin_lock [+ + +]
Author: Pierre Gondois <pierre.gondois@arm.com>
Date:   Mon Oct 28 13:56:56 2024 +0100

    ACPI: CPPC: Make rmw_lock a raw_spin_lock
    
    [ Upstream commit 1c10941e34c5fdc0357e46a25bd130d9cf40b925 ]
    
    The following BUG was triggered:
    
    =============================
    [ BUG: Invalid wait context ]
    6.12.0-rc2-XXX #406 Not tainted
    -----------------------------
    kworker/1:1/62 is trying to lock:
    ffffff8801593030 (&cpc_ptr->rmw_lock){+.+.}-{3:3}, at: cpc_write+0xcc/0x370
    other info that might help us debug this:
    context-{5:5}
    2 locks held by kworker/1:1/62:
      #0: ffffff897ef5ec98 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested+0x2c/0x50
      #1: ffffff880154e238 (&sg_policy->update_lock){....}-{2:2}, at: sugov_update_shared+0x3c/0x280
    stack backtrace:
    CPU: 1 UID: 0 PID: 62 Comm: kworker/1:1 Not tainted 6.12.0-rc2-g9654bd3e8806 #406
    Workqueue:  0x0 (events)
    Call trace:
      dump_backtrace+0xa4/0x130
      show_stack+0x20/0x38
      dump_stack_lvl+0x90/0xd0
      dump_stack+0x18/0x28
      __lock_acquire+0x480/0x1ad8
      lock_acquire+0x114/0x310
      _raw_spin_lock+0x50/0x70
      cpc_write+0xcc/0x370
      cppc_set_perf+0xa0/0x3a8
      cppc_cpufreq_fast_switch+0x40/0xc0
      cpufreq_driver_fast_switch+0x4c/0x218
      sugov_update_shared+0x234/0x280
      update_load_avg+0x6ec/0x7b8
      dequeue_entities+0x108/0x830
      dequeue_task_fair+0x58/0x408
      __schedule+0x4f0/0x1070
      schedule+0x54/0x130
      worker_thread+0xc0/0x2e8
      kthread+0x130/0x148
      ret_from_fork+0x10/0x20
    
    sugov_update_shared() locks a raw_spinlock while cpc_write() locks a
    spinlock.
    
    To have a correct wait-type order, update rmw_lock to a raw spinlock and
    ensure that interrupts will be disabled on the CPU holding it.
    
    Fixes: 60949b7b8054 ("ACPI: CPPC: Fix MASK_VAL() usage")
    Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
    Link: https://patch.msgid.link/20241028125657.1271512-1-pierre.gondois@arm.com
    [ rjw: Changelog edits ]
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
afs: Automatically generate trace tag enums [+ + +]
Author: David Howells <dhowells@redhat.com>
Date:   Thu Feb 23 15:24:24 2023 +0000

    afs: Automatically generate trace tag enums
    
    [ Upstream commit 2daa6404fd2f00985d5bfeb3c161f4630b46b6bf ]
    
    Automatically generate trace tag enums from the symbol -> string mapping
    tables rather than having the enums as well, thereby reducing duplicated
    data.
    
    Signed-off-by: David Howells <dhowells@redhat.com>
    cc: Marc Dionne <marc.dionne@auristor.com>
    cc: Jeff Layton <jlayton@kernel.org>
    cc: linux-afs@lists.infradead.org
    cc: linux-fsdevel@vger.kernel.org
    Stable-dep-of: 247d65fb122a ("afs: Fix missing subdir edit when renamed between parent dirs")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

afs: Fix missing subdir edit when renamed between parent dirs [+ + +]
Author: David Howells <dhowells@redhat.com>
Date:   Wed Oct 23 11:40:10 2024 +0100

    afs: Fix missing subdir edit when renamed between parent dirs
    
    [ Upstream commit 247d65fb122ad560be1c8c4d87d7374fb28b0770 ]
    
    When rename moves an AFS subdirectory between parent directories, the
    subdir also needs a bit of editing: the ".." entry needs updating to point
    to the new parent (though I don't make use of the info) and the DV needs
    incrementing by 1 to reflect the change of content.  The server also sends
    a callback break notification on the subdirectory if we have one, but we
    can take care of recovering the promise next time we access the subdir.
    
    This can be triggered by something like:
    
        mount -t afs %example.com:xfstest.test20 /xfstest.test/
        mkdir /xfstest.test/{aaa,bbb,aaa/ccc}
        touch /xfstest.test/bbb/ccc/d
        mv /xfstest.test/{aaa/ccc,bbb/ccc}
        touch /xfstest.test/bbb/ccc/e
    
    When the pathwalk for the second touch hits "ccc", kafs spots that the DV
    is incorrect and downloads it again (so the fix is not critical).
    
    Fix this, if the rename target is a directory and the old and new
    parents are different, by:
    
     (1) Incrementing the DV number of the target locally.
    
     (2) Editing the ".." entry in the target to refer to its new parent's
         vnode ID and uniquifier.
    
    Link: https://lore.kernel.org/r/3340431.1729680010@warthog.procyon.org.uk
    Fixes: 63a4681ff39c ("afs: Locally edit directory data for mkdir/create/unlink/...")
    cc: David Howells <dhowells@redhat.com>
    cc: Marc Dionne <marc.dionne@auristor.com>
    cc: linux-afs@lists.infradead.org
    Signed-off-by: David Howells <dhowells@redhat.com>
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ALSA: hda/realtek: Fix headset mic on TUXEDO Stellaris 16 Gen6 mb1 [+ + +]
Author: Christoffer Sandberg <cs@tuxedo.de>
Date:   Tue Oct 29 16:16:53 2024 +0100

    ALSA: hda/realtek: Fix headset mic on TUXEDO Stellaris 16 Gen6 mb1
    
    [ Upstream commit e49370d769e71456db3fbd982e95bab8c69f73e8 ]
    
    Quirk is needed to enable headset microphone on missing pin 0x19.
    
    Signed-off-by: Christoffer Sandberg <cs@tuxedo.de>
    Signed-off-by: Werner Sembach <wse@tuxedocomputers.com>
    Cc: <stable@vger.kernel.org>
    Link: https://patch.msgid.link/20241029151653.80726-2-wse@tuxedocomputers.com
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ALSA: hda/realtek: Limit internal Mic boost on Dell platform [+ + +]
Author: Kailang Yang <kailang@realtek.com>
Date:   Fri Oct 18 13:53:24 2024 +0800

    ALSA: hda/realtek: Limit internal Mic boost on Dell platform
    
    [ Upstream commit 78e7be018784934081afec77f96d49a2483f9188 ]
    
    Dell want to limit internal Mic boost on all Dell platform.
    
    Signed-off-by: Kailang Yang <kailang@realtek.com>
    Cc: <stable@vger.kernel.org>
    Link: https://lore.kernel.org/561fc5f5eff04b6cbd79ed173cd1c1db@realtek.com
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ALSA: usb-audio: Add quirks for Dell WD19 dock [+ + +]
Author: Jan Schär <jan@jschaer.ch>
Date:   Tue Oct 29 23:12:49 2024 +0100

    ALSA: usb-audio: Add quirks for Dell WD19 dock
    
    commit 4413665dd6c528b31284119e3571c25f371e1c36 upstream.
    
    The WD19 family of docks has the same audio chipset as the WD15. This
    change enables jack detection on the WD19.
    
    We don't need the dell_dock_mixer_init quirk for the WD19. It is only
    needed because of the dell_alc4020_map quirk for the WD15 in
    mixer_maps.c, which disables the volume controls. Even for the WD15,
    this quirk was apparently only needed when the dock firmware was not
    updated.
    
    Signed-off-by: Jan Schär <jan@jschaer.ch>
    Cc: <stable@vger.kernel.org>
    Link: https://patch.msgid.link/20241029221249.15661-1-jan@jschaer.ch
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
ASoC: cs42l51: Fix some error handling paths in cs42l51_probe() [+ + +]
Author: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Date:   Sat Oct 26 22:46:34 2024 +0200

    ASoC: cs42l51: Fix some error handling paths in cs42l51_probe()
    
    [ Upstream commit d221b844ee79823ffc29b7badc4010bdb0960224 ]
    
    If devm_gpiod_get_optional() fails, we need to disable previously enabled
    regulators, as done in the other error handling path of the function.
    
    Also, gpiod_set_value_cansleep(, 1) needs to be called to undo a
    potential gpiod_set_value_cansleep(, 0).
    If the "reset" gpio is not defined, this additional call is just a no-op.
    
    This behavior is the same as the one already in the .remove() function.
    
    Fixes: 11b9cd748e31 ("ASoC: cs42l51: add reset management")
    Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
    Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com>
    Link: https://patch.msgid.link/a5e5f4b9fb03f46abd2c93ed94b5c395972ce0d1.1729975570.git.christophe.jaillet@wanadoo.fr
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
block: fix sanity checks in blk_rq_map_user_bvec [+ + +]
Author: Xinyu Zhang <xizhang@purestorage.com>
Date:   Wed Oct 23 15:15:19 2024 -0600

    block: fix sanity checks in blk_rq_map_user_bvec
    
    [ Upstream commit 2ff949441802a8d076d9013c7761f63e8ae5a9bd ]
    
    blk_rq_map_user_bvec contains a check bytes + bv->bv_len > nr_iter which
    causes unnecessary failures in NVMe passthrough I/O, reproducible as
    follows:
    
    - register a 2 page, page-aligned buffer against a ring
    - use that buffer to do a 1 page io_uring NVMe passthrough read
    
    The second (i = 1) iteration of the loop in blk_rq_map_user_bvec will
    then have nr_iter == 1 page, bytes == 1 page, bv->bv_len == 1 page, so
    the check bytes + bv->bv_len > nr_iter will succeed, causing the I/O to
    fail. This failure is unnecessary, as when the check succeeds, it means
    we've checked the entire buffer that will be used by the request - i.e.
    blk_rq_map_user_bvec should complete successfully. Therefore, terminate
    the loop early and return successfully when the check bytes + bv->bv_len
    > nr_iter succeeds.
    
    While we're at it, also remove the check that all segments in the bvec
    are single-page. While this seems to be true for all users of the
    function, it doesn't appear to be required anywhere downstream.
    
    CC: stable@vger.kernel.org
    Signed-off-by: Xinyu Zhang <xizhang@purestorage.com>
    Co-developed-by: Uday Shankar <ushankar@purestorage.com>
    Signed-off-by: Uday Shankar <ushankar@purestorage.com>
    Fixes: 37987547932c ("block: extend functionality to map bvec iterator")
    Link: https://lore.kernel.org/r/20241023211519.4177873-1-ushankar@purestorage.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
Bluetooth: hci: fix null-ptr-deref in hci_read_supported_codecs [+ + +]
Author: Sungwoo Kim <iam@sung-woo.kim>
Date:   Tue Oct 29 19:44:41 2024 +0000

    Bluetooth: hci: fix null-ptr-deref in hci_read_supported_codecs
    
    [ Upstream commit 1e67d8641813f1876a42eeb4f532487b8a7fb0a8 ]
    
    Fix __hci_cmd_sync_sk() to return not NULL for unknown opcodes.
    
    __hci_cmd_sync_sk() returns NULL if a command returns a status event.
    However, it also returns NULL where an opcode doesn't exist in the
    hci_cc table because hci_cmd_complete_evt() assumes status = skb->data[0]
    for unknown opcodes.
    This leads to null-ptr-deref in cmd_sync for HCI_OP_READ_LOCAL_CODECS as
    there is no hci_cc for HCI_OP_READ_LOCAL_CODECS, which always assumes
    status = skb->data[0].
    
    KASAN: null-ptr-deref in range [0x0000000000000070-0x0000000000000077]
    CPU: 1 PID: 2000 Comm: kworker/u9:5 Not tainted 6.9.0-ga6bcb805883c-dirty #10
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
    Workqueue: hci7 hci_power_on
    RIP: 0010:hci_read_supported_codecs+0xb9/0x870 net/bluetooth/hci_codec.c:138
    Code: 08 48 89 ef e8 b8 c1 8f fd 48 8b 75 00 e9 96 00 00 00 49 89 c6 48 ba 00 00 00 00 00 fc ff df 4c 8d 60 70 4c 89 e3 48 c1 eb 03 <0f> b6 04 13 84 c0 0f 85 82 06 00 00 41 83 3c 24 02 77 0a e8 bf 78
    RSP: 0018:ffff888120bafac8 EFLAGS: 00010212
    RAX: 0000000000000000 RBX: 000000000000000e RCX: ffff8881173f0040
    RDX: dffffc0000000000 RSI: ffffffffa58496c0 RDI: ffff88810b9ad1e4
    RBP: ffff88810b9ac000 R08: ffffffffa77882a7 R09: 1ffffffff4ef1054
    R10: dffffc0000000000 R11: fffffbfff4ef1055 R12: 0000000000000070
    R13: 0000000000000000 R14: 0000000000000000 R15: ffff88810b9ac000
    FS:  0000000000000000(0000) GS:ffff8881f6c00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007f6ddaa3439e CR3: 0000000139764003 CR4: 0000000000770ef0
    PKRU: 55555554
    Call Trace:
     <TASK>
     hci_read_local_codecs_sync net/bluetooth/hci_sync.c:4546 [inline]
     hci_init_stage_sync net/bluetooth/hci_sync.c:3441 [inline]
     hci_init4_sync net/bluetooth/hci_sync.c:4706 [inline]
     hci_init_sync net/bluetooth/hci_sync.c:4742 [inline]
     hci_dev_init_sync net/bluetooth/hci_sync.c:4912 [inline]
     hci_dev_open_sync+0x19a9/0x2d30 net/bluetooth/hci_sync.c:4994
     hci_dev_do_open net/bluetooth/hci_core.c:483 [inline]
     hci_power_on+0x11e/0x560 net/bluetooth/hci_core.c:1015
     process_one_work kernel/workqueue.c:3267 [inline]
     process_scheduled_works+0x8ef/0x14f0 kernel/workqueue.c:3348
     worker_thread+0x91f/0xe50 kernel/workqueue.c:3429
     kthread+0x2cb/0x360 kernel/kthread.c:388
     ret_from_fork+0x4d/0x80 arch/x86/kernel/process.c:147
     ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
    
    Fixes: abfeea476c68 ("Bluetooth: hci_sync: Convert MGMT_OP_START_DISCOVERY")
    
    Signed-off-by: Sungwoo Kim <iam@sung-woo.kim>
    Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
bpf: Fix out-of-bounds write in trie_get_next_key() [+ + +]
Author: Byeonguk Jeong <jungbu2855@gmail.com>
Date:   Sat Oct 26 14:02:43 2024 +0900

    bpf: Fix out-of-bounds write in trie_get_next_key()
    
    [ Upstream commit 13400ac8fb80c57c2bfb12ebd35ee121ce9b4d21 ]
    
    trie_get_next_key() allocates a node stack with size trie->max_prefixlen,
    while it writes (trie->max_prefixlen + 1) nodes to the stack when it has
    full paths from the root to leaves. For example, consider a trie with
    max_prefixlen is 8, and the nodes with key 0x00/0, 0x00/1, 0x00/2, ...
    0x00/8 inserted. Subsequent calls to trie_get_next_key with _key with
    .prefixlen = 8 make 9 nodes be written on the node stack with size 8.
    
    Fixes: b471f2f1de8b ("bpf: implement MAP_GET_NEXT_KEY command for LPM_TRIE map")
    Signed-off-by: Byeonguk Jeong <jungbu2855@gmail.com>
    Reviewed-by: Toke Høiland-Jørgensen <toke@kernel.org>
    Tested-by: Hou Tao <houtao1@huawei.com>
    Acked-by: Hou Tao <houtao1@huawei.com>
    Link: https://lore.kernel.org/r/Zxx384ZfdlFYnz6J@localhost.localdomain
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
cgroup/bpf: use a dedicated workqueue for cgroup bpf destruction [+ + +]
Author: Chen Ridong <chenridong@huawei.com>
Date:   Tue Oct 8 11:24:56 2024 +0000

    cgroup/bpf: use a dedicated workqueue for cgroup bpf destruction
    
    [ Upstream commit 117932eea99b729ee5d12783601a4f7f5fd58a23 ]
    
    A hung_task problem shown below was found:
    
    INFO: task kworker/0:0:8 blocked for more than 327 seconds.
    "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    Workqueue: events cgroup_bpf_release
    Call Trace:
     <TASK>
     __schedule+0x5a2/0x2050
     ? find_held_lock+0x33/0x100
     ? wq_worker_sleeping+0x9e/0xe0
     schedule+0x9f/0x180
     schedule_preempt_disabled+0x25/0x50
     __mutex_lock+0x512/0x740
     ? cgroup_bpf_release+0x1e/0x4d0
     ? cgroup_bpf_release+0xcf/0x4d0
     ? process_scheduled_works+0x161/0x8a0
     ? cgroup_bpf_release+0x1e/0x4d0
     ? mutex_lock_nested+0x2b/0x40
     ? __pfx_delay_tsc+0x10/0x10
     mutex_lock_nested+0x2b/0x40
     cgroup_bpf_release+0xcf/0x4d0
     ? process_scheduled_works+0x161/0x8a0
     ? trace_event_raw_event_workqueue_execute_start+0x64/0xd0
     ? process_scheduled_works+0x161/0x8a0
     process_scheduled_works+0x23a/0x8a0
     worker_thread+0x231/0x5b0
     ? __pfx_worker_thread+0x10/0x10
     kthread+0x14d/0x1c0
     ? __pfx_kthread+0x10/0x10
     ret_from_fork+0x59/0x70
     ? __pfx_kthread+0x10/0x10
     ret_from_fork_asm+0x1b/0x30
     </TASK>
    
    This issue can be reproduced by the following pressuse test:
    1. A large number of cpuset cgroups are deleted.
    2. Set cpu on and off repeatly.
    3. Set watchdog_thresh repeatly.
    The scripts can be obtained at LINK mentioned above the signature.
    
    The reason for this issue is cgroup_mutex and cpu_hotplug_lock are
    acquired in different tasks, which may lead to deadlock.
    It can lead to a deadlock through the following steps:
    1. A large number of cpusets are deleted asynchronously, which puts a
       large number of cgroup_bpf_release works into system_wq. The max_active
       of system_wq is WQ_DFL_ACTIVE(256). Consequently, all active works are
       cgroup_bpf_release works, and many cgroup_bpf_release works will be put
       into inactive queue. As illustrated in the diagram, there are 256 (in
       the acvtive queue) + n (in the inactive queue) works.
    2. Setting watchdog_thresh will hold cpu_hotplug_lock.read and put
       smp_call_on_cpu work into system_wq. However step 1 has already filled
       system_wq, 'sscs.work' is put into inactive queue. 'sscs.work' has
       to wait until the works that were put into the inacvtive queue earlier
       have executed (n cgroup_bpf_release), so it will be blocked for a while.
    3. Cpu offline requires cpu_hotplug_lock.write, which is blocked by step 2.
    4. Cpusets that were deleted at step 1 put cgroup_release works into
       cgroup_destroy_wq. They are competing to get cgroup_mutex all the time.
       When cgroup_metux is acqured by work at css_killed_work_fn, it will
       call cpuset_css_offline, which needs to acqure cpu_hotplug_lock.read.
       However, cpuset_css_offline will be blocked for step 3.
    5. At this moment, there are 256 works in active queue that are
       cgroup_bpf_release, they are attempting to acquire cgroup_mutex, and as
       a result, all of them are blocked. Consequently, sscs.work can not be
       executed. Ultimately, this situation leads to four processes being
       blocked, forming a deadlock.
    
    system_wq(step1)                WatchDog(step2)                 cpu offline(step3)      cgroup_destroy_wq(step4)
    ...
    2000+ cgroups deleted asyn
    256 actives + n inactives
                                    __lockup_detector_reconfigure
                                    P(cpu_hotplug_lock.read)
                                    put sscs.work into system_wq
    256 + n + 1(sscs.work)
    sscs.work wait to be executed
                                    warting sscs.work finish
                                                                    percpu_down_write
                                                                    P(cpu_hotplug_lock.write)
                                                                    ...blocking...
                                                                                            css_killed_work_fn
                                                                                            P(cgroup_mutex)
                                                                                            cpuset_css_offline
                                                                                            P(cpu_hotplug_lock.read)
                                                                                            ...blocking...
    256 cgroup_bpf_release
    mutex_lock(&cgroup_mutex);
    ..blocking...
    
    To fix the problem, place cgroup_bpf_release works on a dedicated
    workqueue which can break the loop and solve the problem. System wqs are
    for misc things which shouldn't create a large number of concurrent work
    items. If something is going to generate >WQ_DFL_ACTIVE(256) concurrent
    work items, it should use its own dedicated workqueue.
    
    Fixes: 4bfc0bb2c60e ("bpf: decouple the lifetime of cgroup_bpf from cgroup itself")
    Cc: stable@vger.kernel.org # v5.3+
    Link: https://lore.kernel.org/cgroups/e90c32d2-2a85-4f28-9154-09c7d320cb60@huawei.com/T/#t
    Tested-by: Vishal Chourasia <vishalc@linux.ibm.com>
    Signed-off-by: Chen Ridong <chenridong@huawei.com>
    Signed-off-by: Tejun Heo <tj@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
cgroup: Fix potential overflow issue when checking max_depth [+ + +]
Author: Xiu Jianfeng <xiujianfeng@huawei.com>
Date:   Sat Oct 12 07:22:46 2024 +0000

    cgroup: Fix potential overflow issue when checking max_depth
    
    [ Upstream commit 3cc4e13bb1617f6a13e5e6882465984148743cf4 ]
    
    cgroup.max.depth is the maximum allowed descent depth below the current
    cgroup. If the actual descent depth is equal or larger, an attempt to
    create a new child cgroup will fail. However due to the cgroup->max_depth
    is of int type and having the default value INT_MAX, the condition
    'level > cgroup->max_depth' will never be satisfied, and it will cause
    an overflow of the level after it reaches to INT_MAX.
    
    Fix it by starting the level from 0 and using '>=' instead.
    
    It's worth mentioning that this issue is unlikely to occur in reality,
    as it's impossible to have a depth of INT_MAX hierarchy, but should be
    be avoided logically.
    
    Fixes: 1a926e0bbab8 ("cgroup: implement hierarchy limits")
    Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com>
    Reviewed-by: Michal Koutný <mkoutny@suse.com>
    Signed-off-by: Tejun Heo <tj@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
compiler-gcc: be consistent with underscores use for `no_sanitize` [+ + +]
Author: Miguel Ojeda <ojeda@kernel.org>
Date:   Fri Oct 21 13:59:52 2022 +0200

    compiler-gcc: be consistent with underscores use for `no_sanitize`
    
    [ Upstream commit 6e2be1f2ebcea42ed6044432f72f32434e60b34d ]
    
    Patch series "compiler-gcc: be consistent with underscores use for
    `no_sanitize`".
    
    This patch (of 5):
    
    Other macros that define shorthands for attributes in e.g.
    `compiler_attributes.h` and elsewhere use underscores.
    
    Link: https://lkml.kernel.org/r/20221021115956.9947-1-ojeda@kernel.org
    Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
    Reviewed-by: Nathan Chancellor <nathan@kernel.org>
    Cc: Marco Elver <elver@google.com>
    Cc: Alexander Potapenko <glider@google.com>
    Cc: Andrey Konovalov <andreyknvl@gmail.com>
    Cc: Arnd Bergmann <arnd@arndb.de>
    Cc: Dan Li <ashimida@linux.alibaba.com>
    Cc: Kees Cook <keescook@chromium.org>
    Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Cc: Nick Desaulniers <ndesaulniers@google.com>
    Cc: Uros Bizjak <ubizjak@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Stable-dep-of: 894b00a3350c ("kasan: Fix Software Tag-Based KASAN with GCC")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

compiler-gcc: remove attribute support check for `__no_sanitize_address__` [+ + +]
Author: Miguel Ojeda <ojeda@kernel.org>
Date:   Fri Oct 21 13:59:53 2022 +0200

    compiler-gcc: remove attribute support check for `__no_sanitize_address__`
    
    [ Upstream commit ae37a9a2c2d0960d643d782b426ea1aa9c05727a ]
    
    The attribute was added in GCC 4.8, while the minimum GCC version
    supported by the kernel is GCC 5.1.
    
    Therefore, remove the check.
    
    Link: https://godbolt.org/z/84v56vcn8
    Link: https://lkml.kernel.org/r/20221021115956.9947-2-ojeda@kernel.org
    Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
    Reviewed-by: Nathan Chancellor <nathan@kernel.org>
    Cc: Alexander Potapenko <glider@google.com>
    Cc: Andrey Konovalov <andreyknvl@gmail.com>
    Cc: Arnd Bergmann <arnd@arndb.de>
    Cc: Dan Li <ashimida@linux.alibaba.com>
    Cc: Kees Cook <keescook@chromium.org>
    Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Cc: Marco Elver <elver@google.com>
    Cc: Nick Desaulniers <ndesaulniers@google.com>
    Cc: Uros Bizjak <ubizjak@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Stable-dep-of: 894b00a3350c ("kasan: Fix Software Tag-Based KASAN with GCC")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
cpufreq: Avoid a bad reference count on CPU node [+ + +]
Author: Miquel Sabaté Solà <mikisabate@gmail.com>
Date:   Tue Sep 17 15:42:46 2024 +0200

    cpufreq: Avoid a bad reference count on CPU node
    
    [ Upstream commit c0f02536fffbbec71aced36d52a765f8c4493dc2 ]
    
    In the parse_perf_domain function, if the call to
    of_parse_phandle_with_args returns an error, then the reference to the
    CPU device node that was acquired at the start of the function would not
    be properly decremented.
    
    Address this by declaring the variable with the __free(device_node)
    cleanup attribute.
    
    Signed-off-by: Miquel Sabaté Solà <mikisabate@gmail.com>
    Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
    Link: https://patch.msgid.link/20240917134246.584026-1-mikisabate@gmail.com
    Cc: All applicable <stable@vger.kernel.org>
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

cpufreq: Generalize of_perf_domain_get_sharing_cpumask phandle format [+ + +]
Author: Hector Martin <marcan@marcan.st>
Date:   Mon Oct 24 13:39:23 2022 +0900

    cpufreq: Generalize of_perf_domain_get_sharing_cpumask phandle format
    
    [ Upstream commit d182dc6de93225cd853de4db68a1a77501bedb6e ]
    
    of_perf_domain_get_sharing_cpumask currently assumes a 1-argument
    phandle format, and directly returns the argument. Generalize this to
    return the full of_phandle_args, so it can be used by drivers which use
    other phandle styles (e.g. separate nodes). This also requires changing
    the CPU sharing match to compare the full args structure.
    
    Also, make sure to of_node_put(args.np) (the original code was leaking a
    reference).
    
    Signed-off-by: Hector Martin <marcan@marcan.st>
    Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
    Stable-dep-of: c0f02536fffb ("cpufreq: Avoid a bad reference count on CPU node")
    Signed-off-by: Sasha Levin <sashal@kernel.org>
 
cxl/acpi: Move rescan to the workqueue [+ + +]
Author: Dan Williams <dan.j.williams@intel.com>
Date:   Thu Dec 1 13:33:48 2022 -0800

    cxl/acpi: Move rescan to the workqueue
    
    [ Upstream commit 4029c32fb601d505dfb92bdf0db9fdcc41fe1434 ]
    
    Now that the cxl_mem driver has a need to take the root device lock, the
    cxl_bus_rescan() needs to run outside of the root lock context. That
    need arises from RCH topologies and the locking that the cxl_mem driver
    does to attach a descendant to an upstream port. In the RCH case the
    lock needed is the CXL root device lock [1].
    
    Link: http://lore.kernel.org/r/166993045621.1882361.1730100141527044744.stgit@dwillia2-xfh.jf.intel.com [1]
    Tested-by: Robert Richter <rrichter@amd.com>
    Link: http://lore.kernel.org/r/166993042884.1882361.5633723613683058881.stgit@dwillia2-xfh.jf.intel.com
    Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
    Signed-off-by: Dan Williams <dan.j.williams@intel.com>
    Stable-dep-of: 3d6ebf16438d ("cxl/port: Fix cxl_bus_rescan() vs bus_rescan_devices()")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
cxl/port: Fix cxl_bus_rescan() vs bus_rescan_devices() [+ + +]
Author: Dan Williams <dan.j.williams@intel.com>
Date:   Tue Oct 22 18:43:32 2024 -0700

    cxl/port: Fix cxl_bus_rescan() vs bus_rescan_devices()
    
    [ Upstream commit 3d6ebf16438de5d712030fefbb4182b46373d677 ]
    
    It turns out since its original introduction, pre-2.6.12,
    bus_rescan_devices() has skipped devices that might be in the process of
    attaching or detaching from their driver. For CXL this behavior is
    unwanted and expects that cxl_bus_rescan() is a probe barrier.
    
    That behavior is simple enough to achieve with bus_for_each_dev() paired
    with call to device_attach(), and it is unclear why bus_rescan_devices()
    took the position of lockless consumption of dev->driver which is racy.
    
    The "Fixes:" but no "Cc: stable" on this patch reflects that the issue
    is merely by inspection since the bug that triggered the discovery of
    this potential problem [1] is fixed by other means.  However, a stable
    backport should do no harm.
    
    Fixes: 8dd2bc0f8e02 ("cxl/mem: Add the cxl_mem driver")
    Link: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net [1]
    Signed-off-by: Dan Williams <dan.j.williams@intel.com>
    Tested-by: Gregory Price <gourry@gourry.net>
    Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
    Reviewed-by: Ira Weiny <ira.weiny@intel.com>
    Link: https://patch.msgid.link/172964781104.81806.4277549800082443769.stgit@dwillia2-xfh.jf.intel.com
    Signed-off-by: Ira Weiny <ira.weiny@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
drm/amd/display: Add null checks for 'stream' and 'plane' before dereferencing [+ + +]
Author: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Date:   Mon May 27 20:15:21 2024 +0530

    drm/amd/display: Add null checks for 'stream' and 'plane' before dereferencing
    
    commit 15c2990e0f0108b9c3752d7072a97d45d4283aea upstream.
    
    This commit adds null checks for the 'stream' and 'plane' variables in
    the dcn30_apply_idle_power_optimizations function. These variables were
    previously assumed to be null at line 922, but they were used later in
    the code without checking if they were null. This could potentially lead
    to a null pointer dereference, which would cause a crash.
    
    The null checks ensure that 'stream' and 'plane' are not null before
    they are used, preventing potential crashes.
    
    Fixes the below static smatch checker:
    drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn30/dcn30_hwseq.c:938 dcn30_apply_idle_power_optimizations() error: we previously assumed 'stream' could be null (see line 922)
    drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn30/dcn30_hwseq.c:940 dcn30_apply_idle_power_optimizations() error: we previously assumed 'plane' could be null (see line 922)
    
    Cc: Tom Chung <chiahsuan.chung@amd.com>
    Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
    Cc: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
    Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
    Cc: Roman Li <roman.li@amd.com>
    Cc: Hersen Wu <hersenxs.wu@amd.com>
    Cc: Alex Hung <alex.hung@amd.com>
    Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
    Cc: Harry Wentland <harry.wentland@amd.com>
    Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
    Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    [Xiangyu: Modified file path to backport this commit]
    Signed-off-by: Xiangyu Chen <xiangyu.chen@windriver.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/amd/display: Skip on writeback when it's not applicable [+ + +]
Author: Alex Hung <alex.hung@amd.com>
Date:   Fri Mar 15 21:25:25 2024 -0600

    drm/amd/display: Skip on writeback when it's not applicable
    
    commit ecedd99a9369fb5cde601ae9abd58bca2739f1ae upstream.
    
    [WHY]
    dynamic memory safety error detector (KASAN) catches and generates error
    messages "BUG: KASAN: slab-out-of-bounds" as writeback connector does not
    support certain features which are not initialized.
    
    [HOW]
    Skip them when connector type is DRM_MODE_CONNECTOR_WRITEBACK.
    
    Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3199
    Reviewed-by: Harry Wentland <harry.wentland@amd.com>
    Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
    Acked-by: Roman Li <roman.li@amd.com>
    Signed-off-by: Alex Hung <alex.hung@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Xiangyu Chen <xiangyu.chen@windriver.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
firmware: arm_sdei: Fix the input parameter of cpuhp_remove_state() [+ + +]
Author: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Date:   Wed Oct 16 16:47:40 2024 +0800

    firmware: arm_sdei: Fix the input parameter of cpuhp_remove_state()
    
    [ Upstream commit c83212d79be2c9886d3e6039759ecd388fd5fed1 ]
    
    In sdei_device_freeze(), the input parameter of cpuhp_remove_state() is
    passed as 'sdei_entry_point' by mistake. Change it to 'sdei_hp_state'.
    
    Fixes: d2c48b2387eb ("firmware: arm_sdei: Fix sleep from invalid context BUG")
    Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
    Reviewed-by: James Morse <james.morse@arm.com>
    Link: https://lore.kernel.org/r/20241016084740.183353-1-wangxiongfeng2@huawei.com
    Signed-off-by: Will Deacon <will@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
fs/ntfs3: Additional check in ni_clear() [+ + +]
Author: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Date:   Mon Sep 9 15:39:10 2024 +0300

    fs/ntfs3: Additional check in ni_clear()
    
    [ Upstream commit d178944db36b3369b78a08ba520de109b89bf2a9 ]
    
    Checking of NTFS_FLAGS_LOG_REPLAYING added to prevent access to
    uninitialized bitmap during replay process.
    
    Reported-by: syzbot+3bfd2cc059ab93efcdb4@syzkaller.appspotmail.com
    Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fs/ntfs3: Check if more than chunk-size bytes are written [+ + +]
Author: Andrew Ballance <andrewjballance@gmail.com>
Date:   Wed May 15 07:38:33 2024 -0500

    fs/ntfs3: Check if more than chunk-size bytes are written
    
    [ Upstream commit 9931122d04c6d431b2c11b5bb7b10f28584067f0 ]
    
    A incorrectly formatted chunk may decompress into
    more than LZNT_CHUNK_SIZE bytes and a index out of bounds
    will occur in s_max_off.
    
    Signed-off-by: Andrew Ballance <andrewjballance@gmail.com>
    Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fs/ntfs3: Fix possible deadlock in mi_read [+ + +]
Author: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Date:   Wed Aug 28 11:55:53 2024 +0300

    fs/ntfs3: Fix possible deadlock in mi_read
    
    [ Upstream commit 03b097099eef255fbf85ea6a786ae3c91b11f041 ]
    
    Mutex lock with another subclass used in ni_lock_dir().
    
    Reported-by: syzbot+bc7ca0ae4591cb2550f9@syzkaller.appspotmail.com
    Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fs/ntfs3: Fix warning possible deadlock in ntfs_set_state [+ + +]
Author: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Date:   Mon Aug 19 16:26:22 2024 +0300

    fs/ntfs3: Fix warning possible deadlock in ntfs_set_state
    
    [ Upstream commit 5b2db723455a89dc96743d34d8bdaa23a402db2f ]
    
    Use non-zero subkey to skip analyzer warnings.
    
    Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
    Reported-by: syzbot+c2ada45c23d98d646118@syzkaller.appspotmail.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fs/ntfs3: Stale inode instead of bad [+ + +]
Author: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Date:   Thu Aug 22 14:43:32 2024 +0300

    fs/ntfs3: Stale inode instead of bad
    
    [ Upstream commit 1fd21919de6de245b63066b8ee3cfba92e36f0e9 ]
    
    Fixed the logic of processing inode with wrong sequence number.
    
    Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
fs/proc/kcore.c: allow translation of physical memory addresses [+ + +]
Author: Alexander Gordeev <agordeev@linux.ibm.com>
Date:   Mon Sep 30 14:21:19 2024 +0200

    fs/proc/kcore.c: allow translation of physical memory addresses
    
    [ Upstream commit 3d5854d75e3187147613130561b58f0b06166172 ]
    
    When /proc/kcore is read an attempt to read the first two pages results in
    HW-specific page swap on s390 and another (so called prefix) pages are
    accessed instead.  That leads to a wrong read.
    
    Allow architecture-specific translation of memory addresses using
    kc_xlate_dev_mem_ptr() and kc_unxlate_dev_mem_ptr() callbacks similarily
    to /dev/mem xlate_dev_mem_ptr() and unxlate_dev_mem_ptr() callbacks.  That
    way an architecture can deal with specific physical memory ranges.
    
    Re-use the existing /dev/mem callback implementation on s390, which
    handles the described prefix pages swapping correctly.
    
    For other architectures the default callback is basically NOP.  It is
    expected the condition (vaddr == __va(__pa(vaddr))) always holds true for
    KCORE_RAM memory type.
    
    Link: https://lkml.kernel.org/r/20240930122119.1651546-1-agordeev@linux.ibm.com
    Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
    Suggested-by: Heiko Carstens <hca@linux.ibm.com>
    Cc: Vasily Gorbik <gor@linux.ibm.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
fs/proc/kcore: avoid bounce buffer for ktext data [+ + +]
Author: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Date:   Wed Mar 22 18:57:01 2023 +0000

    fs/proc/kcore: avoid bounce buffer for ktext data
    
    [ Upstream commit 2e1c0170771e6bf31bc785ea43a44e6e85e36268 ]
    
    Patch series "convert read_kcore(), vread() to use iterators", v8.
    
    While reviewing Baoquan's recent changes to permit vread() access to
    vm_map_ram regions of vmalloc allocations, Willy pointed out [1] that it
    would be nice to refactor vread() as a whole, since its only user is
    read_kcore() and the existing form of vread() necessitates the use of a
    bounce buffer.
    
    This patch series does exactly that, as well as adjusting how we read the
    kernel text section to avoid the use of a bounce buffer in this case as
    well.
    
    This has been tested against the test case which motivated Baoquan's
    changes in the first place [2] which continues to function correctly, as
    do the vmalloc self tests.
    
    This patch (of 4):
    
    Commit df04abfd181a ("fs/proc/kcore.c: Add bounce buffer for ktext data")
    introduced the use of a bounce buffer to retrieve kernel text data for
    /proc/kcore in order to avoid failures arising from hardened user copies
    enabled by CONFIG_HARDENED_USERCOPY in check_kernel_text_object().
    
    We can avoid doing this if instead of copy_to_user() we use
    _copy_to_user() which bypasses the hardening check.  This is more
    efficient than using a bounce buffer and simplifies the code.
    
    We do so as part an overall effort to eliminate bounce buffer usage in the
    function with an eye to converting it an iterator read.
    
    Link: https://lkml.kernel.org/r/cover.1679566220.git.lstoakes@gmail.com
    Link: https://lore.kernel.org/all/Y8WfDSRkc%2FOHP3oD@casper.infradead.org/ [1]
    Link: https://lore.kernel.org/all/87ilk6gos2.fsf@oracle.com/T/#u [2]
    Link: https://lkml.kernel.org/r/fd39b0bfa7edc76d360def7d034baaee71d90158.1679511146.git.lstoakes@gmail.com
    Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
    Reviewed-by: David Hildenbrand <david@redhat.com>
    Reviewed-by: Baoquan He <bhe@redhat.com>
    Cc: Alexander Viro <viro@zeniv.linux.org.uk>
    Cc: Jens Axboe <axboe@kernel.dk>
    Cc: Jiri Olsa <jolsa@kernel.org>
    Cc: Liu Shixin <liushixin2@huawei.com>
    Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Stable-dep-of: 3d5854d75e31 ("fs/proc/kcore.c: allow translation of physical memory addresses")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fs/proc/kcore: convert read_kcore() to read_kcore_iter() [+ + +]
Author: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Date:   Wed Mar 22 18:57:02 2023 +0000

    fs/proc/kcore: convert read_kcore() to read_kcore_iter()
    
    [ Upstream commit 46c0d6d0904a10785faabee53fe53ee1aa718fea ]
    
    For the time being we still use a bounce buffer for vread(), however in
    the next patch we will convert this to interact directly with the iterator
    and eliminate the bounce buffer altogether.
    
    Link: https://lkml.kernel.org/r/ebe12c8d70eebd71f487d80095605f3ad0d1489c.1679511146.git.lstoakes@gmail.com
    Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
    Reviewed-by: David Hildenbrand <david@redhat.com>
    Reviewed-by: Baoquan He <bhe@redhat.com>
    Cc: Alexander Viro <viro@zeniv.linux.org.uk>
    Cc: Jens Axboe <axboe@kernel.dk>
    Cc: Jiri Olsa <jolsa@kernel.org>
    Cc: Liu Shixin <liushixin2@huawei.com>
    Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Stable-dep-of: 3d5854d75e31 ("fs/proc/kcore.c: allow translation of physical memory addresses")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fs/proc/kcore: reinstate bounce buffer for KCORE_TEXT regions [+ + +]
Author: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Date:   Mon Jul 31 22:50:21 2023 +0100

    fs/proc/kcore: reinstate bounce buffer for KCORE_TEXT regions
    
    [ Upstream commit 17457784004c84178798432a029ab20e14f728b1 ]
    
    Some architectures do not populate the entire range categorised by
    KCORE_TEXT, so we must ensure that the kernel address we read from is
    valid.
    
    Unfortunately there is no solution currently available to do so with a
    purely iterator solution so reinstate the bounce buffer in this instance
    so we can use copy_from_kernel_nofault() in order to avoid page faults
    when regions are unmapped.
    
    This change partly reverts commit 2e1c0170771e ("fs/proc/kcore: avoid
    bounce buffer for ktext data"), reinstating the bounce buffer, but adapts
    the code to continue to use an iterator.
    
    [lstoakes@gmail.com: correct comment to be strictly correct about reasoning]
      Link: https://lkml.kernel.org/r/525a3f14-74fa-4c22-9fca-9dab4de8a0c3@lucifer.local
    Link: https://lkml.kernel.org/r/20230731215021.70911-1-lstoakes@gmail.com
    Fixes: 2e1c0170771e ("fs/proc/kcore: avoid bounce buffer for ktext data")
    Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
    Reported-by: Jiri Olsa <olsajiri@gmail.com>
    Closes: https://lore.kernel.org/all/ZHc2fm+9daF6cgCE@krava
    Tested-by: Jiri Olsa <jolsa@kernel.org>
    Tested-by: Will Deacon <will@kernel.org>
    Cc: Alexander Viro <viro@zeniv.linux.org.uk>
    Cc: Ard Biesheuvel <ardb@kernel.org>
    Cc: Baoquan He <bhe@redhat.com>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Jens Axboe <axboe@kernel.dk>
    Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
    Cc: Liu Shixin <liushixin2@huawei.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Mike Galbraith <efault@gmx.de>
    Cc: Thorsten Leemhuis <regressions@leemhuis.info>
    Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Stable-dep-of: 3d5854d75e31 ("fs/proc/kcore.c: allow translation of physical memory addresses")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
fs: create kiocb_{start,end}_write() helpers [+ + +]
Author: Amir Goldstein <amir73il@gmail.com>
Date:   Thu Aug 17 17:13:33 2023 +0300

    fs: create kiocb_{start,end}_write() helpers
    
    [ Upstream commit ed0360bbab72b829437b67ebb2f9cfac19f59dfe ]
    
    aio, io_uring, cachefiles and overlayfs, all open code an ugly variant
    of file_{start,end}_write() to silence lockdep warnings.
    
    Create helpers for this lockdep dance so we can use the helpers in all
    the callers.
    
    Suggested-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Reviewed-by: Jens Axboe <axboe@kernel.dk>
    Message-Id: <20230817141337.1025891-4-amir73il@gmail.com>
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Stable-dep-of: 1d60d74e8526 ("io_uring/rw: fix missing NOWAIT check for O_DIRECT start write")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
fsdax: dax_unshare_iter needs to copy entire blocks [+ + +]
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Thu Oct 3 08:09:48 2024 -0700

    fsdax: dax_unshare_iter needs to copy entire blocks
    
    [ Upstream commit 50793801fc7f6d08def48754fb0f0706b0cfc394 ]
    
    The code that copies data from srcmap to iomap in dax_unshare_iter is
    very very broken, which bfoster's recent fsx changes have exposed.
    
    If the pos and len passed to dax_file_unshare are not aligned to an
    fsblock boundary, the iter pos and length in the _iter function will
    reflect this unalignment.
    
    dax_iomap_direct_access always returns a pointer to the start of the
    kmapped fsdax page, even if its pos argument is in the middle of that
    page.  This is catastrophic for data integrity when iter->pos is not
    aligned to a page, because daddr/saddr do not point to the same byte in
    the file as iter->pos.  Hence we corrupt user data by copying it to the
    wrong place.
    
    If iter->pos + iomap_length() in the _iter function not aligned to a
    page, then we fail to copy a full block, and only partially populate the
    destination block.  This is catastrophic for data confidentiality
    because we expose stale pmem contents.
    
    Fix both of these issues by aligning copy_pos/copy_len to a page
    boundary (remember, this is fsdax so 1 fsblock == 1 base page) so that
    we always copy full blocks.
    
    We're not done yet -- there's no call to invalidate_inode_pages2_range,
    so programs that have the file range mmap'd will continue accessing the
    old memory mapping after the file metadata updates have completed.
    
    Be careful with the return value -- if the unshare succeeds, we still
    need to return the number of bytes that the iomap iter thinks we're
    operating on.
    
    Cc: ruansy.fnst@fujitsu.com
    Fixes: d984648e428b ("fsdax,xfs: port unshare to fsdax")
    Signed-off-by: Darrick J. Wong <djwong@kernel.org>
    Link: https://lore.kernel.org/r/172796813328.1131942.16777025316348797355.stgit@frogsfrogsfrogs
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fsdax: remove zeroing code from dax_unshare_iter [+ + +]
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Thu Oct 3 08:09:32 2024 -0700

    fsdax: remove zeroing code from dax_unshare_iter
    
    [ Upstream commit 95472274b6fed8f2d30fbdda304e12174b3d4099 ]
    
    Remove the code in dax_unshare_iter that zeroes the destination memory
    because it's not necessary.
    
    If srcmap is unwritten, we don't have to do anything because that
    unwritten extent came from the regular file mapping, and unwritten
    extents cannot be shared.  The same applies to holes.
    
    Furthermore, zeroing to unshare a mapping is just plain wrong because
    unsharing means copy on write, and we should be copying data.
    
    This is effectively a revert of commit 13dd4e04625f ("fsdax: unshare:
    zero destination if srcmap is HOLE or UNWRITTEN")
    
    Cc: ruansy.fnst@fujitsu.com
    Signed-off-by: Darrick J. Wong <djwong@kernel.org>
    Link: https://lore.kernel.org/r/172796813311.1131942.16033376284752798632.stgit@frogsfrogsfrogs
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Stable-dep-of: 50793801fc7f ("fsdax: dax_unshare_iter needs to copy entire blocks")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
gtp: allow -1 to be specified as file description from userspace [+ + +]
Author: Pablo Neira Ayuso <pablo@netfilter.org>
Date:   Tue Oct 22 16:48:25 2024 +0200

    gtp: allow -1 to be specified as file description from userspace
    
    [ Upstream commit 7515e37bce5c428a56a9b04ea7e96b3f53f17150 ]
    
    Existing user space applications maintained by the Osmocom project are
    breaking since a recent fix that addresses incorrect error checking.
    
    Restore operation for user space programs that specify -1 as file
    descriptor to skip GTPv0 or GTPv1 only sockets.
    
    Fixes: defd8b3c37b0 ("gtp: fix a potential NULL pointer dereference")
    Reported-by: Pau Espin Pedrol <pespin@sysmocom.de>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Tested-by: Oliver Smith <osmith@sysmocom.de>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://patch.msgid.link/20241022144825.66740-1-pablo@netfilter.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
iio: adc: ad7124: fix division by zero in ad7124_set_channel_odr() [+ + +]
Author: Zicheng Qu <quzicheng@huawei.com>
Date:   Tue Oct 22 13:43:30 2024 +0000

    iio: adc: ad7124: fix division by zero in ad7124_set_channel_odr()
    
    commit efa353ae1b0541981bc96dbf2e586387d0392baa upstream.
    
    In the ad7124_write_raw() function, parameter val can potentially
    be zero. This may lead to a division by zero when DIV_ROUND_CLOSEST()
    is called within ad7124_set_channel_odr(). The ad7124_write_raw()
    function is invoked through the sequence: iio_write_channel_raw() ->
    iio_write_channel_attribute() -> iio_channel_write(), with no checks
    in place to ensure val is non-zero.
    
    Cc: stable@vger.kernel.org
    Fixes: 7b8d045e497a ("iio: adc: ad7124: allow more than 8 channels")
    Signed-off-by: Zicheng Qu <quzicheng@huawei.com>
    Reviewed-by: Nuno Sa <nuno.sa@analog.com>
    Link: https://patch.msgid.link/20241022134330.574601-1-quzicheng@huawei.com
    Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

iio: light: veml6030: fix microlux value calculation [+ + +]
Author: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Date:   Wed Oct 16 19:04:31 2024 +0200

    iio: light: veml6030: fix microlux value calculation
    
    commit 63dd163cd61dda6f38343776b42331cc6b7e56e0 upstream.
    
    The raw value conversion to obtain a measurement in lux as
    INT_PLUS_MICRO does not calculate the decimal part properly to display
    it as micro (in this case microlux). It only calculates the module to
    obtain the decimal part from a resolution that is 10000 times the
    provided in the datasheet (0.5376 lux/cnt for the veml6030). The
    resulting value must still be multiplied by 100 to make it micro.
    
    This bug was introduced with the original implementation of the driver.
    
    Only the illuminance channel is fixed becuase the scale is non sensical
    for the intensity channels anyway.
    
    Cc: stable@vger.kernel.org
    Fixes: 7b779f573c48 ("iio: light: add driver for veml6030 ambient light sensor")
    Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com>
    Link: https://patch.msgid.link/20241016-veml6030-fix-processed-micro-v1-1-4a5644796437@gmail.com
    Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
io_uring/rw: fix missing NOWAIT check for O_DIRECT start write [+ + +]
Author: Jens Axboe <axboe@kernel.dk>
Date:   Thu Oct 31 08:05:44 2024 -0600

    io_uring/rw: fix missing NOWAIT check for O_DIRECT start write
    
    [ Upstream commit 1d60d74e852647255bd8e76f5a22dc42531e4389 ]
    
    When io_uring starts a write, it'll call kiocb_start_write() to bump the
    super block rwsem, preventing any freezes from happening while that
    write is in-flight. The freeze side will grab that rwsem for writing,
    excluding any new writers from happening and waiting for existing writes
    to finish. But io_uring unconditionally uses kiocb_start_write(), which
    will block if someone is currently attempting to freeze the mount point.
    This causes a deadlock where freeze is waiting for previous writes to
    complete, but the previous writes cannot complete, as the task that is
    supposed to complete them is blocked waiting on starting a new write.
    This results in the following stuck trace showing that dependency with
    the write blocked starting a new write:
    
    task:fio             state:D stack:0     pid:886   tgid:886   ppid:876
    Call trace:
     __switch_to+0x1d8/0x348
     __schedule+0x8e8/0x2248
     schedule+0x110/0x3f0
     percpu_rwsem_wait+0x1e8/0x3f8
     __percpu_down_read+0xe8/0x500
     io_write+0xbb8/0xff8
     io_issue_sqe+0x10c/0x1020
     io_submit_sqes+0x614/0x2110
     __arm64_sys_io_uring_enter+0x524/0x1038
     invoke_syscall+0x74/0x268
     el0_svc_common.constprop.0+0x160/0x238
     do_el0_svc+0x44/0x60
     el0_svc+0x44/0xb0
     el0t_64_sync_handler+0x118/0x128
     el0t_64_sync+0x168/0x170
    INFO: task fsfreeze:7364 blocked for more than 15 seconds.
          Not tainted 6.12.0-rc5-00063-g76aaf945701c #7963
    
    with the attempting freezer stuck trying to grab the rwsem:
    
    task:fsfreeze        state:D stack:0     pid:7364  tgid:7364  ppid:995
    Call trace:
     __switch_to+0x1d8/0x348
     __schedule+0x8e8/0x2248
     schedule+0x110/0x3f0
     percpu_down_write+0x2b0/0x680
     freeze_super+0x248/0x8a8
     do_vfs_ioctl+0x149c/0x1b18
     __arm64_sys_ioctl+0xd0/0x1a0
     invoke_syscall+0x74/0x268
     el0_svc_common.constprop.0+0x160/0x238
     do_el0_svc+0x44/0x60
     el0_svc+0x44/0xb0
     el0t_64_sync_handler+0x118/0x128
     el0t_64_sync+0x168/0x170
    
    Fix this by having the io_uring side honor IOCB_NOWAIT, and only attempt a
    blocking grab of the super block rwsem if it isn't set. For normal issue
    where IOCB_NOWAIT would always be set, this returns -EAGAIN which will
    have io_uring core issue a blocking attempt of the write. That will in
    turn also get completions run, ensuring forward progress.
    
    Since freezing requires CAP_SYS_ADMIN in the first place, this isn't
    something that can be triggered by a regular user.
    
    Cc: stable@vger.kernel.org # 5.10+
    Reported-by: Peter Mann <peter.mann@sh.cz>
    Link: https://lore.kernel.org/io-uring/38c94aec-81c9-4f62-b44e-1d87f5597644@sh.cz
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
io_uring: always lock __io_cqring_overflow_flush [+ + +]
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Wed Apr 10 02:26:54 2024 +0100

    io_uring: always lock __io_cqring_overflow_flush
    
    Commit 8d09a88ef9d3cb7d21d45c39b7b7c31298d23998 upstream.
    
    Conditional locking is never great, in case of
    __io_cqring_overflow_flush(), which is a slow path, it's not justified.
    Don't handle IOPOLL separately, always grab uring_lock for overflow
    flushing.
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/162947df299aa12693ac4b305dacedab32ec7976.1712708261.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

io_uring: rename kiocb_end_write() local helper [+ + +]
Author: Amir Goldstein <amir73il@gmail.com>
Date:   Thu Aug 17 17:13:31 2023 +0300

    io_uring: rename kiocb_end_write() local helper
    
    [ Upstream commit a370167fe526123637965f60859a9f1f3e1a58b7 ]
    
    This helper does not take a kiocb as input and we want to create a
    common helper by that name that takes a kiocb as input.
    
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Reviewed-by: Jens Axboe <axboe@kernel.dk>
    Message-Id: <20230817141337.1025891-2-amir73il@gmail.com>
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Stable-dep-of: 1d60d74e8526 ("io_uring/rw: fix missing NOWAIT check for O_DIRECT start write")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

io_uring: use kiocb_{start,end}_write() helpers [+ + +]
Author: Amir Goldstein <amir73il@gmail.com>
Date:   Thu Aug 17 17:13:34 2023 +0300

    io_uring: use kiocb_{start,end}_write() helpers
    
    [ Upstream commit e484fd73f4bdcb00c2188100c2d84e9f3f5c9f7d ]
    
    Use helpers instead of the open coded dance to silence lockdep warnings.
    
    Suggested-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Reviewed-by: Jens Axboe <axboe@kernel.dk>
    Message-Id: <20230817141337.1025891-5-amir73il@gmail.com>
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Stable-dep-of: 1d60d74e8526 ("io_uring/rw: fix missing NOWAIT check for O_DIRECT start write")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
iomap: convert iomap_unshare_iter to use large folios [+ + +]
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Mon Sep 18 15:57:40 2023 -0700

    iomap: convert iomap_unshare_iter to use large folios
    
    [ Upstream commit a5f31a5028d1e88e97c3b6cdc3e3bf2da085e232 ]
    
    Convert iomap_unshare_iter to create large folios if possible, since the
    write and zeroing paths already do that.  I think this got missed in the
    conversion of the write paths that landed in 6.6-rc1.
    
    Cc: ritesh.list@gmail.com, willy@infradead.org
    Signed-off-by: Darrick J. Wong <djwong@kernel.org>
    Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
    Stable-dep-of: 50793801fc7f ("fsdax: dax_unshare_iter needs to copy entire blocks")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

iomap: don't bother unsharing delalloc extents [+ + +]
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Wed Oct 2 08:00:40 2024 -0700

    iomap: don't bother unsharing delalloc extents
    
    [ Upstream commit f7a4874d977bf4202ad575031222e78809a36292 ]
    
    If unshare encounters a delalloc reservation in the srcmap, that means
    that the file range isn't shared because delalloc reservations cannot be
    reflinked.  Therefore, don't try to unshare them.
    
    Signed-off-by: Darrick J. Wong <djwong@kernel.org>
    Link: https://lore.kernel.org/r/20241002150040.GB21853@frogsfrogsfrogs
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Brian Foster <bfoster@redhat.com>
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Stable-dep-of: 50793801fc7f ("fsdax: dax_unshare_iter needs to copy entire blocks")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

iomap: improve shared block detection in iomap_unshare_iter [+ + +]
Author: Christoph Hellwig <hch@lst.de>
Date:   Tue Sep 10 07:39:04 2024 +0300

    iomap: improve shared block detection in iomap_unshare_iter
    
    [ Upstream commit b53fdb215d13f8e9c29541434bf2d14dac8bcbdc ]
    
    Currently iomap_unshare_iter relies on the IOMAP_F_SHARED flag to detect
    blocks to unshare.  This is reasonable, but IOMAP_F_SHARED is also useful
    for the file system to do internal book keeping for out of place writes.
    XFS used to that, until it got removed in commit 72a048c1056a
    ("xfs: only set IOMAP_F_SHARED when providing a srcmap to a write")
    because unshare for incorrectly unshare such blocks.
    
    Add an extra safeguard by checking the explicitly provided srcmap instead
    of the fallback to the iomap for valid data, as that catches the case
    where we'd just copy from the same place we'd write to easily, allowing
    to reinstate setting IOMAP_F_SHARED for all XFS writes that go to the
    COW fork.
    
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Link: https://lore.kernel.org/r/20240910043949.3481298-3-hch@lst.de
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Stable-dep-of: 50793801fc7f ("fsdax: dax_unshare_iter needs to copy entire blocks")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

iomap: share iomap_unshare_iter predicate code with fsdax [+ + +]
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Thu Oct 3 08:09:16 2024 -0700

    iomap: share iomap_unshare_iter predicate code with fsdax
    
    [ Upstream commit 6ef6a0e821d3dad6bf8a5d5508762dba9042c84b ]
    
    The predicate code that iomap_unshare_iter uses to decide if it's really
    needs to unshare a file range mapping should be shared with the fsdax
    version, because right now they're opencoded and inconsistent.
    
    Note that we simplify the predicate logic a bit -- we no longer allow
    unsharing of inline data mappings, but there aren't any filesystems that
    allow shared inline data currently.
    
    This is a fix in the sense that it should have been ported to fsdax.
    
    Fixes: b53fdb215d13 ("iomap: improve shared block detection in iomap_unshare_iter")
    Signed-off-by: Darrick J. Wong <djwong@kernel.org>
    Link: https://lore.kernel.org/r/172796813294.1131942.15762084021076932620.stgit@frogsfrogsfrogs
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Stable-dep-of: 50793801fc7f ("fsdax: dax_unshare_iter needs to copy entire blocks")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

iomap: turn iomap_want_unshare_iter into an inline function [+ + +]
Author: Christoph Hellwig <hch@lst.de>
Date:   Tue Oct 15 06:13:50 2024 +0200

    iomap: turn iomap_want_unshare_iter into an inline function
    
    [ Upstream commit 6db388585e486c0261aeef55f8bc63a9b45756c0 ]
    
    iomap_want_unshare_iter currently sits in fs/iomap/buffered-io.c, which
    depends on CONFIG_BLOCK.  It is also in used in fs/dax.c whіch has no
    such dependency.  Given that it is a trivial check turn it into an inline
    in include/linux/iomap.h to fix the DAX && !BLOCK build.
    
    Fixes: 6ef6a0e821d3 ("iomap: share iomap_unshare_iter predicate code with fsdax")
    Reported-by: kernel test robot <lkp@intel.com>
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Link: https://lore.kernel.org/r/20241015041350.118403-1-hch@lst.de
    Reviewed-by: Brian Foster <bfoster@redhat.com>
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ipv4: ip_tunnel: Fix suspicious RCU usage warning in ip_tunnel_init_flow() [+ + +]
Author: Ido Schimmel <idosch@nvidia.com>
Date:   Tue Oct 22 09:38:22 2024 +0300

    ipv4: ip_tunnel: Fix suspicious RCU usage warning in ip_tunnel_init_flow()
    
    [ Upstream commit ad4a3ca6a8e886f6491910a3ae5d53595e40597d ]
    
    There are code paths from which the function is called without holding
    the RCU read lock, resulting in a suspicious RCU usage warning [1].
    
    Fix by using l3mdev_master_upper_ifindex_by_index() which will acquire
    the RCU read lock before calling
    l3mdev_master_upper_ifindex_by_index_rcu().
    
    [1]
    WARNING: suspicious RCU usage
    6.12.0-rc3-custom-gac8f72681cf2 #141 Not tainted
    -----------------------------
    net/core/dev.c:876 RCU-list traversed in non-reader section!!
    
    other info that might help us debug this:
    
    rcu_scheduler_active = 2, debug_locks = 1
    1 lock held by ip/361:
     #0: ffffffff86fc7cb0 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x377/0xf60
    
    stack backtrace:
    CPU: 3 UID: 0 PID: 361 Comm: ip Not tainted 6.12.0-rc3-custom-gac8f72681cf2 #141
    Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
    Call Trace:
     <TASK>
     dump_stack_lvl+0xba/0x110
     lockdep_rcu_suspicious.cold+0x4f/0xd6
     dev_get_by_index_rcu+0x1d3/0x210
     l3mdev_master_upper_ifindex_by_index_rcu+0x2b/0xf0
     ip_tunnel_bind_dev+0x72f/0xa00
     ip_tunnel_newlink+0x368/0x7a0
     ipgre_newlink+0x14c/0x170
     __rtnl_newlink+0x1173/0x19c0
     rtnl_newlink+0x6c/0xa0
     rtnetlink_rcv_msg+0x3cc/0xf60
     netlink_rcv_skb+0x171/0x450
     netlink_unicast+0x539/0x7f0
     netlink_sendmsg+0x8c1/0xd80
     ____sys_sendmsg+0x8f9/0xc20
     ___sys_sendmsg+0x197/0x1e0
     __sys_sendmsg+0x122/0x1f0
     do_syscall_64+0xbb/0x1d0
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
    
    Fixes: db53cd3d88dc ("net: Handle l3mdev in ip_tunnel_init_flow")
    Signed-off-by: Ido Schimmel <idosch@nvidia.com>
    Reviewed-by: David Ahern <dsahern@kernel.org>
    Link: https://patch.msgid.link/20241022063822.462057-1-idosch@nvidia.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
kasan: Fix Software Tag-Based KASAN with GCC [+ + +]
Author: Marco Elver <elver@google.com>
Date:   Mon Oct 21 14:00:10 2024 +0200

    kasan: Fix Software Tag-Based KASAN with GCC
    
    [ Upstream commit 894b00a3350c560990638bdf89bdf1f3d5491950 ]
    
    Per [1], -fsanitize=kernel-hwaddress with GCC currently does not disable
    instrumentation in functions with __attribute__((no_sanitize_address)).
    
    However, __attribute__((no_sanitize("hwaddress"))) does correctly
    disable instrumentation. Use it instead.
    
    Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117196 [1]
    Link: https://lore.kernel.org/r/000000000000f362e80620e27859@google.com
    Link: https://lore.kernel.org/r/ZvFGwKfoC4yVjN_X@J2N7QTR9R3
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=218854
    Reported-by: syzbot+908886656a02769af987@syzkaller.appspotmail.com
    Tested-by: Andrey Konovalov <andreyknvl@gmail.com>
    Cc: Andrew Pinski <pinskia@gmail.com>
    Cc: Mark Rutland <mark.rutland@arm.com>
    Cc: Will Deacon <will@kernel.org>
    Signed-off-by: Marco Elver <elver@google.com>
    Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
    Fixes: 7b861a53e46b ("kasan: Bump required compiler version")
    Link: https://lore.kernel.org/r/20241021120013.3209481-1-elver@google.com
    Signed-off-by: Will Deacon <will@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

kasan: remove vmalloc_percpu test [+ + +]
Author: Andrey Konovalov <andreyknvl@gmail.com>
Date:   Tue Oct 22 18:07:06 2024 +0200

    kasan: remove vmalloc_percpu test
    
    [ Upstream commit 330d8df81f3673d6fb74550bbc9bb159d81b35f7 ]
    
    Commit 1a2473f0cbc0 ("kasan: improve vmalloc tests") added the
    vmalloc_percpu KASAN test with the assumption that __alloc_percpu always
    uses vmalloc internally, which is tagged by KASAN.
    
    However, __alloc_percpu might allocate memory from the first per-CPU
    chunk, which is not allocated via vmalloc().  As a result, the test might
    fail.
    
    Remove the test until proper KASAN annotation for the per-CPU allocated
    are added; tracked in https://bugzilla.kernel.org/show_bug.cgi?id=215019.
    
    Link: https://lkml.kernel.org/r/20241022160706.38943-1-andrey.konovalov@linux.dev
    Fixes: 1a2473f0cbc0 ("kasan: improve vmalloc tests")
    Signed-off-by: Andrey Konovalov <andreyknvl@gmail.com>
    Reported-by: Samuel Holland <samuel.holland@sifive.com>
    Link: https://lore.kernel.org/all/4a245fff-cc46-44d1-a5f9-fd2f1c3764ae@sifive.com/
    Reported-by: Sabyrzhan Tasbolatov <snovitoll@gmail.com>
    Link: https://lore.kernel.org/all/CACzwLxiWzNqPBp4C1VkaXZ2wDwvY3yZeetCi1TLGFipKW77drA@mail.gmail.com/
    Cc: Alexander Potapenko <glider@google.com>
    Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
    Cc: Dmitry Vyukov <dvyukov@google.com>
    Cc: Marco Elver <elver@google.com>
    Cc: Sabyrzhan Tasbolatov <snovitoll@gmail.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
Linux: Linux 6.1.116 [+ + +]
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date:   Fri Nov 8 16:26:48 2024 +0100

    Linux 6.1.116
    
    Link: https://lore.kernel.org/r/20241106120306.038154857@linuxfoundation.org
    Tested-by: SeongJae Park <sj@kernel.org>
    Tested-by: Pavel Machek (CIP) <pavel@denx.de>
    Tested-by: Shuah Khan <skhan@linuxfoundation.org>
    Tested-by: Peter Schneider <pschneider1968@googlemail.com>
    Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
    Tested-by: Jon Hunter <jonathanh@nvidia.com>
    Tested-by: Sven Joachim <svenjoac@gmx.de>
    Tested-by: Salvatore Bonaccorso <carnil@debian.org>
    Tested-by: Ron Economos <re@w6rz.net>
    Tested-by: Hardik Garg <hargar@linux.microsoft.com>
    Tested-by: Yann Sionneau <ysionneau@kalrayinc.com>
    Link: https://lore.kernel.org/r/20241107064547.006019150@linuxfoundation.org
    Tested-by: Luna Jernberg <droidbittin@gmail.com>
    Tested-by: Pavel Machek (CIP) <pavel@denx.de>
    Tested-by: Jon Hunter <jonathanh@nvidia.com>
    Tested-by: Peter Schneider <pschneider1968@googlemail.com>
    Tested-by: Mark Brown <broonie@kernel.org>
    Tested-by: Ronald Warsow <rwarsow@gmx.de>
    Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
    Tested-by: Salvatore Bonaccorso <carnil@debian.org>
    Tested-by: Ron Economos <re@w6rz.net>
    Tested-by: Hardik Garg <hargar@linux.microsoft.com>
    Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
    Tested-by: kernelci.org bot <bot@kernelci.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
LoongArch: Fix build errors due to backported TIMENS [+ + +]
Author: Huacai Chen <chenhuacai@kernel.org>
Date:   Sat Nov 2 11:36:16 2024 +0800

    LoongArch: Fix build errors due to backported TIMENS
    
    Commit eb3710efffce1dcff83761db4615f91d93aabfcb ("LoongArch: Add support
    to clone a time namespace") backports the TIMENS support for LoongArch
    (corresponding upstream commit aa5e65dc0818bbf676bf06927368ec46867778fd)
    but causes build errors:
    
      CC      arch/loongarch/kernel/vdso.o
    arch/loongarch/kernel/vdso.c: In function ‘vvar_fault’:
    arch/loongarch/kernel/vdso.c:54:36: error: implicit declaration of
    function ‘find_timens_vvar_page’ [-Werror=implicit-function-declaration]
       54 |         struct page *timens_page = find_timens_vvar_page(vma);
          |                                    ^~~~~~~~~~~~~~~~~~~~~
    arch/loongarch/kernel/vdso.c:54:36: warning: initialization of ‘struct
    page *’ from ‘int’ makes pointer from integer without a cast
    [-Wint-conversion]
    arch/loongarch/kernel/vdso.c: In function ‘vdso_join_timens’:
    arch/loongarch/kernel/vdso.c:143:25: error: implicit declaration of
    function ‘zap_vma_pages’; did you mean ‘zap_vma_ptes’?
    [-Werror=implicit-function-declaration]
      143 |                         zap_vma_pages(vma);
          |                         ^~~~~~~~~~~~~
          |                         zap_vma_ptes
    cc1: some warnings being treated as errors
    
    Because in 6.1.y we should define find_timens_vvar_page() by ourselves
    and use zap_page_range() instead of zap_vma_pages(), so fix it.
    
    Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
mac80211: MAC80211_MESSAGE_TRACING should depend on TRACING [+ + +]
Author: Geert Uytterhoeven <geert@linux-m68k.org>
Date:   Tue Sep 24 14:08:57 2024 +0200

    mac80211: MAC80211_MESSAGE_TRACING should depend on TRACING
    
    [ Upstream commit b3e046c31441d182b954fc2f57b2dc38c71ad4bc ]
    
    When tracing is disabled, there is no point in asking the user about
    enabling tracing of all mac80211 debug messages.
    
    Fixes: 3fae0273168026ed ("mac80211: trace debug messages")
    Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
    Link: https://patch.msgid.link/85bbe38ce0df13350f45714e2dc288cc70947a19.1727179690.git.geert@linux-m68k.org
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
macsec: Fix use-after-free while sending the offloading packet [+ + +]
Author: Jianbo Liu <jianbol@nvidia.com>
Date:   Mon Oct 21 13:03:09 2024 +0300

    macsec: Fix use-after-free while sending the offloading packet
    
    [ Upstream commit f1e54d11b210b53d418ff1476c6b58a2f434dfc0 ]
    
    KASAN reports the following UAF. The metadata_dst, which is used to
    store the SCI value for macsec offload, is already freed by
    metadata_dst_free() in macsec_free_netdev(), while driver still use it
    for sending the packet.
    
    To fix this issue, dst_release() is used instead to release
    metadata_dst. So it is not freed instantly in macsec_free_netdev() if
    still referenced by skb.
    
     BUG: KASAN: slab-use-after-free in mlx5e_xmit+0x1e8f/0x4190 [mlx5_core]
     Read of size 2 at addr ffff88813e42e038 by task kworker/7:2/714
     [...]
     Workqueue: mld mld_ifc_work
     Call Trace:
      <TASK>
      dump_stack_lvl+0x51/0x60
      print_report+0xc1/0x600
      kasan_report+0xab/0xe0
      mlx5e_xmit+0x1e8f/0x4190 [mlx5_core]
      dev_hard_start_xmit+0x120/0x530
      sch_direct_xmit+0x149/0x11e0
      __qdisc_run+0x3ad/0x1730
      __dev_queue_xmit+0x1196/0x2ed0
      vlan_dev_hard_start_xmit+0x32e/0x510 [8021q]
      dev_hard_start_xmit+0x120/0x530
      __dev_queue_xmit+0x14a7/0x2ed0
      macsec_start_xmit+0x13e9/0x2340
      dev_hard_start_xmit+0x120/0x530
      __dev_queue_xmit+0x14a7/0x2ed0
      ip6_finish_output2+0x923/0x1a70
      ip6_finish_output+0x2d7/0x970
      ip6_output+0x1ce/0x3a0
      NF_HOOK.constprop.0+0x15f/0x190
      mld_sendpack+0x59a/0xbd0
      mld_ifc_work+0x48a/0xa80
      process_one_work+0x5aa/0xe50
      worker_thread+0x79c/0x1290
      kthread+0x28f/0x350
      ret_from_fork+0x2d/0x70
      ret_from_fork_asm+0x11/0x20
      </TASK>
    
     Allocated by task 3922:
      kasan_save_stack+0x20/0x40
      kasan_save_track+0x10/0x30
      __kasan_kmalloc+0x77/0x90
      __kmalloc_noprof+0x188/0x400
      metadata_dst_alloc+0x1f/0x4e0
      macsec_newlink+0x914/0x1410
      __rtnl_newlink+0xe08/0x15b0
      rtnl_newlink+0x5f/0x90
      rtnetlink_rcv_msg+0x667/0xa80
      netlink_rcv_skb+0x12c/0x360
      netlink_unicast+0x551/0x770
      netlink_sendmsg+0x72d/0xbd0
      __sock_sendmsg+0xc5/0x190
      ____sys_sendmsg+0x52e/0x6a0
      ___sys_sendmsg+0xeb/0x170
      __sys_sendmsg+0xb5/0x140
      do_syscall_64+0x4c/0x100
      entry_SYSCALL_64_after_hwframe+0x4b/0x53
    
     Freed by task 4011:
      kasan_save_stack+0x20/0x40
      kasan_save_track+0x10/0x30
      kasan_save_free_info+0x37/0x50
      poison_slab_object+0x10c/0x190
      __kasan_slab_free+0x11/0x30
      kfree+0xe0/0x290
      macsec_free_netdev+0x3f/0x140
      netdev_run_todo+0x450/0xc70
      rtnetlink_rcv_msg+0x66f/0xa80
      netlink_rcv_skb+0x12c/0x360
      netlink_unicast+0x551/0x770
      netlink_sendmsg+0x72d/0xbd0
      __sock_sendmsg+0xc5/0x190
      ____sys_sendmsg+0x52e/0x6a0
      ___sys_sendmsg+0xeb/0x170
      __sys_sendmsg+0xb5/0x140
      do_syscall_64+0x4c/0x100
      entry_SYSCALL_64_after_hwframe+0x4b/0x53
    
    Fixes: 0a28bfd4971f ("net/macsec: Add MACsec skb_metadata_dst Tx Data path support")
    Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
    Reviewed-by: Patrisious Haddad <phaddad@nvidia.com>
    Reviewed-by: Chris Mi <cmi@nvidia.com>
    Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
    Link: https://patch.msgid.link/20241021100309.234125-1-tariqt@nvidia.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
mctp i2c: handle NULL header address [+ + +]
Author: Matt Johnston <matt@codeconstruct.com.au>
Date:   Tue Oct 22 18:25:14 2024 +0800

    mctp i2c: handle NULL header address
    
    [ Upstream commit 01e215975fd80af81b5b79f009d49ddd35976c13 ]
    
    daddr can be NULL if there is no neighbour table entry present,
    in that case the tx packet should be dropped.
    
    saddr will usually be set by MCTP core, but check for NULL in case a
    packet is transmitted by a different protocol.
    
    Fixes: f5b8abf9fc3d ("mctp i2c: MCTP I2C binding driver")
    Cc: stable@vger.kernel.org
    Reported-by: Dung Cao <dung@os.amperecomputing.com>
    Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://patch.msgid.link/20241022-mctp-i2c-null-dest-v3-1-e929709956c5@codeconstruct.com.au
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
migrate: convert migrate_pages() to use folios [+ + +]
Author: Huang Ying <ying.huang@intel.com>
Date:   Wed Nov 9 09:23:48 2022 +0800

    migrate: convert migrate_pages() to use folios
    
    [ Upstream commit eaec4e639f11413ce75fbf38affd1aa5c40979e9 ]
    
    Quite straightforward, the page functions are converted to corresponding
    folio functions.  Same for comments.
    
    THP specific code are converted to be large folio.
    
    Link: https://lkml.kernel.org/r/20221109012348.93849-3-ying.huang@intel.com
    Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
    Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Tested-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Cc: Zi Yan <ziy@nvidia.com>
    Cc: Yang Shi <shy828301@gmail.com>
    Cc: Oscar Salvador <osalvador@suse.de>
    Cc: Matthew Wilcox <willy@infradead.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Stable-dep-of: 35e41024c4c2 ("vmscan,migrate: fix page count imbalance on node stats when demoting pages")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

migrate: convert unmap_and_move() to use folios [+ + +]
Author: Huang Ying <ying.huang@intel.com>
Date:   Wed Nov 9 09:23:47 2022 +0800

    migrate: convert unmap_and_move() to use folios
    
    [ Upstream commit 49f51859221a3dfee27488eaeaff800459cac6a9 ]
    
    Patch series "migrate: convert migrate_pages()/unmap_and_move() to use
    folios", v2.
    
    The conversion is quite straightforward, just replace the page API to the
    corresponding folio API.  migrate_pages() and unmap_and_move() mostly work
    with folios (head pages) only.
    
    This patch (of 2):
    
    Quite straightforward, the page functions are converted to corresponding
    folio functions.  Same for comments.
    
    Link: https://lkml.kernel.org/r/20221109012348.93849-1-ying.huang@intel.com
    Link: https://lkml.kernel.org/r/20221109012348.93849-2-ying.huang@intel.com
    Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
    Reviewed-by: Yang Shi <shy828301@gmail.com>
    Reviewed-by: Zi Yan <ziy@nvidia.com>
    Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
    Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Cc: Oscar Salvador <osalvador@suse.de>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Stable-dep-of: 35e41024c4c2 ("vmscan,migrate: fix page count imbalance on node stats when demoting pages")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
migrate_pages: organize stats with struct migrate_pages_stats [+ + +]
Author: Huang Ying <ying.huang@intel.com>
Date:   Mon Feb 13 20:34:36 2023 +0800

    migrate_pages: organize stats with struct migrate_pages_stats
    
    [ Upstream commit 5b855937096aea7f81e73ad6d40d433c9dd49577 ]
    
    Patch series "migrate_pages(): batch TLB flushing", v5.
    
    Now, migrate_pages() migrates folios one by one, like the fake code as
    follows,
    
      for each folio
        unmap
        flush TLB
        copy
        restore map
    
    If multiple folios are passed to migrate_pages(), there are opportunities
    to batch the TLB flushing and copying.  That is, we can change the code to
    something as follows,
    
      for each folio
        unmap
      for each folio
        flush TLB
      for each folio
        copy
      for each folio
        restore map
    
    The total number of TLB flushing IPI can be reduced considerably.  And we
    may use some hardware accelerator such as DSA to accelerate the folio
    copying.
    
    So in this patch, we refactor the migrate_pages() implementation and
    implement the TLB flushing batching.  Base on this, hardware accelerated
    folio copying can be implemented.
    
    If too many folios are passed to migrate_pages(), in the naive batched
    implementation, we may unmap too many folios at the same time.  The
    possibility for a task to wait for the migrated folios to be mapped again
    increases.  So the latency may be hurt.  To deal with this issue, the max
    number of folios be unmapped in batch is restricted to no more than
    HPAGE_PMD_NR in the unit of page.  That is, the influence is at the same
    level of THP migration.
    
    We use the following test to measure the performance impact of the
    patchset,
    
    On a 2-socket Intel server,
    
     - Run pmbench memory accessing benchmark
    
     - Run `migratepages` to migrate pages of pmbench between node 0 and
       node 1 back and forth.
    
    With the patch, the TLB flushing IPI reduces 99.1% during the test and
    the number of pages migrated successfully per second increases 291.7%.
    
    Xin Hao helped to test the patchset on an ARM64 server with 128 cores,
    2 NUMA nodes.  Test results show that the page migration performance
    increases up to 78%.
    
    This patch (of 9):
    
    Define struct migrate_pages_stats to organize the various statistics in
    migrate_pages().  This makes it easier to collect and consume the
    statistics in multiple functions.  This will be needed in the following
    patches in the series.
    
    Link: https://lkml.kernel.org/r/20230213123444.155149-1-ying.huang@intel.com
    Link: https://lkml.kernel.org/r/20230213123444.155149-2-ying.huang@intel.com
    Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
    Reviewed-by: Alistair Popple <apopple@nvidia.com>
    Reviewed-by: Zi Yan <ziy@nvidia.com>
    Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Reviewed-by: Xin Hao <xhao@linux.alibaba.com>
    Cc: Yang Shi <shy828301@gmail.com>
    Cc: Oscar Salvador <osalvador@suse.de>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Bharata B Rao <bharata@amd.com>
    Cc: Minchan Kim <minchan@kernel.org>
    Cc: Mike Kravetz <mike.kravetz@oracle.com>
    Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Stable-dep-of: 35e41024c4c2 ("vmscan,migrate: fix page count imbalance on node stats when demoting pages")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

migrate_pages: restrict number of pages to migrate in batch [+ + +]
Author: Huang Ying <ying.huang@intel.com>
Date:   Mon Feb 13 20:34:38 2023 +0800

    migrate_pages: restrict number of pages to migrate in batch
    
    [ Upstream commit 42012e0436d44aeb2e68f11a28ddd0ad3f38b61f ]
    
    This is a preparation patch to batch the folio unmapping and moving for
    non-hugetlb folios.
    
    If we had batched the folio unmapping, all folios to be migrated would be
    unmapped before copying the contents and flags of the folios.  If the
    folios that were passed to migrate_pages() were too many in unit of pages,
    the execution of the processes would be stopped for too long time, thus
    too long latency.  For example, migrate_pages() syscall will call
    migrate_pages() with all folios of a process.  To avoid this possible
    issue, in this patch, we restrict the number of pages to be migrated to be
    no more than HPAGE_PMD_NR.  That is, the influence is at the same level of
    THP migration.
    
    Link: https://lkml.kernel.org/r/20230213123444.155149-4-ying.huang@intel.com
    Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
    Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Cc: Zi Yan <ziy@nvidia.com>
    Cc: Yang Shi <shy828301@gmail.com>
    Cc: Oscar Salvador <osalvador@suse.de>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Bharata B Rao <bharata@amd.com>
    Cc: Alistair Popple <apopple@nvidia.com>
    Cc: Xin Hao <xhao@linux.alibaba.com>
    Cc: Minchan Kim <minchan@kernel.org>
    Cc: Mike Kravetz <mike.kravetz@oracle.com>
    Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Stable-dep-of: 35e41024c4c2 ("vmscan,migrate: fix page count imbalance on node stats when demoting pages")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

migrate_pages: separate hugetlb folios migration [+ + +]
Author: Huang Ying <ying.huang@intel.com>
Date:   Mon Feb 13 20:34:37 2023 +0800

    migrate_pages: separate hugetlb folios migration
    
    [ Upstream commit e5bfff8b10e496378da4b7863479dd6fb907d4ea ]
    
    This is a preparation patch to batch the folio unmapping and moving for
    the non-hugetlb folios.  Based on that we can batch the TLB shootdown
    during the folio migration and make it possible to use some hardware
    accelerator for the folio copying.
    
    In this patch the hugetlb folios and non-hugetlb folios migration is
    separated in migrate_pages() to make it easy to change the non-hugetlb
    folios migration implementation.
    
    Link: https://lkml.kernel.org/r/20230213123444.155149-3-ying.huang@intel.com
    Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
    Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Reviewed-by: Xin Hao <xhao@linux.alibaba.com>
    Cc: Zi Yan <ziy@nvidia.com>
    Cc: Yang Shi <shy828301@gmail.com>
    Cc: Oscar Salvador <osalvador@suse.de>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Bharata B Rao <bharata@amd.com>
    Cc: Alistair Popple <apopple@nvidia.com>
    Cc: Minchan Kim <minchan@kernel.org>
    Cc: Mike Kravetz <mike.kravetz@oracle.com>
    Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Stable-dep-of: 35e41024c4c2 ("vmscan,migrate: fix page count imbalance on node stats when demoting pages")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

migrate_pages: split unmap_and_move() to _unmap() and _move() [+ + +]
Author: Huang Ying <ying.huang@intel.com>
Date:   Mon Feb 13 20:34:39 2023 +0800

    migrate_pages: split unmap_and_move() to _unmap() and _move()
    
    [ Upstream commit 64c8902ed4418317cd416c566f896bd4a92b2efc ]
    
    This is a preparation patch to batch the folio unmapping and moving.
    
    In this patch, unmap_and_move() is split to migrate_folio_unmap() and
    migrate_folio_move().  So, we can batch _unmap() and _move() in different
    loops later.  To pass some information between unmap and move, the
    original unused dst->mapping and dst->private are used.
    
    Link: https://lkml.kernel.org/r/20230213123444.155149-5-ying.huang@intel.com
    Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
    Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Reviewed-by: Xin Hao <xhao@linux.alibaba.com>
    Cc: Zi Yan <ziy@nvidia.com>
    Cc: Yang Shi <shy828301@gmail.com>
    Cc: Oscar Salvador <osalvador@suse.de>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Bharata B Rao <bharata@amd.com>
    Cc: Alistair Popple <apopple@nvidia.com>
    Cc: Minchan Kim <minchan@kernel.org>
    Cc: Mike Kravetz <mike.kravetz@oracle.com>
    Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Stable-dep-of: 35e41024c4c2 ("vmscan,migrate: fix page count imbalance on node stats when demoting pages")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
migrate_pages_batch: fix statistics for longterm pin retry [+ + +]
Author: Huang Ying <ying.huang@intel.com>
Date:   Mon Apr 17 07:59:29 2023 +0800

    migrate_pages_batch: fix statistics for longterm pin retry
    
    commit 851ae6424697d1c4f085cb878c88168923ebcad1 upstream.
    
    In commit fd4a7ac32918 ("mm: migrate: try again if THP split is failed due
    to page refcnt"), if the THP splitting fails due to page reference count,
    we will retry to improve migration successful rate.  But the failed
    splitting is counted as migration failure and migration retry, which will
    cause duplicated failure counting.  So, in this patch, this is fixed via
    undoing the failure counting if we decide to retry.  The patch is tested
    via failure injection.
    
    Link: https://lkml.kernel.org/r/20230416235929.1040194-1-ying.huang@intel.com
    Fixes: fd4a7ac32918 ("mm: migrate: try again if THP split is failed due to page refcnt")
    Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
    Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Cc: Alistair Popple <apopple@nvidia.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Yang Shi <shy828301@gmail.com>
    Cc: Zi Yan <ziy@nvidia.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
misc: sgi-gru: Don't disable preemption in GRU driver [+ + +]
Author: Dimitri Sivanich <sivanich@hpe.com>
Date:   Thu Sep 19 07:34:50 2024 -0500

    misc: sgi-gru: Don't disable preemption in GRU driver
    
    [ Upstream commit b983b271662bd6104d429b0fd97af3333ba760bf ]
    
    Disabling preemption in the GRU driver is unnecessary, and clashes with
    sleeping locks in several code paths.  Remove preempt_disable and
    preempt_enable from the GRU driver.
    
    Signed-off-by: Dimitri Sivanich <sivanich@hpe.com>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
mlxsw: spectrum_ipip: Fix memory leak when changing remote IPv6 address [+ + +]
Author: Ido Schimmel <idosch@nvidia.com>
Date:   Fri Oct 25 16:26:28 2024 +0200

    mlxsw: spectrum_ipip: Fix memory leak when changing remote IPv6 address
    
    [ Upstream commit 12ae97c531fcd3bfd774d4dfeaeac23eafe24280 ]
    
    The device stores IPv6 addresses that are used for encapsulation in
    linear memory that is managed by the driver.
    
    Changing the remote address of an ip6gre net device never worked
    properly, but since cited commit the following reproducer [1] would
    result in a warning [2] and a memory leak [3]. The problem is that the
    new remote address is never added by the driver to its hash table (and
    therefore the device) and the old address is never removed from it.
    
    Fix by programming the new address when the configuration of the ip6gre
    net device changes and removing the old one. If the address did not
    change, then the above would result in increasing the reference count of
    the address and then decreasing it.
    
    [1]
     # ip link add name bla up type ip6gre local 2001:db8:1::1 remote 2001:db8:2::1 tos inherit ttl inherit
     # ip link set dev bla type ip6gre remote 2001:db8:3::1
     # ip link del dev bla
     # devlink dev reload pci/0000:01:00.0
    
    [2]
    WARNING: CPU: 0 PID: 1682 at drivers/net/ethernet/mellanox/mlxsw/spectrum.c:3002 mlxsw_sp_ipv6_addr_put+0x140/0x1d0
    Modules linked in:
    CPU: 0 UID: 0 PID: 1682 Comm: ip Not tainted 6.12.0-rc3-custom-g86b5b55bc835 #151
    Hardware name: Nvidia SN5600/VMOD0013, BIOS 5.13 05/31/2023
    RIP: 0010:mlxsw_sp_ipv6_addr_put+0x140/0x1d0
    [...]
    Call Trace:
     <TASK>
     mlxsw_sp_router_netdevice_event+0x55f/0x1240
     notifier_call_chain+0x5a/0xd0
     call_netdevice_notifiers_info+0x39/0x90
     unregister_netdevice_many_notify+0x63e/0x9d0
     rtnl_dellink+0x16b/0x3a0
     rtnetlink_rcv_msg+0x142/0x3f0
     netlink_rcv_skb+0x50/0x100
     netlink_unicast+0x242/0x390
     netlink_sendmsg+0x1de/0x420
     ____sys_sendmsg+0x2bd/0x320
     ___sys_sendmsg+0x9a/0xe0
     __sys_sendmsg+0x7a/0xd0
     do_syscall_64+0x9e/0x1a0
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
    
    [3]
    unreferenced object 0xffff898081f597a0 (size 32):
      comm "ip", pid 1626, jiffies 4294719324
      hex dump (first 32 bytes):
        20 01 0d b8 00 02 00 00 00 00 00 00 00 00 00 01   ...............
        21 49 61 83 80 89 ff ff 00 00 00 00 01 00 00 00  !Ia.............
      backtrace (crc fd9be911):
        [<00000000df89c55d>] __kmalloc_cache_noprof+0x1da/0x260
        [<00000000ff2a1ddb>] mlxsw_sp_ipv6_addr_kvdl_index_get+0x281/0x340
        [<000000009ddd445d>] mlxsw_sp_router_netdevice_event+0x47b/0x1240
        [<00000000743e7757>] notifier_call_chain+0x5a/0xd0
        [<000000007c7b9e13>] call_netdevice_notifiers_info+0x39/0x90
        [<000000002509645d>] register_netdevice+0x5f7/0x7a0
        [<00000000c2e7d2a9>] ip6gre_newlink_common.isra.0+0x65/0x130
        [<0000000087cd6d8d>] ip6gre_newlink+0x72/0x120
        [<000000004df7c7cc>] rtnl_newlink+0x471/0xa20
        [<0000000057ed632a>] rtnetlink_rcv_msg+0x142/0x3f0
        [<0000000032e0d5b5>] netlink_rcv_skb+0x50/0x100
        [<00000000908bca63>] netlink_unicast+0x242/0x390
        [<00000000cdbe1c87>] netlink_sendmsg+0x1de/0x420
        [<0000000011db153e>] ____sys_sendmsg+0x2bd/0x320
        [<000000003b6d53eb>] ___sys_sendmsg+0x9a/0xe0
        [<00000000cae27c62>] __sys_sendmsg+0x7a/0xd0
    
    Fixes: cf42911523e0 ("mlxsw: spectrum_ipip: Use common hash table for IPv6 address mapping")
    Reported-by: Maksym Yaremchuk <maksymy@nvidia.com>
    Signed-off-by: Ido Schimmel <idosch@nvidia.com>
    Reviewed-by: Petr Machata <petrm@nvidia.com>
    Signed-off-by: Petr Machata <petrm@nvidia.com>
    Link: https://patch.msgid.link/e91012edc5a6cb9df37b78fd377f669381facfcb.1729866134.git.petrm@nvidia.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

mlxsw: spectrum_ipip: Rename Spectrum-2 ip6gre operations [+ + +]
Author: Ido Schimmel <idosch@nvidia.com>
Date:   Wed Dec 7 13:36:45 2022 +0100

    mlxsw: spectrum_ipip: Rename Spectrum-2 ip6gre operations
    
    [ Upstream commit ab30e4d4b29ba530c65406e8a146630d0663c570 ]
    
    There are two main differences between Spectrum-1 and newer ASICs in
    terms of IP-in-IP support:
    
    1. In Spectrum-1, RIFs representing ip6gre tunnels require two entries
       in the RIF table.
    
    2. In Spectrum-2 and newer ASICs, packets ingress the underlay (during
       encapsulation) and egress the underlay (during decapsulation) via a
       special generic loopback RIF.
    
    The first difference was handled in previous patches by adding the
    'double_rif_entry' field to the Spectrum-1 operations structure of
    ip6gre RIFs. The second difference is handled during RIF creation, by
    only creating a generic loopback RIF in Spectrum-2 and newer ASICs.
    
    Therefore, the ip6gre operations can be shared between Spectrum-1 and
    newer ASIC in a similar fashion to how the ipgre operations are shared.
    
    Rename the operations to not be Spectrum-2 specific and move them
    earlier in the file so that they could later be used for Spectrum-1.
    
    Signed-off-by: Ido Schimmel <idosch@nvidia.com>
    Reviewed-by: Amit Cohen <amcohen@nvidia.com>
    Signed-off-by: Petr Machata <petrm@nvidia.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Stable-dep-of: 12ae97c531fc ("mlxsw: spectrum_ipip: Fix memory leak when changing remote IPv6 address")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

mlxsw: spectrum_ptp: Add missing verification before pushing Tx header [+ + +]
Author: Amit Cohen <amcohen@nvidia.com>
Date:   Fri Oct 25 16:26:25 2024 +0200

    mlxsw: spectrum_ptp: Add missing verification before pushing Tx header
    
    [ Upstream commit 0a66e5582b5102c4d7b866b977ff7c850c1174ce ]
    
    Tx header should be pushed for each packet which is transmitted via
    Spectrum ASICs. The cited commit moved the call to skb_cow_head() from
    mlxsw_sp_port_xmit() to functions which handle Tx header.
    
    In case that mlxsw_sp->ptp_ops->txhdr_construct() is used to handle Tx
    header, and txhdr_construct() is mlxsw_sp_ptp_txhdr_construct(), there is
    no call for skb_cow_head() before pushing Tx header size to SKB. This flow
    is relevant for Spectrum-1 and Spectrum-4, for PTP packets.
    
    Add the missing call to skb_cow_head() to make sure that there is both
    enough room to push the Tx header and that the SKB header is not cloned and
    can be modified.
    
    An additional set will be sent to net-next to centralize the handling of
    the Tx header by pushing it to every packet just before transmission.
    
    Cc: Richard Cochran <richardcochran@gmail.com>
    Fixes: 24157bc69f45 ("mlxsw: Send PTP packets as data packets to overcome a limitation")
    Signed-off-by: Amit Cohen <amcohen@nvidia.com>
    Signed-off-by: Petr Machata <petrm@nvidia.com>
    Link: https://patch.msgid.link/5145780b07ebbb5d3b3570f311254a3a2d554a44.1729866134.git.petrm@nvidia.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

mlxsw: spectrum_router: Add support for double entry RIFs [+ + +]
Author: Ido Schimmel <idosch@nvidia.com>
Date:   Wed Dec 7 13:36:44 2022 +0100

    mlxsw: spectrum_router: Add support for double entry RIFs
    
    [ Upstream commit 5ca1b208c5d107fd4b9e7801200dea18ab1af8e7 ]
    
    In Spectrum-1, loopback router interfaces (RIFs) used for IP-in-IP
    encapsulation with an IPv6 underlay require two RIF entries and the RIF
    index must be even.
    
    Prepare for this change by extending the RIF parameters structure with a
    'double_entry' field that indicates if the RIF being created requires
    two RIF entries or not. Only set it for RIFs representing ip6gre tunnels
    in Spectrum-1.
    
    Signed-off-by: Ido Schimmel <idosch@nvidia.com>
    Reviewed-by: Amit Cohen <amcohen@nvidia.com>
    Signed-off-by: Petr Machata <petrm@nvidia.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Stable-dep-of: 12ae97c531fc ("mlxsw: spectrum_ipip: Fix memory leak when changing remote IPv6 address")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
mm/migrate.c: stop using 0 as NULL pointer [+ + +]
Author: Yang Li <yang.lee@linux.alibaba.com>
Date:   Wed Nov 16 09:23:45 2022 +0800

    mm/migrate.c: stop using 0 as NULL pointer
    
    [ Upstream commit 4c74b65f478dc9353780a6be17fc82f1b06cea80 ]
    
    mm/migrate.c:1198:24: warning: Using plain integer as NULL pointer
    
    Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=3080
    Link: https://lkml.kernel.org/r/20221116012345.84870-1-yang.lee@linux.alibaba.com
    Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
    Reported-by: Abaci Robot <abaci@linux.alibaba.com>
    Reviewed-by: David Hildenbrand <david@redhat.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Stable-dep-of: 35e41024c4c2 ("vmscan,migrate: fix page count imbalance on node stats when demoting pages")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
mm/page_alloc: explicitly define how __GFP_HIGH non-blocking allocations accesses reserves [+ + +]
Author: Mel Gorman <mgorman@techsingularity.net>
Date:   Fri Jan 13 11:12:16 2023 +0000

    mm/page_alloc: explicitly define how __GFP_HIGH non-blocking allocations accesses reserves
    
    [ Upstream commit 1ebbb21811b76c3b932959787f37985af36f62fa ]
    
    GFP_ATOMIC allocations get flagged ALLOC_HARDER which is a vague
    description.  In preparation for the removal of GFP_ATOMIC redefine
    __GFP_ATOMIC to simply mean non-blocking and renaming ALLOC_HARDER to
    ALLOC_NON_BLOCK accordingly.  __GFP_HIGH is required for access to
    reserves but non-blocking is granted more access.  For example, GFP_NOWAIT
    is non-blocking but has no special access to reserves.  A __GFP_NOFAIL
    blocking allocation is granted access similar to __GFP_HIGH if the only
    alternative is an OOM kill.
    
    Link: https://lkml.kernel.org/r/20230113111217.14134-6-mgorman@techsingularity.net
    Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
    Acked-by: Michal Hocko <mhocko@suse.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: NeilBrown <neilb@suse.de>
    Cc: Thierry Reding <thierry.reding@gmail.com>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Stable-dep-of: 281dd25c1a01 ("mm/page_alloc: let GFP_ATOMIC order-0 allocs access highatomic reserves")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

mm/page_alloc: explicitly define what alloc flags deplete min reserves [+ + +]
Author: Mel Gorman <mgorman@techsingularity.net>
Date:   Fri Jan 13 11:12:15 2023 +0000

    mm/page_alloc: explicitly define what alloc flags deplete min reserves
    
    [ Upstream commit ab3508854353793cd35e348fde89a5c09b2fd8b5 ]
    
    As there are more ALLOC_ flags that affect reserves, define what flags
    affect reserves and clarify the effect of each flag.
    
    Link: https://lkml.kernel.org/r/20230113111217.14134-5-mgorman@techsingularity.net
    Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
    Acked-by: Vlastimil Babka <vbabka@suse.cz>
    Acked-by: Michal Hocko <mhocko@suse.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: NeilBrown <neilb@suse.de>
    Cc: Thierry Reding <thierry.reding@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Stable-dep-of: 281dd25c1a01 ("mm/page_alloc: let GFP_ATOMIC order-0 allocs access highatomic reserves")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

mm/page_alloc: explicitly record high-order atomic allocations in alloc_flags [+ + +]
Author: Mel Gorman <mgorman@techsingularity.net>
Date:   Fri Jan 13 11:12:14 2023 +0000

    mm/page_alloc: explicitly record high-order atomic allocations in alloc_flags
    
    [ Upstream commit eb2e2b425c6984ca8034448a3f2c680622bd3d4d ]
    
    A high-order ALLOC_HARDER allocation is assumed to be atomic.  While that
    is accurate, it changes later in the series.  In preparation, explicitly
    record high-order atomic allocations in gfp_to_alloc_flags().
    
    Link: https://lkml.kernel.org/r/20230113111217.14134-4-mgorman@techsingularity.net
    Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
    Acked-by: Vlastimil Babka <vbabka@suse.cz>
    Acked-by: Michal Hocko <mhocko@suse.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: NeilBrown <neilb@suse.de>
    Cc: Thierry Reding <thierry.reding@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Stable-dep-of: 281dd25c1a01 ("mm/page_alloc: let GFP_ATOMIC order-0 allocs access highatomic reserves")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

mm/page_alloc: let GFP_ATOMIC order-0 allocs access highatomic reserves [+ + +]
Author: Matt Fleming <mfleming@cloudflare.com>
Date:   Fri Oct 11 13:07:37 2024 +0100

    mm/page_alloc: let GFP_ATOMIC order-0 allocs access highatomic reserves
    
    [ Upstream commit 281dd25c1a018261a04d1b8bf41a0674000bfe38 ]
    
    Under memory pressure it's possible for GFP_ATOMIC order-0 allocations to
    fail even though free pages are available in the highatomic reserves.
    GFP_ATOMIC allocations cannot trigger unreserve_highatomic_pageblock()
    since it's only run from reclaim.
    
    Given that such allocations will pass the watermarks in
    __zone_watermark_unusable_free(), it makes sense to fallback to highatomic
    reserves the same way that ALLOC_OOM can.
    
    This fixes order-0 page allocation failures observed on Cloudflare's fleet
    when handling network packets:
    
      kswapd1: page allocation failure: order:0, mode:0x820(GFP_ATOMIC),
      nodemask=(null),cpuset=/,mems_allowed=0-7
      CPU: 10 PID: 696 Comm: kswapd1 Kdump: loaded Tainted: G           O 6.6.43-CUSTOM #1
      Hardware name: MACHINE
      Call Trace:
       <IRQ>
       dump_stack_lvl+0x3c/0x50
       warn_alloc+0x13a/0x1c0
       __alloc_pages_slowpath.constprop.0+0xc9d/0xd10
       __alloc_pages+0x327/0x340
       __napi_alloc_skb+0x16d/0x1f0
       bnxt_rx_page_skb+0x96/0x1b0 [bnxt_en]
       bnxt_rx_pkt+0x201/0x15e0 [bnxt_en]
       __bnxt_poll_work+0x156/0x2b0 [bnxt_en]
       bnxt_poll+0xd9/0x1c0 [bnxt_en]
       __napi_poll+0x2b/0x1b0
       bpf_trampoline_6442524138+0x7d/0x1000
       __napi_poll+0x5/0x1b0
       net_rx_action+0x342/0x740
       handle_softirqs+0xcf/0x2b0
       irq_exit_rcu+0x6c/0x90
       sysvec_apic_timer_interrupt+0x72/0x90
       </IRQ>
    
    [mfleming@cloudflare.com: update comment]
      Link: https://lkml.kernel.org/r/20241015125158.3597702-1-matt@readmodwrite.com
    Link: https://lkml.kernel.org/r/20241011120737.3300370-1-matt@readmodwrite.com
    Link: https://lore.kernel.org/all/CAGis_TWzSu=P7QJmjD58WWiu3zjMTVKSzdOwWE8ORaGytzWJwQ@mail.gmail.com/
    Fixes: 1d91df85f399 ("mm/page_alloc: handle a missing case for memalloc_nocma_{save/restore} APIs")
    Signed-off-by: Matt Fleming <mfleming@cloudflare.com>
    Suggested-by: Vlastimil Babka <vbabka@suse.cz>
    Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Cc: Michal Hocko <mhocko@kernel.org>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

mm/page_alloc: rename ALLOC_HIGH to ALLOC_MIN_RESERVE [+ + +]
Author: Mel Gorman <mgorman@techsingularity.net>
Date:   Fri Jan 13 11:12:12 2023 +0000

    mm/page_alloc: rename ALLOC_HIGH to ALLOC_MIN_RESERVE
    
    [ Upstream commit 524c48072e5673f4511f1ad81493e2485863fd65 ]
    
    Patch series "Discard __GFP_ATOMIC", v3.
    
    Neil's patch has been residing in mm-unstable as commit 2fafb4fe8f7a ("mm:
    discard __GFP_ATOMIC") for a long time and recently brought up again.
    Most recently, I was worried that __GFP_HIGH allocations could use
    high-order atomic reserves which is unintentional but there was no
    response so lets revisit -- this series reworks how min reserves are used,
    protects highorder reserves and then finishes with Neil's patch with very
    minor modifications so it fits on top.
    
    There was a review discussion on renaming __GFP_DIRECT_RECLAIM to
    __GFP_ALLOW_BLOCKING but I didn't think it was that big an issue and is
    orthogonal to the removal of __GFP_ATOMIC.
    
    There were some concerns about how the gfp flags affect the min reserves
    but it never reached a solid conclusion so I made my own attempt.
    
    The series tries to iron out some of the details on how reserves are used.
    ALLOC_HIGH becomes ALLOC_MIN_RESERVE and ALLOC_HARDER becomes
    ALLOC_NON_BLOCK and documents how the reserves are affected.  For example,
    ALLOC_NON_BLOCK (no direct reclaim) on its own allows 25% of the min
    reserve.  ALLOC_MIN_RESERVE (__GFP_HIGH) allows 50% and both combined
    allows deeper access again.  ALLOC_OOM allows access to 75%.
    
    High-order atomic allocations are explicitly handled with the caveat that
    no __GFP_ATOMIC flag means that any high-order allocation that specifies
    GFP_HIGH and cannot enter direct reclaim will be treated as if it was
    GFP_ATOMIC.
    
    This patch (of 6):
    
    __GFP_HIGH aliases to ALLOC_HIGH but the name does not really hint what it
    means.  As ALLOC_HIGH is internal to the allocator, rename it to
    ALLOC_MIN_RESERVE to document that the min reserves can be depleted.
    
    Link: https://lkml.kernel.org/r/20230113111217.14134-1-mgorman@techsingularity.net
    Link: https://lkml.kernel.org/r/20230113111217.14134-2-mgorman@techsingularity.net
    Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
    Acked-by: Vlastimil Babka <vbabka@suse.cz>
    Acked-by: Michal Hocko <mhocko@suse.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: NeilBrown <neilb@suse.de>
    Cc: Thierry Reding <thierry.reding@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Stable-dep-of: 281dd25c1a01 ("mm/page_alloc: let GFP_ATOMIC order-0 allocs access highatomic reserves")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

mm/page_alloc: treat RT tasks similar to __GFP_HIGH [+ + +]
Author: Mel Gorman <mgorman@techsingularity.net>
Date:   Fri Jan 13 11:12:13 2023 +0000

    mm/page_alloc: treat RT tasks similar to __GFP_HIGH
    
    [ Upstream commit c988dcbecf3fd5430921eaa3fe9054754f76d185 ]
    
    RT tasks are allowed to dip below the min reserve but ALLOC_HARDER is
    typically combined with ALLOC_MIN_RESERVE so RT tasks are a little
    unusual.  While there is some justification for allowing RT tasks access
    to memory reserves, there is a strong chance that a RT task that is also
    under memory pressure is at risk of missing deadlines anyway.  Relax how
    much reserves an RT task can access by treating it the same as __GFP_HIGH
    allocations.
    
    Note that in a future kernel release that the RT special casing will be
    removed.  Hard realtime tasks should be locking down resources in advance
    and ensuring enough memory is available.  Even a soft-realtime task like
    audio or video live decoding which cannot jitter should be allocating both
    memory and any disk space required up-front before the recording starts
    instead of relying on reserves.  At best, reserve access will only delay
    the problem by a very short interval.
    
    Link: https://lkml.kernel.org/r/20230113111217.14134-3-mgorman@techsingularity.net
    Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
    Acked-by: Vlastimil Babka <vbabka@suse.cz>
    Acked-by: Michal Hocko <mhocko@suse.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: NeilBrown <neilb@suse.de>
    Cc: Thierry Reding <thierry.reding@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Stable-dep-of: 281dd25c1a01 ("mm/page_alloc: let GFP_ATOMIC order-0 allocs access highatomic reserves")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
mm: avoid gcc complaint about pointer casting [+ + +]
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Sat Mar 4 14:03:27 2023 -0800

    mm: avoid gcc complaint about pointer casting
    
    commit e77d587a2c04e82c6a0dffa4a32c874a4029385d upstream.
    
    The migration code ends up temporarily stashing information of the wrong
    type in unused fields of the newly allocated destination folio.  That
    all works fine, but gcc does complain about the pointer type mis-use:
    
        mm/migrate.c: In function ‘__migrate_folio_extract’:
        mm/migrate.c:1050:20: note: randstruct: casting between randomized structure pointer types (ssa): ‘struct anon_vma’ and ‘struct address_space’
    
         1050 |         *anon_vmap = (void *)dst->mapping;
              |         ~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~
    
    and gcc is actually right to complain since it really doesn't understand
    that this is a very temporary special case where this is ok.
    
    This could be fixed in different ways by just obfuscating the assignment
    sufficiently that gcc doesn't see what is going on, but the truly
    "proper C" way to do this is by explicitly using a union.
    
    Using unions for type conversions like this is normally hugely ugly and
    syntactically nasty, but this really is one of the few cases where we
    want to make it clear that we're not doing type conversion, we're really
    re-using the value bit-for-bit just using another type.
    
    IOW, this should not become a common pattern, but in this one case using
    that odd union is probably the best way to document to the compiler what
    is conceptually going on here.
    
    [ Side note: there are valid cases where we convert pointers to other
      pointer types, notably the whole "folio vs page" situation, where the
      types actually have fundamental commonalities.
    
      The fact that the gcc note is limited to just randomized structures
      means that we don't see equivalent warnings for those cases, but it
      migth also mean that we miss other cases where we do play these kinds
      of dodgy games, and this kind of explicit conversion might be a good
      idea. ]
    
    I verified that at least for an allmodconfig build on x86-64, this
    generates the exact same code, apart from line numbers and assembler
    comment changes.
    
    Fixes: 64c8902ed441 ("migrate_pages: split unmap_and_move() to _unmap() and _move()")
    Cc: Huang, Ying <ying.huang@intel.com>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm: migrate: try again if THP split is failed due to page refcnt [+ + +]
Author: Baolin Wang <baolin.wang@linux.alibaba.com>
Date:   Mon Oct 24 16:34:22 2022 +0800

    mm: migrate: try again if THP split is failed due to page refcnt
    
    [ Upstream commit fd4a7ac32918d3d7a2d17dc06c5520f45e36eb52 ]
    
    When creating a virtual machine, we will use memfd_create() to get a file
    descriptor which can be used to create share memory mappings using the
    mmap function, meanwhile the mmap() will set the MAP_POPULATE flag to
    allocate physical pages for the virtual machine.
    
    When allocating physical pages for the guest, the host can fallback to
    allocate some CMA pages for the guest when over half of the zone's free
    memory is in the CMA area.
    
    In guest os, when the application wants to do some data transaction with
    DMA, our QEMU will call VFIO_IOMMU_MAP_DMA ioctl to do longterm-pin and
    create IOMMU mappings for the DMA pages.  However, when calling
    VFIO_IOMMU_MAP_DMA ioctl to pin the physical pages, we found it will be
    failed to longterm-pin sometimes.
    
    After some invetigation, we found the pages used to do DMA mapping can
    contain some CMA pages, and these CMA pages will cause a possible failure
    of the longterm-pin, due to failed to migrate the CMA pages.  The reason
    of migration failure may be temporary reference count or memory allocation
    failure.  So that will cause the VFIO_IOMMU_MAP_DMA ioctl returns error,
    which makes the application failed to start.
    
    I observed one migration failure case (which is not easy to reproduce) is
    that, the 'thp_migration_fail' count is 1 and the 'thp_split_page_failed'
    count is also 1.
    
    That means when migrating a THP which is in CMA area, but can not allocate
    a new THP due to memory fragmentation, so it will split the THP.  However
    THP split is also failed, probably the reason is temporary reference count
    of this THP.  And the temporary reference count can be caused by dropping
    page caches (I observed the drop caches operation in the system), but we
    can not drop the shmem page caches due to they are already dirty at that
    time.
    
    Especially for THP split failure, which is caused by temporary reference
    count, we can try again to mitigate the failure of migration in this case
    according to previous discussion [1].
    
    [1] https://lore.kernel.org/all/470dc638-a300-f261-94b4-e27250e42f96@redhat.com/
    Link: https://lkml.kernel.org/r/6784730480a1df82e8f4cba1ed088e4ac767994b.1666599848.git.baolin.wang@linux.alibaba.com
    Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Reviewed-by: "Huang, Ying" <ying.huang@intel.com>
    Cc: Alistair Popple <apopple@nvidia.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Yang Shi <shy828301@gmail.com>
    Cc: Zi Yan <ziy@nvidia.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Stable-dep-of: 35e41024c4c2 ("vmscan,migrate: fix page count imbalance on node stats when demoting pages")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

mm: remove kern_addr_valid() completely [+ + +]
Author: Kefeng Wang <wangkefeng.wang@huawei.com>
Date:   Tue Oct 18 15:40:14 2022 +0800

    mm: remove kern_addr_valid() completely
    
    [ Upstream commit e025ab842ec35225b1a8e163d1f311beb9e38ce9 ]
    
    Most architectures (except arm64/x86/sparc) simply return 1 for
    kern_addr_valid(), which is only used in read_kcore(), and it calls
    copy_from_kernel_nofault() which could check whether the address is a
    valid kernel address.  So as there is no need for kern_addr_valid(), let's
    remove it.
    
    Link: https://lkml.kernel.org/r/20221018074014.185687-1-wangkefeng.wang@huawei.com
    Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
    Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>     [m68k]
    Acked-by: Heiko Carstens <hca@linux.ibm.com>            [s390]
    Acked-by: Christoph Hellwig <hch@lst.de>
    Acked-by: Helge Deller <deller@gmx.de>                  [parisc]
    Acked-by: Michael Ellerman <mpe@ellerman.id.au>         [powerpc]
    Acked-by: Guo Ren <guoren@kernel.org>                   [csky]
    Acked-by: Catalin Marinas <catalin.marinas@arm.com>     [arm64]
    Cc: Alexander Gordeev <agordeev@linux.ibm.com>
    Cc: Andy Lutomirski <luto@kernel.org>
    Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
    Cc: <aou@eecs.berkeley.edu>
    Cc: Borislav Petkov <bp@alien8.de>
    Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
    Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
    Cc: Chris Zankel <chris@zankel.net>
    Cc: Dave Hansen <dave.hansen@linux.intel.com>
    Cc: David S. Miller <davem@davemloft.net>
    Cc: Dinh Nguyen <dinguyen@kernel.org>
    Cc: Greg Ungerer <gerg@linux-m68k.org>
    Cc: H. Peter Anvin <hpa@zytor.com>
    Cc: Huacai Chen <chenhuacai@kernel.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
    Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
    Cc: Johannes Berg <johannes@sipsolutions.net>
    Cc: Jonas Bonn <jonas@southpole.se>
    Cc: Matt Turner <mattst88@gmail.com>
    Cc: Max Filippov <jcmvbkbc@gmail.com>
    Cc: Michal Simek <monstr@monstr.eu>
    Cc: Nicholas Piggin <npiggin@gmail.com>
    Cc: Palmer Dabbelt <palmer@rivosinc.com>
    Cc: Paul Walmsley <paul.walmsley@sifive.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Richard Henderson <richard.henderson@linaro.org>
    Cc: Richard Weinberger <richard@nod.at>
    Cc: Rich Felker <dalias@libc.org>
    Cc: Russell King <linux@armlinux.org.uk>
    Cc: Stafford Horne <shorne@gmail.com>
    Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
    Cc: Sven Schnelle <svens@linux.ibm.com>
    Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Vasily Gorbik <gor@linux.ibm.com>
    Cc: Vineet Gupta <vgupta@kernel.org>
    Cc: Will Deacon <will@kernel.org>
    Cc: Xuerui Wang <kernel@xen0n.name>
    Cc: Yoshinori Sato <ysato@users.osdn.me>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Stable-dep-of: 3d5854d75e31 ("fs/proc/kcore.c: allow translation of physical memory addresses")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

mm: shmem: fix data-race in shmem_getattr() [+ + +]
Author: Jeongjun Park <aha310510@gmail.com>
Date:   Mon Sep 9 21:35:58 2024 +0900

    mm: shmem: fix data-race in shmem_getattr()
    
    commit d949d1d14fa281ace388b1de978e8f2cd52875cf upstream.
    
    I got the following KCSAN report during syzbot testing:
    
    ==================================================================
    BUG: KCSAN: data-race in generic_fillattr / inode_set_ctime_current
    
    write to 0xffff888102eb3260 of 4 bytes by task 6565 on cpu 1:
     inode_set_ctime_to_ts include/linux/fs.h:1638 [inline]
     inode_set_ctime_current+0x169/0x1d0 fs/inode.c:2626
     shmem_mknod+0x117/0x180 mm/shmem.c:3443
     shmem_create+0x34/0x40 mm/shmem.c:3497
     lookup_open fs/namei.c:3578 [inline]
     open_last_lookups fs/namei.c:3647 [inline]
     path_openat+0xdbc/0x1f00 fs/namei.c:3883
     do_filp_open+0xf7/0x200 fs/namei.c:3913
     do_sys_openat2+0xab/0x120 fs/open.c:1416
     do_sys_open fs/open.c:1431 [inline]
     __do_sys_openat fs/open.c:1447 [inline]
     __se_sys_openat fs/open.c:1442 [inline]
     __x64_sys_openat+0xf3/0x120 fs/open.c:1442
     x64_sys_call+0x1025/0x2d60 arch/x86/include/generated/asm/syscalls_64.h:258
     do_syscall_x64 arch/x86/entry/common.c:52 [inline]
     do_syscall_64+0x54/0x120 arch/x86/entry/common.c:83
     entry_SYSCALL_64_after_hwframe+0x76/0x7e
    
    read to 0xffff888102eb3260 of 4 bytes by task 3498 on cpu 0:
     inode_get_ctime_nsec include/linux/fs.h:1623 [inline]
     inode_get_ctime include/linux/fs.h:1629 [inline]
     generic_fillattr+0x1dd/0x2f0 fs/stat.c:62
     shmem_getattr+0x17b/0x200 mm/shmem.c:1157
     vfs_getattr_nosec fs/stat.c:166 [inline]
     vfs_getattr+0x19b/0x1e0 fs/stat.c:207
     vfs_statx_path fs/stat.c:251 [inline]
     vfs_statx+0x134/0x2f0 fs/stat.c:315
     vfs_fstatat+0xec/0x110 fs/stat.c:341
     __do_sys_newfstatat fs/stat.c:505 [inline]
     __se_sys_newfstatat+0x58/0x260 fs/stat.c:499
     __x64_sys_newfstatat+0x55/0x70 fs/stat.c:499
     x64_sys_call+0x141f/0x2d60 arch/x86/include/generated/asm/syscalls_64.h:263
     do_syscall_x64 arch/x86/entry/common.c:52 [inline]
     do_syscall_64+0x54/0x120 arch/x86/entry/common.c:83
     entry_SYSCALL_64_after_hwframe+0x76/0x7e
    
    value changed: 0x2755ae53 -> 0x27ee44d3
    
    Reported by Kernel Concurrency Sanitizer on:
    CPU: 0 UID: 0 PID: 3498 Comm: udevd Not tainted 6.11.0-rc6-syzkaller-00326-gd1f2d51b711a-dirty #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
    ==================================================================
    
    When calling generic_fillattr(), if you don't hold read lock, data-race
    will occur in inode member variables, which can cause unexpected
    behavior.
    
    Since there is no special protection when shmem_getattr() calls
    generic_fillattr(), data-race occurs by functions such as shmem_unlink()
    or shmem_mknod(). This can cause unexpected results, so commenting it out
    is not enough.
    
    Therefore, when calling generic_fillattr() from shmem_getattr(), it is
    appropriate to protect the inode using inode_lock_shared() and
    inode_unlock_shared() to prevent data-race.
    
    Link: https://lkml.kernel.org/r/20240909123558.70229-1-aha310510@gmail.com
    Fixes: 44a30220bc0a ("shmem: recalculate file inode when fstat")
    Signed-off-by: Jeongjun Park <aha310510@gmail.com>
    Reported-by: syzbot <syzkaller@googlegroup.com>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Yu Zhao <yuzhao@google.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
mtd: spi-nor: winbond: fix w25q128 regression [+ + +]
Author: Michael Walle <mwalle@kernel.org>
Date:   Fri Jun 21 14:09:29 2024 +0200

    mtd: spi-nor: winbond: fix w25q128 regression
    
    commit d35df77707bf5ae1221b5ba1c8a88cf4fcdd4901 upstream.
    
    Commit 83e824a4a595 ("mtd: spi-nor: Correct flags for Winbond w25q128")
    removed the flags for non-SFDP devices. It was assumed that it wasn't in
    use anymore. This wasn't true. Add the no_sfdp_flags as well as the size
    again.
    
    We add the additional flags for dual and quad read because they have
    been reported to work properly by Hartmut using both older and newer
    versions of this flash, the similar flashes with 64Mbit and 256Mbit
    already have these flags and because it will (luckily) trigger our
    legacy SFDP parsing, so newer versions with SFDP support will still get
    the parameters from the SFDP tables.
    
    Reported-by: Hartmut Birr <e9hack@gmail.com>
    Closes: https://lore.kernel.org/r/CALxbwRo_-9CaJmt7r7ELgu+vOcgk=xZcGHobnKf=oT2=u4d4aA@mail.gmail.com/
    Fixes: 83e824a4a595 ("mtd: spi-nor: Correct flags for Winbond w25q128")
    Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
    Signed-off-by: Michael Walle <mwalle@kernel.org>
    Acked-by: Tudor Ambarus <tudor.ambarus@linaro.org>
    Reviewed-by: Esben Haabendal <esben@geanix.com>
    Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
    Signed-off-by: Pratyush Yadav <pratyush@kernel.org>
    Link: https://lore.kernel.org/r/20240621120929.2670185-1-mwalle@kernel.org
    Link: https://lore.kernel.org/r/20240621120929.2670185-1-mwalle@kernel.org
    [Backported to v6.6 - vastly different due to upstream changes]
    Reviewed-by: Tudor Ambarus <tudor.ambarus@linaro.org>
    Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
net/sched: stop qdisc_tree_reduce_backlog on TC_H_ROOT [+ + +]
Author: Pedro Tammela <pctammela@mojatatu.com>
Date:   Thu Oct 24 12:55:47 2024 -0400

    net/sched: stop qdisc_tree_reduce_backlog on TC_H_ROOT
    
    [ Upstream commit 2e95c4384438adeaa772caa560244b1a2efef816 ]
    
    In qdisc_tree_reduce_backlog, Qdiscs with major handle ffff: are assumed
    to be either root or ingress. This assumption is bogus since it's valid
    to create egress qdiscs with major handle ffff:
    Budimir Markovic found that for qdiscs like DRR that maintain an active
    class list, it will cause a UAF with a dangling class pointer.
    
    In 066a3b5b2346, the concern was to avoid iterating over the ingress
    qdisc since its parent is itself. The proper fix is to stop when parent
    TC_H_ROOT is reached because the only way to retrieve ingress is when a
    hierarchy which does not contain a ffff: major handle call into
    qdisc_lookup with TC_H_MAJ(TC_H_ROOT).
    
    In the scenario where major ffff: is an egress qdisc in any of the tree
    levels, the updates will also propagate to TC_H_ROOT, which then the
    iteration must stop.
    
    Fixes: 066a3b5b2346 ("[NET_SCHED] sch_api: fix qdisc_tree_decrease_qlen() loop")
    Reported-by: Budimir Markovic <markovicbudimir@gmail.com>
    Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com>
    Tested-by: Victor Nogueira <victor@mojatatu.com>
    Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
    Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
    
     net/sched/sch_api.c | 2 +-
     1 file changed, 1 insertion(+), 1 deletion(-)
    Reviewed-by: Simon Horman <horms@kernel.org>
    
    Link: https://patch.msgid.link/20241024165547.418570-1-jhs@mojatatu.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
net: amd: mvme147: Fix probe banner message [+ + +]
Author: Daniel Palmer <daniel@0x0f.com>
Date:   Mon Oct 7 19:43:17 2024 +0900

    net: amd: mvme147: Fix probe banner message
    
    [ Upstream commit 82c5b53140faf89c31ea2b3a0985a2f291694169 ]
    
    Currently this driver prints this line with what looks like
    a rogue format specifier when the device is probed:
    [    2.840000] eth%d: MVME147 at 0xfffe1800, irq 12, Hardware Address xx:xx:xx:xx:xx:xx
    
    Change the printk() for netdev_info() and move it after the
    registration has completed so it prints out the name of the
    interface properly.
    
    Signed-off-by: Daniel Palmer <daniel@0x0f.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: skip offload for NETIF_F_IPV6_CSUM if ipv6 header contains extension [+ + +]
Author: Benoît Monin <benoit.monin@gmx.fr>
Date:   Thu Oct 24 16:01:54 2024 +0200

    net: skip offload for NETIF_F_IPV6_CSUM if ipv6 header contains extension
    
    [ Upstream commit 04c20a9356f283da623903e81e7c6d5df7e4dc3c ]
    
    As documented in skbuff.h, devices with NETIF_F_IPV6_CSUM capability
    can only checksum TCP and UDP over IPv6 if the IP header does not
    contains extension.
    
    This is enforced for UDP packets emitted from user-space to an IPv6
    address as they go through ip6_make_skb(), which calls
    __ip6_append_data() where a check is done on the header size before
    setting CHECKSUM_PARTIAL.
    
    But the introduction of UDP encapsulation with fou6 added a code-path
    where it is possible to get an skb with a partial UDP checksum and an
    IPv6 header with extension:
    * fou6 adds a UDP header with a partial checksum if the inner packet
    does not contains a valid checksum.
    * ip6_tunnel adds an IPv6 header with a destination option extension
    header if encap_limit is non-zero (the default value is 4).
    
    The thread linked below describes in more details how to reproduce the
    problem with GRE-in-UDP tunnel.
    
    Add a check on the network header size in skb_csum_hwoffload_help() to
    make sure no IPv6 packet with extension header is handed to a network
    device with NETIF_F_IPV6_CSUM capability.
    
    Link: https://lore.kernel.org/netdev/26548921.1r3eYUQgxm@benoit.monin/T/#u
    Fixes: aa3463d65e7b ("fou: Add encap ops for IPv6 tunnels")
    Signed-off-by: Benoît Monin <benoit.monin@gmx.fr>
    Reviewed-by: Willem de Bruijn <willemb@google.com>
    Link: https://patch.msgid.link/5fbeecfc311ea182aa1d1c771725ab8b4cac515e.1729778144.git.benoit.monin@gmx.fr
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: stmmac: TSO: Fix unbalanced DMA map/unmap for non-paged SKB data [+ + +]
Author: Furong Xu <0x1207@gmail.com>
Date:   Mon Oct 21 14:10:23 2024 +0800

    net: stmmac: TSO: Fix unbalanced DMA map/unmap for non-paged SKB data
    
    [ Upstream commit 66600fac7a984dea4ae095411f644770b2561ede ]
    
    In case the non-paged data of a SKB carries protocol header and protocol
    payload to be transmitted on a certain platform that the DMA AXI address
    width is configured to 40-bit/48-bit, or the size of the non-paged data
    is bigger than TSO_MAX_BUFF_SIZE on a certain platform that the DMA AXI
    address width is configured to 32-bit, then this SKB requires at least
    two DMA transmit descriptors to serve it.
    
    For example, three descriptors are allocated to split one DMA buffer
    mapped from one piece of non-paged data:
        dma_desc[N + 0],
        dma_desc[N + 1],
        dma_desc[N + 2].
    Then three elements of tx_q->tx_skbuff_dma[] will be allocated to hold
    extra information to be reused in stmmac_tx_clean():
        tx_q->tx_skbuff_dma[N + 0],
        tx_q->tx_skbuff_dma[N + 1],
        tx_q->tx_skbuff_dma[N + 2].
    Now we focus on tx_q->tx_skbuff_dma[entry].buf, which is the DMA buffer
    address returned by DMA mapping call. stmmac_tx_clean() will try to
    unmap the DMA buffer _ONLY_IF_ tx_q->tx_skbuff_dma[entry].buf
    is a valid buffer address.
    
    The expected behavior that saves DMA buffer address of this non-paged
    data to tx_q->tx_skbuff_dma[entry].buf is:
        tx_q->tx_skbuff_dma[N + 0].buf = NULL;
        tx_q->tx_skbuff_dma[N + 1].buf = NULL;
        tx_q->tx_skbuff_dma[N + 2].buf = dma_map_single();
    Unfortunately, the current code misbehaves like this:
        tx_q->tx_skbuff_dma[N + 0].buf = dma_map_single();
        tx_q->tx_skbuff_dma[N + 1].buf = NULL;
        tx_q->tx_skbuff_dma[N + 2].buf = NULL;
    
    On the stmmac_tx_clean() side, when dma_desc[N + 0] is closed by the
    DMA engine, tx_q->tx_skbuff_dma[N + 0].buf is a valid buffer address
    obviously, then the DMA buffer will be unmapped immediately.
    There may be a rare case that the DMA engine does not finish the
    pending dma_desc[N + 1], dma_desc[N + 2] yet. Now things will go
    horribly wrong, DMA is going to access a unmapped/unreferenced memory
    region, corrupted data will be transmited or iommu fault will be
    triggered :(
    
    In contrast, the for-loop that maps SKB fragments behaves perfectly
    as expected, and that is how the driver should do for both non-paged
    data and paged frags actually.
    
    This patch corrects DMA map/unmap sequences by fixing the array index
    for tx_q->tx_skbuff_dma[entry].buf when assigning DMA buffer address.
    
    Tested and verified on DWXGMAC CORE 3.20a
    
    Reported-by: Suraj Jaiswal <quic_jsuraj@quicinc.com>
    Fixes: f748be531d70 ("stmmac: support new GMAC4")
    Signed-off-by: Furong Xu <0x1207@gmail.com>
    Reviewed-by: Hariprasad Kelam <hkelam@marvell.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://patch.msgid.link/20241021061023.2162701-1-0x1207@gmail.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
netdevsim: Add trailing zero to terminate the string in nsim_nexthop_bucket_activity_write() [+ + +]
Author: Zichen Xie <zichenxie0106@gmail.com>
Date:   Tue Oct 22 12:19:08 2024 -0500

    netdevsim: Add trailing zero to terminate the string in nsim_nexthop_bucket_activity_write()
    
    [ Upstream commit 4ce1f56a1eaced2523329bef800d004e30f2f76c ]
    
    This was found by a static analyzer.
    We should not forget the trailing zero after copy_from_user()
    if we will further do some string operations, sscanf() in this
    case. Adding a trailing zero will ensure that the function
    performs properly.
    
    Fixes: c6385c0b67c5 ("netdevsim: Allow reporting activity on nexthop buckets")
    Signed-off-by: Zichen Xie <zichenxie0106@gmail.com>
    Reviewed-by: Petr Machata <petrm@nvidia.com>
    Reviewed-by: Ido Schimmel <idosch@nvidia.com>
    Link: https://patch.msgid.link/20241022171907.8606-1-zichenxie0106@gmail.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
netfilter: Fix use-after-free in get_info() [+ + +]
Author: Dong Chenchen <dongchenchen2@huawei.com>
Date:   Thu Oct 24 09:47:01 2024 +0800

    netfilter: Fix use-after-free in get_info()
    
    [ Upstream commit f48d258f0ac540f00fa617dac496c4c18b5dc2fa ]
    
    ip6table_nat module unload has refcnt warning for UAF. call trace is:
    
    WARNING: CPU: 1 PID: 379 at kernel/module/main.c:853 module_put+0x6f/0x80
    Modules linked in: ip6table_nat(-)
    CPU: 1 UID: 0 PID: 379 Comm: ip6tables Not tainted 6.12.0-rc4-00047-gc2ee9f594da8-dirty #205
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
    BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
    RIP: 0010:module_put+0x6f/0x80
    Call Trace:
     <TASK>
     get_info+0x128/0x180
     do_ip6t_get_ctl+0x6a/0x430
     nf_getsockopt+0x46/0x80
     ipv6_getsockopt+0xb9/0x100
     rawv6_getsockopt+0x42/0x190
     do_sock_getsockopt+0xaa/0x180
     __sys_getsockopt+0x70/0xc0
     __x64_sys_getsockopt+0x20/0x30
     do_syscall_64+0xa2/0x1a0
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
    
    Concurrent execution of module unload and get_info() trigered the warning.
    The root cause is as follows:
    
    cpu0                                  cpu1
    module_exit
    //mod->state = MODULE_STATE_GOING
      ip6table_nat_exit
        xt_unregister_template
            kfree(t)
            //removed from templ_list
                                          getinfo()
                                              t = xt_find_table_lock
                                                    list_for_each_entry(tmpl, &xt_templates[af]...)
                                                            if (strcmp(tmpl->name, name))
                                                                    continue;  //table not found
                                                            try_module_get
                                                    list_for_each_entry(t, &xt_net->tables[af]...)
                                                            return t;  //not get refcnt
                                              module_put(t->me) //uaf
        unregister_pernet_subsys
        //remove table from xt_net list
    
    While xt_table module was going away and has been removed from
    xt_templates list, we couldnt get refcnt of xt_table->me. Check
    module in xt_net->tables list re-traversal to fix it.
    
    Fixes: fdacd57c79b7 ("netfilter: x_tables: never register tables by default")
    Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
    Reviewed-by: Florian Westphal <fw@strlen.de>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

netfilter: nf_reject_ipv6: fix potential crash in nf_send_reset6() [+ + +]
Author: Eric Dumazet <edumazet@google.com>
Date:   Fri Oct 25 08:02:29 2024 +0000

    netfilter: nf_reject_ipv6: fix potential crash in nf_send_reset6()
    
    [ Upstream commit 4ed234fe793f27a3b151c43d2106df2ff0d81aac ]
    
    I got a syzbot report without a repro [1] crashing in nf_send_reset6()
    
    I think the issue is that dev->hard_header_len is zero, and we attempt
    later to push an Ethernet header.
    
    Use LL_MAX_HEADER, as other functions in net/ipv6/netfilter/nf_reject_ipv6.c.
    
    [1]
    
    skbuff: skb_under_panic: text:ffffffff89b1d008 len:74 put:14 head:ffff88803123aa00 data:ffff88803123a9f2 tail:0x3c end:0x140 dev:syz_tun
     kernel BUG at net/core/skbuff.c:206 !
    Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
    CPU: 0 UID: 0 PID: 7373 Comm: syz.1.568 Not tainted 6.12.0-rc2-syzkaller-00631-g6d858708d465 #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
     RIP: 0010:skb_panic net/core/skbuff.c:206 [inline]
     RIP: 0010:skb_under_panic+0x14b/0x150 net/core/skbuff.c:216
    Code: 0d 8d 48 c7 c6 60 a6 29 8e 48 8b 54 24 08 8b 0c 24 44 8b 44 24 04 4d 89 e9 50 41 54 41 57 41 56 e8 ba 30 38 02 48 83 c4 20 90 <0f> 0b 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3
    RSP: 0018:ffffc900045269b0 EFLAGS: 00010282
    RAX: 0000000000000088 RBX: dffffc0000000000 RCX: cd66dacdc5d8e800
    RDX: 0000000000000000 RSI: 0000000000000200 RDI: 0000000000000000
    RBP: ffff88802d39a3d0 R08: ffffffff8174afec R09: 1ffff920008a4ccc
    R10: dffffc0000000000 R11: fffff520008a4ccd R12: 0000000000000140
    R13: ffff88803123aa00 R14: ffff88803123a9f2 R15: 000000000000003c
    FS:  00007fdbee5ff6c0(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000000 CR3: 000000005d322000 CR4: 00000000003526f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
     <TASK>
      skb_push+0xe5/0x100 net/core/skbuff.c:2636
      eth_header+0x38/0x1f0 net/ethernet/eth.c:83
      dev_hard_header include/linux/netdevice.h:3208 [inline]
      nf_send_reset6+0xce6/0x1270 net/ipv6/netfilter/nf_reject_ipv6.c:358
      nft_reject_inet_eval+0x3b9/0x690 net/netfilter/nft_reject_inet.c:48
      expr_call_ops_eval net/netfilter/nf_tables_core.c:240 [inline]
      nft_do_chain+0x4ad/0x1da0 net/netfilter/nf_tables_core.c:288
      nft_do_chain_inet+0x418/0x6b0 net/netfilter/nft_chain_filter.c:161
      nf_hook_entry_hookfn include/linux/netfilter.h:154 [inline]
      nf_hook_slow+0xc3/0x220 net/netfilter/core.c:626
      nf_hook include/linux/netfilter.h:269 [inline]
      NF_HOOK include/linux/netfilter.h:312 [inline]
      br_nf_pre_routing_ipv6+0x63e/0x770 net/bridge/br_netfilter_ipv6.c:184
      nf_hook_entry_hookfn include/linux/netfilter.h:154 [inline]
      nf_hook_bridge_pre net/bridge/br_input.c:277 [inline]
      br_handle_frame+0x9fd/0x1530 net/bridge/br_input.c:424
      __netif_receive_skb_core+0x13e8/0x4570 net/core/dev.c:5562
      __netif_receive_skb_one_core net/core/dev.c:5666 [inline]
      __netif_receive_skb+0x12f/0x650 net/core/dev.c:5781
      netif_receive_skb_internal net/core/dev.c:5867 [inline]
      netif_receive_skb+0x1e8/0x890 net/core/dev.c:5926
      tun_rx_batched+0x1b7/0x8f0 drivers/net/tun.c:1550
      tun_get_user+0x3056/0x47e0 drivers/net/tun.c:2007
      tun_chr_write_iter+0x10d/0x1f0 drivers/net/tun.c:2053
      new_sync_write fs/read_write.c:590 [inline]
      vfs_write+0xa6d/0xc90 fs/read_write.c:683
      ksys_write+0x183/0x2b0 fs/read_write.c:736
      do_syscall_x64 arch/x86/entry/common.c:52 [inline]
      do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
    RIP: 0033:0x7fdbeeb7d1ff
    Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 c9 8d 02 00 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 44 24 08 e8 1c 8e 02 00 48
    RSP: 002b:00007fdbee5ff000 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
    RAX: ffffffffffffffda RBX: 00007fdbeed36058 RCX: 00007fdbeeb7d1ff
    RDX: 000000000000008e RSI: 0000000020000040 RDI: 00000000000000c8
    RBP: 00007fdbeebf12be R08: 0000000000000000 R09: 0000000000000000
    R10: 000000000000008e R11: 0000000000000293 R12: 0000000000000000
    R13: 0000000000000000 R14: 00007fdbeed36058 R15: 00007ffc38de06e8
     </TASK>
    
    Fixes: c8d7b98bec43 ("netfilter: move nf_send_resetX() code to nf_reject_ipvX modules")
    Reported-by: syzbot <syzkaller@googlegroups.com>
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

netfilter: nft_payload: sanitize offset and length before calling skb_checksum() [+ + +]
Author: Pablo Neira Ayuso <pablo@netfilter.org>
Date:   Wed Oct 30 23:13:48 2024 +0100

    netfilter: nft_payload: sanitize offset and length before calling skb_checksum()
    
    [ Upstream commit d5953d680f7e96208c29ce4139a0e38de87a57fe ]
    
    If access to offset + length is larger than the skbuff length, then
    skb_checksum() triggers BUG_ON().
    
    skb_checksum() internally subtracts the length parameter while iterating
    over skbuff, BUG_ON(len) at the end of it checks that the expected
    length to be included in the checksum calculation is fully consumed.
    
    Fixes: 7ec3f7b47b8d ("netfilter: nft_payload: add packet mangling support")
    Reported-by: Slavin Liu <slavin-ayu@qq.com>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
NFS: remove revoked delegation from server's delegation list [+ + +]
Author: Dai Ngo <dai.ngo@oracle.com>
Date:   Tue Oct 8 15:58:07 2024 -0700

    NFS: remove revoked delegation from server's delegation list
    
    [ Upstream commit 7ef60108069b7e3cc66432304e1dd197d5c0a9b5 ]
    
    After the delegation is returned to the NFS server remove it
    from the server's delegations list to reduce the time it takes
    to scan this list.
    
    Network trace captured while running the below script shows the
    time taken to service the CB_RECALL increases gradually due to
    the overhead of traversing the delegation list in
    nfs_delegation_find_inode_server.
    
    The NFS server in this test is a Solaris server which issues
    CB_RECALL when receiving the all-zero stateid in the SETATTR.
    
    mount=/mnt/data
    for i in $(seq 1 20)
    do
       echo $i
       mkdir $mount/testtarfile$i
       time  tar -C $mount/testtarfile$i -xf 5000_files.tar
    done
    
    Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
    Reviewed-by: Trond Myklebust <trond.myklebust@hammerspace.com>
    Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
nilfs2: fix kernel bug due to missing clearing of checked flag [+ + +]
Author: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Date:   Fri Oct 18 04:33:10 2024 +0900

    nilfs2: fix kernel bug due to missing clearing of checked flag
    
    commit 41e192ad2779cae0102879612dfe46726e4396aa upstream.
    
    Syzbot reported that in directory operations after nilfs2 detects
    filesystem corruption and degrades to read-only,
    __block_write_begin_int(), which is called to prepare block writes, may
    fail the BUG_ON check for accesses exceeding the folio/page size,
    triggering a kernel bug.
    
    This was found to be because the "checked" flag of a page/folio was not
    cleared when it was discarded by nilfs2's own routine, which causes the
    sanity check of directory entries to be skipped when the directory
    page/folio is reloaded.  So, fix that.
    
    This was necessary when the use of nilfs2's own page discard routine was
    applied to more than just metadata files.
    
    Link: https://lkml.kernel.org/r/20241017193359.5051-1-konishi.ryusuke@gmail.com
    Fixes: 8c26c4e2694a ("nilfs2: fix issue with flush kernel thread after remount in RO mode because of driver's internal error or metadata corruption")
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
    Reported-by: syzbot+d6ca2daf692c7a82f959@syzkaller.appspotmail.com
    Closes: https://syzkaller.appspot.com/bug?extid=d6ca2daf692c7a82f959
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

nilfs2: fix potential deadlock with newly created symlinks [+ + +]
Author: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Date:   Sun Oct 20 13:51:28 2024 +0900

    nilfs2: fix potential deadlock with newly created symlinks
    
    commit b3a033e3ecd3471248d474ef263aadc0059e516a upstream.
    
    Syzbot reported that page_symlink(), called by nilfs_symlink(), triggers
    memory reclamation involving the filesystem layer, which can result in
    circular lock dependencies among the reader/writer semaphore
    nilfs->ns_segctor_sem, s_writers percpu_rwsem (intwrite) and the
    fs_reclaim pseudo lock.
    
    This is because after commit 21fc61c73c39 ("don't put symlink bodies in
    pagecache into highmem"), the gfp flags of the page cache for symbolic
    links are overwritten to GFP_KERNEL via inode_nohighmem().
    
    This is not a problem for symlinks read from the backing device, because
    the __GFP_FS flag is dropped after inode_nohighmem() is called.  However,
    when a new symlink is created with nilfs_symlink(), the gfp flags remain
    overwritten to GFP_KERNEL.  Then, memory allocation called from
    page_symlink() etc.  triggers memory reclamation including the FS layer,
    which may call nilfs_evict_inode() or nilfs_dirty_inode().  And these can
    cause a deadlock if they are called while nilfs->ns_segctor_sem is held:
    
    Fix this issue by dropping the __GFP_FS flag from the page cache GFP flags
    of newly created symlinks in the same way that nilfs_new_inode() and
    __nilfs_read_inode() do, as a workaround until we adopt nofs allocation
    scope consistently or improve the locking constraints.
    
    Link: https://lkml.kernel.org/r/20241020050003.4308-1-konishi.ryusuke@gmail.com
    Fixes: 21fc61c73c39 ("don't put symlink bodies in pagecache into highmem")
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
    Reported-by: syzbot+9ef37ac20608f4836256@syzkaller.appspotmail.com
    Closes: https://syzkaller.appspot.com/bug?extid=9ef37ac20608f4836256
    Tested-by: syzbot+9ef37ac20608f4836256@syzkaller.appspotmail.com
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
nvmet-auth: assign dh_key to NULL after kfree_sensitive [+ + +]
Author: Vitaliy Shevtsov <v.shevtsov@maxima.ru>
Date:   Mon Sep 16 22:41:37 2024 +0500

    nvmet-auth: assign dh_key to NULL after kfree_sensitive
    
    [ Upstream commit d2f551b1f72b4c508ab9298419f6feadc3b5d791 ]
    
    ctrl->dh_key might be used across multiple calls to nvmet_setup_dhgroup()
    for the same controller. So it's better to nullify it after release on
    error path in order to avoid double free later in nvmet_destroy_auth().
    
    Found by Linux Verification Center (linuxtesting.org) with Svace.
    
    Fixes: 7a277c37d352 ("nvmet-auth: Diffie-Hellman key exchange support")
    Cc: stable@vger.kernel.org
    Signed-off-by: Vitaliy Shevtsov <v.shevtsov@maxima.ru>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Hannes Reinecke <hare@suse.de>
    Signed-off-by: Keith Busch <kbusch@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ocfs2: pass u64 to ocfs2_truncate_inline maybe overflow [+ + +]
Author: Edward Adam Davis <eadavis@qq.com>
Date:   Wed Oct 16 19:43:47 2024 +0800

    ocfs2: pass u64 to ocfs2_truncate_inline maybe overflow
    
    [ Upstream commit bc0a2f3a73fcdac651fca64df39306d1e5ebe3b0 ]
    
    Syzbot reported a kernel BUG in ocfs2_truncate_inline.  There are two
    reasons for this: first, the parameter value passed is greater than
    ocfs2_max_inline_data_with_xattr, second, the start and end parameters of
    ocfs2_truncate_inline are "unsigned int".
    
    So, we need to add a sanity check for byte_start and byte_len right before
    ocfs2_truncate_inline() in ocfs2_remove_inode_range(), if they are greater
    than ocfs2_max_inline_data_with_xattr return -EINVAL.
    
    Link: https://lkml.kernel.org/r/tencent_D48DB5122ADDAEDDD11918CFB68D93258C07@qq.com
    Fixes: 1afc32b95233 ("ocfs2: Write support for inline data")
    Signed-off-by: Edward Adam Davis <eadavis@qq.com>
    Reported-by: syzbot+81092778aac03460d6b7@syzkaller.appspotmail.com
    Closes: https://syzkaller.appspot.com/bug?extid=81092778aac03460d6b7
    Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
    Cc: Joel Becker <jlbec@evilplan.org>
    Cc: Joseph Qi <joseph.qi@linux.alibaba.com>
    Cc: Mark Fasheh <mark@fasheh.com>
    Cc: Junxiao Bi <junxiao.bi@oracle.com>
    Cc: Changwei Ge <gechangwei@live.cn>
    Cc: Gang He <ghe@suse.com>
    Cc: Jun Piao <piaojun@huawei.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
RDMA/bnxt_re: synchronize the qp-handle table array [+ + +]
Author: Selvin Xavier <selvin.xavier@broadcom.com>
Date:   Mon Oct 14 06:36:15 2024 -0700

    RDMA/bnxt_re: synchronize the qp-handle table array
    
    [ Upstream commit 76d3ddff7153cc0bcc14a63798d19f5d0693ea71 ]
    
    There is a race between the CREQ tasklet and destroy qp when accessing the
    qp-handle table. There is a chance of reading a valid qp-handle in the
    CREQ tasklet handler while the QP is already moving ahead with the
    destruction.
    
    Fixing this race by implementing a table-lock to synchronize the access.
    
    Fixes: f218d67ef004 ("RDMA/bnxt_re: Allow posting when QPs are in error")
    Fixes: 84cf229f4001 ("RDMA/bnxt_re: Fix the qp table indexing")
    Link: https://patch.msgid.link/r/1728912975-19346-3-git-send-email-selvin.xavier@broadcom.com
    Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
    Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
    Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
RDMA/cxgb4: Dump vendor specific QP details [+ + +]
Author: Leon Romanovsky <leon@kernel.org>
Date:   Mon Oct 7 20:55:17 2024 +0300

    RDMA/cxgb4: Dump vendor specific QP details
    
    [ Upstream commit 89f8c6f197f480fe05edf91eb9359d5425869d04 ]
    
    Restore the missing functionality to dump vendor specific QP details,
    which was mistakenly removed in the commit mentioned in Fixes line.
    
    Fixes: 5cc34116ccec ("RDMA: Add dedicated QP resource tracker function")
    Link: https://patch.msgid.link/r/ed9844829135cfdcac7d64285688195a5cd43f82.1728323026.git.leonro@nvidia.com
    Reported-by: Dr. David Alan Gilbert <linux@treblig.org>
    Closes: https://lore.kernel.org/all/Zv_4qAxuC0dLmgXP@gallifrey
    Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
    Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
RDMA/mlx5: Round max_rd_atomic/max_dest_rd_atomic up instead of down [+ + +]
Author: Patrisious Haddad <phaddad@nvidia.com>
Date:   Thu Oct 10 11:50:23 2024 +0300

    RDMA/mlx5: Round max_rd_atomic/max_dest_rd_atomic up instead of down
    
    [ Upstream commit 78ed28e08e74da6265e49e19206e1bcb8b9a7f0d ]
    
    After the cited commit below max_dest_rd_atomic and max_rd_atomic values
    are being rounded down to the next power of 2. As opposed to the old
    behavior and mlx4 driver where they used to be rounded up instead.
    
    In order to stay consistent with older code and other drivers, revert to
    using fls round function which rounds up to the next power of 2.
    
    Fixes: f18e26af6aba ("RDMA/mlx5: Convert modify QP to use MLX5_SET macros")
    Link: https://patch.msgid.link/r/d85515d6ef21a2fa8ef4c8293dce9b58df8a6297.1728550179.git.leon@kernel.org
    Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
    Reviewed-by: Maher Sanalla <msanalla@nvidia.com>
    Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
    Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
Revert "driver core: Fix uevent_show() vs driver detach race" [+ + +]
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date:   Tue Oct 29 01:23:04 2024 +0100

    Revert "driver core: Fix uevent_show() vs driver detach race"
    
    commit 9a71892cbcdb9d1459c84f5a4c722b14354158a5 upstream.
    
    This reverts commit 15fffc6a5624b13b428bb1c6e9088e32a55eb82c.
    
    This commit causes a regression, so revert it for now until it can come
    back in a way that works for everyone.
    
    Link: https://lore.kernel.org/all/172790598832.1168608.4519484276671503678.stgit@dwillia2-xfh.jf.intel.com/
    Fixes: 15fffc6a5624 ("driver core: Fix uevent_show() vs driver detach race")
    Cc: stable <stable@kernel.org>
    Cc: Ashish Sangwan <a.sangwan@samsung.com>
    Cc: Namjae Jeon <namjae.jeon@samsung.com>
    Cc: Dirk Behme <dirk.behme@de.bosch.com>
    Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Cc: Rafael J. Wysocki <rafael@kernel.org>
    Cc: Dan Williams <dan.j.williams@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
riscv: efi: Set NX compat flag in PE/COFF header [+ + +]
Author: Heinrich Schuchardt <heinrich.schuchardt@canonical.com>
Date:   Sun Sep 29 16:02:33 2024 +0200

    riscv: efi: Set NX compat flag in PE/COFF header
    
    [ Upstream commit d41373a4b910961df5a5e3527d7bde6ad45ca438 ]
    
    The IMAGE_DLLCHARACTERISTICS_NX_COMPAT informs the firmware that the
    EFI binary does not rely on pages that are both executable and
    writable.
    
    The flag is used by some distro versions of GRUB to decide if the EFI
    binary may be executed.
    
    As the Linux kernel neither has RWX sections nor needs RWX pages for
    relocation we should set the flag.
    
    Cc: Ard Biesheuvel <ardb@kernel.org>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com>
    Reviewed-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
    Fixes: cb7d2dd5612a ("RISC-V: Add PE/COFF header for EFI stub")
    Acked-by: Ard Biesheuvel <ardb@kernel.org>
    Link: https://lore.kernel.org/r/20240929140233.211800-1-heinrich.schuchardt@canonical.com
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

riscv: Remove duplicated GET_RM [+ + +]
Author: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
Date:   Tue Oct 8 17:41:39 2024 +0800

    riscv: Remove duplicated GET_RM
    
    [ Upstream commit 164f66de6bb6ef454893f193c898dc8f1da6d18b ]
    
    The macro GET_RM defined twice in this file, one can be removed.
    
    Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
    Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
    Fixes: 956d705dd279 ("riscv: Unaligned load/store handling for M_MODE")
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20241008094141.549248-3-zhangchunyan@iscas.ac.cn
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

riscv: Remove unused GENERATING_ASM_OFFSETS [+ + +]
Author: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
Date:   Tue Oct 8 17:41:38 2024 +0800

    riscv: Remove unused GENERATING_ASM_OFFSETS
    
    [ Upstream commit 46d4e5ac6f2f801f97bcd0ec82365969197dc9b1 ]
    
    The macro is not used in the current version of kernel, it looks like
    can be removed to avoid a build warning:
    
    ../arch/riscv/kernel/asm-offsets.c: At top level:
    ../arch/riscv/kernel/asm-offsets.c:7: warning: macro "GENERATING_ASM_OFFSETS" is not used [-Wunused-macros]
        7 | #define GENERATING_ASM_OFFSETS
    
    Fixes: 9639a44394b9 ("RISC-V: Provide a cleaner raw_smp_processor_id()")
    Cc: stable@vger.kernel.org
    Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
    Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
    Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
    Link: https://lore.kernel.org/r/20241008094141.549248-2-zhangchunyan@iscas.ac.cn
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

riscv: Use '%u' to format the output of 'cpu' [+ + +]
Author: WangYuli <wangyuli@uniontech.com>
Date:   Thu Oct 17 11:20:10 2024 +0800

    riscv: Use '%u' to format the output of 'cpu'
    
    [ Upstream commit e0872ab72630dada3ae055bfa410bf463ff1d1e0 ]
    
    'cpu' is an unsigned integer, so its conversion specifier should
    be %u, not %d.
    
    Suggested-by: Wentao Guan <guanwentao@uniontech.com>
    Suggested-by: Maciej W. Rozycki <macro@orcam.me.uk>
    Link: https://lore.kernel.org/all/alpine.DEB.2.21.2409122309090.40372@angie.orcam.me.uk/
    Signed-off-by: WangYuli <wangyuli@uniontech.com>
    Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
    Tested-by: Charlie Jenkins <charlie@rivosinc.com>
    Fixes: f1e58583b9c7 ("RISC-V: Support cpu hotplug")
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/4C127DEECDA287C8+20241017032010.96772-1-wangyuli@uniontech.com
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

riscv: vdso: Prevent the compiler from inserting calls to memset() [+ + +]
Author: Alexandre Ghiti <alexghiti@rivosinc.com>
Date:   Wed Oct 16 10:36:24 2024 +0200

    riscv: vdso: Prevent the compiler from inserting calls to memset()
    
    [ Upstream commit bf40167d54d55d4b54d0103713d86a8638fb9290 ]
    
    The compiler is smart enough to insert a call to memset() in
    riscv_vdso_get_cpus(), which generates a dynamic relocation.
    
    So prevent this by using -fno-builtin option.
    
    Fixes: e2c0cdfba7f6 ("RISC-V: User-facing API")
    Cc: stable@vger.kernel.org
    Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
    Reviewed-by: Guo Ren <guoren@kernel.org>
    Link: https://lore.kernel.org/r/20241016083625.136311-2-alexghiti@rivosinc.com
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
scsi: scsi_transport_fc: Allow setting rport state to current state [+ + +]
Author: Benjamin Marzinski <bmarzins@redhat.com>
Date:   Tue Sep 17 19:06:43 2024 -0400

    scsi: scsi_transport_fc: Allow setting rport state to current state
    
    [ Upstream commit d539a871ae47a1f27a609a62e06093fa69d7ce99 ]
    
    The only input fc_rport_set_marginal_state() currently accepts is
    "Marginal" when port_state is "Online", and "Online" when the port_state
    is "Marginal". It should also allow setting port_state to its current
    state, either "Marginal or "Online".
    
    Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
    Link: https://lore.kernel.org/r/20240917230643.966768-1-bmarzins@redhat.com
    Reviewed-by: Ewan D. Milne <emilne@redhat.com>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
selftests/mm: fix incorrect buffer->mirror size in hmm2 double_map test [+ + +]
Author: Donet Tom <donettom@linux.ibm.com>
Date:   Fri Sep 27 00:07:52 2024 -0500

    selftests/mm: fix incorrect buffer->mirror size in hmm2 double_map test
    
    [ Upstream commit 76503e1fa1a53ef041a120825d5ce81c7fe7bdd7 ]
    
    The hmm2 double_map test was failing due to an incorrect buffer->mirror
    size.  The buffer->mirror size was 6, while buffer->ptr size was 6 *
    PAGE_SIZE.  The test failed because the kernel's copy_to_user function was
    attempting to copy a 6 * PAGE_SIZE buffer to buffer->mirror.  Since the
    size of buffer->mirror was incorrect, copy_to_user failed.
    
    This patch corrects the buffer->mirror size to 6 * PAGE_SIZE.
    
    Test Result without this patch
    ==============================
     #  RUN           hmm2.hmm2_device_private.double_map ...
     # hmm-tests.c:1680:double_map:Expected ret (-14) == 0 (0)
     # double_map: Test terminated by assertion
     #          FAIL  hmm2.hmm2_device_private.double_map
     not ok 53 hmm2.hmm2_device_private.double_map
    
    Test Result with this patch
    ===========================
     #  RUN           hmm2.hmm2_device_private.double_map ...
     #            OK  hmm2.hmm2_device_private.double_map
     ok 53 hmm2.hmm2_device_private.double_map
    
    Link: https://lkml.kernel.org/r/20240927050752.51066-1-donettom@linux.ibm.com
    Fixes: fee9f6d1b8df ("mm/hmm/test: add selftests for HMM")
    Signed-off-by: Donet Tom <donettom@linux.ibm.com>
    Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
    Cc: Jérôme Glisse <jglisse@redhat.com>
    Cc: Kees Cook <keescook@chromium.org>
    Cc: Mark Brown <broonie@kernel.org>
    Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>
    Cc: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
    Cc: Shuah Khan <shuah@kernel.org>
    Cc: Ralph Campbell <rcampbell@nvidia.com>
    Cc: Jason Gunthorpe <jgg@mellanox.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
staging: iio: frequency: ad9832: fix division by zero in ad9832_calc_freqreg() [+ + +]
Author: Zicheng Qu <quzicheng@huawei.com>
Date:   Tue Oct 22 13:43:54 2024 +0000

    staging: iio: frequency: ad9832: fix division by zero in ad9832_calc_freqreg()
    
    commit 6bd301819f8f69331a55ae2336c8b111fc933f3d upstream.
    
    In the ad9832_write_frequency() function, clk_get_rate() might return 0.
    This can lead to a division by zero when calling ad9832_calc_freqreg().
    The check if (fout > (clk_get_rate(st->mclk) / 2)) does not protect
    against the case when fout is 0. The ad9832_write_frequency() function
    is called from ad9832_write(), and fout is derived from a text buffer,
    which can contain any value.
    
    Link: https://lore.kernel.org/all/2024100904-CVE-2024-47663-9bdc@gregkh/
    Fixes: ea707584bac1 ("Staging: IIO: DDS: AD9832 / AD9835 driver")
    Cc: stable@vger.kernel.org
    Signed-off-by: Zicheng Qu <quzicheng@huawei.com>
    Reviewed-by: Nuno Sa <nuno.sa@analog.com>
    Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
    Link: https://patch.msgid.link/20241022134354.574614-1-quzicheng@huawei.com
    Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
USB: gadget: dummy-hcd: Fix "task hung" problem [+ + +]
Author: Alan Stern <stern@rowland.harvard.edu>
Date:   Wed Oct 16 11:44:45 2024 -0400

    USB: gadget: dummy-hcd: Fix "task hung" problem
    
    [ Upstream commit 5189df7b8088268012882c220d6aca4e64981348 ]
    
    The syzbot fuzzer has been encountering "task hung" problems ever
    since the dummy-hcd driver was changed to use hrtimers instead of
    regular timers.  It turns out that the problems are caused by a subtle
    difference between the timer_pending() and hrtimer_active() APIs.
    
    The changeover blindly replaced the first by the second.  However,
    timer_pending() returns True when the timer is queued but not when its
    callback is running, whereas hrtimer_active() returns True when the
    hrtimer is queued _or_ its callback is running.  This difference
    occasionally caused dummy_urb_enqueue() to think that the callback
    routine had not yet started when in fact it was almost finished.  As a
    result the hrtimer was not restarted, which made it impossible for the
    driver to dequeue later the URB that was just enqueued.  This caused
    usb_kill_urb() to hang, and things got worse from there.
    
    Since hrtimers have no API for telling when they are queued and the
    callback isn't running, the driver must keep track of this for itself.
    That's what this patch does, adding a new "timer_pending" flag and
    setting or clearing it at the appropriate times.
    
    Reported-by: syzbot+f342ea16c9d06d80b585@syzkaller.appspotmail.com
    Closes: https://lore.kernel.org/linux-usb/6709234e.050a0220.3e960.0011.GAE@google.com/
    Tested-by: syzbot+f342ea16c9d06d80b585@syzkaller.appspotmail.com
    Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
    Fixes: a7f3813e589f ("usb: gadget: dummy_hcd: Switch to hrtimer transfer scheduler")
    Cc: Marcello Sylvester Bauer <sylv@sylv.io>
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/2dab644e-ef87-4de8-ac9a-26f100b2c609@rowland.harvard.edu
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
usb: gadget: dummy_hcd: execute hrtimer callback in softirq context [+ + +]
Author: Andrey Konovalov <andreyknvl@gmail.com>
Date:   Wed Sep 4 03:30:51 2024 +0200

    usb: gadget: dummy_hcd: execute hrtimer callback in softirq context
    
    [ Upstream commit 9313d139aa25e572d860f6f673b73a20f32d7f93 ]
    
    Commit a7f3813e589f ("usb: gadget: dummy_hcd: Switch to hrtimer transfer
    scheduler") switched dummy_hcd to use hrtimer and made the timer's
    callback be executed in the hardirq context.
    
    With that change, __usb_hcd_giveback_urb now gets executed in the hardirq
    context, which causes problems for KCOV and KMSAN.
    
    One problem is that KCOV now is unable to collect coverage from
    the USB code that gets executed from the dummy_hcd's timer callback,
    as KCOV cannot collect coverage in the hardirq context.
    
    Another problem is that the dummy_hcd hrtimer might get triggered in the
    middle of a softirq with KCOV remote coverage collection enabled, and that
    causes a WARNING in KCOV, as reported by syzbot. (I sent a separate patch
    to shut down this WARNING, but that doesn't fix the other two issues.)
    
    Finally, KMSAN appears to ignore tracking memory copying operations
    that happen in the hardirq context, which causes false positive
    kernel-infoleaks, as reported by syzbot.
    
    Change the hrtimer in dummy_hcd to execute the callback in the softirq
    context.
    
    Reported-by: syzbot+2388cdaeb6b10f0c13ac@syzkaller.appspotmail.com
    Closes: https://syzkaller.appspot.com/bug?extid=2388cdaeb6b10f0c13ac
    Reported-by: syzbot+17ca2339e34a1d863aad@syzkaller.appspotmail.com
    Closes: https://syzkaller.appspot.com/bug?extid=17ca2339e34a1d863aad
    Reported-by: syzbot+c793a7eca38803212c61@syzkaller.appspotmail.com
    Closes: https://syzkaller.appspot.com/bug?extid=c793a7eca38803212c61
    Reported-by: syzbot+1e6e0b916b211bee1bd6@syzkaller.appspotmail.com
    Closes: https://syzkaller.appspot.com/bug?extid=1e6e0b916b211bee1bd6
    Reported-by: kernel test robot <oliver.sang@intel.com>
    Closes: https://lore.kernel.org/oe-lkp/202406141323.413a90d2-lkp@intel.com
    Fixes: a7f3813e589f ("usb: gadget: dummy_hcd: Switch to hrtimer transfer scheduler")
    Cc: stable@vger.kernel.org
    Acked-by: Marcello Sylvester Bauer <sylv@sylv.io>
    Signed-off-by: Andrey Konovalov <andreyknvl@gmail.com>
    Reported-by: syzbot+edd9fe0d3a65b14588d5@syzkaller.appspotmail.com
    Closes: https://syzkaller.appspot.com/bug?extid=edd9fe0d3a65b14588d5
    Link: https://lore.kernel.org/r/20240904013051.4409-1-andrey.konovalov@linux.dev
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

usb: gadget: dummy_hcd: Set transfer interval to 1 microframe [+ + +]
Author: Marcello Sylvester Bauer <sylv@sylv.io>
Date:   Thu Apr 11 17:22:11 2024 +0200

    usb: gadget: dummy_hcd: Set transfer interval to 1 microframe
    
    [ Upstream commit 0a723ed3baa941ca4f51d87bab00661f41142835 ]
    
    Currently, the transfer polling interval is set to 1ms, which is the
    frame rate of full-speed and low-speed USB. The USB 2.0 specification
    introduces microframes (125 microseconds) to improve the timing
    precision of data transfers.
    
    Reducing the transfer interval to 1 microframe increases data throughput
    for high-speed and super-speed USB communication
    
    Signed-off-by: Marcello Sylvester Bauer <marcello.bauer@9elements.com>
    Signed-off-by: Marcello Sylvester Bauer <sylv@sylv.io>
    Link: https://lore.kernel.org/r/6295dbb84ca76884551df9eb157cce569377a22c.1712843963.git.sylv@sylv.io
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

usb: gadget: dummy_hcd: Switch to hrtimer transfer scheduler [+ + +]
Author: Marcello Sylvester Bauer <sylv@sylv.io>
Date:   Thu Apr 11 16:51:28 2024 +0200

    usb: gadget: dummy_hcd: Switch to hrtimer transfer scheduler
    
    [ Upstream commit a7f3813e589fd8e2834720829a47b5eb914a9afe ]
    
    The dummy_hcd transfer scheduler assumes that the internal kernel timer
    frequency is set to 1000Hz to give a polling interval of 1ms. Reducing
    the timer frequency will result in an anti-proportional reduction in
    transfer performance. Switch to a hrtimer to decouple this association.
    
    Signed-off-by: Marcello Sylvester Bauer <marcello.bauer@9elements.com>
    Signed-off-by: Marcello Sylvester Bauer <sylv@sylv.io>
    Reviewed-by: Alan Stern <stern@rowland.harvard.edu>
    Link: https://lore.kernel.org/r/57a1c2180ff74661600e010c234d1dbaba1d0d46.1712843963.git.sylv@sylv.io
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

usb: phy: Fix API devm_usb_put_phy() can not release the phy [+ + +]
Author: Zijun Hu <quic_zijuhu@quicinc.com>
Date:   Sun Oct 20 17:33:42 2024 +0800

    usb: phy: Fix API devm_usb_put_phy() can not release the phy
    
    commit fdce49b5da6e0fb6d077986dec3e90ef2b094b50 upstream.
    
    For devm_usb_put_phy(), its comment says it needs to invoke usb_put_phy()
    to release the phy, but it does not do that actually, so it can not fully
    undo what the API devm_usb_get_phy() does, that is wrong, fixed by using
    devres_release() instead of devres_destroy() within the API.
    
    Fixes: cedf8602373a ("usb: phy: move bulk of otg/otg.c to phy/phy.c")
    Cc: stable@vger.kernel.org
    Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
    Link: https://lore.kernel.org/r/20241020-usb_phy_fix-v1-1-7f79243b8e1e@quicinc.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: typec: fix unreleased fwnode_handle in typec_port_register_altmodes() [+ + +]
Author: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Date:   Mon Oct 21 22:45:29 2024 +0200

    usb: typec: fix unreleased fwnode_handle in typec_port_register_altmodes()
    
    commit 9581acb91eaf5bbe70086bbb6fca808220d358ba upstream.
    
    The 'altmodes_node' fwnode_handle is never released after it is no
    longer required, which leaks the resource.
    
    Add the required call to fwnode_handle_put() when 'altmodes_node' is no
    longer required.
    
    Cc: stable@vger.kernel.org
    Fixes: 7b458a4c5d73 ("usb: typec: Add typec_port_register_altmodes()")
    Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
    Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com>
    Link: https://lore.kernel.org/r/20241021-typec-class-fwnode_handle_put-v2-1-3281225d3d27@gmail.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
usbip: tools: Fix detach_port() invalid port error path [+ + +]
Author: Zongmin Zhou <zhouzongmin@kylinos.cn>
Date:   Thu Oct 24 10:27:00 2024 +0800

    usbip: tools: Fix detach_port() invalid port error path
    
    commit e7cd4b811c9e019f5acbce85699c622b30194c24 upstream.
    
    The detach_port() doesn't return error
    when detach is attempted on an invalid port.
    
    Fixes: 40ecdeb1a187 ("usbip: usbip_detach: fix to check for invalid ports")
    Cc: stable@vger.kernel.org
    Reviewed-by: Hongren Zheng <i@zenithal.me>
    Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
    Signed-off-by: Zongmin Zhou <zhouzongmin@kylinos.cn>
    Link: https://lore.kernel.org/r/20241024022700.1236660-1-min_halo@163.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
vmscan,migrate: fix page count imbalance on node stats when demoting pages [+ + +]
Author: Gregory Price <gourry@gourry.net>
Date:   Fri Oct 25 10:17:24 2024 -0400

    vmscan,migrate: fix page count imbalance on node stats when demoting pages
    
    [ Upstream commit 35e41024c4c2b02ef8207f61b9004f6956cf037b ]
    
    When numa balancing is enabled with demotion, vmscan will call
    migrate_pages when shrinking LRUs.  migrate_pages will decrement the
    the node's isolated page count, leading to an imbalanced count when
    invoked from (MG)LRU code.
    
    The result is dmesg output like such:
    
    $ cat /proc/sys/vm/stat_refresh
    
    [77383.088417] vmstat_refresh: nr_isolated_anon -103212
    [77383.088417] vmstat_refresh: nr_isolated_file -899642
    
    This negative value may impact compaction and reclaim throttling.
    
    The following path produces the decrement:
    
    shrink_folio_list
      demote_folio_list
        migrate_pages
          migrate_pages_batch
            migrate_folio_move
              migrate_folio_done
                mod_node_page_state(-ve) <- decrement
    
    This path happens for SUCCESSFUL migrations, not failures.  Typically
    callers to migrate_pages are required to handle putback/accounting for
    failures, but this is already handled in the shrink code.
    
    When accounting for migrations, instead do not decrement the count when
    the migration reason is MR_DEMOTION.  As of v6.11, this demotion logic
    is the only source of MR_DEMOTION.
    
    Link: https://lkml.kernel.org/r/20241025141724.17927-1-gourry@gourry.net
    Fixes: 26aa2d199d6f ("mm/migrate: demote pages during reclaim")
    Signed-off-by: Gregory Price <gourry@gourry.net>
    Reviewed-by: Yang Shi <shy828301@gmail.com>
    Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
    Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev>
    Reviewed-by: "Huang, Ying" <ying.huang@intel.com>
    Reviewed-by: Oscar Salvador <osalvador@suse.de>
    Cc: Dave Hansen <dave.hansen@linux.intel.com>
    Cc: Wei Xu <weixugc@google.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
vt: prevent kernel-infoleak in con_font_get() [+ + +]
Author: Jeongjun Park <aha310510@gmail.com>
Date:   Fri Oct 11 02:46:19 2024 +0900

    vt: prevent kernel-infoleak in con_font_get()
    
    commit f956052e00de211b5c9ebaa1958366c23f82ee9e upstream.
    
    font.data may not initialize all memory spaces depending on the implementation
    of vc->vc_sw->con_font_get. This may cause info-leak, so to prevent this, it
    is safest to modify it to initialize the allocated memory space to 0, and it
    generally does not affect the overall performance of the system.
    
    Cc: stable@vger.kernel.org
    Reported-by: syzbot+955da2d57931604ee691@syzkaller.appspotmail.com
    Fixes: 05e2600cb0a4 ("VT: Bump font size limitation to 64x128 pixels")
    Signed-off-by: Jeongjun Park <aha310510@gmail.com>
    Link: https://lore.kernel.org/r/20241010174619.59662-1-aha310510@gmail.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
wifi: ath10k: Fix memory leak in management tx [+ + +]
Author: Manikanta Pubbisetty <quic_mpubbise@quicinc.com>
Date:   Tue Oct 15 12:11:03 2024 +0530

    wifi: ath10k: Fix memory leak in management tx
    
    commit e15d84b3bba187aa372dff7c58ce1fd5cb48a076 upstream.
    
    In the current logic, memory is allocated for storing the MSDU context
    during management packet TX but this memory is not being freed during
    management TX completion. Similar leaks are seen in the management TX
    cleanup logic.
    
    Kmemleak reports this problem as below,
    
    unreferenced object 0xffffff80b64ed250 (size 16):
      comm "kworker/u16:7", pid 148, jiffies 4294687130 (age 714.199s)
      hex dump (first 16 bytes):
        00 2b d8 d8 80 ff ff ff c4 74 e9 fd 07 00 00 00  .+.......t......
      backtrace:
        [<ffffffe6e7b245dc>] __kmem_cache_alloc_node+0x1e4/0x2d8
        [<ffffffe6e7adde88>] kmalloc_trace+0x48/0x110
        [<ffffffe6bbd765fc>] ath10k_wmi_tlv_op_gen_mgmt_tx_send+0xd4/0x1d8 [ath10k_core]
        [<ffffffe6bbd3eed4>] ath10k_mgmt_over_wmi_tx_work+0x134/0x298 [ath10k_core]
        [<ffffffe6e78d5974>] process_scheduled_works+0x1ac/0x400
        [<ffffffe6e78d60b8>] worker_thread+0x208/0x328
        [<ffffffe6e78dc890>] kthread+0x100/0x1c0
        [<ffffffe6e78166c0>] ret_from_fork+0x10/0x20
    
    Free the memory during completion and cleanup to fix the leak.
    
    Protect the mgmt_pending_tx idr_remove() operation in
    ath10k_wmi_tlv_op_cleanup_mgmt_tx_send() using ar->data_lock similar to
    other instances.
    
    Tested-on: WCN3990 hw1.0 SNOC WLAN.HL.2.0-01387-QCAHLSWMTPLZ-1
    
    Fixes: dc405152bb64 ("ath10k: handle mgmt tx completion event")
    Fixes: c730c477176a ("ath10k: Remove msdu from idr when management pkt send fails")
    Cc: stable@vger.kernel.org
    Signed-off-by: Manikanta Pubbisetty <quic_mpubbise@quicinc.com>
    Link: https://patch.msgid.link/20241015064103.6060-1-quic_mpubbise@quicinc.com
    Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

wifi: ath11k: Fix invalid ring usage in full monitor mode [+ + +]
Author: Remi Pommarel <repk@triplefau.lt>
Date:   Tue Sep 24 21:41:19 2024 +0200

    wifi: ath11k: Fix invalid ring usage in full monitor mode
    
    [ Upstream commit befd716ed429b26eca7abde95da6195c548470de ]
    
    On full monitor HW the monitor destination rxdma ring does not have the
    same descriptor format as in the "classical" mode. The full monitor
    destination entries are of hal_sw_monitor_ring type and fetched using
    ath11k_dp_full_mon_process_rx while the classical ones are of type
    hal_reo_entrance_ring and fetched with ath11k_dp_rx_mon_dest_process.
    
    Although both hal_sw_monitor_ring and hal_reo_entrance_ring are of same
    size, the offset to useful info (such as sw_cookie, paddr, etc) are
    different. Thus if ath11k_dp_rx_mon_dest_process gets called on full
    monitor destination ring, invalid skb buffer id will be fetched from DMA
    ring causing issues such as the following rcu_sched stall:
    
     rcu: INFO: rcu_sched self-detected stall on CPU
     rcu:     0-....: (1 GPs behind) idle=c67/0/0x7 softirq=45768/45769 fqs=1012
      (t=2100 jiffies g=14817 q=8703)
     Task dump for CPU 0:
     task:swapper/0       state:R  running task     stack: 0 pid:    0 ppid:     0 flags:0x0000000a
     Call trace:
      dump_backtrace+0x0/0x160
      show_stack+0x14/0x20
      sched_show_task+0x158/0x184
      dump_cpu_task+0x40/0x4c
      rcu_dump_cpu_stacks+0xec/0x12c
      rcu_sched_clock_irq+0x6c8/0x8a0
      update_process_times+0x88/0xd0
      tick_sched_timer+0x74/0x1e0
      __hrtimer_run_queues+0x150/0x204
      hrtimer_interrupt+0xe4/0x240
      arch_timer_handler_phys+0x30/0x40
      handle_percpu_devid_irq+0x80/0x130
      handle_domain_irq+0x5c/0x90
      gic_handle_irq+0x8c/0xb4
      do_interrupt_handler+0x30/0x54
      el1_interrupt+0x2c/0x4c
      el1h_64_irq_handler+0x14/0x1c
      el1h_64_irq+0x74/0x78
      do_raw_spin_lock+0x60/0x100
      _raw_spin_lock_bh+0x1c/0x2c
      ath11k_dp_rx_mon_mpdu_pop.constprop.0+0x174/0x650
      ath11k_dp_rx_process_mon_status+0x8b4/0xa80
      ath11k_dp_rx_process_mon_rings+0x244/0x510
      ath11k_dp_service_srng+0x190/0x300
      ath11k_pcic_ext_grp_napi_poll+0x30/0xc0
      __napi_poll+0x34/0x174
      net_rx_action+0xf8/0x2a0
      _stext+0x12c/0x2ac
      irq_exit+0x94/0xc0
      handle_domain_irq+0x60/0x90
      gic_handle_irq+0x8c/0xb4
      call_on_irq_stack+0x28/0x44
      do_interrupt_handler+0x4c/0x54
      el1_interrupt+0x2c/0x4c
      el1h_64_irq_handler+0x14/0x1c
      el1h_64_irq+0x74/0x78
      arch_cpu_idle+0x14/0x20
      do_idle+0xf0/0x130
      cpu_startup_entry+0x24/0x50
      rest_init+0xf8/0x104
      arch_call_rest_init+0xc/0x14
      start_kernel+0x56c/0x58c
      __primary_switched+0xa0/0xa8
    
    Thus ath11k_dp_rx_mon_dest_process(), which use classical destination
    entry format, should no be called on full monitor capable HW.
    
    Fixes: 67a9d399fcb0 ("ath11k: enable RX PPDU stats in monitor co-exist mode")
    Signed-off-by: Remi Pommarel <repk@triplefau.lt>
    Reviewed-by: Praneesh P <quic_ppranees@quicinc.com>
    Link: https://patch.msgid.link/20240924194119.15942-1-repk@triplefau.lt
    Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: brcm80211: BRCM_TRACING should depend on TRACING [+ + +]
Author: Geert Uytterhoeven <geert@linux-m68k.org>
Date:   Tue Sep 24 14:09:32 2024 +0200

    wifi: brcm80211: BRCM_TRACING should depend on TRACING
    
    [ Upstream commit b73b2069528f90ec49d5fa1010a759baa2c2be05 ]
    
    When tracing is disabled, there is no point in asking the user about
    enabling Broadcom wireless device tracing.
    
    Fixes: f5c4f10852d42012 ("brcm80211: Allow trace support to be enabled separately from debug")
    Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
    Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com>
    Signed-off-by: Kalle Valo <kvalo@kernel.org>
    Link: https://patch.msgid.link/81a29b15eaacc1ac1fb421bdace9ac0c3385f40f.1727179742.git.geert@linux-m68k.org
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: cfg80211: clear wdev->cqm_config pointer on free [+ + +]
Author: Johannes Berg <johannes.berg@intel.com>
Date:   Tue Oct 22 16:17:42 2024 +0200

    wifi: cfg80211: clear wdev->cqm_config pointer on free
    
    commit d5fee261dfd9e17b08b1df8471ac5d5736070917 upstream.
    
    When we free wdev->cqm_config when unregistering, we also
    need to clear out the pointer since the same wdev/netdev
    may get re-registered in another network namespace, then
    destroyed later, running this code again, which results in
    a double-free.
    
    Reported-by: syzbot+36218cddfd84b5cc263e@syzkaller.appspotmail.com
    Fixes: 37c20b2effe9 ("wifi: cfg80211: fix cqm_config access race")
    Cc: stable@vger.kernel.org
    Link: https://patch.msgid.link/20241022161742.7c34b2037726.I121b9cdb7eb180802eafc90b493522950d57ee18@changeid
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

wifi: iwlegacy: Clear stale interrupts before resuming device [+ + +]
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date:   Tue Oct 1 23:07:45 2024 +0300

    wifi: iwlegacy: Clear stale interrupts before resuming device
    
    commit 07c90acb071b9954e1fecb1e4f4f13d12c544b34 upstream.
    
    iwl4965 fails upon resume from hibernation on my laptop. The reason
    seems to be a stale interrupt which isn't being cleared out before
    interrupts are enabled. We end up with a race beween the resume
    trying to bring things back up, and the restart work (queued form
    the interrupt handler) trying to bring things down. Eventually
    the whole thing blows up.
    
    Fix the problem by clearing out any stale interrupts before
    interrupts get enabled during resume.
    
    Here's a debug log of the indicent:
    [   12.042589] ieee80211 phy0: il_isr ISR inta 0x00000080, enabled 0xaa00008b, fh 0x00000000
    [   12.042625] ieee80211 phy0: il4965_irq_tasklet inta 0x00000080, enabled 0x00000000, fh 0x00000000
    [   12.042651] iwl4965 0000:10:00.0: RF_KILL bit toggled to enable radio.
    [   12.042653] iwl4965 0000:10:00.0: On demand firmware reload
    [   12.042690] ieee80211 phy0: il4965_irq_tasklet End inta 0x00000000, enabled 0xaa00008b, fh 0x00000000, flags 0x00000282
    [   12.052207] ieee80211 phy0: il4965_mac_start enter
    [   12.052212] ieee80211 phy0: il_prep_station Add STA to driver ID 31: ff:ff:ff:ff:ff:ff
    [   12.052244] ieee80211 phy0: il4965_set_hw_ready hardware  ready
    [   12.052324] ieee80211 phy0: il_apm_init Init card's basic functions
    [   12.052348] ieee80211 phy0: il_apm_init L1 Enabled; Disabling L0S
    [   12.055727] ieee80211 phy0: il4965_load_bsm Begin load bsm
    [   12.056140] ieee80211 phy0: il4965_verify_bsm Begin verify bsm
    [   12.058642] ieee80211 phy0: il4965_verify_bsm BSM bootstrap uCode image OK
    [   12.058721] ieee80211 phy0: il4965_load_bsm BSM write complete, poll 1 iterations
    [   12.058734] ieee80211 phy0: __il4965_up iwl4965 is coming up
    [   12.058737] ieee80211 phy0: il4965_mac_start Start UP work done.
    [   12.058757] ieee80211 phy0: __il4965_down iwl4965 is going down
    [   12.058761] ieee80211 phy0: il_scan_cancel_timeout Scan cancel timeout
    [   12.058762] ieee80211 phy0: il_do_scan_abort Not performing scan to abort
    [   12.058765] ieee80211 phy0: il_clear_ucode_stations Clearing ucode stations in driver
    [   12.058767] ieee80211 phy0: il_clear_ucode_stations No active stations found to be cleared
    [   12.058819] ieee80211 phy0: _il_apm_stop Stop card, put in low power state
    [   12.058827] ieee80211 phy0: _il_apm_stop_master stop master
    [   12.058864] ieee80211 phy0: il4965_clear_free_frames 0 frames on pre-allocated heap on clear.
    [   12.058869] ieee80211 phy0: Hardware restart was requested
    [   16.132299] iwl4965 0000:10:00.0: START_ALIVE timeout after 4000ms.
    [   16.132303] ------------[ cut here ]------------
    [   16.132304] Hardware became unavailable upon resume. This could be a software issue prior to suspend or a hardware issue.
    [   16.132338] WARNING: CPU: 0 PID: 181 at net/mac80211/util.c:1826 ieee80211_reconfig+0x8f/0x14b0 [mac80211]
    [   16.132390] Modules linked in: ctr ccm sch_fq_codel xt_tcpudp xt_multiport xt_state iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv4 ip_tables x_tables binfmt_misc joydev mousedev btusb btrtl btintel btbcm bluetooth ecdh_generic ecc iTCO_wdt i2c_dev iwl4965 iwlegacy coretemp snd_hda_codec_analog pcspkr psmouse mac80211 snd_hda_codec_generic libarc4 sdhci_pci cqhci sha256_generic sdhci libsha256 firewire_ohci snd_hda_intel snd_intel_dspcfg mmc_core snd_hda_codec snd_hwdep firewire_core led_class iosf_mbi snd_hda_core uhci_hcd lpc_ich crc_itu_t cfg80211 ehci_pci ehci_hcd snd_pcm usbcore mfd_core rfkill snd_timer snd usb_common soundcore video parport_pc parport intel_agp wmi intel_gtt backlight e1000e agpgart evdev
    [   16.132456] CPU: 0 UID: 0 PID: 181 Comm: kworker/u8:6 Not tainted 6.11.0-cl+ #143
    [   16.132460] Hardware name: Hewlett-Packard HP Compaq 6910p/30BE, BIOS 68MCU Ver. F.19 07/06/2010
    [   16.132463] Workqueue: async async_run_entry_fn
    [   16.132469] RIP: 0010:ieee80211_reconfig+0x8f/0x14b0 [mac80211]
    [   16.132501] Code: da 02 00 00 c6 83 ad 05 00 00 00 48 89 df e8 98 1b fc ff 85 c0 41 89 c7 0f 84 e9 02 00 00 48 c7 c7 a0 e6 48 a0 e8 d1 77 c4 e0 <0f> 0b eb 2d 84 c0 0f 85 8b 01 00 00 c6 87 ad 05 00 00 00 e8 69 1b
    [   16.132504] RSP: 0018:ffffc9000029fcf0 EFLAGS: 00010282
    [   16.132507] RAX: 0000000000000000 RBX: ffff8880072008e0 RCX: 0000000000000001
    [   16.132509] RDX: ffffffff81f21a18 RSI: 0000000000000086 RDI: 0000000000000001
    [   16.132510] RBP: ffff8880072003c0 R08: 0000000000000000 R09: 0000000000000003
    [   16.132512] R10: 0000000000000000 R11: ffff88807e5b0000 R12: 0000000000000001
    [   16.132514] R13: 0000000000000000 R14: 0000000000000000 R15: 00000000ffffff92
    [   16.132515] FS:  0000000000000000(0000) GS:ffff88807c200000(0000) knlGS:0000000000000000
    [   16.132517] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [   16.132519] CR2: 000055dd43786c08 CR3: 000000000978f000 CR4: 00000000000006f0
    [   16.132521] Call Trace:
    [   16.132525]  <TASK>
    [   16.132526]  ? __warn+0x77/0x120
    [   16.132532]  ? ieee80211_reconfig+0x8f/0x14b0 [mac80211]
    [   16.132564]  ? report_bug+0x15c/0x190
    [   16.132568]  ? handle_bug+0x36/0x70
    [   16.132571]  ? exc_invalid_op+0x13/0x60
    [   16.132573]  ? asm_exc_invalid_op+0x16/0x20
    [   16.132579]  ? ieee80211_reconfig+0x8f/0x14b0 [mac80211]
    [   16.132611]  ? snd_hdac_bus_init_cmd_io+0x24/0x200 [snd_hda_core]
    [   16.132617]  ? pick_eevdf+0x133/0x1c0
    [   16.132622]  ? check_preempt_wakeup_fair+0x70/0x90
    [   16.132626]  ? wakeup_preempt+0x4a/0x60
    [   16.132628]  ? ttwu_do_activate.isra.0+0x5a/0x190
    [   16.132632]  wiphy_resume+0x79/0x1a0 [cfg80211]
    [   16.132675]  ? wiphy_suspend+0x2a0/0x2a0 [cfg80211]
    [   16.132697]  dpm_run_callback+0x75/0x1b0
    [   16.132703]  device_resume+0x97/0x200
    [   16.132707]  async_resume+0x14/0x20
    [   16.132711]  async_run_entry_fn+0x1b/0xa0
    [   16.132714]  process_one_work+0x13d/0x350
    [   16.132718]  worker_thread+0x2be/0x3d0
    [   16.132722]  ? cancel_delayed_work_sync+0x70/0x70
    [   16.132725]  kthread+0xc0/0xf0
    [   16.132729]  ? kthread_park+0x80/0x80
    [   16.132732]  ret_from_fork+0x28/0x40
    [   16.132735]  ? kthread_park+0x80/0x80
    [   16.132738]  ret_from_fork_asm+0x11/0x20
    [   16.132741]  </TASK>
    [   16.132742] ---[ end trace 0000000000000000 ]---
    [   16.132930] ------------[ cut here ]------------
    [   16.132932] WARNING: CPU: 0 PID: 181 at net/mac80211/driver-ops.c:41 drv_stop+0xe7/0xf0 [mac80211]
    [   16.132957] Modules linked in: ctr ccm sch_fq_codel xt_tcpudp xt_multiport xt_state iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv4 ip_tables x_tables binfmt_misc joydev mousedev btusb btrtl btintel btbcm bluetooth ecdh_generic ecc iTCO_wdt i2c_dev iwl4965 iwlegacy coretemp snd_hda_codec_analog pcspkr psmouse mac80211 snd_hda_codec_generic libarc4 sdhci_pci cqhci sha256_generic sdhci libsha256 firewire_ohci snd_hda_intel snd_intel_dspcfg mmc_core snd_hda_codec snd_hwdep firewire_core led_class iosf_mbi snd_hda_core uhci_hcd lpc_ich crc_itu_t cfg80211 ehci_pci ehci_hcd snd_pcm usbcore mfd_core rfkill snd_timer snd usb_common soundcore video parport_pc parport intel_agp wmi intel_gtt backlight e1000e agpgart evdev
    [   16.133014] CPU: 0 UID: 0 PID: 181 Comm: kworker/u8:6 Tainted: G        W          6.11.0-cl+ #143
    [   16.133018] Tainted: [W]=WARN
    [   16.133019] Hardware name: Hewlett-Packard HP Compaq 6910p/30BE, BIOS 68MCU Ver. F.19 07/06/2010
    [   16.133021] Workqueue: async async_run_entry_fn
    [   16.133025] RIP: 0010:drv_stop+0xe7/0xf0 [mac80211]
    [   16.133048] Code: 48 85 c0 74 0e 48 8b 78 08 89 ea 48 89 de e8 e0 87 04 00 65 ff 0d d1 de c4 5f 0f 85 42 ff ff ff e8 be 52 c2 e0 e9 38 ff ff ff <0f> 0b 5b 5d c3 0f 1f 40 00 41 54 49 89 fc 55 53 48 89 f3 2e 2e 2e
    [   16.133050] RSP: 0018:ffffc9000029fc50 EFLAGS: 00010246
    [   16.133053] RAX: 0000000000000000 RBX: ffff8880072008e0 RCX: ffff88800377f6c0
    [   16.133054] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8880072008e0
    [   16.133056] RBP: 0000000000000000 R08: ffffffff81f238d8 R09: 0000000000000000
    [   16.133058] R10: ffff8880080520f0 R11: 0000000000000000 R12: ffff888008051c60
    [   16.133060] R13: ffff8880072008e0 R14: 0000000000000000 R15: ffff8880072011d8
    [   16.133061] FS:  0000000000000000(0000) GS:ffff88807c200000(0000) knlGS:0000000000000000
    [   16.133063] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [   16.133065] CR2: 000055dd43786c08 CR3: 000000000978f000 CR4: 00000000000006f0
    [   16.133067] Call Trace:
    [   16.133069]  <TASK>
    [   16.133070]  ? __warn+0x77/0x120
    [   16.133075]  ? drv_stop+0xe7/0xf0 [mac80211]
    [   16.133098]  ? report_bug+0x15c/0x190
    [   16.133100]  ? handle_bug+0x36/0x70
    [   16.133103]  ? exc_invalid_op+0x13/0x60
    [   16.133105]  ? asm_exc_invalid_op+0x16/0x20
    [   16.133109]  ? drv_stop+0xe7/0xf0 [mac80211]
    [   16.133132]  ieee80211_do_stop+0x55a/0x810 [mac80211]
    [   16.133161]  ? fq_codel_reset+0xa5/0xc0 [sch_fq_codel]
    [   16.133164]  ieee80211_stop+0x4f/0x180 [mac80211]
    [   16.133192]  __dev_close_many+0xa2/0x120
    [   16.133195]  dev_close_many+0x90/0x150
    [   16.133198]  dev_close+0x5d/0x80
    [   16.133200]  cfg80211_shutdown_all_interfaces+0x40/0xe0 [cfg80211]
    [   16.133223]  wiphy_resume+0xb2/0x1a0 [cfg80211]
    [   16.133247]  ? wiphy_suspend+0x2a0/0x2a0 [cfg80211]
    [   16.133269]  dpm_run_callback+0x75/0x1b0
    [   16.133273]  device_resume+0x97/0x200
    [   16.133277]  async_resume+0x14/0x20
    [   16.133280]  async_run_entry_fn+0x1b/0xa0
    [   16.133283]  process_one_work+0x13d/0x350
    [   16.133287]  worker_thread+0x2be/0x3d0
    [   16.133290]  ? cancel_delayed_work_sync+0x70/0x70
    [   16.133294]  kthread+0xc0/0xf0
    [   16.133296]  ? kthread_park+0x80/0x80
    [   16.133299]  ret_from_fork+0x28/0x40
    [   16.133302]  ? kthread_park+0x80/0x80
    [   16.133304]  ret_from_fork_asm+0x11/0x20
    [   16.133307]  </TASK>
    [   16.133308] ---[ end trace 0000000000000000 ]---
    [   16.133335] ieee80211 phy0: PM: dpm_run_callback(): wiphy_resume [cfg80211] returns -110
    [   16.133360] ieee80211 phy0: PM: failed to restore async: error -110
    
    Cc: stable@vger.kernel.org
    Cc: Stanislaw Gruszka <stf_xl@wp.pl>
    Cc: Kalle Valo <kvalo@kernel.org>
    Cc: linux-wireless@vger.kernel.org
    Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
    Acked-by: Stanislaw Gruszka <stf_xl@wp.pl>
    Signed-off-by: Kalle Valo <kvalo@kernel.org>
    Link: https://patch.msgid.link/20241001200745.8276-1-ville.syrjala@linux.intel.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

wifi: iwlegacy: Fix "field-spanning write" warning in il_enqueue_hcmd() [+ + +]
Author: Ben Hutchings <ben@decadent.org.uk>
Date:   Thu Sep 12 01:01:21 2024 +0200

    wifi: iwlegacy: Fix "field-spanning write" warning in il_enqueue_hcmd()
    
    [ Upstream commit d4cdc46ca16a5c78b36c5b9b6ad8cac09d6130a0 ]
    
    iwlegacy uses command buffers with a payload size of 320
    bytes (default) or 4092 bytes (huge).  The struct il_device_cmd type
    describes the default buffers and there is no separate type describing
    the huge buffers.
    
    The il_enqueue_hcmd() function works with both default and huge
    buffers, and has a memcpy() to the buffer payload.  The size of
    this copy may exceed 320 bytes when using a huge buffer, which
    now results in a run-time warning:
    
        memcpy: detected field-spanning write (size 1014) of single field "&out_cmd->cmd.payload" at drivers/net/wireless/intel/iwlegacy/common.c:3170 (size 320)
    
    To fix this:
    
    - Define a new struct type for huge buffers, with a correctly sized
      payload field
    - When using a huge buffer in il_enqueue_hcmd(), cast the command
      buffer pointer to that type when looking up the payload field
    
    Reported-by: Martin-Éric Racine <martin-eric.racine@iki.fi>
    References: https://bugs.debian.org/1062421
    References: https://bugzilla.kernel.org/show_bug.cgi?id=219124
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
    Fixes: 54d9469bc515 ("fortify: Add run-time WARN for cross-field memcpy()")
    Tested-by: Martin-Éric Racine <martin-eric.racine@iki.fi>
    Tested-by: Brandon Nielsen <nielsenb@jetfuse.net>
    Acked-by: Stanislaw Gruszka <stf_xl@wp.pl>
    Signed-off-by: Kalle Valo <kvalo@kernel.org>
    Link: https://patch.msgid.link/ZuIhQRi/791vlUhE@decadent.org.uk
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: iwlwifi: mvm: disconnect station vifs if recovery failed [+ + +]
Author: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Date:   Sun Jan 28 08:53:56 2024 +0200

    wifi: iwlwifi: mvm: disconnect station vifs if recovery failed
    
    [ Upstream commit e50a88e5cb8792cc416866496288c5f4d1eb4b1f ]
    
    This will allow to reconnect immediately instead of leaving the
    connection in a limbo state.
    
    Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
    Reviewed-by: Gregory Greenman <gregory.greenman@intel.com>
    Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
    Link: https://msgid.link/20240128084842.e90531cd3a36.Iebdc9483983c0d8497f9dcf9d79ec37332a5fdcc@changeid
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Stable-dep-of: 07a6e3b78a65 ("wifi: iwlwifi: mvm: Fix response handling in iwl_mvm_send_recovery_cmd()")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: iwlwifi: mvm: fix 6 GHz scan construction [+ + +]
Author: Johannes Berg <johannes.berg@intel.com>
Date:   Wed Oct 23 09:17:44 2024 +0200

    wifi: iwlwifi: mvm: fix 6 GHz scan construction
    
    commit 7245012f0f496162dd95d888ed2ceb5a35170f1a upstream.
    
    If more than 255 colocated APs exist for the set of all
    APs found during 2.4/5 GHz scanning, then the 6 GHz scan
    construction will loop forever since the loop variable
    has type u8, which can never reach the number found when
    that's bigger than 255, and is stored in a u32 variable.
    Also move it into the loops to have a smaller scope.
    
    Using a u32 there is fine, we limit the number of APs in
    the scan list and each has a limit on the number of RNR
    entries due to the frame size. With a limit of 1000 scan
    results, a frame size upper bound of 4096 (really it's
    more like ~2300) and a TBTT entry size of at least 11,
    we get an upper bound for the number of ~372k, well in
    the bounds of a u32.
    
    Cc: stable@vger.kernel.org
    Fixes: eae94cf82d74 ("iwlwifi: mvm: add support for 6GHz")
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219375
    Link: https://patch.msgid.link/20241023091744.f4baed5c08a1.I8b417148bbc8c5d11c101e1b8f5bf372e17bf2a7@changeid
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

wifi: iwlwifi: mvm: Fix response handling in iwl_mvm_send_recovery_cmd() [+ + +]
Author: Daniel Gabay <daniel.gabay@intel.com>
Date:   Thu Oct 10 14:05:05 2024 +0300

    wifi: iwlwifi: mvm: Fix response handling in iwl_mvm_send_recovery_cmd()
    
    [ Upstream commit 07a6e3b78a65f4b2796a8d0d4adb1a15a81edead ]
    
    1. The size of the response packet is not validated.
    2. The response buffer is not freed.
    
    Resolve these issues by switching to iwl_mvm_send_cmd_status(),
    which handles both size validation and frees the buffer.
    
    Fixes: f130bb75d881 ("iwlwifi: add FW recovery flow")
    Signed-off-by: Daniel Gabay <daniel.gabay@intel.com>
    Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
    Link: https://patch.msgid.link/20241010140328.76c73185951e.Id3b6ca82ced2081f5ee4f33c997491d0ebda83f7@changeid
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: mac80211: do not pass a stopped vif to the driver in .get_txpower [+ + +]
Author: Felix Fietkau <nbd@nbd.name>
Date:   Wed Oct 2 11:56:30 2024 +0200

    wifi: mac80211: do not pass a stopped vif to the driver in .get_txpower
    
    commit 393b6bc174b0dd21bb2a36c13b36e62fc3474a23 upstream.
    
    Avoid potentially crashing in the driver because of uninitialized private data
    
    Fixes: 5b3dc42b1b0d ("mac80211: add support for driver tx power reporting")
    Cc: stable@vger.kernel.org
    Signed-off-by: Felix Fietkau <nbd@nbd.name>
    Link: https://patch.msgid.link/20241002095630.22431-1-nbd@nbd.name
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

wifi: mac80211: fix NULL dereference at band check in starting tx ba session [+ + +]
Author: Zong-Zhe Yang <kevin_yang@realtek.com>
Date:   Mon Jun 17 19:52:17 2024 +0800

    wifi: mac80211: fix NULL dereference at band check in starting tx ba session
    
    commit 021d53a3d87eeb9dbba524ac515651242a2a7e3b upstream.
    
    In MLD connection, link_data/link_conf are dynamically allocated. They
    don't point to vif->bss_conf. So, there will be no chanreq assigned to
    vif->bss_conf and then the chan will be NULL. Tweak the code to check
    ht_supported/vht_supported/has_he/has_eht on sta deflink.
    
    Crash log (with rtw89 version under MLO development):
    [ 9890.526087] BUG: kernel NULL pointer dereference, address: 0000000000000000
    [ 9890.526102] #PF: supervisor read access in kernel mode
    [ 9890.526105] #PF: error_code(0x0000) - not-present page
    [ 9890.526109] PGD 0 P4D 0
    [ 9890.526114] Oops: 0000 [#1] PREEMPT SMP PTI
    [ 9890.526119] CPU: 2 PID: 6367 Comm: kworker/u16:2 Kdump: loaded Tainted: G           OE      6.9.0 #1
    [ 9890.526123] Hardware name: LENOVO 2356AD1/2356AD1, BIOS G7ETB3WW (2.73 ) 11/28/2018
    [ 9890.526126] Workqueue: phy2 rtw89_core_ba_work [rtw89_core]
    [ 9890.526203] RIP: 0010:ieee80211_start_tx_ba_session (net/mac80211/agg-tx.c:618 (discriminator 1)) mac80211
    [ 9890.526279] Code: f7 e8 d5 93 3e ea 48 83 c4 28 89 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc 49 8b 84 24 e0 f1 ff ff 48 8b 80 90 1b 00 00 <83> 38 03 0f 84 37 fe ff ff bb ea ff ff ff eb cc 49 8b 84 24 10 f3
    All code
    ========
       0:   f7 e8                   imul   %eax
       2:   d5                      (bad)
       3:   93                      xchg   %eax,%ebx
       4:   3e ea                   ds (bad)
       6:   48 83 c4 28             add    $0x28,%rsp
       a:   89 d8                   mov    %ebx,%eax
       c:   5b                      pop    %rbx
       d:   41 5c                   pop    %r12
       f:   41 5d                   pop    %r13
      11:   41 5e                   pop    %r14
      13:   41 5f                   pop    %r15
      15:   5d                      pop    %rbp
      16:   c3                      retq
      17:   cc                      int3
      18:   cc                      int3
      19:   cc                      int3
      1a:   cc                      int3
      1b:   49 8b 84 24 e0 f1 ff    mov    -0xe20(%r12),%rax
      22:   ff
      23:   48 8b 80 90 1b 00 00    mov    0x1b90(%rax),%rax
      2a:*  83 38 03                cmpl   $0x3,(%rax)              <-- trapping instruction
      2d:   0f 84 37 fe ff ff       je     0xfffffffffffffe6a
      33:   bb ea ff ff ff          mov    $0xffffffea,%ebx
      38:   eb cc                   jmp    0x6
      3a:   49                      rex.WB
      3b:   8b                      .byte 0x8b
      3c:   84 24 10                test   %ah,(%rax,%rdx,1)
      3f:   f3                      repz
    
    Code starting with the faulting instruction
    ===========================================
       0:   83 38 03                cmpl   $0x3,(%rax)
       3:   0f 84 37 fe ff ff       je     0xfffffffffffffe40
       9:   bb ea ff ff ff          mov    $0xffffffea,%ebx
       e:   eb cc                   jmp    0xffffffffffffffdc
      10:   49                      rex.WB
      11:   8b                      .byte 0x8b
      12:   84 24 10                test   %ah,(%rax,%rdx,1)
      15:   f3                      repz
    [ 9890.526285] RSP: 0018:ffffb8db09013d68 EFLAGS: 00010246
    [ 9890.526291] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff9308e0d656c8
    [ 9890.526295] RDX: 0000000000000000 RSI: ffffffffab99460b RDI: ffffffffab9a7685
    [ 9890.526300] RBP: ffffb8db09013db8 R08: 0000000000000000 R09: 0000000000000873
    [ 9890.526304] R10: ffff9308e0d64800 R11: 0000000000000002 R12: ffff9308e5ff6e70
    [ 9890.526308] R13: ffff930952500e20 R14: ffff9309192a8c00 R15: 0000000000000000
    [ 9890.526313] FS:  0000000000000000(0000) GS:ffff930b4e700000(0000) knlGS:0000000000000000
    [ 9890.526316] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 9890.526318] CR2: 0000000000000000 CR3: 0000000391c58005 CR4: 00000000001706f0
    [ 9890.526321] Call Trace:
    [ 9890.526324]  <TASK>
    [ 9890.526327] ? show_regs (arch/x86/kernel/dumpstack.c:479)
    [ 9890.526335] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434)
    [ 9890.526340] ? page_fault_oops (arch/x86/mm/fault.c:713)
    [ 9890.526347] ? search_module_extables (kernel/module/main.c:3256 (discriminator 3))
    [ 9890.526353] ? ieee80211_start_tx_ba_session (net/mac80211/agg-tx.c:618 (discriminator 1)) mac80211
    
    Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
    Link: https://patch.msgid.link/20240617115217.22344-1-kevin_yang@realtek.com
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: Xiangyu Chen <xiangyu.chen@windriver.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

wifi: mac80211: skip non-uploaded keys in ieee80211_iter_keys [+ + +]
Author: Felix Fietkau <nbd@nbd.name>
Date:   Sun Oct 6 17:36:30 2024 +0200

    wifi: mac80211: skip non-uploaded keys in ieee80211_iter_keys
    
    [ Upstream commit 52009b419355195912a628d0a9847922e90c348c ]
    
    Sync iterator conditions with ieee80211_iter_keys_rcu.
    
    Fixes: 830af02f24fb ("mac80211: allow driver to iterate keys")
    Signed-off-by: Felix Fietkau <nbd@nbd.name>
    Link: https://patch.msgid.link/20241006153630.87885-1-nbd@nbd.name
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
x86/bugs: Use code segment selector for VERW operand [+ + +]
Author: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Date:   Thu Sep 26 09:10:31 2024 -0700

    x86/bugs: Use code segment selector for VERW operand
    
    commit e4d2102018542e3ae5e297bc6e229303abff8a0f upstream.
    
    Robert Gill reported below #GP in 32-bit mode when dosemu software was
    executing vm86() system call:
    
      general protection fault: 0000 [#1] PREEMPT SMP
      CPU: 4 PID: 4610 Comm: dosemu.bin Not tainted 6.6.21-gentoo-x86 #1
      Hardware name: Dell Inc. PowerEdge 1950/0H723K, BIOS 2.7.0 10/30/2010
      EIP: restore_all_switch_stack+0xbe/0xcf
      EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: 00000000
      ESI: 00000000 EDI: 00000000 EBP: 00000000 ESP: ff8affdc
      DS: 0000 ES: 0000 FS: 0000 GS: 0033 SS: 0068 EFLAGS: 00010046
      CR0: 80050033 CR2: 00c2101c CR3: 04b6d000 CR4: 000406d0
      Call Trace:
       show_regs+0x70/0x78
       die_addr+0x29/0x70
       exc_general_protection+0x13c/0x348
       exc_bounds+0x98/0x98
       handle_exception+0x14d/0x14d
       exc_bounds+0x98/0x98
       restore_all_switch_stack+0xbe/0xcf
       exc_bounds+0x98/0x98
       restore_all_switch_stack+0xbe/0xcf
    
    This only happens in 32-bit mode when VERW based mitigations like MDS/RFDS
    are enabled. This is because segment registers with an arbitrary user value
    can result in #GP when executing VERW. Intel SDM vol. 2C documents the
    following behavior for VERW instruction:
    
      #GP(0) - If a memory operand effective address is outside the CS, DS, ES,
               FS, or GS segment limit.
    
    CLEAR_CPU_BUFFERS macro executes VERW instruction before returning to user
    space. Use %cs selector to reference VERW operand. This ensures VERW will
    not #GP for an arbitrary user %ds.
    
    [ mingo: Fixed the SOB chain. ]
    
    Fixes: a0e2dab44d22 ("x86/entry_32: Add VERW just before userspace transition")
    Reported-by: Robert Gill <rtgill82@gmail.com>
    Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com
    Cc: stable@vger.kernel.org # 5.10+
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218707
    Closes: https://lore.kernel.org/all/8c77ccfd-d561-45a1-8ed5-6b75212c7a58@leemhuis.info/
    Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
    Suggested-by: Brian Gerst <brgerst@gmail.com>
    Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
    Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
xhci: Fix Link TRB DMA in command ring stopped completion event [+ + +]
Author: Faisal Hassan <quic_faisalh@quicinc.com>
Date:   Tue Oct 22 21:26:31 2024 +0530

    xhci: Fix Link TRB DMA in command ring stopped completion event
    
    commit 075919f6df5dd82ad0b1894898b315fbb3c29b84 upstream.
    
    During the aborting of a command, the software receives a command
    completion event for the command ring stopped, with the TRB pointing
    to the next TRB after the aborted command.
    
    If the command we abort is located just before the Link TRB in the
    command ring, then during the 'command ring stopped' completion event,
    the xHC gives the Link TRB in the event's cmd DMA, which causes a
    mismatch in handling command completion event.
    
    To address this situation, move the 'command ring stopped' completion
    event check slightly earlier, since the specific command it stopped
    on isn't of significant concern.
    
    Fixes: 7f84eef0dafb ("USB: xhci: No-op command queueing and irq handler.")
    Cc: stable@vger.kernel.org
    Signed-off-by: Faisal Hassan <quic_faisalh@quicinc.com>
    Acked-by: Mathias Nyman <mathias.nyman@linux.intel.com>
    Link: https://lore.kernel.org/r/20241022155631.1185-1-quic_faisalh@quicinc.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

xhci: Use pm_runtime_get to prevent RPM on unsupported systems [+ + +]
Author: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
Date:   Thu Oct 24 19:07:18 2024 +0530

    xhci: Use pm_runtime_get to prevent RPM on unsupported systems
    
    commit 31004740e42846a6f0bb255e6348281df3eb8032 upstream.
    
    Use pm_runtime_put in the remove function and pm_runtime_get to disable
    RPM on platforms that don't support runtime D3, as re-enabling it through
    sysfs auto power control may cause the controller to malfunction. This
    can lead to issues such as hotplug devices not being detected due to
    failed interrupt generation.
    
    Fixes: a5d6264b638e ("xhci: Enable RPM on controllers that support low-power states")
    Cc: stable <stable@kernel.org>
    Signed-off-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
    Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
    Link: https://lore.kernel.org/r/20241024133718.723846-1-Basavaraj.Natikar@amd.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>