-
Notifications
You must be signed in to change notification settings - Fork 12
build(deps): bump certifi from 2023.7.22 to 2024.7.4 in /drivers/gpu/drm/ci/xfails #9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
dependabot
wants to merge
37
commits into
master
Choose a base branch
from
dependabot/pip/drivers/gpu/drm/ci/xfails/certifi-2024.7.4
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
build(deps): bump certifi from 2023.7.22 to 2024.7.4 in /drivers/gpu/drm/ci/xfails #9
dependabot
wants to merge
37
commits into
master
from
dependabot/pip/drivers/gpu/drm/ci/xfails/certifi-2024.7.4
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Commit 343f4c4 ("kthread: Don't allocate kthread_struct for init and umh") introduces a new function user_mode_thread() for init and umh. init and umh are different from typical kernel threads since the don't need a "kthread" struct and they will finally become user processes by calling kernel_execve(), but on the other hand, they are also different from typical user mode threads (they have no "mm" structs at creation time, which is traditionally used to distinguish a user thread and a kernel thread). So I think it is reasonable to treat init and umh as "special kernel threads". Then let's unify the kernel_thread() and user_mode_thread() to kernel_thread() again, and add a new 'user' parameter for init and umh. This also makes code simpler. Signed-off-by: Huacai Chen <[email protected]>
…rqs() Adjust the return value semanteme of msi_domain_prepare_irqs(), which allows us to modify the input nvec by overriding the msi_domain_ops:: msi_prepare(). This is necessary for the later patch. Before: 0 on success, others on error. After: = 0: Success; > 0: The modified nvec; < 0: Error code. Callers are also updated. Signed-off-by: Huacai Chen <[email protected]>
Loongson machines can have as many as 256 logical cpus, but the maximum of msi vectors in one irqchip is also 256 (practically that is less than 256, because pch-pic consumes some of them). Even on a 64-core machine, 256 irqs can be easily exhausted if there are several NICs (NICs usually allocate msi irqs depending on the number of online cpus). So we want to limit the msi allocation. In this patch we add a machanism to limit msi allocation: 1, Modify input "nvec" by overriding the msi_domain_ops::msi_prepare(); 2, The default limit is 256, which is compatible with the old behavior; 3, Add a cmdline parameter "loongson_msi_limit=xxx" to control the limit. Signed-off-by: Juxin Gao <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
On multiple bridge platform, a MSIX vector is often affinitive with only one cpu, and the count of MSIX is determined as the count of cpus in the system. Unfortunately, the cpu group related to a brigde is only allowed to handle interrupts from devices behind the bridge, which breaks the normal affinity setting for multiple MSIX vectors, and causing following affinity setting: IRQ: 4, 5, 6, 7, 8, 9, 10, 11, 12, 13 CPU: 0, 1, 2, 3, 4, 0, 0, 0, 0, 0 To balance the affinity, we improve the setting to assign cpu for IRQ as following: IRQ: 4, 5, 6, 7, 8, 9, 10, 11, 12, 13 CPU: 0, 1, 2, 3, 4, 0, 1, 2, 3, 4 Signed-off-by: Jianmin Lv <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
When the best selected CPU is offline, work_on_cpu() will stuck forever. This can be happen if a node is online while all its CPUs are offline (we can use "maxcpus=1" without "nr_cpus=1" to reproduce it), Therefore, in this case, we should call local_pci_probe() instead of work_on_cpu(). Cc: <[email protected]> Signed-off-by: Huacai Chen <[email protected]> Signed-off-by: Hongchen Zhang <[email protected]>
LS7A chipset can be used as a downstream bridge which connected to a high-level host bridge. In this case DEV_LS7A_PCIE_PORT5 is used as the upward port. We should always enable MSI caps of this port, otherwise downstream devices cannot use MSI. Cc: <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
Don't limit MRRS during resume, so that saved value can be restored, otherwise the MRRS will become the minimum value after resume. Cc: <[email protected]> Fixes: 8b3517f ("PCI: loongson: Prevent LS7A MRRS increases") Signed-off-by: Jianmin Lv <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
Chromium sandbox apparently wants to deny statx [1] so it could properly inspect arguments after the sandboxed process later falls back to fstat. Because there's currently not a "fd-only" version of statx, so that the sandbox has no way to ensure the path argument is empty without being able to peek into the sandboxed process's memory. For architectures able to do newfstatat though, glibc falls back to newfstatat after getting -ENOSYS for statx, then the respective SIGSYS handler [2] takes care of inspecting the path argument, transforming allowed newfstatat's into fstat instead which is allowed and has the same type of return value. But, as LoongArch is the first architecture to not have fstat nor newfstatat, the LoongArch glibc does not attempt falling back at all when it gets -ENOSYS for statx -- and you see the problem there! Actually, back when the LoongArch port was under review, people were aware of the same problem with sandboxing clone3 [3], so clone was eventually kept. Unfortunately it seemed at that time no one had noticed statx, so besides restoring fstat/newfstatat to LoongArch uapi (and postponing the problem further), it seems inevitable that we would need to tackle seccomp deep argument inspection. However, this is obviously a decision that shouldn't be taken lightly, so we just restore fstat/newfstatat by defining __ARCH_WANT_NEW_STAT in unistd.h. This is the simplest solution for now, and so we hope the community will tackle the long-standing problem of seccomp deep argument inspection in the future [4][5]. More infomation please reading this thread [6]. [1] https://chromium-review.googlesource.com/c/chromium/src/+/2823150 [2] https://chromium.googlesource.com/chromium/src/sandbox/+/c085b51940bd/linux/seccomp-bpf-helpers/sigsys_handlers.cc#355 [3] https://lore.kernel.org/linux-arch/[email protected]/ [4] https://lwn.net/Articles/799557/ [5] https://lpc.events/event/4/contributions/560/attachments/397/640/deep-arg-inspection.pdf [6] https://lore.kernel.org/loongarch/20240226-granit-seilschaft-eccc2433014d@brauner/T/#t Cc: [email protected] Signed-off-by: Huacai Chen <[email protected]>
Some drivers want to use cpu_logical_map(), early_cpu_to_node() and some other CPU mapping APIs, even if we use "nr_cpus=1" to hard limit the CPU number. This is strongly required for the multi-bridges machines. Currently, we stop parsing the MADT if the nr_cpus limit is reached, but to achieve the above goal we should always enumerate the MADT table and setup logical-physical CPU mapping whether there is a nr_cpus limit. Rework the MADT enumeration: 1. Define a flag "cpu_enumerated" to distinguish the first enumeration (cpu_enumerated=0) and the physical hotplug case (cpu_enumerated=1) for set_processor_mask(). 2. If cpu_enumerated=0, stop parsing only when NR_CPUS limit is reached, so we can setup logical-physical CPU mapping; if cpu_enumerated=1, stop parsing when nr_cpu_ids limit is reached, so we can avoid some runtime bugs. Once logical-physical CPU mapping is setup, we will let cpu_enumerated=1. 3. Use find_first_zero_bit() instead of cpumask_next_zero() to find the next zero bit (free logical CPU id) in the cpu_present_mask, because cpumask_next_zero() will stop at nr_cpu_ids. 4. Only touch cpu_possible_mask if cpu_enumerated=0, this is in order to avoid some potential crashes, because cpu_possible_mask is marked as __ro_after_init. 5. In prefill_possible_map(), clear cpu_present_mask bits greater than nr_cpu_ids, in order to avoid a CPU be "present" but not "possible". Signed-off-by: Huacai Chen <[email protected]>
Add irq_work support for LoongArch via self IPIs. This make it possible to run works in hardware interrupt context, which is a prerequisite for NOHZ_FULL. Implement: - arch_irq_work_raise() - arch_irq_work_has_interrupt() Reviewed-by: Guo Ren <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
In order for things like get_user_pages() to work on ZONE_DEVICE memory, we need a software PTE bit to identify device-backed PFNs. Hook this up along with the relevant helpers to join in with ARCH_HAS_PTE_DEVMAP. Signed-off-by: Huacai Chen <[email protected]>
Add ARCH_HAS_DEBUG_VM_PGTABLE selection in Kconfig, in order to make corresponding vm debug features usable on LoongArch. Also update the corresponding arch-support.txt document. Signed-off-by: Huacai Chen <[email protected]>
Currently, only TLB-based ioremap() support writecombine, so add the counterpart for DMW-based ioremap() with help of DMW2. The base address (WRITECOMBINE_BASE) is configured as 0xa000000000000000. DMW3 is unused by kernel now, however firmware may leave garbage in them and interfere kernel's address mapping. So clear it as necessary. BTW, centralize the DMW configuration to macro SETUP_DMWINS. Signed-off-by: Jiaxun Yang <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
Most LoongArch 64 machines are using custom "SADR" ACPI extension to perform ACPI S3 sleep. However the standard ACPI way to perform sleep is to write a value to ACPI PM1/SLEEP_CTL register, and this is never supported properly in kernel. Add standard S3 sleep by providing a default DoSuspend function which calls ACPI's acpi_enter_sleep_state() routine when SADR is not provided by the firmware. Also fix suspend assembly code so that ra is set properly before go into sleep routine. (Previously linked address of jirl was set to a0, some firmware do require return address in a0 but it's already set with la.pcrel before). Signed-off-by: Jiaxun Yang <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
Hibernation assumes the memory layout after resume be the same as that before sleep, so it expects the kernel is loaded at the same position. To achieve this goal we automatically disable KASLR if user explicitly requests hibernation via the "resume=" command line. Since "nohibernate" and "noresume" have higher priorities than "resume=", we only disable KASLR if there is no "nohibernate" and "noresume". Signed-off-by: Huacai Chen <[email protected]>
fw_arg1 is in memory space rather than I/O space, so we should use early_memremap_ro() instead of early_ioremap() to map the cmdline. Moreover, we should unmap it after using. Suggested-by: Jiaxun Yang <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
-Zdirect-access-external-data is a new Rust compiler option added in Rust 1.78, which we use to optimize the access of external data in the Linux kernel's Rust code. This patch modifies the Rust code in vmlinux to directly access externa data, using PC-REL instead of GOT. However, Rust code whithin modules is constrained by the PC-REL addressing range and is explicitly set to use an indirect method. Signed-off-by: WANG Rui <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
LoongArch defines UPROBE_SWBP_INSN as a function call and this breaks arch_uprobe_trampoline() which uses it to initialize a static variable. Add the new "__builtin_constant_p" helper, __emit_break(), and redefine the current users of larch_insn_gen_break() to use it. Fixes: ff474a7 ("uprobe: Add uretprobe syscall to speed up return probe") Reported-by: Nathan Chancellor <[email protected]> Closes: https://lore.kernel.org/all/20240614174822.GA1185149@thelio-3990X/ Suggested-by: Andrii Nakryiko <[email protected]> Tested-by: Tiezhu Yang <[email protected]> Signed-off-by: Oleg Nesterov <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
Signed-off-by: Binbin Zhou <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
This add CPU HWMon (temperature sensor) platform driver for Loongson-3. Tested-by: Xi Ruoyao <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
When CONFIG_CPUMASK_OFFSTACK and CONFIG_DEBUG_PER_CPU_MAPS is selected, cpu_max_bits_warn() generates a runtime warning similar as below while we show /proc/cpuinfo. Fix this by using nr_cpu_ids (the runtime limit) instead of NR_CPUS to iterate CPUs. [ 3.052463] ------------[ cut here ]------------ [ 3.059679] WARNING: CPU: 3 PID: 1 at include/linux/cpumask.h:108 show_cpuinfo+0x5e8/0x5f0 [ 3.070072] Modules linked in: efivarfs autofs4 [ 3.076257] CPU: 0 PID: 1 Comm: systemd Not tainted 5.19-rc5+ #1052 [ 3.099465] Stack : 9000000100157b08 9000000000f18530 9000000000cf846c 9000000100154000 [ 3.109127] 9000000100157a50 0000000000000000 9000000100157a58 9000000000ef7430 [ 3.118774] 90000001001578e8 0000000000000040 0000000000000020 ffffffffffffffff [ 3.128412] 0000000000aaaaaa 1ab25f00eec96a37 900000010021de80 900000000101c890 [ 3.138056] 0000000000000000 0000000000000000 0000000000000000 0000000000aaaaaa [ 3.147711] ffff8000339dc220 0000000000000001 0000000006ab4000 0000000000000000 [ 3.157364] 900000000101c998 0000000000000004 9000000000ef7430 0000000000000000 [ 3.167012] 0000000000000009 000000000000006c 0000000000000000 0000000000000000 [ 3.176641] 9000000000d3de08 9000000001639390 90000000002086d8 00007ffff0080286 [ 3.186260] 00000000000000b0 0000000000000004 0000000000000000 0000000000071c1c [ 3.195868] ... [ 3.199917] Call Trace: [ 3.203941] [<90000000002086d8>] show_stack+0x38/0x14c [ 3.210666] [<9000000000cf846c>] dump_stack_lvl+0x60/0x88 [ 3.217625] [<900000000023d268>] __warn+0xd0/0x100 [ 3.223958] [<9000000000cf3c90>] warn_slowpath_fmt+0x7c/0xcc [ 3.231150] [<9000000000210220>] show_cpuinfo+0x5e8/0x5f0 [ 3.238080] [<90000000004f578c>] seq_read_iter+0x354/0x4b4 [ 3.245098] [<90000000004c2e90>] new_sync_read+0x17c/0x1c4 [ 3.252114] [<90000000004c5174>] vfs_read+0x138/0x1d0 [ 3.258694] [<90000000004c55f8>] ksys_read+0x70/0x100 [ 3.265265] [<9000000000cfde9c>] do_syscall+0x7c/0x94 [ 3.271820] [<9000000000202fe4>] handle_syscall+0xc4/0x160 [ 3.281824] ---[ end trace 8b484262b4b8c24c ]--- Cc: [email protected] Signed-off-by: Huacai Chen <[email protected]>
If an NTFS file system is mounted to another system with different
PAGE_SIZE from the original system, log->page_size will change in
log_replay(), but log->page_{mask,bits} don't change correspondingly.
This will cause a panic because "u32 bytes = log->page_size - page_off"
will get a negative value in the later read_log_page().
Cc: [email protected]
Fixes: b46acd6 ("fs/ntfs3: Add NTFS journal")
Signed-off-by: Huacai Chen <[email protected]>
The label end_reply is obviously a typo. It should be "replay" in this context. So rename end_reply to end_replay. Cc: [email protected] Fixes: b46acd6 ("fs/ntfs3: Add NTFS journal") Signed-off-by: Huacai Chen <[email protected]>
Signed-off-by: Huacai Chen <[email protected]>
Debug machanism include: gdb, ftrace, kprobe, uprobe and jump_label Signed-off-by: Huacai Chen <[email protected]>
Add driver support for the IOMMU in LS7A. If you need to enable it, please add the loongson_iommu=on kernel parameter. Signed-off-by: Huacai Chen <[email protected]>
Signed-off-by: Huacai Chen <[email protected]>
Consider a configuration like this: 1, efifb (or simpledrm) is built-in; 2, a native display driver (such as radeon) is also built-in. As Javier said, this is not a common configuration (the native display driver is usually built as a module), but it can happen and cause some trouble. In this case, since efifb, radeon and sysfb are all in device_initcall() level, the order in practise is like this: efifb registered at first, but no "efi-framebuffer" device yet. radeon registered later, and /dev/fb0 created. sysfb_init() comes at last, it registers "efi-framebuffer" and then causes an error message "efifb: a framebuffer is already registered". Make sysfb_init() to be subsys_ initcall_sync() can avoid this. And Javier Martinez Canillas is trying to make a more general solution in commit 873eb3b ("fbdev: Disable sysfb device registration when removing conflicting FBs"). However, this patch still makes sense because it can make the screen display as early as possible (We cannot move to subsys_initcall, since sysfb_init() should be executed after PCI enumeration). This is a better version of commit 60aebc9 ("drivers/firmware: Move sysfb_init() from device_initcall to subsys_initcall_sync") since the previous commit leads to blank displays on some systems. The reason is that vgaarb initialization is also a subsys_initcall_sync function so sysfb_disable() is sometimes missed. So here we move sysfb_init() to an fs_initcall function which is ensured after vgaarb initialization. Signed-off-by: Huacai Chen <[email protected]>
After commit 60aebc9 ("drivers/firmware: Move sysfb_init() from device_initcall to subsys_initcall_sync") some Lenovo laptops get a blank screen until the display manager starts. This regression occurs with such a Kconfig combination: CONFIG_SYSFB=y CONFIG_SYSFB_SIMPLEFB=y CONFIG_DRM_SIMPLEDRM=y CONFIG_DRM_I915=y # Or other native drivers such as radeon, amdgpu If replace CONFIG_DRM_SIMPLEDRM with CONFIG_FB_SIMPLE (they use the same device), there is no blank screen. The root cause is the initialization order, and this order depends on the Makefile. FB_SIMPLE is before native DRM drivers (e.g. i915, radeon, amdgpu, and so on), but DRM_SIMPLEDRM is after them. Thus, if we use FB_SIMPLE, I915 will takeover FB_SIMPLE, then no problem; and if we use DRM_SIMPLEDRM, DRM_SIMPLEDRM will try to takeover I915, but fails to work. So we can move the "tiny" directory before native DRM drivers to solve this problem. Fixes: 60aebc9 ("drivers/firmware: Move sysfb_init() from device_initcall to subsys_initcall_sync") Closes: https://lore.kernel.org/dri-devel/[email protected]/T/#t Reported-by: Jaak Ristioja <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
Radeon driver can not handle the interrupt is faster than DMA data, so irq handler must update an old ih.rptr value in IH_RB_RPTR register to enable interrupt again when interrupt is faster than DMA data. Signed-off-by: Huacai Chen <[email protected]> Signed-off-by: Zhijie Zhang <[email protected]>
84c620d to
af6528b
Compare
chenhuacai
pushed a commit
that referenced
this pull request
Sep 8, 2025
BUG: kernel NULL pointer dereference, address: 00000000000002ec PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP PTI CPU: 28 UID: 0 PID: 343 Comm: kworker/28:1 Kdump: loaded Tainted: G OE 6.17.0-rc2+ #9 NONE Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014 Workqueue: smc_hs_wq smc_listen_work [smc] RIP: 0010:smc_ib_is_sg_need_sync+0x9e/0xd0 [smc] ... Call Trace: <TASK> smcr_buf_map_link+0x211/0x2a0 [smc] __smc_buf_create+0x522/0x970 [smc] smc_buf_create+0x3a/0x110 [smc] smc_find_rdma_v2_device_serv+0x18f/0x240 [smc] ? smc_vlan_by_tcpsk+0x7e/0xe0 [smc] smc_listen_find_device+0x1dd/0x2b0 [smc] smc_listen_work+0x30f/0x580 [smc] process_one_work+0x18c/0x340 worker_thread+0x242/0x360 kthread+0xe7/0x220 ret_from_fork+0x13a/0x160 ret_from_fork_asm+0x1a/0x30 </TASK> If the software RoCE device is used, ibdev->dma_device is a null pointer. As a result, the problem occurs. Null pointer detection is added to prevent problems. Fixes: 0ef69e7 ("net/smc: optimize for smc_sndbuf_sync_sg_for_device and smc_rmb_sync_sg_for_cpu") Signed-off-by: Liu Jian <[email protected]> Reviewed-by: Guangguan Wang <[email protected]> Reviewed-by: Zhu Yanjun <[email protected]> Reviewed-by: D. Wythe <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
chenhuacai
pushed a commit
that referenced
this pull request
Sep 18, 2025
Steven Rostedt reported a crash with "ftrace=function" kernel command line: [ 0.159269] BUG: kernel NULL pointer dereference, address: 000000000000001c [ 0.160254] #PF: supervisor read access in kernel mode [ 0.160975] #PF: error_code(0x0000) - not-present page [ 0.161697] PGD 0 P4D 0 [ 0.162055] Oops: Oops: 0000 [#1] SMP PTI [ 0.162619] CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.17.0-rc2-test-00006-g48d06e78b7cb-dirty #9 PREEMPT(undef) [ 0.164141] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 0.165439] RIP: 0010:kmem_cache_alloc_noprof (mm/slub.c:4237) [ 0.166186] Code: 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 49 89 fc 53 48 83 e4 f0 48 83 ec 20 8b 05 c9 b6 7e 01 <44> 8b 77 1c 65 4c 8b 2d b5 ea 20 02 4c 89 6c 24 18 41 89 f5 21 f0 [ 0.168811] RSP: 0000:ffffffffb2e03b30 EFLAGS: 00010086 [ 0.169545] RAX: 0000000001fff33f RBX: 0000000000000000 RCX: 0000000000000000 [ 0.170544] RDX: 0000000000002800 RSI: 0000000000002800 RDI: 0000000000000000 [ 0.171554] RBP: ffffffffb2e03b80 R08: 0000000000000004 R09: ffffffffb2e03c90 [ 0.172549] R10: ffffffffb2e03c90 R11: 0000000000000000 R12: 0000000000000000 [ 0.173544] R13: ffffffffb2e03c90 R14: ffffffffb2e03c90 R15: 0000000000000001 [ 0.174542] FS: 0000000000000000(0000) GS:ffff9d2808114000(0000) knlGS:0000000000000000 [ 0.175684] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.176486] CR2: 000000000000001c CR3: 000000007264c001 CR4: 00000000000200b0 [ 0.177483] Call Trace: [ 0.177828] <TASK> [ 0.178123] mas_alloc_nodes (lib/maple_tree.c:176 (discriminator 2) lib/maple_tree.c:1255 (discriminator 2)) [ 0.178692] mas_store_gfp (lib/maple_tree.c:5468) [ 0.179223] execmem_cache_add_locked (mm/execmem.c:207) [ 0.179870] execmem_alloc (mm/execmem.c:213 mm/execmem.c:313 mm/execmem.c:335 mm/execmem.c:475) [ 0.180397] ? ftrace_caller (arch/x86/kernel/ftrace_64.S:169) [ 0.180922] ? __pfx_ftrace_caller (arch/x86/kernel/ftrace_64.S:158) [ 0.181517] execmem_alloc_rw (mm/execmem.c:487) [ 0.182052] arch_ftrace_update_trampoline (arch/x86/kernel/ftrace.c:266 arch/x86/kernel/ftrace.c:344 arch/x86/kernel/ftrace.c:474) [ 0.182778] ? ftrace_caller_op_ptr (arch/x86/kernel/ftrace_64.S:182) [ 0.183388] ftrace_update_trampoline (kernel/trace/ftrace.c:7947) [ 0.184024] __register_ftrace_function (kernel/trace/ftrace.c:368) [ 0.184682] ftrace_startup (kernel/trace/ftrace.c:3048) [ 0.185205] ? __pfx_function_trace_call (kernel/trace/trace_functions.c:210) [ 0.185877] register_ftrace_function_nolock (kernel/trace/ftrace.c:8717) [ 0.186595] register_ftrace_function (kernel/trace/ftrace.c:8745) [ 0.187254] ? __pfx_function_trace_call (kernel/trace/trace_functions.c:210) [ 0.187924] function_trace_init (kernel/trace/trace_functions.c:170) [ 0.188499] tracing_set_tracer (kernel/trace/trace.c:5916 kernel/trace/trace.c:6349) [ 0.189088] register_tracer (kernel/trace/trace.c:2391) [ 0.189642] early_trace_init (kernel/trace/trace.c:11075 kernel/trace/trace.c:11149) [ 0.190204] start_kernel (init/main.c:970) [ 0.190732] x86_64_start_reservations (arch/x86/kernel/head64.c:307) [ 0.191381] x86_64_start_kernel (??:?) [ 0.191955] common_startup_64 (arch/x86/kernel/head_64.S:419) [ 0.192534] </TASK> [ 0.192839] Modules linked in: [ 0.193267] CR2: 000000000000001c [ 0.193730] ---[ end trace 0000000000000000 ]--- The crash happens because on x86 ftrace allocations from execmem require maple tree to be initialized. Move maple tree initialization that depends only on slab availability earlier in boot so that it will happen right after mm_core_init(). Link: https://lkml.kernel.org/r/[email protected] Fixes: 5d79c2b ("x86/ftrace: enable EXECMEM_ROX_CACHE for ftrace allocations") Signed-off-by: Mike Rapoport (Microsoft) <[email protected]> Reported-by: Steven Rostedt (Google) <[email protected]> Tested-by: Steven Rostedt (Google) <[email protected]> Closes: https://lore.kernel.org/all/[email protected]/ Reviewed-by: Masami Hiramatsu (Google) <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Cc: Borislav Betkov <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleinxer <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
d890056 to
3eb6775
Compare
4fe06dd to
f6ea04b
Compare
chenhuacai
pushed a commit
that referenced
this pull request
Oct 7, 2025
Before disabling SR-IOV via config space accesses to the parent PF, sriov_disable() first removes the PCI devices representing the VFs. Since commit 9d16947 ("PCI: Add global pci_lock_rescan_remove()") such removal operations are serialized against concurrent remove and rescan using the pci_rescan_remove_lock. No such locking was ever added in sriov_disable() however. In particular when commit 18f9e9d ("PCI/IOV: Factor out sriov_add_vfs()") factored out the PCI device removal into sriov_del_vfs() there was still no locking around the pci_iov_remove_virtfn() calls. On s390 the lack of serialization in sriov_disable() may cause double remove and list corruption with the below (amended) trace being observed: PSW: 0704c00180000000 0000000c914e4b38 (klist_put+56) GPRS: 000003800313fb48 0000000000000000 0000000100000001 0000000000000001 00000000f9b520a8 0000000000000000 0000000000002fbd 00000000f4cc9480 0000000000000001 0000000000000000 0000000000000000 0000000180692828 00000000818e8000 000003800313fe2c 000003800313fb20 000003800313fad8 #0 [3800313fb20] device_del at c9158ad5c #1 [3800313fb88] pci_remove_bus_device at c915105ba #2 [3800313fbd0] pci_iov_remove_virtfn at c9152f198 #3 [3800313fc28] zpci_iov_remove_virtfn at c90fb67c0 #4 [3800313fc60] zpci_bus_remove_device at c90fb6104 #5 [3800313fca0] __zpci_event_availability at c90fb3dca #6 [3800313fd08] chsc_process_sei_nt0 at c918fe4a2 #7 [3800313fd60] crw_collect_info at c91905822 #8 [3800313fe10] kthread at c90feb390 #9 [3800313fe68] __ret_from_fork at c90f6aa64 #10 [3800313fe98] ret_from_fork at c9194f3f2. This is because in addition to sriov_disable() removing the VFs, the platform also generates hot-unplug events for the VFs. This being the reverse operation to the hotplug events generated by sriov_enable() and handled via pdev->no_vf_scan. And while the event processing takes pci_rescan_remove_lock and checks whether the struct pci_dev still exists, the lack of synchronization makes this checking racy. Other races may also be possible of course though given that this lack of locking persisted so long observable races seem very rare. Even on s390 the list corruption was only observed with certain devices since the platform events are only triggered by config accesses after the removal, so as long as the removal finished synchronously they would not race. Either way the locking is missing so fix this by adding it to the sriov_del_vfs() helper. Just like PCI rescan-remove, locking is also missing in sriov_add_vfs() including for the error case where pci_stop_and_remove_bus_device() is called without the PCI rescan-remove lock being held. Even in the non-error case, adding new PCI devices and buses should be serialized via the PCI rescan-remove lock. Add the necessary locking. Fixes: 18f9e9d ("PCI/IOV: Factor out sriov_add_vfs()") Signed-off-by: Niklas Schnelle <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Benjamin Block <[email protected]> Reviewed-by: Farhan Ali <[email protected]> Reviewed-by: Julian Ruess <[email protected]> Cc: [email protected] Link: https://patch.msgid.link/[email protected]
chenhuacai
pushed a commit
that referenced
this pull request
Oct 8, 2025
Revert commit 1afa706 ("serial: qcom-geni: Enable PM runtime for serial driver") and its dependent commit 86fa39d ("serial: qcom-geni: Enable Serial on SA8255p Qualcomm platforms") because the first one causes regression - hang task on Qualcomm RB1 board (QRB2210) and unable to use serial at all during normal boot: INFO: task kworker/u16:0:12 blocked for more than 42 seconds. Not tainted 6.17.0-rc1-00004-g53e760d89498 #9 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:kworker/u16:0 state:D stack:0 pid:12 tgid:12 ppid:2 task_flags:0x4208060 flags:0x00000010 Workqueue: async async_run_entry_fn Call trace: __switch_to+0xe8/0x1a0 (T) __schedule+0x290/0x7c0 schedule+0x34/0x118 rpm_resume+0x14c/0x66c rpm_resume+0x2a4/0x66c rpm_resume+0x2a4/0x66c rpm_resume+0x2a4/0x66c __pm_runtime_resume+0x50/0x9c __driver_probe_device+0x58/0x120 driver_probe_device+0x3c/0x154 __driver_attach_async_helper+0x4c/0xc0 async_run_entry_fn+0x34/0xe0 process_one_work+0x148/0x290 worker_thread+0x2c4/0x3e0 kthread+0x118/0x1c0 ret_from_fork+0x10/0x20 The issue was reported on 12th of August and was ignored by author of commits introducing issue for two weeks. Only after complaining author produced a fix which did not work, so if original commits cannot be reliably fixed for 5 weeks, they obviously are buggy and need to be dropped. Fixes: 1afa706 ("serial: qcom-geni: Enable PM runtime for serial driver") Reported-by: Alexey Klimov <[email protected]> Closes: https://lore.kernel.org/all/[email protected]/ Signed-off-by: Krzysztof Kozlowski <[email protected]> Tested-by: Alexey Klimov <[email protected]> Reviewed-by: Alexey Klimov <[email protected]> Reviewed-by: Bryan O'Donoghue <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
chenhuacai
pushed a commit
that referenced
this pull request
Oct 11, 2025
The test starts a workload and then opens events. If the events fail
to open, for example because of perf_event_paranoid, the gopipe of the
workload is leaked and the file descriptor leak check fails when the
test exits. To avoid this cancel the workload when opening the events
fails.
Before:
```
$ perf test -vv 7
7: PERF_RECORD_* events & perf_sample fields:
--- start ---
test child forked, pid 1189568
Using CPUID GenuineIntel-6-B7-1
------------------------------------------------------------
perf_event_attr:
type 0 (PERF_TYPE_HARDWARE)
config 0xa00000000 (cpu_atom/PERF_COUNT_HW_CPU_CYCLES/)
disabled 1
------------------------------------------------------------
sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8
sys_perf_event_open failed, error -13
------------------------------------------------------------
perf_event_attr:
type 0 (PERF_TYPE_HARDWARE)
config 0xa00000000 (cpu_atom/PERF_COUNT_HW_CPU_CYCLES/)
disabled 1
exclude_kernel 1
------------------------------------------------------------
sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 = 3
------------------------------------------------------------
perf_event_attr:
type 0 (PERF_TYPE_HARDWARE)
config 0x400000000 (cpu_core/PERF_COUNT_HW_CPU_CYCLES/)
disabled 1
------------------------------------------------------------
sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8
sys_perf_event_open failed, error -13
------------------------------------------------------------
perf_event_attr:
type 0 (PERF_TYPE_HARDWARE)
config 0x400000000 (cpu_core/PERF_COUNT_HW_CPU_CYCLES/)
disabled 1
exclude_kernel 1
------------------------------------------------------------
sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 = 3
Attempt to add: software/cpu-clock/
..after resolving event: software/config=0/
cpu-clock -> software/cpu-clock/
------------------------------------------------------------
perf_event_attr:
type 1 (PERF_TYPE_SOFTWARE)
size 136
config 0x9 (PERF_COUNT_SW_DUMMY)
sample_type IP|TID|TIME|CPU
read_format ID|LOST
disabled 1
inherit 1
mmap 1
comm 1
enable_on_exec 1
task 1
sample_id_all 1
mmap2 1
comm_exec 1
ksymbol 1
bpf_event 1
{ wakeup_events, wakeup_watermark } 1
------------------------------------------------------------
sys_perf_event_open: pid 1189569 cpu 0 group_fd -1 flags 0x8
sys_perf_event_open failed, error -13
perf_evlist__open: Permission denied
---- end(-2) ----
Leak of file descriptor 6 that opened: 'pipe:[14200347]'
---- unexpected signal (6) ----
iFailed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
#0 0x565358f6666e in child_test_sig_handler builtin-test.c:311
#1 0x7f29ce849df0 in __restore_rt libc_sigaction.c:0
#2 0x7f29ce89e95c in __pthread_kill_implementation pthread_kill.c:44
#3 0x7f29ce849cc2 in raise raise.c:27
#4 0x7f29ce8324ac in abort abort.c:81
#5 0x565358f662d4 in check_leaks builtin-test.c:226
#6 0x565358f6682e in run_test_child builtin-test.c:344
#7 0x565358ef7121 in start_command run-command.c:128
#8 0x565358f67273 in start_test builtin-test.c:545
#9 0x565358f6771d in __cmd_test builtin-test.c:647
#10 0x565358f682bd in cmd_test builtin-test.c:849
#11 0x565358ee5ded in run_builtin perf.c:349
#12 0x565358ee6085 in handle_internal_command perf.c:401
#13 0x565358ee61de in run_argv perf.c:448
#14 0x565358ee6527 in main perf.c:555
#15 0x7f29ce833ca8 in __libc_start_call_main libc_start_call_main.h:74
#16 0x7f29ce833d65 in __libc_start_main@@GLIBC_2.34 libc-start.c:128
#17 0x565358e391c1 in _start perf[851c1]
7: PERF_RECORD_* events & perf_sample fields : FAILED!
```
After:
```
$ perf test 7
7: PERF_RECORD_* events & perf_sample fields : Skip (permissions)
```
Fixes: 16d00fe ("perf tests: Move test__PERF_RECORD into separate object")
Signed-off-by: Ian Rogers <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Chun-Tse Shao <[email protected]>
Cc: Howard Chu <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
cbf82cb to
f955b42
Compare
71efa6e to
fb4d5b4
Compare
1bae970 to
cbbe8fd
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Bumps certifi from 2023.7.22 to 2024.7.4.
Commits
bd815382024.07.04 (#295)06a2cbfBump peter-evans/create-pull-request from 6.0.5 to 6.1.0 (#294)13bba02Bump actions/checkout from 4.1.6 to 4.1.7 (#293)e8abcd0Bump pypa/gh-action-pypi-publish from 1.8.14 to 1.9.0 (#292)124f4ad2024.06.02 (#291)c2196ce--- (#290)fefdeecBump actions/checkout from 4.1.4 to 4.1.5 (#289)3c5fb15Bump actions/download-artifact from 4.1.6 to 4.1.7 (#286)4a9569aBump actions/checkout from 4.1.2 to 4.1.4 (#287)1fc8086Bump peter-evans/create-pull-request from 6.0.4 to 6.0.5 (#288)Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting
@dependabot rebase.Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR:
@dependabot rebasewill rebase this PR@dependabot recreatewill recreate this PR, overwriting any edits that have been made to it@dependabot mergewill merge this PR after your CI passes on it@dependabot squash and mergewill squash and merge this PR after your CI passes on it@dependabot cancel mergewill cancel a previously requested merge and block automerging@dependabot reopenwill reopen this PR if it is closed@dependabot closewill close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually@dependabot show <dependency name> ignore conditionswill show all of the ignore conditions of the specified dependency@dependabot ignore this major versionwill close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)@dependabot ignore this minor versionwill close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)@dependabot ignore this dependencywill close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)You can disable automated security fix PRs for this repo from the Security Alerts page.