[pull] main from llvm:main #5651

pull · 2025-10-31T01:14:26Z

See Commits and Changes for more details.

Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

…168168) This probably should have turned into a regular integer constant earlier. This is to defend against future regressions.

The main improvement is to the mfma tests. There are some mild regressions scattered around, and a few major ones. The worst regressions are in some of the bitcast tests; these are cases where the SGPR argument list runs out and uses VGPRs, and the copies-from-VGPR are misidentified as divergent. Most of the shufflevector tests are also regressions. These end up with cleaner MIR, but then get poor regalloc decisions.

Implement support for the OffsetOfExpr

Upstream ExtVectorElementExpr with result Vector type

) Per [LWG554](https://cplusplus.github.io/LWG/issue554), the rationale is that even if `true / false` traps, the values causing trap are the converted `int` values produced by usual arithmetic conversion, but not the original `bool` values. This is also true for all other non-promoted integer types. As a result, `std::numeric_limits<I>` should be `false` if `I` is a non non-promoted integer type. Fixes #166053.

…#165779) (#168034) Refer to #158276 for previous hotfix. In Z3, boolean expressions are incompatible with bitvec operators. However, C expressions like `-(5 && a)` will generate such symbolic expressions, which will be further used as an integer. To be compatible with such usages, this fix converts such expressions to integer using the existing `fromCast`.

Update test to capture unnamed VPValues in variables, making it easier to update with future VPlan changes.

…n. (#167965) Extend willNotFreeBetween to perform simple checking across blocks to support the case where CtxI is in a successor of the block that contains the assume, but the assume's parent is the single predecessor of CtxI's block. This enables using _builtin_assume_dereferenceable to vectorize std::find_if and co in practice. End-to-end reproducer: https://godbolt.org/z/6jbsd4EjT PR: #167965

… __builtin_elementwise_sqrt (#168057) Followup to #165682

In #165748 constant expressions were allowed in `collectPossibleValues` because we are still using insertelement + shufflevector idioms to represent a scalable vector splat. However, it also accepts some unresolved constants like ptrtoint of globals or pointer difference between two globals. Absolutely we can ask the user to check this case with the constant folding API. However, since we don't observe the real-world usefulness of handling constant expressions, I decide to be more conservative and only handle immediate constants in the helper function. With this patch, we don't need to touch the SimplifyCFG part, as the values can only be either ConstantInt or undef/poison values (NB: switch on undef condition is UB). Fix the miscompilation reported by #165748 (comment)

These tests were only checking the specialized prefix, leaving common code unchecked (and incorrect). Checked code was also not using patterns for SSA values.

Construct SCEVs for VPWidenIntOrFpInductionRecipe analogous to VPCanonicalInductionPHIRecipe: create an AddRec with start + step from the recipe. Currently the only impact should be computing more costs of replicating stores directly in VPlan.

…167918) Use clang linker wrapper to device-link and embed HIP fat binary directly. Match CUDA non-RDC flow in new driver by producing .hipfb like .fatbin. Previously, llvm offload binary is used to package the device IR's and embed them in the host object file, then clang linker wrapper is used with each host object file to extract device IR's, perform device linking, bundle code objects into a fat binary, wrap it in a host object file, then merge it with the original host object by the host linker with '-r' option. However, the host linker in MSVC toolchain does not support '-r' option. The new approach still package the device IR's with llvm offload binary, but instead of embed it in a host object, it is passed to clang linker wrapper directly, where device IR's are extracted and linked, fat binary is generated, then embeded in the host object directly. Compared with the old offload driver, this approach can parallelize the device linking for different GPU's by using the parallelization feature of clang linker wrapper. Fixes: SWDEV-565994

Only check up to CtxI (CtxIter) when checking for calls that may free in CtxI's block. Missed update in #167965. This should be NFC, as all current callers pass a terminator that is guaranteed to not free as CtxI

As in title. AVX10.x doesn't distinguish between available vector lengths. -mattr=avx10.x-512 and defining of macros with _512 is kept for compatibility. Bit-positions of avx10.1/2 features in compiler-rt and X86TargetParser are synced to match those in the gcc.

…168128) When shrinking and/or to bitset* remove leftover implicit scc def. bitset* instructions do not set scc. Signed-off-by: John Lu <[email protected]>

The section headers present in the DBI stream got lost when using `pdb2yaml` and `yaml2pdb`. They are a list of COFF section headers. The `llvm::object::coff_section` didn't have a YAML mapping, so I added one in llvm-pdbutil. The mapping for COFF sections in ObjectYAML includes the section data itself, so we can't use it here. Creation of the section map and headers in yaml2pdb is done like in LLD: https://github.com/llvm/llvm-project/blob/438a18c1e105ca04e624239644195e48b28b5099/lld/COFF/PDB.cpp#L1695-L1703

This adds additional test coverage for folding FCMP uno (#166823)

Identified with bugprone-unused-local-non-trivial-variable.

Identified with llvm-use-ranges.

Identified with readability-delete-null-pointer.

NumElts is alreadyof type int. Identified with readability-redundant-casting.

This patch is limited to single-word replacements to fix spelling and/or grammar to ease the review process. Punctuation and markdown fixes are specifically excluded.

Simplifies some tests which no do not need to pass TC, and future changes will require to always have a trip count available.

…167981) During the initialization sequence in our tests the first 'threads' response sould only be kept if the process is actually stopped, otherwise we will have stale data. In VSCode, during the debug session startup sequence immediately after 'configurationDone' a 'threads' request is made. This initial request is to retrieve the main threads name and id so the UI can be populated. However, in our tests we do not want to cache this value unless the process is actually stopped. We do need to make this initial request because lldb-dap is caching the initial thread list during configurationDone before the process is resumed. We need to make this call to ensure the cached initial threads are purged. I noticed this in a CI job for another review (https://github.com/llvm/llvm-project/actions/runs/19348261989/job/55353961798) where the tests incorrectly failed to fetch the threads prior to validating the thread names.

There is an extra underscore in build_type param in #167583 patch. Fixing it in this PR.

…168433) This change adds the ACCImplicitRoutine pass which implements the OpenACC specification for implicit routine directives (OpenACC 3.4 spec, section 2.15.1). According to the specification: "If no explicit routine directive applies to a procedure whose definition appears in the program unit being compiled, then the implementation applies an implicit routine directive to that procedure if any of the following conditions holds: The procedure is called or its address is accessed in a compute region." The pass automatically generates `acc.routine` operations for functions called within OpenACC compute constructs or within existing routine functions that do not already have explicit routine directives. It recursively applies implicit routine directives while avoiding infinite recursion when dependencies form cycles. Key features: - Walks through all OpenACC compute constructs (parallel, kernels, serial) to identify function calls - Creates implicit `acc.routine` operations for functions without explicit routine declarations - Recursively processes existing `acc.routine` operations to handle transitive dependencies - Avoids infinite recursion through proper tracking of processed routines - Respects device-type specific bind clauses to skip routines bound to different device types Requirements: - Function operations must implement `mlir::FunctionOpInterface` to be identified and associated with routine directives. - Call operations must implement `mlir::CallOpInterface` to detect function calls and traverse the call graph. - Optionally pre-register `acc::OpenACCSupport` if custom behavior is needed for determining if a symbol use is valid within GPU regions (such as functions which are already considerations for offloading even without `acc routine` markings) Co-authored-by: delaram-talaashrafi<[email protected]>

This allows SDNodes to be validated against their expected type profiles and reduces the number of changes required to add a new node. The validation functionality has detected several issues, see `PPCSelectionDAGInfo::verifyTargetNode()`. Most of the nodes have a description in `*.td` files and were successfully "imported". Those that don't have a description are listed in the enum in `PPCSelectionDAGInfo.td`. These nodes are not validated. Part of #119709. Pull Request: #168108

We build the callsite graph by first adding nodes and edges for all allocation contexts, then match the interior callsite nodes onto actual calls (IR or summary), which due to inlining may result in the generation of new nodes representing the inlined context sequence. We attempt to update edges correctly during this process, but in the case of recursion this becomes impossible to always get correct. Specifically, when creating new inlined sequence nodes for stack ids on recursive cycles we can't always update correctly, because we have lost the original ordering of the context. This PR introduces a mechanism, guarded by -memprof-top-n-important= flag, to keep track of extra information for the largest N cold contexts. Another flag -memprof-fixup-important (enabled by default) will perform more expensive fixup of the edges for those largest N cold contexts, by saving and walking the original ordered list of stack ids from the context.

Some linux versions might not support the mlock call, so skip that part of the test if the mlock fails.

…167956) This commit adds a new helper function that creates various mock objects that can be used in dwarf expression testing. The optional register value and memory contents are used to create MockProcessWithMemRead and MockRegisterContext that can return expected memory contents and register values. This simplifies some tests by removing redundant code that creates these objects in individual tests and consolidates the logic into one place.

…face) (#168440) This MR fixes a recent build breakage by this MR: #166648 (Post-merge build error here: https://lab.llvm.org/buildbot/#/builders/138/builds/21929) The `MLIRInferIntRangeInterface` library is now a public dependency of `MLIRLLVMDialect`.

These functions should be declared in `stdlib.h`, not `wchar.h`, as confusing as it is. Move them to the proper header file and matching directories in src/ and test/ trees. This was discovered while testing libc++ build against llvm-libc, which re-declares functions like mbtowc in std-namespace in `<cstdlib>` header, and then uses those functions in its locale implementation.

The core LLVM library implements a specialization for `ilist_node_base<true, void>`, which is used by other components. This is needed to link properly when building LLVM as a library on Windows. This effort is tracked in #109483.

LDS block size should be 2048 bytes (512 dwords) based on current spec.

)" This reverts commit bde9062. This caused failures on Darwin that were not caught by upstream buildbots. Reverting for now to give myself some time to fix.

There seem to be cases where the workflow status is completed but the jobs have not completed. We need to gracefully handle these changes to avoid a crash loop in the metrics container.

Arm64EC indirect calls use a function __os_arm64x_check_icall... this has one obvious return value, x11, which is the function to call. However, it actually returns one other important value: x9, which is the final destination for the emulator after the call. If the call is calling x64 code, x9 is used by the thunk. Previously, we didn't model this, and it mostly worked because the compiler usually doesn't modify x9 in the narrow window between the check, and the call. That said, it can happen in some cases; one reliable way is to do an indirect tail-call with stack protectors enabled. (You can also just get unlucky with register allocation, but it's harder to write a testcase for that.) This patch uses the cfguardtarget bundle to simplify the calling convention handling, for similar reasons that x64 uses it: modifying arbitrary calls is difficult without a separate marking. Fixes #167430.

Add documentation about CMAKE_OSX_SYSROOT so that folks bringing up on OSX can have a clean test run.

These APIs are MachO specific, and the interfaces are about to be extended to support more MachO-specific behavior. For now it makes sense to group them with other MachO specific APIs in MachO.h.

This patch adds the posix function `inet_addr`. Since most of the parsing logic is delegated to `inet_aton`, I have only included some basic smoke tests for testing purposes.

When scanning an interface source (dylib or TBD file), consider "fallback" architectures (CPUType / CPUSubType pairs) in addition to the process's CPUType / CPUSubType. Background: When dyld loads a dylib into a process it may load dylib or slice whose CPU type / subtype isn't an exact match for the process's CPU type / subtype. E.g. arm64 processes can load arm64e dylibs / slices. When building an interface we need to follow the same logic, otherwise we risk generating a spurious "does not contain a compatible slice" error. E.g. If we're running an arm64 JIT'd program and loading an interface from a TBD file, and if no arm64 slice is present in that file, then we should fall back to looking for an arm64e slice. rdar://164510783

This patch makes objc-imageinfo.S work with the internal shell. The test uses a subshell to temporarily change the directory. The internal shell does not support subshells, so this construct was replaced with a pushd/popd sequence.

It's not reachable because the custom parser will accept or fail the whole instruction.

pull bot locked and limited conversation to collaborators Oct 31, 2025

pull bot added the ⤵️ pull label Oct 31, 2025

arsenm and others added 28 commits November 14, 2025 21:42

AMDGPU: Consider isVGPRImm when forming constant from build_vector (#…

9fecebf

…168168) This probably should have turned into a regular integer constant earlier. This is to defend against future regressions.

MCNopsFragment,MCBoundaryAlignFragment: Use parent MCSubtargetInfo

d9dfe75

MCAsmBackend: Remove unneeded MCAssembler parameter

29e3c2e

[CIR] Implement support for OffsetOfExpr (#167726)

30c8465

Implement support for the OffsetOfExpr

[CIR] ExtVectorElementExpr with result Vector type (#167925)

22f550b

Upstream ExtVectorElementExpr with result Vector type

[VPlan] Strip outdated comment in optimizeForVFAndUF (NFC) (#168068)

85db928

[LV] Use variables in CHECK lines for unnamed VPValues in test.

ca26cf8

Update test to capture unnamed VPValues in variables, making it easier to update with future VPlan changes.

[X86] Replace default _mm512_sqrt_pd/s/h implementations with generic…

4cd8c11

… __builtin_elementwise_sqrt (#168057) Followup to #165682

[mlir][emitc] Fix ineffective tests (#168197)

5613e4a

These tests were only checking the specialized prefix, leaving common code unchecked (and incorrect). Checked code was also not using patterns for SSA values.

[ValueTracking] Only check up to CtxIter in willNotFreeBetween.

20db716

Only check up to CtxI (CtxIter) when checking for calls that may free in CtxI's block. Missed update in #167965. This should be NFC, as all current callers pass a terminator that is guaranteed to not free as CtxI

[AMDGPU] When shrinking and/or to bitset*, remove implicit scc def (#…

9fa15ef

…168128) When shrinking and/or to bitset* remove leftover implicit scc def. bitset* instructions do not set scc. Signed-off-by: John Lu <[email protected]>

[LV] Add test with to check different interleave counts for fmaxnum.

59d2e93

This adds additional test coverage for folding FCMP uno (#166823)

[Utils] Remove an unused local variable (NFC) (#168181)

636e370

Identified with bugprone-unused-local-non-trivial-variable.

[llvm] Use llvm::copy (NFC) (#168182)

7a8237b

Identified with llvm-use-ranges.

[llvm] Delete pointers without null checks (NFC) (#168183)

3a7876d

Identified with readability-delete-null-pointer.

[Analysis] Remove a redundant cast (NFC) (#168184)

268ea1a

NumElts is alreadyof type int. Identified with readability-redundant-casting.

[llvm] Proofread *.rst (#168185)

63e059d

This patch is limited to single-word replacements to fix spelling and/or grammar to ease the review process. Punctuation and markdown fixes are specifically excluded.

[VPlan] Always set trip count when creating plan for unit tests (NFC).

67f61df

Simplifies some tests which no do not need to pass TC, and future changes will require to always have a trip count available.

DAG: Use poison when legalizing scalar_to_vector results (#167751)

33a7bb1

ashgti and others added 30 commits November 17, 2025 14:19

[libc]Github] Fix typo on build_type param (#168453)

e89e359

There is an extra underscore in build_type param in #167583 patch. Fixing it in this PR.

[scudo] Skip test if mlock fails. (#168448)

7a14ef0

Some linux versions might not support the mlock call, so skip that part of the test if the mlock fails.

[gn] port 900c517 (amdgpu SDNodeInfo)

0f0cf84

[AMDGPU] update LDS block size for gfx1250 (#167614)

5f38ae4

LDS block size should be 2048 bytes (512 dwords) based on current spec.

[gn] port 43dacd0 (ppc SDNodeInfo)

26b15b7

[bazel] Fix #168108 (#168461)

1bf902e

[gn] port 320c18a (systemz SDNodeInfo)

2c4bce4

Revert "Reapply "[compiler-rt] Default to Lit's Internal Shell" (#168232

eb20b53

)" This reverts commit bde9062. This caused failures on Darwin that were not caught by upstream buildbots. Reverting for now to give myself some time to fix.

[CI] Gracefully Fail when Job Completion Timestamp is None (#168457)

efee326

There seem to be cases where the workflow status is completed but the jobs have not completed. We need to gracefully handle these changes to avoid a crash loop in the metrics container.

[gn build] Port 1425d75

be96137

[gn build] Port 472e4ab

ec3e5dc

[gn build] Port 49d5bb0

307d7ed

[lldb] Update Lua typemap for #167764 (#168464)

186b8ba

Add documentation about CMAKE_OSX_SYSROOT (#168024)

f6ebb35

Add documentation about CMAKE_OSX_SYSROOT so that folks bringing up on OSX can have a clean test run.

[ORC] Merge GetDylibInterface.h APIs into MachO.h. (#168462)

17f0afe

These APIs are MachO specific, and the interfaces are about to be extended to support more MachO-specific behavior. For now it makes sense to group them with other MachO specific APIs in MachO.h.

[RISCV] Remove unused function declaration. NFC (#168459)

5b1a4db

[gn build] Port 17f0afe

7c09f12

[libc] implement inet_addr (#167708)

a5590a2

This patch adds the posix function `inet_addr`. Since most of the parsing logic is delegated to `inet_aton`, I have only included some basic smoke tests for testing purposes.

[ORC] Make tests work with Internal Shell (#168471)

d464c99

This patch makes objc-imageinfo.S work with the internal shell. The test uses a subshell to temporarily change the directory. The internal shell does not support subshells, so this construct was replaced with a pushd/popd sequence.

[RISCV] Remove Match_InvalidXSfmmVType. NFC (#168465)

0e3fba8

It's not reachable because the custom parser will accept or fail the whole instruction.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[pull] main from llvm:main #5651

[pull] main from llvm:main #5651

pull bot commented Oct 31, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

124 participants

[pull] main from llvm:main #5651

Are you sure you want to change the base?

[pull] main from llvm:main #5651

Conversation

pull bot commented Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

124 participants

pull bot commented Oct 31, 2025 •

edited

Loading