-
Couldn't load subscription status.
- Fork 0
Use ATTACH maps for array-sections/subscripts on pointers. #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: tgt-capture-mapped-ptrs-by-ref
Are you sure you want to change the base?
Use ATTACH maps for array-sections/subscripts on pointers. #1
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The libomptarget code will disappear from this PR once llvm#149036 is merged.
| const ValueDecl *BaseDecl = nullptr, const Expr *MapExpr = nullptr, | ||
| ArrayRef<OMPClauseMappableExprCommon::MappableExprComponentListRef> | ||
| OverlappedElements = {}, | ||
| bool AreBothBasePtrAndPteeMapped = false) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AreBothBaseptrAndPteeMapped was used to decide to use PTR_AND_OBJ maps for something like map(p, p[0]). We don't do that now, since we map them independently, and attach them separately.
…ribution. (llvm#163443) In some cases, loop bounds (lower, upper and step) of `scf.for` can come locally from the parent warp op the `scf.for`. Current logic will not yield the loop bounds in the new warp op generated during lowering causing sinked `scf.for` to have non dominating use. In this PR, we have added logic to yield loop bounds by default (treat them as other operands of `scf.for`) which fixes this bug.
The canonicalize is also triggered when the `trunc` is `nuw`. Proof: https://alive2.llvm.org/ce/z/eWvWe3 Fixes: llvm#162451
…m#163938) Addresses issue llvm#145937 Without this patch SROA generates new dbg_assign for new stores. We can simply steal the existing dbg_assigns linked to the old store when the store is not being split.
This reverts commit 0b9a7b8. This is causing test failures under LLVM: 1. Other/pass-pipeline-parsing.ll This broke premerge. This was notably not caught by premerge testing on the original PR because the original PR only touches polly, and premerge does not test LLVM when only polly is touched.
Implement CXXDefaultArgExpr support for ComplexType Issue llvm#141365
… SSE41 phminposuw intrinsic to be used in constexp (llvm#163041) Fix llvm#161336
Added support for ConditionalOperator, BinaryConditionalOperator and OpaqueValueExpr as lvalue. Implemented support for ternary operators with one branch being a throw expression. This required weakening the requirement that the true and false regions of the ternary operator must terminate with a `YieldOp`. Instead the true and false regions are now allowed to terminate with an `UnreachableOp` and no `YieldOp` gets emitted when the block throws.
This were all removed in llvm#160028, but I apparently missed this one instance in the documentation. Remove it given that it no longer works.
This patch adds a new script, premerge_advisor_explain.py that requests test failure explanations from the premerge advisor. For now it just prints them out to STDOUT. This allows for testing of the entire system by looking at failure explanations in failed jobs before we do the rest of the wiring to enable the premerge advisor to write out comments.
… AVX/AVX512 subvector extraction intrinsics to be used in constexpr llvm#157712 (llvm#162836) **This PR supersedes and replaces PR llvm#158853** The original branch diverged too far from the main branch, resulting in significant merge conflicts that were difficult to resolve cleanly. To provide a clean and reviewable history, this new PR was created by cherry-picking the necessary commits onto a fresh branch based on the latest `main`. --- *(Original Description)* This patch enables the use of AVX/AVX512 subvector extraction intrinsics within `constexpr` functions. This is achieved by implementing the evaluation logic for these intrinsics in `VectorExprEvaluator::VisitCallExpr` and `InterpretBuiltin`. The original discussion and review comments can be found in the previous pull request for context: llvm#158853 Fixes llvm#157712
The primary purpose of this commit is to enable marking loads to LDS (global.load.lds, buffer.*.load.lds) volatile (using bit 31 of the aux as with normal buffer loads) and to ensure that their !nontemporal annotations translate to appropriate settings of te cache control bits. However, in the process of implementing this feature, we also fixed - Incorrect handling of buffer loads to LDS in GlobalISel - Updating the handling of volatile on buffers in SIMemoryLegalizer: previously, the mapping of address spaces would cause volatile on buffer loads to be silently dropped on at least gfx10. --------- Co-authored-by: Matt Arsenault <[email protected]>
These two are lowered as if they are the expression: LHS = (LHS < RHS ) ? RHS : LHS; and LHS = (LHS < RHS ) ? LHS : RHS; This patch generates these expressions and ensures they are properly emitted into IR. Note: this is dependent on llvm#163580 and cannot be merged until that one is (or the tests will fail).
Replace with PatGprShiftMaskXLen/PatGprShiftMask32 or using the ShiftMaskXLen/ShiftMask32 ComplexPattern direclty in patterns. This avoids various casts that were need to make a ComplexPattern work inside of a PatFrag.
Variant part, represented by `DW_TAG_variant_part` is a structure with a
discriminant and different variants, from which only one can be active
and valid at the same time. The discriminant is the main difference
between variant parts and unions represented by `DW_TAG_union` type.
Variant parts are used by Rust enums, which look like:
```rust
pub enum MyEnum {
First { a: u32, b: i32 },
Second(u32),
}
```
This type's debug info is the following `DICompositeType` with
`DW_TAG_structure_type` tag:
```llvm
!4 = !DICompositeType(tag: DW_TAG_structure_type, name: "MyEnum",
scope: !2, file: !5, size: 96, align: 32, flags: DIFlagPublic,
elements: !6, templateParams: !16,
identifier: "faba668fd9f71e9b7cf3b9ac5e8b93cb")
```
With one element being also a `DICompositeType`, but with
`DW_TAG_variant_part` tag:
```llvm
!6 = !{!7}
!7 = !DICompositeType(tag: DW_TAG_variant_part, scope: !4, file: !5,
size: 96, align: 32, elements: !8, templateParams: !16,
identifier: "e4aee046fc86d111657622fdcb8c42f7", discriminator: !21)
```
Which has a discriminator:
```llvm
!21 = !DIDerivedType(tag: DW_TAG_member, scope: !4, file: !5,
baseType: !13, size: 32, align: 32, flags: DIFlagArtificial)
```
Which then holds different variants as `DIDerivedType` elements with
`DW_TAG_member` tag:
```llvm
!8 = !{!9, !17}
!9 = !DIDerivedType(tag: DW_TAG_member, name: "First", scope: !7,
file: !5, baseType: !10, size: 96, align: 32, extraData: i32 0)
!10 = !DICompositeType(tag: DW_TAG_structure_type, name: "First",
scope: !4, file: !5, size: 96, align: 32, flags: DIFlagPublic,
elements: !11, templateParams: !16,
identifier: "cc7748c842e275452db4205b190c8ff7")
!11 = !{!12, !14}
!12 = !DIDerivedType(tag: DW_TAG_member, name: "a", scope: !10,
file: !5, baseType: !13, size: 32, align: 32, offset: 32,
flags: DIFlagPublic)
!13 = !DIBasicType(name: "u32", size: 32, encoding: DW_ATE_unsigned)
!14 = !DIDerivedType(tag: DW_TAG_member, name: "b", scope: !10,
file: !5, baseType: !15, size: 32, align: 32, offset: 64,
flags: DIFlagPublic)
!15 = !DIBasicType(name: "i32", size: 32, encoding: DW_ATE_signed)
!16 = !{}
!17 = !DIDerivedType(tag: DW_TAG_member, name: "Second", scope: !7,
file: !5, baseType: !18, size: 96, align: 32, extraData: i32 1)
!18 = !DICompositeType(tag: DW_TAG_structure_type, name: "Second",
scope: !4, file: !5, size: 96, align: 32, flags: DIFlagPublic,
elements: !19, templateParams: !16,
identifier: "a2094b1381f3082d504fbd0903aa7c06")
!19 = !{!20}
!20 = !DIDerivedType(tag: DW_TAG_member, name: "__0", scope: !18,
file: !5, baseType: !13, size: 32, align: 32, offset: 32,
flags: DIFlagPublic)
```
BPF backend was assuming that all the elements of any `DICompositeType`
have tag `DW_TAG_member` and are instances of `DIDerivedType`. However,
the single element of the outer composite type `!4` has tag
`DW_TAG_variant_part` and is an instance of `DICompositeType`. The
unconditional call of `cast<DIDerivedType>` on all elements was causing
an assertion failure when any Rust code with enums was compiled to the
BPF target.
Fix that by:
* Handling `DW_TAG_variant_part` in `visitStructType`.
* Replacing unconditional call of `cast<DIDerivedType>` over
`DICompositeType` elements with a `switch` statement, handling both
`DW_TAG_member` and `DW_TAG_variant_part` and casting the element to an
appropriate type (`DIDerivedType` or `DICompositeType`).
Fixes: llvm#155778
Add `try_lock` to confirm to Lockable, which is necessary to use it with `std::scoped_lock`.
Having taken on a maintainer role for these dialects, make it official with a CODEOWNERS entry. --------- Co-authored-by: Jakub Kuderski <[email protected]>
Suggest the `initializer_list` overload instead. 4+ args is an arbitrary number that allows for incremental deprecation without having too update too many call sites. For more context, see llvm#163117.
Adding llvm-ir2vec, llvm-ctxprof-util (and llvm-sim) in the Bazel configs. llvm-ctxprof-util and llvm-ir2vec are used in several LIT unit tests, and the missing binary is causing unit test failures. llvm-ctxprof-util: https://github.com/llvm/llvm-project/blob/15cde999d47c3edc7647faf5fd967f5d5d88416a/llvm/test/Analysis/CtxProfAnalysis/flatten-icp.ll#L2 llvm-ir2vec: https://github.com/llvm/llvm-project/blob/55d4e92c8821d5543469118a76fe38db866377b7/llvm/utils/mlgo-utils/IR2Vec/generateTriplets.py#L60 llvm-sim: https://github.com/llvm/llvm-project/blob/894eaf481542adefde861a7e39f769f21d3f4fa4/llvm/test/lit.cfg.py#L268
This patch implements llvm::countr_zero_constexpr, a constexpr version of llvm::countr_zero, in terms of llvm::popcount while making llvm::popcount a constexpr function at the same time. The new function is intended to serve as a marker. When we switch to C++20, we will most likely go through functions in llvm/ADT/bit.h and replace them with their counterparts from <bit>. With llvm::countr_zero_constexpr, we can easily replace its use with std::countr_zero. This patch reimplements ConstantLog2 in terms of the new function.
This rewrite does not preserve numerics: for example, we'd expect the maximum fp value to yield Inf instead of identity. `GL.Length` does not allow for fast math flags, so we need to remove this. Special cases (constants) can be handled via a folder if someone wants to implement one.
Add builders on the Python side that match builders in the C++ side, add tests for launching GPU kernels and regions, and correct some small documentation mistakes. This reflects the API decisions already made in the func dialect's Python bindings and makes use of the GPU dialect's bindings work more similar to C++ interface.
…ns (llvm#163863) Before the patch the added test case would indent the function and moving its second line beyond the column limit. Fixes llvm#68122.
As the Cygwin platform requires $PATH to be set in order to run unittests, do the same as for the regular Windows target.
…lvm#164039) Two of the tests are currently asserting, and two are emitting unexpected results. The asserting tests will be fixed using the ATTACH-style codegen from llvm#153683. The other two involve `use_device_addr` on byrefs, and need more follow-up codegen changes, that have been noted in a FIXME comment.
…ion variable (llvm#164147) `@SHLIBDIR@` is replaced by CMake's configuration function, so it must be in `lit.site.cfg.py.in` but not `lit.cfg.py`. `lit.cfg.py` must reference variables in generated `lit.site.cfg.py`. We didn't notice this problem because it only affects Windows (including MinGW and Cygwin) that are configured with either LLVM_LINK_LLVM_DYLIB=ON or BUILD_SHARED=ON.
Add OnDiskGraphDB and OnDiskKeyValueDB that can be used to implement ObjectStore and ActionCache respectively. Those are on-disk persistent storage that build upon OnDiskTrieHashMap and implements key functions that are required by LLVMCAS interfaces. This abstraction layer defines how the objects are hashed and stored on disk. OnDiskKeyValueDB is a basic OnDiskTrieHashMap while OnDiskGraphDB also defines: * How objects of various size are store on disk and are referenced by the trie nodes. * How to store the references from one stored object to another object that is referenced. In addition to basic APIs for ObjectStore and ActionCache, other advances database configuration features can be implemented in this layer without exposing to the users of the LLVMCAS interface. For example, OnDiskGraphDB has a faulty in function to fetch data from an upstream OnDiskGraphDB if the data is missing.
Move the parse tree utility function semantics::getDesignatorNameIfDataRef to Parser/tools.h and rename it to comply with the local style.
This fixes a build error when building tensorflow on riscv64 linux.
…-using-attach-maptype
…anslations happen at the end.
This is the initial clang change to support using
ATTACHmap-type for pointer-attachment.This builds upon the following:
targetby reference. llvm/llvm-project#145454For example, for the following:
The following maps are now emitted by clang:
Previously, the two possible maps emitted by clang were:
(B) does not perform any pointer attachment, while (C) also maps the
pointer p, both of which are incorrect.
With this change, we are using ATTACH-style maps, like
(A), for cases where the expression has a base-pointer. For example:We also group mapping of clauses with the same base decl in the order of the increasing complexity of their base-pointers, e.g. for something like:
We first map
spp, thenspp[0]then spp[0][0] and spp[0][0].a.This allows us to also group "struct" allocation based on their attach pointers.
Cases that need handling:
pis a base-pointer in a map from a member function within the same class, p is not beingprivatized, instead, we still try to create an implicit map ofthis[0:1], and accesspthrough that, which is incorrect.use_device_addrclause does not work properly, because we don't have a proper component-list set-up for it, just one component, so we cannot find the proper attach-ptr. Foruse_device_addr, we should match existing maps whose attach-ptr matches the attach-ptr of theuse_device_addroperand.use_device_ptrhandling has some issues too. Need debugging.Some tests still haven't been updated. These include: