-
Notifications
You must be signed in to change notification settings - Fork 300
Make __cpuid_count() a pure function
#1943
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
'__cpuid_count()' is implemented using inline assembly, because LLVM doesn't have an intrinsic for it. It's a pure operation, but this wasn't marked in the 'asm!' invocation; so calls to it couldn't be elided or deduplicated. This change makes it pure. CPUID does have _some_ less-than-pure effects -- e.g. it can be used as a serializing instruction (like a strong memory fence). Users who want to rely on that could use inline assembly themselves instead.
|
r? @folkertdev rustbot has assigned @folkertdev. Use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
Cpuid isn't always pure. For example it can be used to query the Local APIC ID, which changes when a thread gets rescheduled to another core. Not sure if you can query that from ring 3 though. |
|
If CPUID is not always pure, should we look for a way to accommodate pure use cases? Or should users looking for pure information (which is most of them AFAIK) write inline assembly to inform LLVM of that property? This is at least possible with |
|
I said this here but for availability in this issue:
|
|
@thomcc I completely understand that this might be seen as a breaking change, which is why I enumerated the other options. I noticed this issue when looking at assembly outputs for small code examples, where CPUID could have been elided but wasn't; I agree with your advice, I'm aware of the performance impact and am not planning to call it very often. |
|
This conflicts with #1935: we can't have this be both safe and pure because if the same cpuid call (with the same inputs) ever returns different values then it would result in undefined behavior. |
|
If this was marked as pure and unsafe, what's the conditions that callers can/must uphold to avoid running into UB from it not being actually pure? I can't think of anything other than "don't call this with inputs for which it's not a pure function" but that seems like a breaking change if there are any such inputs. |
|
I feel like having a safe Couldn't the application discussed in the top post cache the values? Rather than relying on optimization to possibly elide calls. |
That's a good argument. And you're right that this isn't a major optimization in any way; I had put up this PR because this seemed like a small oversight in the intrinsics set. I was wrong -- the intrinsics have all been pretty ironclad :) I'll probably close the PR once the discussion dries up. |
Refiled from rust-lang/rust#147961.
Crates like
raw_cpuidusecore::arch::x86_64::__cpuid_count()to determine x86 CPU information. It's great that core provides such a function, instead of having to write inline assembly everywhere; but core's implementation does not use theasm!attributespureandnomem. This means that calls to__cpuid_count()can't be elided or deduplicated. I'm writing some target-feature enhancement code (akin tomultiversion), and I'd like to rely on CPUID getting optimized away appropriately.While CPUID is a serializing instruction, that's not the primary use case for it. There are several possible approaches to separating the primary use case (where it can be treated as a pure function) from secondary use cases (where it needs to be impure):
Make
__cpuid_count()pure and require inline assembly for secondary use cases. (implemented in this PR)Secondary use cases are IMO quite rare and their users probably don't mind using inline assembly manually, in order to control LLVM thoroughly. But this is might be considered a breaking change.
Make
__cpuid_count()pure and introduce__impure_cpuid_count()for secondary use cases.This would simplify the updating of secondary use cases, but might still be considered a breaking change. It would also require replicating
__cpuid()and __get_cpuid_max()`.Leave
__cpuid_count()as-is and introduce__pure_cpuid_count().This would not be a breaking change; however, I find it unfortunate that the primary use case for this function would be relegated to a more inconvenient function name. Once an approach is stabilized, it would be harder to transition to an (IMO) ideal world where
__cpuid_count()is pure.I think approach 1 is ideal, but it's a (minor?) breaking change, and I'll leave that judgement to the reviewer.