Skip to content

Conversation

@bal-e
Copy link

@bal-e bal-e commented Oct 22, 2025

Refiled from rust-lang/rust#147961.

Crates like raw_cpuid use core::arch::x86_64::__cpuid_count() to determine x86 CPU information. It's great that core provides such a function, instead of having to write inline assembly everywhere; but core's implementation does not use the asm! attributes pure and nomem. This means that calls to __cpuid_count() can't be elided or deduplicated. I'm writing some target-feature enhancement code (akin to multiversion), and I'd like to rely on CPUID getting optimized away appropriately.

While CPUID is a serializing instruction, that's not the primary use case for it. There are several possible approaches to separating the primary use case (where it can be treated as a pure function) from secondary use cases (where it needs to be impure):

  1. Make __cpuid_count() pure and require inline assembly for secondary use cases. (implemented in this PR)

    Secondary use cases are IMO quite rare and their users probably don't mind using inline assembly manually, in order to control LLVM thoroughly. But this is might be considered a breaking change.

  2. Make __cpuid_count() pure and introduce __impure_cpuid_count() for secondary use cases.

    This would simplify the updating of secondary use cases, but might still be considered a breaking change. It would also require replicating __cpuid() and __get_cpuid_max()`.

  3. Leave __cpuid_count() as-is and introduce __pure_cpuid_count().

    This would not be a breaking change; however, I find it unfortunate that the primary use case for this function would be relegated to a more inconvenient function name. Once an approach is stabilized, it would be harder to transition to an (IMO) ideal world where __cpuid_count() is pure.

I think approach 1 is ideal, but it's a (minor?) breaking change, and I'll leave that judgement to the reviewer.

'__cpuid_count()' is implemented using inline assembly, because LLVM
doesn't have an intrinsic for it.  It's a pure operation, but this
wasn't marked in the 'asm!' invocation; so calls to it couldn't be
elided or deduplicated.  This change makes it pure.

CPUID does have _some_ less-than-pure effects -- e.g. it can be used as
a serializing instruction (like a strong memory fence).  Users who want
to rely on that could use inline assembly themselves instead.
@rustbot
Copy link
Collaborator

rustbot commented Oct 22, 2025

r? @folkertdev

rustbot has assigned @folkertdev.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

Copy link
Contributor

@folkertdev folkertdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @sayantn @Amanieu, I think you just know more about this.

I was also briefly confused why this function is unsafe, but #1935 already attempts to fix that.

@bjorn3
Copy link
Member

bjorn3 commented Oct 22, 2025

Cpuid isn't always pure. For example it can be used to query the Local APIC ID, which changes when a thread gets rescheduled to another core. Not sure if you can query that from ring 3 though.

@bal-e
Copy link
Author

bal-e commented Oct 22, 2025

If CPUID is not always pure, should we look for a way to accommodate pure use cases? Or should users looking for pure information (which is most of them AFAIK) write inline assembly to inform LLVM of that property? This is at least possible with raw_cpuid because it provides an abstraction over the raw CPUID function.

@thomcc
Copy link
Member

thomcc commented Oct 24, 2025

I said this here but for availability in this issue:

It's possible people wrote code assuming this was a serializing instruction, so changing it could be considered a breaking change. Or at least a change that could break code. Also, it's worth noting, CPUID is very expensive (partially because it is serializing). You probably should not be using it directly in your macro expansion, regardless of if the compiler can elide repeated calls.

@bal-e
Copy link
Author

bal-e commented Oct 24, 2025

@thomcc I completely understand that this might be seen as a breaking change, which is why I enumerated the other options. I noticed this issue when looking at assembly outputs for small code examples, where CPUID could have been elided but wasn't; I agree with your advice, I'm aware of the performance impact and am not planning to call it very often.

@Amanieu
Copy link
Member

Amanieu commented Oct 24, 2025

This conflicts with #1935: we can't have this be both safe and pure because if the same cpuid call (with the same inputs) ever returns different values then it would result in undefined behavior.

@hanna-kruppe
Copy link

If this was marked as pure and unsafe, what's the conditions that callers can/must uphold to avoid running into UB from it not being actually pure? I can't think of anything other than "don't call this with inputs for which it's not a pure function" but that seems like a breaking change if there are any such inputs.

@tgross35
Copy link
Contributor

I feel like having a safe cpuid that doesn't change its current behavior is easier to reason about for more users than unsafe and pure. Most cases call cpuid sparingly anyway; making users think about whether the values they read are static or OS-controlled doesn't really seem worth the optimization.

Couldn't the application discussed in the top post cache the values? Rather than relying on optimization to possibly elide calls.

@bal-e
Copy link
Author

bal-e commented Oct 25, 2025

I feel like having a safe cpuid that doesn't change its current behavior is easier to reason about for more users than unsafe and pure.

That's a good argument. And you're right that this isn't a major optimization in any way; I had put up this PR because this seemed like a small oversight in the intrinsics set. I was wrong -- the intrinsics have all been pretty ironclad :)

I'll probably close the PR once the discussion dries up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants