[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 00/32] x86/msr: Drop 32-bit MSR interfaces



On 2026-06-29 01:38, Arnd Bergmann wrote:
>>
>> There is no RDMSRQ instruction on any x86 CPU. Are you mixing this up with
>> WRMSRNS/RDMSR using an immediate for addressing the MSR?
> 
> Yes, I was just confused about the exact definition here and assumed
> the single-register output version was actually called rdmsrq.
> 
So just to be clear:

There are three instructions(*):

        wrmsr           - implicit form only
        wrmsrns         - implicit or immediate
        rdmsr           - implicit or immediate

The implicit form are the same on 32 and 64 bits (and, in fact, 16 bits): they
take a MSR register address in %ecx and the data as two 32-bit words in
%edx:%eax. This interface predates x86-64 by about a decade, and the Linux MSR
interfaces were designed when Linux was 32-bit only, so it made sense at the
time to treat them as two halves, especially since MSRs often are various
kinds of bitfields. It didn't help that gcc at the time was extremely
inefficient in its handling of multiword arithmetic (it is much better now),
so using a u64 would have made for much worse code.

The immediate forms are 64-bit only and use a single arbitrary 64-bit
register; the MSR address is kept in an immediate in the instruction, just
like they are for most other register types. The only thing that is "special"
there is that the possible register address space is very large (2^32)
although in practice a very small fraction of that is (currently) used.

The immediate forms are expected to be faster, and provide for further
performance improvements in future microarchitectures. This is important,
because it provides a fine-grain uniform architecture for supervisor-only
state, instead of having to give a bulk ISA (XSAVES/XRSTORS) that is different
from the fine-grained architecture, and still get good performance. This gives
the kernel very fine level control over the context switch flows, for one thing.

WRMSRNS is a non-serializing form of WRMSR, which is defined as an
architecturally hard-serializing instruction, although some MSRs have been
retconned as non-serializing (and the set is different between vendors.) We
want to switch that over to the model where the kernel explicitly opts in to
nonserialization, but that means using alternatives since not all CPUs have
the WRMSRNS instruction.

Furthermore, we want to use alternatives so we can make use of the
immediate-format instructions when the MSR address is known at compile time,
which it is in *nearly* all cases. If we are smart about it we can also use
this to let the tracing framework be specific about what MSRs to trace, since
some MSRs are frequently accessed, but many are set at startup and then
rarely, if ever, touched.


(*) There are actually two more instructions:

        RDMSRLIST
        WRMSRLIST

... which are bulk versions of RDMSR and WRMSRNS respectively. They can be
useful to save and restore entire groups of MSRs in one shot, such as
performance counter configurations. By architecturally allowing the memory
operations and MSR operations to operate asynchronously, they give some of the
pipeline benefits of the immediate MSR operations without requiring the MSR
set to have been set at compile time or code to be dynamically generated.

However, they expose an entirely different programming model, whereas the
immediate- and -NS instruction choices can be entirely hidden at the C level.




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.