[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [PATCH] x86: prefer RDTSCP in rdtsc_ordered()
If available, its use is supposed to be cheaper than LFENCE+RDTSC, and is virtually guaranteed to be cheaper than MFENCE+RDTSC. Unlike in rdtsc() use 64-bit local variables, eliminating the need for the compiler to emit a zero-extension insn for %eax (that's a cheap MOV, yet still pointless to have). Suggested-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx> --- a/xen/arch/x86/include/asm/msr.h +++ b/xen/arch/x86/include/asm/msr.h @@ -108,18 +108,30 @@ static inline uint64_t rdtsc(void) static inline uint64_t rdtsc_ordered(void) { - /* - * The RDTSC instruction is not ordered relative to memory access. - * The Intel SDM and the AMD APM are both vague on this point, but - * empirically an RDTSC instruction can be speculatively executed - * before prior loads. An RDTSC immediately after an appropriate - * barrier appears to be ordered as a normal load, that is, it - * provides the same ordering guarantees as reading from a global - * memory location that some other imaginary CPU is updating - * continuously with a time stamp. - */ - alternative("lfence", "mfence", X86_FEATURE_MFENCE_RDTSC); - return rdtsc(); + uint64_t low, high, aux; + + /* + * The RDTSC instruction is not ordered relative to memory access. + * The Intel SDM and the AMD APM are both vague on this point, but + * empirically an RDTSC instruction can be speculatively executed + * before prior loads. An RDTSC immediately after an appropriate + * barrier appears to be ordered as a normal load, that is, it + * provides the same ordering guarantees as reading from a global + * memory location that some other imaginary CPU is updating + * continuously with a time stamp. + * + * RDTSCP, otoh, "does wait until all previous instructions have + * executed and all previous loads are globally visible" (SDM) / + * "forces all older instructions to retire before reading the + * timestamp counter" (APM) + */ + alternative_io_2("lfence; rdtsc", + "mfence; rdtsc", X86_FEATURE_MFENCE_RDTSC, + "rdtscp", X86_FEATURE_RDTSCP, + ASM_OUTPUT2("=a" (low), "=d" (high), "=c" (aux)), + /* no inputs */); + + return (high << 32) | low; } #define __write_tsc(val) wrmsrl(MSR_IA32_TSC, val)
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |