|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 4/6] x86/idle: Implement a new MWAIT IPI-elision algorithm
On 03/07/2025 10:01 am, Jan Beulich wrote:
> On 02.07.2025 16:41, Andrew Cooper wrote:
>> In order elide IPIs, we must be able to identify whether a target CPU is in
>> MWAIT at the point it is woken up. i.e. the store to wake it up must also
>> identify the state.
>>
>> Create a new in_mwait variable beside __softirq_pending, so we can use a
>> CMPXCHG to set the softirq while also observing the status safely. Implement
>> an x86 version of arch_pend_softirq() which does this.
>>
>> In mwait_idle_with_hints(), advertise in_mwait, with an explanation of
>> precisely what it means. X86_BUG_MONITOR can be accounted for simply by not
>> advertising in_mwait.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>> ---
>> CC: Jan Beulich <JBeulich@xxxxxxxx>
>> CC: Roger Pau Monné <roger.pau@xxxxxxxxxx>
>> CC: Anthony PERARD <anthony.perard@xxxxxxxxxx>
>> CC: Michal Orzel <michal.orzel@xxxxxxx>
>> CC: Julien Grall <julien@xxxxxxx>
>> CC: Stefano Stabellini <sstabellini@xxxxxxxxxx>
>>
>> This is modelled after Linux's TIF_NEED_RESCHED (single bit equivelent of all
>> of __softirq_pending), and TIF_POLLING_NRFLAG (arch-neutral "in_mwait").
>>
>> In Linux, they're both in the same flags field, which adds complexity. In
>> Xen, __softirq_pending is already unsigned long for everything other than
>> x86,
>> so adding an arch-neutral "in_mwait" would need wider changes.
> Why would the flag need to be arch-neutral?
Because it's not about mwait; it's about the ability to notice the
rising edge of TIF_NEED_RESCHED, and is implemented by more than just
x86 in Linux.
>> + */
>> + alternative_io("movb $1, %[in_mwait]",
>> + "", X86_BUG_MONITOR,
>> + [in_mwait] "=m" (stat->in_mwait));
>>
>> monitor(this_softirq_pending, 0, 0);
>> smp_mb();
> Unlike alternative(), alternative_io() has no memory clobber. To the compiler
> the two resulting asm() therefore have no dependency operand-wise[1]. The
> sole ordering criteria then is that they're both volatile asm()-s. It not
> being explicitly said (anywhere that I could find) that volatile guarantees
> such ordering, I wonder if we wouldn't better have an operand-based
> dependency between the two. Otoh it may well be that we rely on such ordering
> elsewhere already, in which case having one more instance probably would be
> okay(ish).
>
> [1] It's actually worse than that, I think: monitor() also doesn't specify
> the location as a (memory) input.
The GCC bugzilla has plenty of statements that volatiles (which have
survived optimisation passes) can't be reordered.
>
>> @@ -452,6 +466,10 @@ void mwait_idle_with_hints(unsigned int eax, unsigned
>> int ecx)
>> mwait(eax, ecx);
>> spec_ctrl_exit_idle(info);
>> }
>> +
>> + alternative_io("movb $0, %[in_mwait]",
>> + "", X86_BUG_MONITOR,
>> + [in_mwait] "=m" (stat->in_mwait));
> This doesn't strictly need to be an alternative, does it? We can save a
> memory write in the buggy case, but that likely makes at most a marginal
> difference.
It doesn't *need* to be an alternative. It could be:
if ( !cpu_bug_monitor )
ACCESS_ONCE(stat->in_mwait) = true;
but getting rid of the branch is an advantage too.
>> --- a/xen/arch/x86/include/asm/hardirq.h
>> +++ b/xen/arch/x86/include/asm/hardirq.h
>> @@ -5,7 +5,19 @@
>> #include <xen/types.h>
>>
>> typedef struct {
>> - unsigned int __softirq_pending;
>> + /*
>> + * The layout is important. Any CPU can set bits in __softirq_pending,
>> + * but in_mwait is a status bit owned by the CPU. softirq_mwait_raw
>> must
>> + * cover both, and must be in a single cacheline.
>> + */
>> + union {
>> + struct {
>> + unsigned int __softirq_pending;
>> + bool in_mwait;
>> + };
>> + uint64_t softirq_mwait_raw;
>> + };
> To guard against someone changing e.g. __softirq_pending to unsigned long
> while ignoring the comment, how about adding a BUILD_BUG_ON() checking both
> parts of the union are the same size (which of course would require naming
> either the union field within the containing struct or its struct member)?
That turns out to be very hard because of the definition of
softirq_pending() being common. More below.
>
>> @@ -9,4 +11,36 @@
>> #define HVM_DPCI_SOFTIRQ (NR_COMMON_SOFTIRQS + 4)
>> #define NR_ARCH_SOFTIRQS 5
>>
>> +/*
>> + * Ensure softirq @nr is pending on @cpu. Return true if an IPI can be
>> + * skipped, false if the IPI cannot be skipped.
>> + *
>> + * We use a CMPXCHG covering both __softirq_pending and in_mwait, in order
>> to
>> + * set softirq @nr while also observing in_mwait in a race-free way.
>> + */
>> +static always_inline bool arch_pend_softirq(unsigned int nr, unsigned int
>> cpu)
>> +{
>> + uint64_t *ptr = &irq_stat[cpu].softirq_mwait_raw;
>> + uint64_t old, new;
>> + unsigned int softirq = 1U << nr;
>> +
>> + old = ACCESS_ONCE(*ptr);
>> +
>> + do {
>> + if ( old & softirq )
>> + /* Softirq already pending, nothing to do. */
>> + return true;
>> +
>> + new = old | softirq;
>> +
>> + } while ( (old = cmpxchg(ptr, old, new)) != new );
> Don't you mean
>
> } while ( (new = cmpxchg(ptr, old, new)) != old );
>
> here
No. I'm pretty sure I don't.
> (with new latched ahead of the loop and old set from new inside it),
> like we have it elsewhere when we avoid the use of yet another variable,
> e.g. in inc_linear_{entries,uses}()?
Eww, no. Having new and old backwards like that is borderline
obfucation, and is unnecessary cognitive complexity for the reader.
>
>> + /*
>> + * We have caused the softirq to become pending. If in_mwait was set,
>> the
>> + * target CPU will notice the modification and act on it.
>> + */
>> + return new & (1UL << 32);
> Hmm, this requires the layout to allow for even less re-arrangement than the
> comment ahead of the union says: You strictly require in_mwait to follow
> __softirq_pending immediately, and the (assembly) write to strictly write 1
> into the field. Could I talk you into at least
>
> return new & (1UL << (offsetof(..., in_mwait) * 8));
>
> ? This way the field being inspected would also be mentioned in the access
> itself (i.e. become grep-able beyond the mentioning in the comment). And I
> sincerely dislike hard-coded literal numbers when they (relatively) easily
> can be expressed another way.
The nice way to do this would be a named union and a proper field, but
that doesn't work because of how softirq_pending() is declared.
I suppose I can use an offsetof() expression.
~Andrew
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |