|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] x86/hvm: Improve hvm_set_guest_pat() code generation again
On 16.12.2022 21:53, Andrew Cooper wrote:
> On 10/08/2022 3:06 pm, Jan Beulich wrote:
>> On 10.08.2022 15:36, Andrew Cooper wrote:
>>> From: Edwin Török <edvin.torok@xxxxxxxxxx>
>>>
>>> Following on from cset 9ce0a5e207f3 ("x86/hvm: Improve hvm_set_guest_pat()
>>> code generation"), and the discovery that Clang/LLVM makes some especially
>>> disastrous code generation for the loop at -O2
>>>
>>> https://github.com/llvm/llvm-project/issues/54644
>>>
>>> Edvin decided to remove the loop entirely by fully vectorising it. This is
>>> substantially more efficient than the loop, and rather harder for a typical
>>> compiler to mess up.
>>>
>>> Signed-off-by: Edwin Török <edvin.torok@xxxxxxxxxx>
>>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>> The main downside being that changing the code to fit in a new PAT
>> type will now be harder.
>
> When was the last PAT type change?
>
> Trick question. Never, because PAT hasn't changed since it was
> introduced 24 years ago in the Pentium III.
>
> I really don't think we're in danger of needing to change this logic.
One way to look at things, sure.
>> I wonder in particular whether with that
>> in mind it wouldn't be better to express the check not in terms of
>> relations, but in terms of set / clear bits ("bits 3-7 clear AND
>> (bit 2 set OR bit 1 clear)"). The code kind of does so already, but
>> the variable names don't reflect that (and would hence need to
>> change in such an event).
>
> That would reduced clarity.
>
> The bits being set or cleared are trivial for any developer, given the
> particularly basic RHS expressions.
>
> The constant names are what relate the bit patterns to the description
> of the problem.
Again - one way to look at things. Plus, with Demi's series now also in
mind, what's done here is moving us in exactly the opposite direction.
Is this hot enough a function to warrant that?
>>> --- a/xen/arch/x86/hvm/hvm.c
>>> +++ b/xen/arch/x86/hvm/hvm.c
>>> @@ -302,24 +302,43 @@ void hvm_get_guest_pat(struct vcpu *v, u64 *guest_pat)
>>> *guest_pat = v->arch.hvm.pat_cr;
>>> }
>>>
>>> -int hvm_set_guest_pat(struct vcpu *v, uint64_t guest_pat)
>>> +/*
>>> + * MSR_PAT takes 8 uniform fields, each of which must be a valid
>>> architectural
>>> + * memory type (0, 1, 4-7). This is a fully vectorised form of the
>>> + * 8-iteration loop over bytes looking for PAT_TYPE_* constants.
>> While grep-ing for PAT_TYPE_ will hit this line, I think we want
>> every individual type to also be found here when grep-ing for one.
>> The actual values aren't going to change, but perhaps the beast
>> way to do so would still be by way of BUILD_BUG_ON()s.
>
> Why? What does that solve or improve?
>
> "pat" is the thing people are going to be looking for if they're
> actually trying to find this logic.
>
> (And I bring this patch up specifically after reviewing Demi's series,
> where PAT_TYPE_* changes to X86_MT_* but "pat" is still the useful
> search term IMO.)
I don't think "PAT" is a good thing to grep for when trying to find uses
of particular memory types. Go grep for "_WP", "_WC", or "_WT" - you'll
find not overly many hits, and with the false positives filtered out
you'll have a good overview of which of these types is used in how many
places.
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |