[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 2/3] x86: use CLFLUSHOPT when available



>>> On 10.02.16 at 16:03, <andrew.cooper3@xxxxxxxxxx> wrote:
> On 10/02/16 12:57, Jan Beulich wrote:
>> Also drop an unnecessary va adjustment in the code being touched.
>>
>> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
>>
>> --- a/xen/arch/x86/flushtlb.c
>> +++ b/xen/arch/x86/flushtlb.c
>> @@ -139,10 +139,12 @@ unsigned int flush_area_local(const void
>>               c->x86_clflush_size && c->x86_cache_size && sz &&
>>               ((sz >> 10) < c->x86_cache_size) )
>>          {
>> -            va = (const void *)((unsigned long)va & ~(sz - 1));
>> +            alternative(ASM_NOP3, "sfence", X86_FEATURE_CLFLUSHOPT);
> 
> Why separate?  This would be better in the lower alternative(), with one
> single nop making up the difference in length.  That way, processors
> without CLFLUSHOPT don't suffer the 1 cycle instruction decode stall
> from the redundant rex prefix.

Why would we want the fence inside the loop - a single fence is
sufficient for the entire flush.

Also if we're worried about the REX decode, this could easily be a
NOP instead, just that I'm not certain which one in the end is less
decode overhead.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.