[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Xen-devel] fpu_taskswitch adjustment proposal
>>> On 15.06.12 at 19:06, Keir Fraser <keir@xxxxxxx> wrote:
> On 15/06/2012 17:03, "Jan Beulich" <JBeulich@xxxxxxxx> wrote:
>> While pv-ops so far doesn't care to eliminate the two trap-and-
>> emulate CR0 accesses from the asm/xor.h save/restore
>> operations, the legacy x86-64 kernel uses conditional clts()/stts()
>> for this purpose. While looking into whether to extend this to the
>> newly added (in 3.5) AVX operations there I realized that this isn't
>> fully correct: It doesn't properly nest inside a kernel_fpu_begin()/
>> kernel_fpu_end() pair (as it will stts() at the end no matter what
>> the original state of CR0.TS was).
>> In order to not introduce completely new hypercalls to overcome
>> this (fpu_taskswitch isn't really extensible on its own), I'm
>> considering to add a new VM assist, altering the fpu_taskswitch
>> behavior so that it would return an indicator whether any change
>> to the virtual CR0.TS was done. That way, the kernel can
>> implement a true save/restore cycle here.
> It should be possible for the guest kernel to track its CR0.TS setting
> shouldn't it? It gets modified only via a few paravirt hooks, and implicitly
> cleared on #NM.
While this works fine and is fairly non-intrusive, it's not really
buying us much: The non-SSE variants of the xor code will still
outperform the SSE one on both 32- and 64-bit x86 (and the
MMX ones on 32-bit). So I now instead wonder why linux-2.6.18's
include/asm-x86_64/mach-xen/asm/xor.h doesn't simply
forward to asm-generic/xor.h, or at least doesn't override the
template selection logic. Jun, do you recall whether this was
perhaps done without any actual measurements when the port
to x86-64 was first done?
Xen-devel mailing list