[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] fpu_taskswitch adjustment proposal
>>> On 15.06.12 at 19:06, Keir Fraser <keir@xxxxxxx> wrote: > On 15/06/2012 17:03, "Jan Beulich" <JBeulich@xxxxxxxx> wrote: > >> While pv-ops so far doesn't care to eliminate the two trap-and- >> emulate CR0 accesses from the asm/xor.h save/restore >> operations, the legacy x86-64 kernel uses conditional clts()/stts() >> for this purpose. While looking into whether to extend this to the >> newly added (in 3.5) AVX operations there I realized that this isn't >> fully correct: It doesn't properly nest inside a kernel_fpu_begin()/ >> kernel_fpu_end() pair (as it will stts() at the end no matter what >> the original state of CR0.TS was). >> >> In order to not introduce completely new hypercalls to overcome >> this (fpu_taskswitch isn't really extensible on its own), I'm >> considering to add a new VM assist, altering the fpu_taskswitch >> behavior so that it would return an indicator whether any change >> to the virtual CR0.TS was done. That way, the kernel can >> implement a true save/restore cycle here. > > It should be possible for the guest kernel to track its CR0.TS setting > shouldn't it? It gets modified only via a few paravirt hooks, and implicitly > cleared on #NM. While this works fine and is fairly non-intrusive, it's not really buying us much: The non-SSE variants of the xor code will still outperform the SSE one on both 32- and 64-bit x86 (and the MMX ones on 32-bit). So I now instead wonder why linux-2.6.18's include/asm-x86_64/mach-xen/asm/xor.h doesn't simply forward to asm-generic/xor.h, or at least doesn't override the template selection logic. Jun, do you recall whether this was perhaps done without any actual measurements when the port to x86-64 was first done? Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |