[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCHv1] x86: don't schedule when handling #NM exception



On 03/10/2014 09:17 AM, David Vrabel wrote:
> math_state_restore() is called from the #NM exception handler.  It may
> do a GFP_KERNEL allocation (in init_fpu()) which may schedule.
> 
> Change this allocation to GFP_ATOMIC, but leave all the other callers
> of init_fpu() or fpu_alloc() using GFP_KERNEL.

And what the [Finnish] do you do if GFP_ATOMIC fails?

> do_group_exit() will also call schedule() so replace the call with
> force_sig(SIGKILL, tsk) instead.
> 
> Scheduling in math_state_restore() is particularly bad in Xen PV
> guests since the Xen clears CR0.TS before raising #NM exception (in
> the expectation that the #NM handler always clears TS).  If task A is
> descheduled and task B is scheduled.  Task B may end up with CR0.TS
> unexpectedly clear and any FPU instructions will not raise #NM and
> will corrupt task A's FPU state instead.

Yes, we know Xen is completely broken in this respect.

Anyway, I have a patchset from Sarah Newman which I have been reviewing
privately so far (which looks good and should be posted publicly -- the
holdup has not been Sarah's code but a combination of my bandwidth and
trying to get some preexisting bugs in the eagerfpu code dealt with,
which Suresh Siddha fortunately stepped up to do and which we now have a
solution for.)

Sarah's patchset switches Xen PV to use eagerfpu unconditionally, which
removes the dependency on #NM and is the right thing to do.

Sarah, could you post the latest patchset to LKML so it can be publicly
reviewed?  I'm sorry for the slow response time on my end.

        -hpa


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.