[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] schedule() vs softirqs



PowerPC's timer interrupt (called the decrementer) is a one-shot timer,
not periodic. When it goes off, entering the hypervisor, we first set it
very high so it won't interrupt hypervisor code, then
raise_softirq(TIMER_SOFTIRQ). We know that timer_softirq_action() will
then call reprogam_timer(), which will reprogram the decrementer to the
appropriate value.

We recently uncovered a bug on PowerPC where if a timer tick arrives
just inside schedule() while interrupts are still enabled, the
decrementer is never reprogrammed to that appropriate value. This is
because once inside schedule(), we never handle any subsequent softirqs:
we call context_switch() and resume the guest.

I believe the tick problem affects periodic timers (i.e. x86) as well,
though less drastically. With a CPU-bound guest, it would result in
dropped ticks: TIMER_SOFTIRQ is set and not handled, and when the timer
expires again it is re-set. In other cases, it would result in some
timer ticks being delivered very late. I don't know what effect this
might have on guests, perhaps with sensitive time-slewing code.

In addition, when SCHEDULE_SOFTIRQ is set, all "greater" softirqs
(including NMI) will not be handled until the next hypervisor
invocation.

This is pretty anti-social behavior for a softirq handler. One solution
would be to have schedule() *not* call context_switch() directly, but
rather set a flag (or a "next vcpu" pointer) and return. That would
allow other softirqs to be processed normally. Once do_softirq() returns
to assembly, we can check the "next vcpu" pointer and call
context_switch().

(This solution would enable a PowerPC optimization as well: we would
like to lazily save non-volatile registers. We can't do this unless the
exception handler regains control from do_softirq() before
context_switch() is called.)

Thoughts?

-- 
Hollis Blanchard
IBM Linux Technology Center


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.