[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v8 03/10] xen/arm: inflight irqs during migration

On Wed, 2014-07-23 at 15:45 +0100, Stefano Stabellini wrote:
> On Thu, 17 Jul 2014, Ian Campbell wrote:
> > On Thu, 2014-07-10 at 19:13 +0100, Stefano Stabellini wrote:
> > > We need to take special care when migrating irqs that are already
> > > inflight from one vcpu to another. See "The effect of changes to an
> > > GICD_ITARGETSR", part of chapter 4.3.12 of the ARM Generic Interrupt
> > > Controller Architecture Specification.
> > > 
> > > The main issue from the Xen point of view is that the lr_pending and
> > > inflight lists are per-vcpu. The lock we take to protect them is also
> > > per-vcpu.
> > > 
> > > In order to avoid issues, if the irq is still lr_pending, we can
> > > immediately move it to the new vcpu for injection.
> > > 
> > > Otherwise if it is in a GICH_LR register, set a new flag
> > > GIC_IRQ_GUEST_MIGRATING, so that we can recognize when we receive an irq
> > > while the previous one is still inflight (given that we are only dealing
> > > with hardware interrupts here, it just means that its LR hasn't been
> > > cleared yet on the old vcpu).  If GIC_IRQ_GUEST_MIGRATING is set, we
> > > only set GIC_IRQ_GUEST_QUEUED and interrupt the old vcpu. To know which
> > > one is the old vcpu, we introduce a new field to pending_irq, called
> > > vcpu_migrate_from.
> > > When clearing the LR on the old vcpu, we take special care of injecting
> > > the interrupt into the new vcpu. To do that we need to release the old
> > > vcpu lock before taking the new vcpu lock.
> > 
> > I still think this is an awful lot of complexity and scaffolding for
> > something which is rare on the scale of things and which could be almost
> > trivially handled by requesting a maintenance interrupt for one EOI and
> > completing the move at that point.
> Requesting a maintenance interrupt is not as simple as it looks:
> - ATM we don't know how to edit a living GICH_LR register, we would have
> to add a function for that;

That doesn't sound like a great hardship. Perhaps you can reuse the
setter function anyhow.

> - if we request a maintenance interrupt then we also need to EOI the
> physical IRQ, that is something that we don't do anymore (unless
> PLATFORM_QUIRK_GUEST_PIRQ_NEED_EOI but that is another matter). We would
> need to understand that some physical irqs need to be EOI'ed by Xen and
> some don't.

I was thinking the maintenance interrupt handler would take care of

> Also requesting a maintenance interrupt would only guarantee that the
> vcpu is interrupted as soon as possible, but it won't save us from
> having to introduce GIC_IRQ_GUEST_MIGRATING.

I didn't expect GIC_IRQ_GUEST_MIGRATING to go away. If nothing else you
would need it to flag to the maintenance IRQ that it needs to EOI
+complete the migration.

>  It would only let us skip
> adding vcpu_migrate_from and the 5 lines of code in
> vgic_vcpu_inject_irq.

And the code in gic_update_one_lr I think, and most of
vgic_vcpu_inject-cpu. And more than the raw lines of code the
*complexity* would be much lower.

I think it could work like this:

On write to ITARGETSR if the interrupt is active then you set MIGRATING
and update the LR to request a maintenance IRQ.

If another interrupt occurs while this one is active then you mark it
pending just like normal (you don't care if it is migrating or not).

In the maintenance irq handler you check the migrating bit, if it is
clear then nothing to do. If it is set then you know the old cpu (it's
current) clear the LR, EOI the interrupt and write the physical
ITARGETSR (in some order, perhaps not that one). Then SGI the new
processor if the interrupt is pending.

If there have been multiple migrations then you don't care, you only
care about the current target right now as you finish it off.

> Overall I thought that this approach would be easier.
> > In order to avoid a simple maint interrupt you are adding code to the
> > normal interrupt path and a potential SGI back to another processor (and
> > I hope I'm misreading this but it looks like an SGI back again to finish
> > off?). That's got to be way more costly to the first interrupt on the
> > new VCPU than the cost of a maintenance IRQ on the old one.
> > 
> > I think avoiding maintenance interrupts in general is a worthy goal, but
> > there are times when they are the most appropriate mechanism.
> To be clear the case we are talking about is when the guest kernel wants
> to migrate an interrupt that is currently inflight in a GICH_LR register.
> Requesting a maintenance interrupt for it would only make sure that the
> old vcpu is interrupted soon after the EOI.

Yes, that's the point though. When you get this notification then you
can finish off the migration pretty much trivially with no worrying
about other inflight interrupt, pending stuff due to lazy handling of LR
cleanup etc, you just update the h/w state and you are done.

>  Without it, we need to
> identify which one is the old vcpu (in case of 2 consequent migrations),
> I introduced vcpu_migrate_from for that, and kick it when receiving the
> second interrupt if the first is still inflight. Exactly and only the
> few lines of code you quoted below.
> It is one SGI more in the uncommon case when we receive a second

Two more I think?

> physical interrupt without the old vcpu being interrupted yet.  In the
> vast majority of cases the old vcpu has already been interrupted by
> something else or by the second irq itself

> (we haven't changed affinity yet)

In patch #5 you will start doing so though, meaning that the extra
SGI(s) will become more frequent, if not the common case for high
frequency IRQs.

But it's not really so much about the SGIs but all the juggling of state
(in the code as well as in our heads) about all the corner cases which
arise due to the lazy clearing of LRs done in the normal case (which I'm
fine with BTW) and ordering of the various bit tests etc. I just think
it could be done in the simplest way possible with no real overhead
because these migration events are in reality rare.

>  and there is no need for the additional SGI.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.