[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v8 03/10] xen/arm: inflight irqs during migration

On Wed, 23 Jul 2014, Ian Campbell wrote:
> On Wed, 2014-07-23 at 15:45 +0100, Stefano Stabellini wrote:
> > On Thu, 17 Jul 2014, Ian Campbell wrote:
> > > On Thu, 2014-07-10 at 19:13 +0100, Stefano Stabellini wrote:
> > > > We need to take special care when migrating irqs that are already
> > > > inflight from one vcpu to another. See "The effect of changes to an
> > > > GICD_ITARGETSR", part of chapter 4.3.12 of the ARM Generic Interrupt
> > > > Controller Architecture Specification.
> > > > 
> > > > The main issue from the Xen point of view is that the lr_pending and
> > > > inflight lists are per-vcpu. The lock we take to protect them is also
> > > > per-vcpu.
> > > > 
> > > > In order to avoid issues, if the irq is still lr_pending, we can
> > > > immediately move it to the new vcpu for injection.
> > > > 
> > > > Otherwise if it is in a GICH_LR register, set a new flag
> > > > GIC_IRQ_GUEST_MIGRATING, so that we can recognize when we receive an irq
> > > > while the previous one is still inflight (given that we are only dealing
> > > > with hardware interrupts here, it just means that its LR hasn't been
> > > > cleared yet on the old vcpu).  If GIC_IRQ_GUEST_MIGRATING is set, we
> > > > only set GIC_IRQ_GUEST_QUEUED and interrupt the old vcpu. To know which
> > > > one is the old vcpu, we introduce a new field to pending_irq, called
> > > > vcpu_migrate_from.
> > > > When clearing the LR on the old vcpu, we take special care of injecting
> > > > the interrupt into the new vcpu. To do that we need to release the old
> > > > vcpu lock before taking the new vcpu lock.
> > > 
> > > I still think this is an awful lot of complexity and scaffolding for
> > > something which is rare on the scale of things and which could be almost
> > > trivially handled by requesting a maintenance interrupt for one EOI and
> > > completing the move at that point.
> > 
> > Requesting a maintenance interrupt is not as simple as it looks:
> > - ATM we don't know how to edit a living GICH_LR register, we would have
> > to add a function for that;
> That doesn't sound like a great hardship. Perhaps you can reuse the
> setter function anyhow.
> > - if we request a maintenance interrupt then we also need to EOI the
> > physical IRQ, that is something that we don't do anymore (unless
> > PLATFORM_QUIRK_GUEST_PIRQ_NEED_EOI but that is another matter). We would
> > need to understand that some physical irqs need to be EOI'ed by Xen and
> > some don't.
> I was thinking the maintenance interrupt handler would take care of
> this.

In that case we would have to resurrect the code to loop over the
GICH_EISR* registers from maintenance_interrupt.
Anything can be done, I am just pointing out that this alternative
approach is not as cheap as it might sound.

> > Also requesting a maintenance interrupt would only guarantee that the
> > vcpu is interrupted as soon as possible, but it won't save us from
> > having to introduce GIC_IRQ_GUEST_MIGRATING.
> I didn't expect GIC_IRQ_GUEST_MIGRATING to go away. If nothing else you
> would need it to flag to the maintenance IRQ that it needs to EOI
> +complete the migration.
> >  It would only let us skip
> > adding vcpu_migrate_from and the 5 lines of code in
> > vgic_vcpu_inject_irq.
> And the code in gic_update_one_lr I think, and most of
> vgic_vcpu_inject-cpu.
> And more than the raw lines of code the
> *complexity* would be much lower.

I don't know about the complexity. One thing is to completely get rid of
maintenance interrupts. Another is to get rid of them in most cases but
not all. Having to deal both with not having them and with having them,
increases complexity, at least in my view. It simpler to think that you
have them all the times or never.

In any case replying to this email made me realize that there is indeed
a lot of unneeded code in this patch, especially given that writing to
the physical ITARGETSR is guaranteed to affect pending (non active)
irqs.  From the ARM ARM:

"Software can write to an GICD_ITARGETSR at any time. Any change to a CPU
targets field value:


Has an effect on any pending interrupts. This means:
 â adding a CPU interface to the target list of a pending interrupt makes
   that interrupt pending on that CPU interface
 â removing a CPU interface from the target list of a pending interrupt
   removes the pending state of that interrupt on that CPU interface."

I think we can rely on this behaviour. Thanks to patch #5 we know that
we'll be receiving the second physical irq on the old cpu and from then
on the next ones always on the new cpu. So we won't need
vcpu_migrate_from, the complex ordering of MIGRATING and QUEUED, or the
Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.