[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1/4] xen/events: Clear cpu_evtchn_mask before resuming



On 28/04/15 16:52, Boris Ostrovsky wrote:
> When a guest is resumed, the hypervisor may change event channel
> assignments. If this happens and the guest uses 2-level events it
> is possible for the interrupt to be claimed by wrong VCPU since
> cpu_evtchn_mask bits may be stale. This can happen even though
> evtchn_2l_bind_to_cpu() attempts to clear old bits: irq_info that
> is passed in is not necessarily the original one (from pre-migration
> times) but instead is freshly allocated during resume and so any
> information about which CPU the channel was bound to is lost.
> 
> Thus we should clear the mask during resume.
> 
> We also need to make sure that bits for xenstore and console channels
> are set when these two subsystems are resumed. While rebind_evtchn_irq()
> (which is invoked for both of them on a resume) calls irq_set_affinity(),
> the latter will in fact postpone setting affinity until handling the
> interrupt. But because cpu_evtchn_mask will have bits for these two cleared
> we won't be able to take the interrupt.
> 
> Setting IRQ_MOVE_PCNTXT flag for the two irqs avoids this problem by
> allowing to set affinity immediately, which is safe for event-channel-based
> interrupts.
[...]
> --- a/drivers/tty/hvc/hvc_xen.c
> +++ b/drivers/tty/hvc/hvc_xen.c
> @@ -533,6 +533,7 @@ static int __init xen_hvc_init(void)
>  
>               info = vtermno_to_xencons(HVC_COOKIE);
>               info->irq = bind_evtchn_to_irq(info->evtchn);
> +             irq_set_status_flags(info->irq, IRQ_MOVE_PCNTXT);
>       }
>       if (info->irq < 0)
>               info->irq = 0; /* NO_IRQ */
> diff --git a/drivers/xen/events/events_2l.c b/drivers/xen/events/events_2l.c
> index 5db43fc..7dd4631 100644
> --- a/drivers/xen/events/events_2l.c
> +++ b/drivers/xen/events/events_2l.c
> @@ -345,6 +345,15 @@ irqreturn_t xen_debug_interrupt(int irq, void *dev_id)
>       return IRQ_HANDLED;
>  }
>  
> +static void evtchn_2l_resume(void)
> +{
> +     int i;
> +
> +     for_each_online_cpu(i)
> +             memset(per_cpu(cpu_evtchn_mask, i), 0, sizeof(xen_ulong_t) *
> +                             EVTCHN_2L_NR_CHANNELS/BITS_PER_EVTCHN_WORD);
> +}
> +
>  static const struct evtchn_ops evtchn_ops_2l = {
>       .max_channels      = evtchn_2l_max_channels,
>       .nr_channels       = evtchn_2l_max_channels,
> @@ -356,6 +365,7 @@ static const struct evtchn_ops evtchn_ops_2l = {
>       .mask              = evtchn_2l_mask,
>       .unmask            = evtchn_2l_unmask,
>       .handle_events     = evtchn_2l_handle_events,
> +     .resume            = evtchn_2l_resume,
>  };
>  
>  void __init xen_evtchn_2l_init(void)
> diff --git a/drivers/xen/xenbus/xenbus_comms.c 
> b/drivers/xen/xenbus/xenbus_comms.c
> index fdb0f33..30203d1 100644
> --- a/drivers/xen/xenbus/xenbus_comms.c
> +++ b/drivers/xen/xenbus/xenbus_comms.c
> @@ -231,6 +231,7 @@ int xb_init_comms(void)
>               }
>  
>               xenbus_irq = err;
> +             irq_set_status_flags(xenbus_irq, IRQ_MOVE_PCNTXT);

IRQ_MOVE_PCNTXT means "Interrupt can be migrated from process context"
which doesn't really sound relevant to me here?

Thomas Glexnier is really not happy with mis-use of IRQ APIs.

From the commit log the evtchn_2l_resume() fucntion that's added sounds
like it fixes the problem on its own?

David

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.