Xen project Mailing List

Re: [Xen-devel] xen-evtchn: Bind dyn evtchn:qemu-dm interrupt to next online CPU

To: Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxxx>

From: Anoob Soman <anoob.soman@xxxxxxxxxx>

Date: Wed, 17 May 2017 16:28:54 +0100

Delivery-date: Wed, 17 May 2017 15:29:17 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 16/05/17 20:02, Boris Ostrovsky wrote:

On 05/16/2017 01:15 PM, Anoob Soman wrote:

Hi,

In our Xenserver testing, I have seen cases when we boot 50 windows
VMs together, dom0 kernel softlocks up.

The following is a brief description of the problem of what caused
soflockup detection code to kick. A HVM domain boot generates around
200K (evtchn:qemu-dm xen-dyn) interrupts, in a short period of time.
All these evtchn:qemu-dm are bound to VCPU 0, until irqbalance sees
these IRQ and moves it to a different VCPU. In case of XenServer,
irqbalance runs every 10 seconds, which means irqbalance doesn't get
to see these burst of interrupts and doesn't re-balance interrupts
most of the time, making all evtchn:qemu-dm to be processed by VCPU0.
This cause VCPU0 to spend most of time processing hardirq and very
little time on softirq. Moreover, in XenServer dom0 PREEMPTION is
disabled, this means VCPU0 never runs watchdog (process context),
triggering a softlockup detection code to panic.

One way to solve this problem would be to bind evtchn:qemu-dm to next
online VCPU, will spread hardirq processing evenly across different
VCPU. Later, irqbalance will try to balance evtchn:qemu-dm, if required.

I am not sure if this is a sensible thing to do. I have a pseduo code
(not the final patch) which tries to bind only
IOCTL_EVTCHN_BIND_INTERDOMAIN (used by qemu-dm) to next.


I think it's reasonable. One thing to watch for though is VCPU offlining
while binding.

Let me see if I can manage to do take care of this scenario.

  /* Rebind an evtchn so that it gets delivered to a specific cpu */
-static int rebind_irq_to_cpu(unsigned irq, unsigned tcpu)
+int xen_rebind_evtchn_to_cpu(int evtchn, unsigned tcpu)
  {
      struct evtchn_bind_vcpu bind_vcpu;
-    int evtchn = evtchn_from_irq(irq);
      int masked;

      if (!VALID_EVTCHN(evtchn))
@@ -1339,6 +1338,12 @@ static int rebind_irq_to_cpu(unsigned irq,
unsigned tcpu)

      return 0;
  }
+EXPORT_SYMBOL_GPL(xen_rebind_evtchn_to_cpu);
+
+static int rebind_irq_to_cpu(unsigned irq, unsigned tcpu)
+{
+    return xen_rebind_evtchn_to_cpu(evtchn_from_irq(irq), tcpu);
+}

  static int set_affinity_irq(struct irq_data *data, const struct
cpumask *dest,
                  bool force)
diff --git a/drivers/xen/evtchn.c b/drivers/xen/evtchn.c
index 7efd1cb..bd34d3d 100644
--- a/drivers/xen/evtchn.c
+++ b/drivers/xen/evtchn.c
@@ -471,6 +471,7 @@ static long evtchn_ioctl(struct file *file,
      case IOCTL_EVTCHN_BIND_INTERDOMAIN: {
          struct ioctl_evtchn_bind_interdomain bind;
          struct evtchn_bind_interdomain bind_interdomain;
+        static unsigned int selected_cpu;

          rc = -EFAULT;
          if (copy_from_user(&bind, uarg, sizeof(bind)))
@@ -489,8 +490,13 @@ static long evtchn_ioctl(struct file *file,
              break;

          rc = evtchn_bind_to_user(u, bind_interdomain.local_port);
-        if (rc == 0)
+        if (rc == 0) {
              rc = bind_interdomain.local_port;
+            selected_cpu = cpumask_next(selected_cpu, cpu_online_mask);
+            if (selected_cpu >= nr_cpu_ids)
+                selected_cpu = cpumask_first(cpu_online_mask);
+            xen_rebind_evtchn_to_cpu(rc, selected_cpu);

Can you do proper assignment *instead of* binding to CPU0 as opposed to
rebinding the event channel later? Otherwise you are making an extra
hypercall.

Sure, make sense.

You also probably want to look at current IRQ affinity mask instead of
cpu_online_mask.

Sure, I will send a patch out with the changes, based on your comments.

+        }
          break;
      }

diff --git a/include/xen/events.h b/include/xen/events.h
index 88da2ab..f442ca5 100644
--- a/include/xen/events.h
+++ b/include/xen/events.h
@@ -58,6 +58,7 @@ void evtchn_put(unsigned int evtchn);

  void xen_send_IPI_one(unsigned int cpu, enum ipi_vector vector);
  void rebind_evtchn_irq(int evtchn, int irq);
+int xen_rebind_evtchn_to_cpu(int evtchn, unsigned tcpu);

  static inline void notify_remote_via_evtchn(int port)
  {

"selected_cpu" needs to be protected, but I would like to avoid taking
a lock. One way to avoid taking lock (before
xen_rebind_evtchn_to_cpu()) would be to use
"local_port%num_present_cpus()" or " smp_processor_id()" as index into
cpumask_next.

The latter sounds better to me --- there is a chance that the interrupts
will be processed locally.

-boris

Did you mean using smp_processor_id() as index ?

What do you guys suggest.

Thanks,

Anoob.

_______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.