[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited to single interrupt





From: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
To: Justin Acker <ackerj67@xxxxxxxxx>
Cc: "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>; "boris.ostrovsky@xxxxxxxxxx" <boris.ostrovsky@xxxxxxxxxx>
Sent: Wednesday, September 2, 2015 8:53 AM
Subject: Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited to single interrupt

On Tue, Sep 01, 2015 at 11:09:38PM +0000, Justin Acker wrote:
>
>      From: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
>  To: Justin Acker <ackerj67@xxxxxxxxx>
> Cc: "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>; boris.ostrovsky@xxxxxxxxxx
>  Sent: Tuesday, September 1, 2015 4:56 PM
>  Subject: Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited to single interrupt
>   
> On Tue, Sep 01, 2015 at 05:39:46PM +0000, Justin Acker wrote:
> > Taking this to the dev list from users.
> >
> > Is there a way to force or enable pirq delivery to a set of cpus as opposed to single device from being a assigned a single pirq so that its interrupt can be distributed across multiple cpus? I believe the device drivers do support multiple queues when run natively without the Dom0 loaded. The device in question is the xhci_hcd driver for which I/O transfers seem to be slowed when the Dom0 is loaded. The behavior seems to pass through to the DomU if pass through is enabled. I found some similar threads, but most relate to Ethernet controllers. I tried some of the x2apic and x2apic_phys dom0 kernel arguments, but none distributed the pirqs. Based on the reading relating to IRQs for Xen, I think pinning the pirqs to cpu0 is done to avoid an interrupt storm. I tried IRQ balance and when configured/adjusted it will balance individual pirqs, but not multiple interrupts.
>
> Yes. You can do it with smp affinity:
>
> https://cs.uwaterloo.ca/~brecht/servers/apic/SMP-affinity.txt
> Yes, this does allow for assigning a specific interrupt to a single cpu, but it will not spread the interrupt load across a defined group or all cpus. Is it possible to define a range of CPUs or spread the interrupt load for a device across all cpus as it does with a native kernel without the Dom0 loaded?

It should be. Did you try giving it an mask that puts the interrupts on all the CPUs?
(0xf) ?
>
> I don't follow the "behavior seems to pass through to the DomU if pass through is enabled" ?
> The device interrupts are limited to a single pirq if the device is used directly in the Dom0. If the device is passed through to a DomU - i.e. the xhci_hcd controller - then the DomU cannot spread the interrupt load across the cpus in the VM.

Why? How are you seeing this? The method by which you use smp affinity should
be exactly the same.

And it looks to me that the device has a single pirq as well when booting as baremetal right?

On baremetal, it uses all 8 cpus for affinity as noted below (IRQ27) compared to (IRQ78) in the Dom0.
                                                                             CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7     
baremetal:                                        27:   17977230     628258   44247270     120391 1597809883   14440991  152189328      73322  IR-PCI-MSI-edge      xhci_hcd
Dom0 or DomU passed through: 78:      82521          0          0          0          0          0          0          0  xen-pirq-msi       xhci_hcd

So the issue here is that you want to spread the interrupt delivery to happen across
all of the CPUs. The smp_affinity should do it. Did you try modifying it by hand (you may
want to kill irqbalance when you do this just to make sure it does not write its own values in)?



Yes, this would be great if there is a way to spread the affinity across all cpus or a specified set of CPUs similar to the native kernel behavior. I guess it would be spread across a set or all pirqs? With irqbalance disabled, I did adjust try the interrupt affinity manually (i.e. echo ff /prox/irq/78/smp_affinity). The interrupt will move to the specified CPU (0 through 7). Without specifying the affinity manually, it does look like it's mapped to all cpus by default. With the Dom0 loaded, cat /proc/irq/78/smp_affinity returns ff, but the interrupt never appears to be scheduled on more than cpu.






>
> >
> >
> >
> > With irqbalance enabled in Dom0:
>
> What version? There was a bug in it where it would never distribute the IRQs properly
> across the CPUs.
> irqbalance version 1.0.7.
>
> Boris (CC-ed) might remember the upstream patch that made this work properly?
>
>
> >
> >            CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7     
> >  76:      11304          0     149579          0          0          0          0          0  xen-pirq-msi       0000:00:1f.2
> >  77:       1243          0          0      35447          0          0          0          0  xen-pirq-msi       radeon
> >  78:      82521          0          0          0          0          0          0          0  xen-pirq-msi       xhci_hcd
> >  79:         23          0          0          0          0          0          0          0  xen-pirq-msi       mei_me
> >  80:         11          0          0          0          0        741          0          0  xen-pirq-msi       em1
> >  81:        350          0          0          0       1671          0          0          0  xen-pirq-msi       iwlwifi
> >  82:        275          0          0          0          0          0          0          0  xen-pirq-msi       snd_hda_intel
> >
> > With native 3.19 kernel:
> >
> > Without Dom0 for the same system from the first message:
> >
> > # cat /proc/interrupts
> >            CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7     
> >   0:         33          0          0          0          0          0          0          0  IR-IO-APIC-edge      timer
> >   8:          0          0          0          0          0          0          1          0  IR-IO-APIC-edge      rtc0
> >   9:         20          0          0          0          0          1          1          1  IR-IO-APIC-fasteoi   acpi
> >  16:         15          0          8          1          4          1          1          1  IR-IO-APIC  16-fasteoi   ehci_hcd:usb3
> >  18:     703940       5678    1426226       1303    3938243     111477     757871        510  IR-IO-APIC  18-fasteoi   ath9k
> >  23:         11          2          3          0          0         17          2          0  IR-IO-APIC  23-fasteoi   ehci_hcd:usb4
> >  24:          0          0          0          0          0          0          0          0  DMAR_MSI-edge      dmar0
> >  25:          0          0          0          0          0          0          0          0  DMAR_MSI-edge      dmar1
> >  26:      20419       1609      26822        567      62281       5426      14928        395  IR-PCI-MSI-edge      0000:00:1f.2
> >  27:   17977230     628258   44247270     120391 1597809883   14440991  152189328      73322  IR-PCI-MSI-edge      xhci_hcd
> >  28:        563          0          0          0          1          0          6          0  IR-PCI-MSI-edge      i915
> >  29:         14          0          0          4          2          4          0          0  IR-PCI-MSI-edge      mei_me
> >  30:      39514       1744      60339        157     129956      19702      72140         83  IR-PCI-MSI-edge      eth0
> >  31:          3          0          0          1         54          0          0          2  IR-PCI-MSI-edge      snd_hda_intel
> >  32:      28145        284      53316         63     139165       4410      25760         27  IR-PCI-MSI-edge      eth1-rx-0
> >  33:       1032         43       2392          5       1797        265       1507         20  IR-PCI-MSI-edge      eth1-tx-0
> >  34:          0          1          0          0          0          1          2          0  IR-PCI-MSI-edge      eth1
> >  35:          5          0          0         12        148          6          2          1  IR-PCI-MSI-edge      snd_hda_intel
> >
> >
> > The USB controller is an Intel C210:
> >
> > 00:14.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB xHCI Host Controller (rev 04) (prog-if 30 [XHCI])
> >     Subsystem: Dell Device 053e
> >     Flags: bus master, medium devsel, latency 0, IRQ 78
> >     Memory at f7f20000 (64-bit, non-prefetchable) [size=64K]
> >     Capabilities: [70] Power Management version 2
> >     Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+
> >     Kernel driver in use: xhci_hcd
> >     Kernel modules: xhci_pci
> >      On Tuesday, September 1, 2015 11:50 AM, Ian Campbell <ian.campbell@xxxxxxxxxx> wrote:
> >   
> >
> >  On Tue, 2015-09-01 at 13:56 +0000, Justin Acker wrote:
> > > Thanks Ian,
> > >
> > > I appreciate the explanation. I believe the device drivers do support
> > > multiple queues when run natively without the Dom0 loaded. The device in
> > > question is the xhci_hcd driver for which I/O transfers seem to be slowed
> > > when the Dom0 is loaded. The behavior seems to pass through to the DomU
> > > if pass through is enabled. I found some similar threads, but most relate
> > > to Ethernet controllers. I tried some of the x2apic and x2apic_phys dom0
> > > kernel arguments, but none distributed the pirqs. Based on the reading
> > > relating to IRQs for Xen, I think pinning the pirqs to cpu0 is done to
> > > avoid an I/O storm. I tried IRQ balance and when configured/adjusted it
> > > will balance individual pirqs, but not multiple interrupts.
> > >
> > > Is there a way to force or enable pirq delivery to a set of cpus as you
> > > mentioned above or omit a single device from being a assigned a PIRQ so
> > > that its interrupt can be distributed across all cpus?
> >
> > A PIRQ is the way an interrupt is exposed to a PV guest, without it there
> > would be no interrupt at all.
> >
> > I'm afraid I'm out of my depth WRT how x86/MSIs and Xen x86/PV pirqs
> > interact, in particular WRT configuring which set of CPUs can have the IRQ
> > delivered.
> >
> > If no one else chimes in soon I'd suggest taking this to the dev list, at
> > the very least someone who knows what they are talking about (i.e. other
> > than me) might be able to help.
> >
> > Ian.
> >
> >
> >
> > 
>
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@xxxxxxxxxxxxx
> > http://lists.xen.org/xen-devel
>
>


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.