[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Issue with MSI in a HVM domU with several passed through PCI devices



On Tue, Jun 26, 2012 at 2:59 PM, Stefano Stabellini
<stefano.stabellini@xxxxxxxxxxxxx> wrote:
> On Tue, 26 Jun 2012, Rolu wrote:
>> On Mon, Jun 25, 2012 at 1:38 PM, Stefano Stabellini
>> <stefano.stabellini@xxxxxxxxxxxxx> wrote:
>> > On Mon, 25 Jun 2012, Jan Beulich wrote:
>> >> >>> On 24.06.12 at 04:21, Rolu <rolu@xxxxxxxx> wrote:
>> >> > On Wed, Jun 20, 2012 at 6:03 PM, Jan Beulich <JBeulich@xxxxxxxx> wrote:
>> >> >> At the same time, adding logging to the guest kernel would
>> >> >> be nice, to see what value it actually writes (in a current
>> >> >> kernel this would be in __write_msi_msg()).
>> >> >>
>> >> >
>> >> > Turns out that msg->data here is also 0x4300, so it seems the guest
>> >> > kernel is producing these values. I caused it to make a stack trace
>> >> > and this pointed back to xen_hvm_setup_msi_irqs. This function uses
>> >> > the macro XEN_PIRQ_MSI_DATA, which evaluates to 0x4300. It checks the
>> >> > current data field and if it isn't equal to the macro it uses
>> >> > xen_msi_compose_msg to make a new message, but that function just sets
>> >> > the data field of the message to XEN_PIRQ_MSI_DATA - so, 0x4300. This
>> >> > then gets passed to __write_msi_msg and that's that. There are no
>> >> > other writes through __write_msi_msg (except for the same thing for
>> >> > other devices).
>> >> >
>> >> > The macro XEN_PIRQ_MSI_DATA contains a part (3 << 8) which ends up
>> >> > decoded as the delivery mode, so it seems the kernel is intentionally
>> >> > setting it to 3.
>> >>
>> >> So that can never have worked properly afaict. Stefano, the
>> >> code as it is currently - using literal (3 << 8) - is clearly bogus.
>> >> Your original commit at least had a comment saying that the
>> >> reserved delivery mode encoding is intentional here, but that
>> >> comment got lost with the later introduction of XEN_PIRQ_MSI_DATA.
>> >> In any case - the cooperation with qemu apparently doesn't
>> >> work, as the reserved encoding should never make it through
>> >> to the hypervisor. Could you explain what the intention here
>> >> was?
>> >>
>> >> And regardless of anything, can the literal numbers please be
>> >> replaced by proper manifest constants - the "8" here already
>> >> has MSI_DATA_DELIVERY_MODE_SHIFT, and giving the 3 a
>> >> proper symbolic would permit locating where this is being (or
>> >> really, as it doesn't appear to work supposed to be) consumed
>> >> in qemu, provided it uses the same definition (i.e. that one
>> >> should go into one of the public headers).
>> >
>> > The (3 << 8) is unimportant. The delivery mode chosen is "reserved"
>> > because notifications are not supposed to be delivered as MSI anymore.
>> >
>> > This is what should happen:
>> >
>> > 1) Linux configures the device with a 0 vector number and the pirq number
>> > in the address field;
>> >
>> > 2) QEMU notices a vector number of 0 and reads the pirq number from the
>> > address field, passing it to xc_domain_update_msi_irq;
>> >
>> > 3) Xen assignes the given pirq to the physical MSI;
>> >
>> > 4) The guest issues a EVTCHNOP_bind_pirq hypercall;
>> >
>> > 5) Xen sets the pirq as "IRQ_PT";
>> >
>> > 6) When Xen tries to inject the MSI into the guest, hvm_domain_use_pirq
>> > returns true so Xen calls send_guest_pirq instead.
>> >
>> >
>> > Obviously 6) is not happening. hvm_domain_use_pirq is:
>> >
>> > is_hvm_domain(d) && pirq && pirq->arch.hvm.emuirq != IRQ_UNBOUND
>> >
>> > My guess is that emuirq is IRQ_UNBOUND when it should be IRQ_PT (see
>> > above).
>>
>> This appears to be true. I added logging to hvm_pci_msi_assert in
>> xen/drivers/passthrough/io.c and it indicates that
>> pirq->arch.hvm.emuirq is -1 (while IRQ_PT is -2) every time right
>> before an unsupported delivery mode message.
>>
>> I also log pirq->pirq but I found that most of the time I can't find
>> this value anywhere else (I'm not sure how to interpret the value,
>> though). For example, in my last try:
>>
>> * I get an unsupported delivery mode error for pirq->pirq 55, 54 and
>> 53. The vast majority are for 54.
>> * I have logging in map_domain_emuirq_pirq in xen/arch/x86/irq.c. It
>> gets called with pirq 19, 20, 21, 22, 23, 52, 51, 50, 16, 17, 55.
>> Never for 54 or 53. It also gets called with pirq=49,emuirq=23 once
>> but complains it's already mapped.
>> * I have logging in evtchn_bind_pirq in xen/common/event_channel.c. It
>> gets called with bind->pirq 16, 17, 51, 55, 49, 29 (twice), 21, 19,
>> 22, 52, 48, 47. Also never 54 or 53.
>> * map_domain_emuirq_pirq is called from evtchn_bind_pirq for pirq 16, 17, 55.
>> * The qemu log mentions pirq 35, 36 and 37
>>
>> It seems pirq values don't always mean the same? Is it a coincidence
>> that 55 occurs almost everywhere, or is something going wrong with the
>> other two values (53 and 54 versus 16 and 17)?
>>
>> I have three MSI capable devices passed through to the domU, and I do
>> see groups of three distinct pirqs in the data above - just not the
>> same ones in every place I look.
>>
>> > So maybe the guest is not issuing a EVTCHNOP_bind_pirq hypercall
>> > (__startup_pirq doesn't get called), or Xen is erroring out in
>> > map_domain_emuirq_pirq.
>>
>> evtchn_bind_pirq gets called, though I'm not sure if it is with the right 
>> data.
>>
>> map_domain_emuirq_pirq always gets past the checks in the top half
>> (i.e. up to the line /* do not store emuirq mappings for pt devices
>> */), except for one time with pirq=49,emuirq=23 where it finds they
>> are already mapped.
>> It is called three times with an emuirq of -2, for pirq 16, 17 and 55.
>> This implies their info->arch.hvm.emuirq is also set to -2 (haven't
>> directly logged that but it's the only assignment there).
>>
>> Interestingly, I get an unsupported delivery mode error for pirq 55
>> where my logging says pirq->arch.hvm.emuirq is -1, *after*
>> map_domain_emuirq_pirq was called for pirq 55 and emuirq -2.
>
> Looking back at your QEMU logs, it seems that pt_msi_setup is not
> called (or it is not called at the right time), otherwise you should
> get:
>
> pt_msi_setup requested pirq = %d
>
> in your logs.
> Could you try disabling msitranslate? You can do that adding
>
> pci_msitranslate=0
>
> to your VM config file.

I tried that, but it didn't work.

> If that works, probably this (untested) QEMU patch could fix your problem:
>

I appreciate the help.

I applied the patch anyway just to see what would happen (had to edit
a few dev versus d variable names) but it didn't help. It also breaks
pt_msi_update, as I get in the qemu log:

pt_msi_update: Update msi with pirq 2f gvec 0 gflags 302f
pt_msi_update: Error: Binding of MSI failed.
pt_msi_update: Error: Unmapping of MSI failed.
pt_msgctrl_reg_write: Warning: Can not bind MSI for dev 80

I added some logging to pt_msi_setup (without the patch). It does get
called, and it does so rather early in the boot process, each time
right before lines as these:

pci_intx: intx=1
register_real_device: Real physical device 00:1b.0 registered successfuly!
IRQ type = MSI-INTx

At this point dev->msi->data, addr_hi and addr_lo are all 0, which
doesn't seem right. Is it being called prematurely?

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.