[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] PCI passthrough problems after legacy update of xen 4.1
Hi again, I tried adding some debugging statements in physdev_map_pirq. I have attached the physdev.c with my debug statements, a hypervisor log from testing, and the output of lspci -vv for my three devices. First briefly my understanding of the function, including where I put the log messages: // - comments denote what I've added /**/ - comments denote my assumption on what the code does int physdev_map_pirq(physdev_map_pirq *map) { // added log AF1, dumping most of *map switch(map->type) /* MSI | GSI */ // added log AF2 for clarifying MSI/GSI case for myself case GSI: /* fetching real irq */ // add log AF2.5: real irq case MSI: /* invent irq */ /* end switch*/ /* some mutex locking here */ pirq = domain_irq_to_pirq(d, irq); /* lookup in host-specific table */ // added log AF3, map->pirq= ..., pirq = ... if ( map->pirq < 0 ) { if (pirq) /* "already mapped" error */ else // added log AF4 } else { if ( pirq && pirq != map->pirq ) { /* failing case follows */ // added log AF4.5: pirq = ... dprintk(XENLOG_G_ERR, "dom%d: pirq %d conflicts with irq %d\n", d->domain_id, map->index, map->pirq); ret = -EEXIST; goto done; } else { // added log AF5 "else case instead of EEXIST" pirq = map->pirq; } } map->pirq = pirq; /* when everything is ok */ // added log AF6: map->pirq=... /* mutex unlocking */ /* cleaning up */ } Output from xm dmesg with vm creation details is attached. Whereas I'm unsure about the meaning of map->index, I note that in the failing case with only device 04:00.0 passed we see (the GSI cases seem the relevant ones, so I omit the MSI cases here. Full log is attached.): [FAILING] (XEN) physdev.c:98: AF1: map->domid1, map->type1, map->index19, map->pirq-1, map->bus0, map->devfn0, map->entry_nr0 (XEN) physdev.c:131: AF2: map->type=GSI (XEN) physdev.c:141: AF2.5: found irq=19 (if <= 0, irq will be set to map->index instead or fail with error) (XEN) physdev.c:188: AF3: after getting pirq: map->pirq-1, pirq0 (XEN) physdev.c:205: AF4: got previously free pirq=16 (XEN) physdev.c:236: AF6: final map->pirq: 16 ... (XEN) physdev.c:98: AF1: map->domid1, map->type1, map->index19, map->pirq19, map->bus0, map->devfn0, map->entry_nr0 (XEN) physdev.c:131: AF2: map->type=GSI (XEN) physdev.c:141: AF2.5: found irq=19 (if <= 0, irq will be set to map->index instead or fail with error) (XEN) physdev.c:188: AF3: after getting pirq: map->pirq19, pirq16 (XEN) physdev.c:219: AF4.5: pirq right before EEXIST error: 16 (XEN) physdev.c:221: dom1: pirq 19 conflicts with irq 19 I find it confusing that the error statement on the last line reports "pirq 19", while the value being reported is really from "map->index", according to the code above. My AF4.5 is executed right before, so the value of the local variable pirq is then 16 (while map->pirq == map->index == 19). Can it be that the code assumes that local "pirq" and "map->index" at this point of the code should have the same value, and some other new code has invalidated the assumption? Relying on the same assumption somewhere else may then have caused the error. In the working case, when all three devices are passed, [41:00.0, 41:00.1, 04:00.0], I see what seems like multiple instances of setting up each device. The first seems to correspond to 41:00.0: [WORKING] (XEN) physdev.c:98: AF1: map->domid2, map->type1, map->index16, map->pirq-1, map->bus0, map->devfn0, map->entry_nr0 (XEN) physdev.c:131: AF2: map->type=GSI (XEN) physdev.c:141: AF2.5: found irq=16 (if <= 0, irq will be set to map->index instead or fail with error) (XEN) physdev.c:188: AF3: after getting pirq: map->pirq-1, pirq0 (XEN) physdev.c:205: AF4: got previously free pirq=16 (XEN) physdev.c:236: AF6: final map->pirq: 16 ... (XEN) physdev.c:98: AF1: map->domid2, map->type1, map->index16, map->pirq16, map->bus0, map->devfn0, map->entry_nr0 (XEN) physdev.c:131: AF2: map->type=GSI (XEN) physdev.c:141: AF2.5: found irq=16 (if <= 0, irq will be set to map->index instead or fail with error) (XEN) physdev.c:188: AF3: after getting pirq: map->pirq16, pirq16 (XEN) physdev.c:227: AF5: else case instead of EEXIST error (XEN) physdev.c:236: AF6: final map->pirq: 16 (p)IRQs match all the way. Then: (XEN) physdev.c:98: AF1: map->domid2, map->type1, map->index17, map->pirq17, map->bus0, map->devfn0, map->entry_nr0 (XEN) physdev.c:131: AF2: map->type=GSI (XEN) physdev.c:141: AF2.5: found irq=17 (if <= 0, irq will be set to map->index instead or fail with error) (XEN) physdev.c:188: AF3: after getting pirq: map->pirq17, pirq0 (XEN) physdev.c:227: AF5: else case instead of EEXIST error (XEN) physdev.c:236: AF6: final map->pirq: 17 (XEN) physdev.c:98: AF1: map->domid2, map->type1, map->index19, map->pirq19, map->bus0, map->devfn0, map->entry_nr0 (XEN) physdev.c:131: AF2: map->type=GSI (XEN) physdev.c:141: AF2.5: found irq=19 (if <= 0, irq will be set to map->index instead or fail with error) (XEN) physdev.c:188: AF3: after getting pirq: map->pirq19, pirq0 (XEN) physdev.c:227: AF5: else case instead of EEXIST error (XEN) physdev.c:236: AF6: final map->pirq: 19 41:00.1 and 04:00.0 seem to be correctly set up on "first try", finding the "right" irq. Then follows some repetitions of the function, sometimes with errors/warnings, sometimes without. See the attached log, I can't really decide if they are needed or if they are redundant. From the above it seems that the get_free_pirq function plays a role. I haven't looked further into it. Regards, Andreas 2013/5/3 Andreas Falck <falck.andreas.lists@xxxxxxxxx>
Attachment:
physdev.c Attachment:
xm_dmesg_per_deviceorder Attachment:
lspci_passed_through_devs _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |