[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [XEN PATCH v10 4/5] tools: Add new function to get gsi from dev
On 2024/6/24 16:13, Jan Beulich wrote: > On 21.06.2024 10:15, Chen, Jiqian wrote: >> On 2024/6/20 18:37, Jan Beulich wrote: >>> On 20.06.2024 12:23, Chen, Jiqian wrote: >>>> On 2024/6/20 15:43, Jan Beulich wrote: >>>>> On 20.06.2024 09:03, Chen, Jiqian wrote: >>>>>> On 2024/6/18 17:13, Jan Beulich wrote: >>>>>>> On 18.06.2024 10:10, Chen, Jiqian wrote: >>>>>>>> On 2024/6/17 23:10, Jan Beulich wrote: >>>>>>>>> On 17.06.2024 11:00, Jiqian Chen wrote: >>>>>>>>>> --- a/tools/libs/light/libxl_pci.c >>>>>>>>>> +++ b/tools/libs/light/libxl_pci.c >>>>>>>>>> @@ -1406,6 +1406,12 @@ static bool pci_supp_legacy_irq(void) >>>>>>>>>> #endif >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> +#define PCI_DEVID(bus, devfn)\ >>>>>>>>>> + ((((uint16_t)(bus)) << 8) | ((devfn) & 0xff)) >>>>>>>>>> + >>>>>>>>>> +#define PCI_SBDF(seg, bus, devfn) \ >>>>>>>>>> + ((((uint32_t)(seg)) << 16) | (PCI_DEVID(bus, devfn))) >>>>>>>>> >>>>>>>>> I'm not a maintainer of this file; if I were, I'd ask that for >>>>>>>>> readability's >>>>>>>>> sake all excess parentheses be dropped from these. >>>>>>>> Isn't it a coding requirement to enclose each element in parentheses >>>>>>>> in the macro definition? >>>>>>>> It seems other files also do this. See >>>>>>>> tools/libs/light/libxl_internal.h >>>>>>> >>>>>>> As said, I'm not a maintainer of this code. Yet while I'm aware that >>>>>>> libxl >>>>>>> has its own CODING_STYLE, I can't spot anything towards excessive use of >>>>>>> parentheses there. >>>>>> So, which parentheses do you think are excessive use? >>>>> >>>>> #define PCI_DEVID(bus, devfn)\ >>>>> (((uint16_t)(bus) << 8) | ((devfn) & 0xff)) >>>>> >>>>> #define PCI_SBDF(seg, bus, devfn) \ >>>>> (((uint32_t)(seg) << 16) | PCI_DEVID(bus, devfn)) >>>> Thanks, will change in next version. >>>> >>>>> >>>>>>>>>> @@ -1486,6 +1496,18 @@ static void pci_add_dm_done(libxl__egc *egc, >>>>>>>>>> goto out_no_irq; >>>>>>>>>> } >>>>>>>>>> if ((fscanf(f, "%u", &irq) == 1) && irq) { >>>>>>>>>> +#ifdef CONFIG_X86 >>>>>>>>>> + sbdf = PCI_SBDF(pci->domain, pci->bus, >>>>>>>>>> + (PCI_DEVFN(pci->dev, pci->func))); >>>>>>>>>> + gsi = xc_physdev_gsi_from_dev(ctx->xch, sbdf); >>>>>>>>>> + /* >>>>>>>>>> + * Old kernel version may not support this function, >>>>>>>>> >>>>>>>>> Just kernel? >>>>>>>> Yes, xc_physdev_gsi_from_dev depends on the function implemented on >>>>>>>> linux kernel side. >>>>>>> >>>>>>> Okay, and when the kernel supports it but the underlying hypervisor >>>>>>> doesn't >>>>>>> support what the kernel wants to use in order to fulfill the request, >>>>>>> all >>>>>> I don't know what things you mentioned hypervisor doesn't support are, >>>>>> because xc_physdev_gsi_from_dev is to get the gsi of pcidev through sbdf >>>>>> information, >>>>>> that relationship can be got only in dom0 instead of Xen hypervisor. >>>>>> >>>>>>> is fine? (See also below for what may be needed in the hypervisor, even >>>>>>> if >>>>>> You mean xc_physdev_map_pirq needs gsi? >>>>> >>>>> I'd put it slightly differently: You arrange for that function to now >>>>> take a >>>>> GSI when the caller is PVH. But yes, the function, when used with >>>>> MAP_PIRQ_TYPE_GSI, clearly expects a GSI as input (see also below). >>>>> >>>>>>> this IOCTL would be satisfied by the kernel without needing to interact >>>>>>> with >>>>>>> the hypervisor.) >>>>>>> >>>>>>>>>> + * so if fail, keep using irq; if success, use gsi >>>>>>>>>> + */ >>>>>>>>>> + if (gsi > 0) { >>>>>>>>>> + irq = gsi; >>>>>>>>> >>>>>>>>> I'm still puzzled by this, when by now I think we've sufficiently >>>>>>>>> clarified >>>>>>>>> that IRQs and GSIs use two distinct numbering spaces. >>>>>>>>> >>>>>>>>> Also, as previously indicated, you call this for PV Dom0 as well. >>>>>>>>> Aiui on >>>>>>>>> the assumption that it'll fail. What if we decide to make the >>>>>>>>> functionality >>>>>>>>> available there, too (if only for informational purposes, or for >>>>>>>>> consistency)? Suddenly you're fallback logic wouldn't work anymore, >>>>>>>>> and >>>>>>>>> you'd call ... >>>>>>>>> >>>>>>>>>> + } >>>>>>>>>> +#endif >>>>>>>>>> r = xc_physdev_map_pirq(ctx->xch, domid, irq, &irq); >>>>>>>>> >>>>>>>>> ... the function with a GSI when a pIRQ is meant. Imo, as suggested >>>>>>>>> before, >>>>>>>>> you strictly want to avoid the call on PV Dom0. >>>>>>>>> >>>>>>>>> Also for PVH Dom0: I don't think I've seen changes to the hypercall >>>>>>>>> handling, yet. How can that be when GSI and IRQ aren't the same, and >>>>>>>>> hence >>>>>>>>> incoming GSI would need translating to IRQ somewhere? I can once >>>>>>>>> again only >>>>>>>>> assume all your testing was done with IRQs whose numbers happened to >>>>>>>>> match >>>>>>>>> their GSI numbers. (The difference, imo, would also need calling out >>>>>>>>> in the >>>>>>>>> public header, where the respective interface struct(s) is/are >>>>>>>>> defined.) >>>>>>>> I feel like you missed out on many of the previous discussions. >>>>>>>> Without my changes, the original codes use irq (read from file >>>>>>>> /sys/bus/pci/devices/<sbdf>/irq) to do xc_physdev_map_pirq, >>>>>>>> but xc_physdev_map_pirq require passing into gsi instead of irq, so we >>>>>>>> need to use gsi whether dom0 is PV or PVH, so for the original codes, >>>>>>>> they are wrong. >>>>>>>> Just because by chance, the irq value in the Linux kernel of pv dom0 >>>>>>>> is equal to the gsi value, so there was no problem with the original >>>>>>>> pv passthrough. >>>>>>>> But not when using PVH, so passthrough failed. >>>>>>>> With my changes, I got gsi through function xc_physdev_gsi_from_dev >>>>>>>> firstly, and to be compatible with old kernel versions(if the ioctl >>>>>>>> IOCTL_PRIVCMD_GSI_FROM_DEV is not implemented), I still need to use >>>>>>>> the irq number, so I need to check the result >>>>>>>> of gsi, if gsi > 0 means IOCTL_PRIVCMD_GSI_FROM_DEV is implemented I >>>>>>>> can use the right one gsi, otherwise keep using wrong one irq. >>>>>>> >>>>>>> I understand all of this, to a (I think) sufficient degree at least. >>>>>>> Yet what >>>>>>> you're effectively proposing (without explicitly saying so) is that e.g. >>>>>>> struct physdev_map_pirq's pirq field suddenly may no longer hold a pIRQ >>>>>>> number, but (when the caller is PVH) a GSI. This may be a necessary >>>>>>> adjustment >>>>>>> to make (simply because the caller may have no way to express things in >>>>>>> pIRQ >>>>>>> terms), but then suitable adjustments to the handling of >>>>>>> PHYSDEVOP_map_pirq >>>>>>> would be necessary. In fact that field is presently marked as "IN or >>>>>>> OUT"; >>>>>>> when re-purposed to take a GSI on input, it may end up being necessary >>>>>>> to pass >>>>>>> back the pIRQ (in the subject domain's numbering space). Or >>>>>>> alternatively it >>>>>>> may be necessary to add yet another sub-function so the GSI can be >>>>>>> translated >>>>>>> to the corresponding pIRQ for the domain that's going to use the IRQ, >>>>>>> for that >>>>>>> then to be passed into PHYSDEVOP_map_pirq. >>>>>> If I understood correctly, your concerns about this patch are two: >>>>>> First, when dom0 is PV, I should not use xc_physdev_gsi_from_dev to get >>>>>> gsi to do xc_physdev_map_pirq, I should keep the original code that use >>>>>> irq. >>>>> >>>>> Yes. >>>> OK, I can change to do this. >>>> But I still have a concern: >>>> Although without my changes, passthrough can work now when dom0 is PV. >>>> And you also agree that: for xc_physdev_map_pirq, when use with >>>> MAP_PIRQ_TYPE_GSI, it expects a GSI as input. >>>> Isn't it a wrong for PV dom0 to pass irq in? Why don't we use gsi as it >>>> should be used, since we have a function to get gsi now? >>> >>> Indeed this and ... >>> >>>>>> Second, when dom0 is PVH, I get the gsi, but I should not pass gsi as >>>>>> the fourth parameter of xc_physdev_map_pirq, I should create a new local >>>>>> parameter pirq=-1, and pass it in. >>>>> >>>>> I think so, yes. You also may need to record the output value, so you can >>>>> later >>>>> use it for unmap. xc_physdev_map_pirq() may also need adjusting, as right >>>>> now >>>>> it wouldn't put a negative incoming *pirq into the .pirq field. >>>> xc_physdev_map_pirq's logic is if we pass a negative in, it sets *pirq to >>>> index(gsi). >>>> Is its logic right? If not how do we change it? >>> >>> ... this matches ... >>> >>>>> I actually wonder if that's even correct right now, i.e. independent of >>>>> your change. >>> >>> ... the remark here. >> So, what should I do next step? >> If assume the logic of function xc_physdev_map_pirq and PHYSDEVOP_map_pirq >> is right, >> I think what I did now is right, both PV and PVH dom0 should pass gsi into >> xc_physdev_map_pirq. > > It may sound unfriendly, and I'm willing to accept other maintainers > disagreeing with me, but I think before we think of any extensions of > what we have here, we want to address any issues with what we have. > Since you're working in the area, and since getting the additions right > requires properly understanding how things work (and where things may > not work correctly right now), I view you as being in the best position > to see about doing that (imo) prereq step. Make sense. OK, I think there may be two issues in the callstack of function xc_physdev_map_pirq (xc_physdev_map_pirq-> physdev_map_pirq-> allocate_and_map_gsi_pirq-> allocate_pirq). For function xc_physdev_map_pirq, I think it should have two use cases: First, when caller pass a pirq that ">=0", they want to map gsi to a specific pirq. In this case, when it reaches allocate_pirq, if the gsi already has a mapped pirq(current_pirq), and current_pirq is not equal with specific pirq, it fails due to EEXIST, if current_pirq is equal with specific pirq or the gsi doesn't have a current_pirq, allocate_pirq return the specific pirq. It successes. Second, when caller pass a pirq that "<0", they want to get a free pirq for gsi. In this case, when it reaches allocate_pirq, if the gsi already has a mapped pirq(current_pirq), we should return current_pirq, otherwise, it allocate a free pirq through get_free_pirqs and then return the free pirq, it successes. Roadmap is below: caller pass gsi and pirq into xc_physdev_map_pirq (xc_physdev_map_pirq) / \ *pirq>=0 *pirq<0 [issue 1] (map gsi to a specific pirq) (map gsi to a free pirq) _ / \ | gsi already has a mapped pirq gsi already has a mapped pirq | (current_pirq) (current_pirq) | / \ / \ | yes no yes no allocate_pirq-------------------------| / \ / \ | current_pirq == pirq return specific pirq return current_pirq [issue2] return free pirq | / \ | yes no | / \ |_ return specific pirq fail -EEXIST But for current code, [issue 1]: when *pirq<0, in xc_physdev_map_pirq, it re-sets *pirq to index by default. " map.pirq = *pirq < 0 ? index : *pirq; ", so that it can't allocate a free pirq for gsi(above second case) [issue 2]: when *pirq<0 and gsi already has mapped pirq and has no need to allocate a new free pirq, in allocate_pirq, it returns *pirq(<0) directly, it means allocate_pirq fail. Here should return the already mapped pirq for that gsi and mean successful. > >> By the way, I found xc_physdev_map_pirq didn't support negative pirq is >> since your commit 934a5253d932b6f67fe40fc48975a2b0117e4cce, do you remember >> why? > > Counter question: What is a negative pIRQ? I mean when caller pass a pirq that "<0" in. > > Jan -- Best regards, Jiqian Chen.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |