[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [XEN PATCH v10 4/5] tools: Add new function to get gsi from dev


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: "Chen, Jiqian" <Jiqian.Chen@xxxxxxx>
  • Date: Fri, 21 Jun 2024 08:15:46 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=amd.com; dmarc=pass action=none header.from=amd.com; dkim=pass header.d=amd.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=v9pA/4k1oDsOndFxwviSGAYlabNLEzeuyJzYt3aYE+Q=; b=CMSIXSj8obgaSAxx0TlgKLn7Nz8LBOX3LY+vuZTQt6t0N01ng0AHtf5AUbx7olqRJtJM4wL/68ELGg+0358G/pLcnYUT7sl0ixSeYUPXzKp02zjbvq1BBYrGHX1QwCJb2MMhXwTMfEhw6lCcK/t1z6Cy/Zs8fzyrfMOOMYJvpFue72xz1fDpqtLm4UTjmGDXjARbx/nyltyBR0ZotUplieKsb4uFAcsDgyXEQaTpDOuWb4dLmoFfB+SBzD2sgkGAdBxzClVawLS45euXnkYigvFzdQEefxsiF5UDvYS2XPQJr1Qe5hz0oiL4k6J6Tz7AvsoHEgYiW4E7i4LitP+zYQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=X65pj05hocguX3QXqvtKSOY2z/FWL3R282AiqKpq/7rBVjmUdAIrSOEpjmy175kSqaR/nAeq2rthQpBUA2wQ6gCEeh8LT+VHSKLGKXfVGraR021DsMhGbDbzIgvR4a8eNDnoBrr3lNvuMoToOKClVUE6cJTSGaYIc/rO1IIIpu4nrnZE1c/gIf/cI42IKuPKifBGOaSgUcAEfMPMDRx9VJenWJgy0+HnauVp6+QSEKxlFMwZ9TqTwDdPP0YGf1TOmgIha2FMeFXIzg1xJtbekeH2vHvxtVMS6eiM+D2U4n/cXZ2Jf6lJfInjVMDOsm/p/FMT+oIfmeqyINJo7LRJxA==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=amd.com;
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Anthony PERARD <anthony@xxxxxxxxxxxxxx>, Juergen Gross <jgross@xxxxxxxx>, "Daniel P . Smith" <dpsmith@xxxxxxxxxxxxxxxxxxxx>, "Hildebrand, Stewart" <Stewart.Hildebrand@xxxxxxx>, "Huang, Ray" <Ray.Huang@xxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "Chen, Jiqian" <Jiqian.Chen@xxxxxxx>
  • Delivery-date: Fri, 21 Jun 2024 08:16:01 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHawJToSxHkdXzldkCDZE5kihITT7HMD9QAgAGTOgD//5t7AIADeOiA//+SewCAAKx5AP//hDUAAD2nHAA=
  • Thread-topic: [XEN PATCH v10 4/5] tools: Add new function to get gsi from dev

On 2024/6/20 18:37, Jan Beulich wrote:
> On 20.06.2024 12:23, Chen, Jiqian wrote:
>> On 2024/6/20 15:43, Jan Beulich wrote:
>>> On 20.06.2024 09:03, Chen, Jiqian wrote:
>>>> On 2024/6/18 17:13, Jan Beulich wrote:
>>>>> On 18.06.2024 10:10, Chen, Jiqian wrote:
>>>>>> On 2024/6/17 23:10, Jan Beulich wrote:
>>>>>>> On 17.06.2024 11:00, Jiqian Chen wrote:
>>>>>>>> --- a/tools/libs/light/libxl_pci.c
>>>>>>>> +++ b/tools/libs/light/libxl_pci.c
>>>>>>>> @@ -1406,6 +1406,12 @@ static bool pci_supp_legacy_irq(void)
>>>>>>>>  #endif
>>>>>>>>  }
>>>>>>>>  
>>>>>>>> +#define PCI_DEVID(bus, devfn)\
>>>>>>>> +            ((((uint16_t)(bus)) << 8) | ((devfn) & 0xff))
>>>>>>>> +
>>>>>>>> +#define PCI_SBDF(seg, bus, devfn) \
>>>>>>>> +            ((((uint32_t)(seg)) << 16) | (PCI_DEVID(bus, devfn)))
>>>>>>>
>>>>>>> I'm not a maintainer of this file; if I were, I'd ask that for 
>>>>>>> readability's
>>>>>>> sake all excess parentheses be dropped from these.
>>>>>> Isn't it a coding requirement to enclose each element in parentheses in 
>>>>>> the macro definition?
>>>>>> It seems other files also do this. See tools/libs/light/libxl_internal.h
>>>>>
>>>>> As said, I'm not a maintainer of this code. Yet while I'm aware that libxl
>>>>> has its own CODING_STYLE, I can't spot anything towards excessive use of
>>>>> parentheses there.
>>>> So, which parentheses do you think are excessive use?
>>>
>>> #define PCI_DEVID(bus, devfn)\
>>>             (((uint16_t)(bus) << 8) | ((devfn) & 0xff))
>>>
>>> #define PCI_SBDF(seg, bus, devfn) \
>>>             (((uint32_t)(seg) << 16) | PCI_DEVID(bus, devfn))
>> Thanks, will change in next version.
>>
>>>
>>>>>>>> @@ -1486,6 +1496,18 @@ static void pci_add_dm_done(libxl__egc *egc,
>>>>>>>>          goto out_no_irq;
>>>>>>>>      }
>>>>>>>>      if ((fscanf(f, "%u", &irq) == 1) && irq) {
>>>>>>>> +#ifdef CONFIG_X86
>>>>>>>> +        sbdf = PCI_SBDF(pci->domain, pci->bus,
>>>>>>>> +                        (PCI_DEVFN(pci->dev, pci->func)));
>>>>>>>> +        gsi = xc_physdev_gsi_from_dev(ctx->xch, sbdf);
>>>>>>>> +        /*
>>>>>>>> +         * Old kernel version may not support this function,
>>>>>>>
>>>>>>> Just kernel?
>>>>>> Yes, xc_physdev_gsi_from_dev depends on the function implemented on 
>>>>>> linux kernel side.
>>>>>
>>>>> Okay, and when the kernel supports it but the underlying hypervisor 
>>>>> doesn't
>>>>> support what the kernel wants to use in order to fulfill the request, all
>>>> I don't know what things you mentioned hypervisor doesn't support are,
>>>> because xc_physdev_gsi_from_dev is to get the gsi of pcidev through sbdf 
>>>> information,
>>>> that relationship can be got only in dom0 instead of Xen hypervisor.
>>>>
>>>>> is fine? (See also below for what may be needed in the hypervisor, even if
>>>> You mean xc_physdev_map_pirq needs gsi?
>>>
>>> I'd put it slightly differently: You arrange for that function to now take a
>>> GSI when the caller is PVH. But yes, the function, when used with
>>> MAP_PIRQ_TYPE_GSI, clearly expects a GSI as input (see also below).
>>>
>>>>> this IOCTL would be satisfied by the kernel without needing to interact 
>>>>> with
>>>>> the hypervisor.)
>>>>>
>>>>>>>> +         * so if fail, keep using irq; if success, use gsi
>>>>>>>> +         */
>>>>>>>> +        if (gsi > 0) {
>>>>>>>> +            irq = gsi;
>>>>>>>
>>>>>>> I'm still puzzled by this, when by now I think we've sufficiently 
>>>>>>> clarified
>>>>>>> that IRQs and GSIs use two distinct numbering spaces.
>>>>>>>
>>>>>>> Also, as previously indicated, you call this for PV Dom0 as well. Aiui 
>>>>>>> on
>>>>>>> the assumption that it'll fail. What if we decide to make the 
>>>>>>> functionality
>>>>>>> available there, too (if only for informational purposes, or for
>>>>>>> consistency)? Suddenly you're fallback logic wouldn't work anymore, and
>>>>>>> you'd call ...
>>>>>>>
>>>>>>>> +        }
>>>>>>>> +#endif
>>>>>>>>          r = xc_physdev_map_pirq(ctx->xch, domid, irq, &irq);
>>>>>>>
>>>>>>> ... the function with a GSI when a pIRQ is meant. Imo, as suggested 
>>>>>>> before,
>>>>>>> you strictly want to avoid the call on PV Dom0.
>>>>>>>
>>>>>>> Also for PVH Dom0: I don't think I've seen changes to the hypercall
>>>>>>> handling, yet. How can that be when GSI and IRQ aren't the same, and 
>>>>>>> hence
>>>>>>> incoming GSI would need translating to IRQ somewhere? I can once again 
>>>>>>> only
>>>>>>> assume all your testing was done with IRQs whose numbers happened to 
>>>>>>> match
>>>>>>> their GSI numbers. (The difference, imo, would also need calling out in 
>>>>>>> the
>>>>>>> public header, where the respective interface struct(s) is/are defined.)
>>>>>> I feel like you missed out on many of the previous discussions.
>>>>>> Without my changes, the original codes use irq (read from file 
>>>>>> /sys/bus/pci/devices/<sbdf>/irq) to do xc_physdev_map_pirq,
>>>>>> but xc_physdev_map_pirq require passing into gsi instead of irq, so we 
>>>>>> need to use gsi whether dom0 is PV or PVH, so for the original codes, 
>>>>>> they are wrong.
>>>>>> Just because by chance, the irq value in the Linux kernel of pv dom0 is 
>>>>>> equal to the gsi value, so there was no problem with the original pv 
>>>>>> passthrough.
>>>>>> But not when using PVH, so passthrough failed.
>>>>>> With my changes, I got gsi through function xc_physdev_gsi_from_dev 
>>>>>> firstly, and to be compatible with old kernel versions(if the ioctl
>>>>>> IOCTL_PRIVCMD_GSI_FROM_DEV is not implemented), I still need to use the 
>>>>>> irq number, so I need to check the result
>>>>>> of gsi, if gsi > 0 means IOCTL_PRIVCMD_GSI_FROM_DEV is implemented I can 
>>>>>> use the right one gsi, otherwise keep using wrong one irq. 
>>>>>
>>>>> I understand all of this, to a (I think) sufficient degree at least. Yet 
>>>>> what
>>>>> you're effectively proposing (without explicitly saying so) is that e.g.
>>>>> struct physdev_map_pirq's pirq field suddenly may no longer hold a pIRQ
>>>>> number, but (when the caller is PVH) a GSI. This may be a necessary 
>>>>> adjustment
>>>>> to make (simply because the caller may have no way to express things in 
>>>>> pIRQ
>>>>> terms), but then suitable adjustments to the handling of 
>>>>> PHYSDEVOP_map_pirq
>>>>> would be necessary. In fact that field is presently marked as "IN or OUT";
>>>>> when re-purposed to take a GSI on input, it may end up being necessary to 
>>>>> pass
>>>>> back the pIRQ (in the subject domain's numbering space). Or alternatively 
>>>>> it
>>>>> may be necessary to add yet another sub-function so the GSI can be 
>>>>> translated
>>>>> to the corresponding pIRQ for the domain that's going to use the IRQ, for 
>>>>> that
>>>>> then to be passed into PHYSDEVOP_map_pirq.
>>>> If I understood correctly, your concerns about this patch are two:
>>>> First, when dom0 is PV, I should not use xc_physdev_gsi_from_dev to get 
>>>> gsi to do xc_physdev_map_pirq, I should keep the original code that use 
>>>> irq.
>>>
>>> Yes.
>> OK, I can change to do this.
>> But I still have a concern:
>> Although without my changes, passthrough can work now when dom0 is PV.
>> And you also agree that: for xc_physdev_map_pirq, when use with 
>> MAP_PIRQ_TYPE_GSI, it expects a GSI as input.
>> Isn't it a wrong for PV dom0 to pass irq in? Why don't we use gsi as it 
>> should be used, since we have a function to get gsi now?
> 
> Indeed this and ...
> 
>>>> Second, when dom0 is PVH, I get the gsi, but I should not pass gsi as the 
>>>> fourth parameter of xc_physdev_map_pirq, I should create a new local 
>>>> parameter pirq=-1, and pass it in.
>>>
>>> I think so, yes. You also may need to record the output value, so you can 
>>> later
>>> use it for unmap. xc_physdev_map_pirq() may also need adjusting, as right 
>>> now
>>> it wouldn't put a negative incoming *pirq into the .pirq field. 
>> xc_physdev_map_pirq's logic is if we pass a negative in, it sets *pirq to 
>> index(gsi).
>> Is its logic right? If not how do we change it?
> 
> ... this matches ...
> 
>>> I actually wonder if that's even correct right now, i.e. independent of 
>>> your change.
> 
> ... the remark here.
So, what should I do next step?
If assume the logic of function xc_physdev_map_pirq and PHYSDEVOP_map_pirq is 
right,
I think what I did now is right, both PV and PVH dom0 should pass gsi into 
xc_physdev_map_pirq.
By the way, I found xc_physdev_map_pirq didn't support negative pirq is since 
your commit 934a5253d932b6f67fe40fc48975a2b0117e4cce, do you remember why?

> 
>> Even without my changes, passthrough can work for PV dom0, not for PVH dom0.
> 
> In the common case. I fear no-one ever tried for a device with an IRQ that
> has a source override specified in ACPI.
> 
>> According to the logic of hypercall PHYSDEVOP_map_pirq,
>> if pirq is -1, it calls physdev_map_pirq-> allocate_and_map_gsi_pirq-> 
>> allocate_pirq -> get_free_pirq to get pirq.
>> If pirq is set to positive before calling hypercall, it set pirq to its own 
>> value in allocate_pirq.
> 
> Which is what looks wrong to me. Question is what it was done this way in the
> first place.
> 
> Jan

-- 
Best regards,
Jiqian Chen.

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.