[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [XEN PATCH v10 4/5] tools: Add new function to get gsi from dev


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: "Chen, Jiqian" <Jiqian.Chen@xxxxxxx>
  • Date: Thu, 20 Jun 2024 07:03:15 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=amd.com; dmarc=pass action=none header.from=amd.com; dkim=pass header.d=amd.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ynlm9TnUOxuZ8OJUp1x9VljZupy4oxRLKp3UZv/IYSs=; b=dI7D0Mrp5niAnemW+05hd6RGdTnLDJ6KZpT14nfGcySmox4RAQ0iBrf8VqdRmN7F72aYuymtunjc3X1UoGl4rsKkkFlk69luojlG6AaUth0jfjkzQxZCJzcQn5/Uxo4j6lchXuO4QBhTJEWz6uz6KiaSce3PgcJzEmPxC9yj5GJsiauzmIjxFoHUj77/8LXpMoAdClvL0wYkgBoMCtD0ED0H3FwQ/ihPGOAD2o48vbXp1q/62oEy7G1NxY++BmCwEKuY4JYENdt+1/gCYlVam8lA3RGOXP64SkwuGH81Bd+s7Bu90YkSkD1mObqgH6HhkRWJnKWkrMAKKGZZW0Uz6A==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=cby54vvKPo6Zh1+0kfsfKTYBl2wEl9EMRu4mny/ICXhyv4yXHPbkHhrzIlCGlAD0yNqnfq4MVj7Os/c56F2cUsq74aFJwRpVclMb7r2At81XiIa2rpVQSPvBzj0e/x22SAqYRr3fYUh+PEFrNKSJ5029eUqVrF+59F8ljlFdklqcJdZw7JPfQsMXrL2T3mv++7p3qFD6vzZCb7DHIT1QxmWu8D8By9bxyblqsSZSSml6Wzhxh1ffPFSIqbMFBmnpAV5qqjXXu1nW/CYCWLE23qLWQKa/XGxmTInw6L7D3pILygxu22nIjYIav7it5MVPCIxDeXBV9jm9g3xHrVtnBw==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=amd.com;
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Anthony PERARD <anthony@xxxxxxxxxxxxxx>, Juergen Gross <jgross@xxxxxxxx>, "Daniel P . Smith" <dpsmith@xxxxxxxxxxxxxxxxxxxx>, "Hildebrand, Stewart" <Stewart.Hildebrand@xxxxxxx>, "Huang, Ray" <Ray.Huang@xxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "Chen, Jiqian" <Jiqian.Chen@xxxxxxx>
  • Delivery-date: Thu, 20 Jun 2024 07:03:24 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHawJToSxHkdXzldkCDZE5kihITT7HMD9QAgAGTOgD//5t7AIADeOiA
  • Thread-topic: [XEN PATCH v10 4/5] tools: Add new function to get gsi from dev

On 2024/6/18 17:13, Jan Beulich wrote:
> On 18.06.2024 10:10, Chen, Jiqian wrote:
>> On 2024/6/17 23:10, Jan Beulich wrote:
>>> On 17.06.2024 11:00, Jiqian Chen wrote:
>>>> --- a/tools/include/xen-sys/Linux/privcmd.h
>>>> +++ b/tools/include/xen-sys/Linux/privcmd.h
>>>> @@ -95,6 +95,11 @@ typedef struct privcmd_mmap_resource {
>>>>    __u64 addr;
>>>>  } privcmd_mmap_resource_t;
>>>>  
>>>> +typedef struct privcmd_gsi_from_dev {
>>>> +  __u32 sbdf;
>>>
>>> That's PCI-centric, without struct and IOCTL names reflecting this fact.
>> So, change to privcmd_gsi_from_pcidev ?
> 
> That's what I'd suggest, yes. But remember that it's the kernel maintainers
> who have the ultimate say here, as here you're only making a copy of what
> the canonical header (in the kernel tree) is going to have.
OK, then let's wait for the corresponding patch on kernel side to be accepted 
first.
> 
>>>> +  int gsi;
>>>
>>> Is "int" legitimate to use here? Doesn't this want to similarly be __u32?
>> I want to set gsi to negative if there is no record of this translation.
> 
> There are surely more explicit ways to signal that case?
Maybe, I will think about the implementation on kernel side again.
> 
>>>> --- a/tools/libs/light/libxl_pci.c
>>>> +++ b/tools/libs/light/libxl_pci.c
>>>> @@ -1406,6 +1406,12 @@ static bool pci_supp_legacy_irq(void)
>>>>  #endif
>>>>  }
>>>>  
>>>> +#define PCI_DEVID(bus, devfn)\
>>>> +            ((((uint16_t)(bus)) << 8) | ((devfn) & 0xff))
>>>> +
>>>> +#define PCI_SBDF(seg, bus, devfn) \
>>>> +            ((((uint32_t)(seg)) << 16) | (PCI_DEVID(bus, devfn)))
>>>
>>> I'm not a maintainer of this file; if I were, I'd ask that for readability's
>>> sake all excess parentheses be dropped from these.
>> Isn't it a coding requirement to enclose each element in parentheses in the 
>> macro definition?
>> It seems other files also do this. See tools/libs/light/libxl_internal.h
> 
> As said, I'm not a maintainer of this code. Yet while I'm aware that libxl
> has its own CODING_STYLE, I can't spot anything towards excessive use of
> parentheses there.
So, which parentheses do you think are excessive use?
> 
>>>> @@ -1486,6 +1496,18 @@ static void pci_add_dm_done(libxl__egc *egc,
>>>>          goto out_no_irq;
>>>>      }
>>>>      if ((fscanf(f, "%u", &irq) == 1) && irq) {
>>>> +#ifdef CONFIG_X86
>>>> +        sbdf = PCI_SBDF(pci->domain, pci->bus,
>>>> +                        (PCI_DEVFN(pci->dev, pci->func)));
>>>> +        gsi = xc_physdev_gsi_from_dev(ctx->xch, sbdf);
>>>> +        /*
>>>> +         * Old kernel version may not support this function,
>>>
>>> Just kernel?
>> Yes, xc_physdev_gsi_from_dev depends on the function implemented on linux 
>> kernel side.
> 
> Okay, and when the kernel supports it but the underlying hypervisor doesn't
> support what the kernel wants to use in order to fulfill the request, all
I don't know what things you mentioned hypervisor doesn't support are,
because xc_physdev_gsi_from_dev is to get the gsi of pcidev through sbdf 
information,
that relationship can be got only in dom0 instead of Xen hypervisor.

> is fine? (See also below for what may be needed in the hypervisor, even if
You mean xc_physdev_map_pirq needs gsi?

> this IOCTL would be satisfied by the kernel without needing to interact with
> the hypervisor.)
> 
>>>> +         * so if fail, keep using irq; if success, use gsi
>>>> +         */
>>>> +        if (gsi > 0) {
>>>> +            irq = gsi;
>>>
>>> I'm still puzzled by this, when by now I think we've sufficiently clarified
>>> that IRQs and GSIs use two distinct numbering spaces.
>>>
>>> Also, as previously indicated, you call this for PV Dom0 as well. Aiui on
>>> the assumption that it'll fail. What if we decide to make the functionality
>>> available there, too (if only for informational purposes, or for
>>> consistency)? Suddenly you're fallback logic wouldn't work anymore, and
>>> you'd call ...
>>>
>>>> +        }
>>>> +#endif
>>>>          r = xc_physdev_map_pirq(ctx->xch, domid, irq, &irq);
>>>
>>> ... the function with a GSI when a pIRQ is meant. Imo, as suggested before,
>>> you strictly want to avoid the call on PV Dom0.
>>>
>>> Also for PVH Dom0: I don't think I've seen changes to the hypercall
>>> handling, yet. How can that be when GSI and IRQ aren't the same, and hence
>>> incoming GSI would need translating to IRQ somewhere? I can once again only
>>> assume all your testing was done with IRQs whose numbers happened to match
>>> their GSI numbers. (The difference, imo, would also need calling out in the
>>> public header, where the respective interface struct(s) is/are defined.)
>> I feel like you missed out on many of the previous discussions.
>> Without my changes, the original codes use irq (read from file 
>> /sys/bus/pci/devices/<sbdf>/irq) to do xc_physdev_map_pirq,
>> but xc_physdev_map_pirq require passing into gsi instead of irq, so we need 
>> to use gsi whether dom0 is PV or PVH, so for the original codes, they are 
>> wrong.
>> Just because by chance, the irq value in the Linux kernel of pv dom0 is 
>> equal to the gsi value, so there was no problem with the original pv 
>> passthrough.
>> But not when using PVH, so passthrough failed.
>> With my changes, I got gsi through function xc_physdev_gsi_from_dev firstly, 
>> and to be compatible with old kernel versions(if the ioctl
>> IOCTL_PRIVCMD_GSI_FROM_DEV is not implemented), I still need to use the irq 
>> number, so I need to check the result
>> of gsi, if gsi > 0 means IOCTL_PRIVCMD_GSI_FROM_DEV is implemented I can use 
>> the right one gsi, otherwise keep using wrong one irq. 
> 
> I understand all of this, to a (I think) sufficient degree at least. Yet what
> you're effectively proposing (without explicitly saying so) is that e.g.
> struct physdev_map_pirq's pirq field suddenly may no longer hold a pIRQ
> number, but (when the caller is PVH) a GSI. This may be a necessary adjustment
> to make (simply because the caller may have no way to express things in pIRQ
> terms), but then suitable adjustments to the handling of PHYSDEVOP_map_pirq
> would be necessary. In fact that field is presently marked as "IN or OUT";
> when re-purposed to take a GSI on input, it may end up being necessary to pass
> back the pIRQ (in the subject domain's numbering space). Or alternatively it
> may be necessary to add yet another sub-function so the GSI can be translated
> to the corresponding pIRQ for the domain that's going to use the IRQ, for that
> then to be passed into PHYSDEVOP_map_pirq.
If I understood correctly, your concerns about this patch are two:
First, when dom0 is PV, I should not use xc_physdev_gsi_from_dev to get gsi to 
do xc_physdev_map_pirq, I should keep the original code that use irq.
Second, when dom0 is PVH, I get the gsi, but I should not pass gsi as the 
fourth parameter of xc_physdev_map_pirq, I should create a new local parameter 
pirq=-1, and pass it in.

> 
>> And regarding to the implementation of ioctl IOCTL_PRIVCMD_GSI_FROM_DEV, it 
>> doesn't have any xen heypercall handling changes, all of its processing 
>> logic is on the kernel side.
>> I know, so you might want to say, "Then you shouldn't put this in xen's 
>> code." But this concern was discussed in previous versions, and since the 
>> pci maintainer disallowed to add
>> gsi sysfs on linux kernel side, I had to do so.
> 
> Right, but this is a separate aspect (which we simply need to live with on
> the Xen side).
> 
> Jan

-- 
Best regards,
Jiqian Chen.

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.