[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [XEN PATCH v10 4/5] tools: Add new function to get gsi from dev


  • To: "Chen, Jiqian" <Jiqian.Chen@xxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Tue, 18 Jun 2024 11:13:58 +0200
  • Autocrypt: addr=jbeulich@xxxxxxxx; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Anthony PERARD <anthony@xxxxxxxxxxxxxx>, Juergen Gross <jgross@xxxxxxxx>, "Daniel P . Smith" <dpsmith@xxxxxxxxxxxxxxxxxxxx>, "Hildebrand, Stewart" <Stewart.Hildebrand@xxxxxxx>, "Huang, Ray" <Ray.Huang@xxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Tue, 18 Jun 2024 09:14:10 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 18.06.2024 10:10, Chen, Jiqian wrote:
> On 2024/6/17 23:10, Jan Beulich wrote:
>> On 17.06.2024 11:00, Jiqian Chen wrote:
>>> --- a/tools/include/xen-sys/Linux/privcmd.h
>>> +++ b/tools/include/xen-sys/Linux/privcmd.h
>>> @@ -95,6 +95,11 @@ typedef struct privcmd_mmap_resource {
>>>     __u64 addr;
>>>  } privcmd_mmap_resource_t;
>>>  
>>> +typedef struct privcmd_gsi_from_dev {
>>> +   __u32 sbdf;
>>
>> That's PCI-centric, without struct and IOCTL names reflecting this fact.
> So, change to privcmd_gsi_from_pcidev ?

That's what I'd suggest, yes. But remember that it's the kernel maintainers
who have the ultimate say here, as here you're only making a copy of what
the canonical header (in the kernel tree) is going to have.

>>> +   int gsi;
>>
>> Is "int" legitimate to use here? Doesn't this want to similarly be __u32?
> I want to set gsi to negative if there is no record of this translation.

There are surely more explicit ways to signal that case?

>>> --- a/tools/libs/light/libxl_pci.c
>>> +++ b/tools/libs/light/libxl_pci.c
>>> @@ -1406,6 +1406,12 @@ static bool pci_supp_legacy_irq(void)
>>>  #endif
>>>  }
>>>  
>>> +#define PCI_DEVID(bus, devfn)\
>>> +            ((((uint16_t)(bus)) << 8) | ((devfn) & 0xff))
>>> +
>>> +#define PCI_SBDF(seg, bus, devfn) \
>>> +            ((((uint32_t)(seg)) << 16) | (PCI_DEVID(bus, devfn)))
>>
>> I'm not a maintainer of this file; if I were, I'd ask that for readability's
>> sake all excess parentheses be dropped from these.
> Isn't it a coding requirement to enclose each element in parentheses in the 
> macro definition?
> It seems other files also do this. See tools/libs/light/libxl_internal.h

As said, I'm not a maintainer of this code. Yet while I'm aware that libxl
has its own CODING_STYLE, I can't spot anything towards excessive use of
parentheses there.

>>> @@ -1486,6 +1496,18 @@ static void pci_add_dm_done(libxl__egc *egc,
>>>          goto out_no_irq;
>>>      }
>>>      if ((fscanf(f, "%u", &irq) == 1) && irq) {
>>> +#ifdef CONFIG_X86
>>> +        sbdf = PCI_SBDF(pci->domain, pci->bus,
>>> +                        (PCI_DEVFN(pci->dev, pci->func)));
>>> +        gsi = xc_physdev_gsi_from_dev(ctx->xch, sbdf);
>>> +        /*
>>> +         * Old kernel version may not support this function,
>>
>> Just kernel?
> Yes, xc_physdev_gsi_from_dev depends on the function implemented on linux 
> kernel side.

Okay, and when the kernel supports it but the underlying hypervisor doesn't
support what the kernel wants to use in order to fulfill the request, all
is fine? (See also below for what may be needed in the hypervisor, even if
this IOCTL would be satisfied by the kernel without needing to interact with
the hypervisor.)

>>> +         * so if fail, keep using irq; if success, use gsi
>>> +         */
>>> +        if (gsi > 0) {
>>> +            irq = gsi;
>>
>> I'm still puzzled by this, when by now I think we've sufficiently clarified
>> that IRQs and GSIs use two distinct numbering spaces.
>>
>> Also, as previously indicated, you call this for PV Dom0 as well. Aiui on
>> the assumption that it'll fail. What if we decide to make the functionality
>> available there, too (if only for informational purposes, or for
>> consistency)? Suddenly you're fallback logic wouldn't work anymore, and
>> you'd call ...
>>
>>> +        }
>>> +#endif
>>>          r = xc_physdev_map_pirq(ctx->xch, domid, irq, &irq);
>>
>> ... the function with a GSI when a pIRQ is meant. Imo, as suggested before,
>> you strictly want to avoid the call on PV Dom0.
>>
>> Also for PVH Dom0: I don't think I've seen changes to the hypercall
>> handling, yet. How can that be when GSI and IRQ aren't the same, and hence
>> incoming GSI would need translating to IRQ somewhere? I can once again only
>> assume all your testing was done with IRQs whose numbers happened to match
>> their GSI numbers. (The difference, imo, would also need calling out in the
>> public header, where the respective interface struct(s) is/are defined.)
> I feel like you missed out on many of the previous discussions.
> Without my changes, the original codes use irq (read from file 
> /sys/bus/pci/devices/<sbdf>/irq) to do xc_physdev_map_pirq,
> but xc_physdev_map_pirq require passing into gsi instead of irq, so we need 
> to use gsi whether dom0 is PV or PVH, so for the original codes, they are 
> wrong.
> Just because by chance, the irq value in the Linux kernel of pv dom0 is equal 
> to the gsi value, so there was no problem with the original pv passthrough.
> But not when using PVH, so passthrough failed.
> With my changes, I got gsi through function xc_physdev_gsi_from_dev firstly, 
> and to be compatible with old kernel versions(if the ioctl
> IOCTL_PRIVCMD_GSI_FROM_DEV is not implemented), I still need to use the irq 
> number, so I need to check the result
> of gsi, if gsi > 0 means IOCTL_PRIVCMD_GSI_FROM_DEV is implemented I can use 
> the right one gsi, otherwise keep using wrong one irq. 

I understand all of this, to a (I think) sufficient degree at least. Yet what
you're effectively proposing (without explicitly saying so) is that e.g.
struct physdev_map_pirq's pirq field suddenly may no longer hold a pIRQ
number, but (when the caller is PVH) a GSI. This may be a necessary adjustment
to make (simply because the caller may have no way to express things in pIRQ
terms), but then suitable adjustments to the handling of PHYSDEVOP_map_pirq
would be necessary. In fact that field is presently marked as "IN or OUT";
when re-purposed to take a GSI on input, it may end up being necessary to pass
back the pIRQ (in the subject domain's numbering space). Or alternatively it
may be necessary to add yet another sub-function so the GSI can be translated
to the corresponding pIRQ for the domain that's going to use the IRQ, for that
then to be passed into PHYSDEVOP_map_pirq.

> And regarding to the implementation of ioctl IOCTL_PRIVCMD_GSI_FROM_DEV, it 
> doesn't have any xen heypercall handling changes, all of its processing logic 
> is on the kernel side.
> I know, so you might want to say, "Then you shouldn't put this in xen's 
> code." But this concern was discussed in previous versions, and since the pci 
> maintainer disallowed to add
> gsi sysfs on linux kernel side, I had to do so.

Right, but this is a separate aspect (which we simply need to live with on
the Xen side).

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.