|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [RFC XEN PATCH v9 5/5] domctl: Add XEN_DOMCTL_gsi_permission to grant gsi
Hi Jan,
On 2024/6/11 22:39, Jan Beulich wrote:
> On 07.06.2024 10:11, Jiqian Chen wrote:
>> Some type of domain don't have PIRQ, like PVH, it do not do
>> PHYSDEVOP_map_pirq for each gsi. When passthrough a device
>> to guest on PVH dom0, callstack
>> pci_add_dm_done->XEN_DOMCTL_irq_permission will failed at
>> domain_pirq_to_irq, because PVH has no mapping of gsi, pirq
>> and irq on Xen side.
>
> All of this is, to me at least, in pretty sharp contradiction to what
> patch 2 says and does. IOW: Do we want the concept of pIRQ in PVH, or
> do we want to keep that to PV?
It's not contradictory.
What I did is not to add the concept of PIRQs for PVH.
All previous passthrough code was implemented on the basis of pv dom0 + hvm
domU.
For pv dom0, it has PIRQs. For hvm domU, it has PIRQs too.
So the codes are not suitable for PVH dom0 + hvm domU, because PVH dom0 has no
PIRQs.
Patch 2 do PHYSDEVOP_map_pirq for hvm domU even when dom0 is PVH instead of PV.
It didn't add PIRQs for PVH.
This patch is to grant irq( that get from gsi ) to hvm domU, why
XEN_DOMCTL_irq_permission is not useful is because PVH has no PIRQs, we can't
get irq through pirq like PV does.
>
>> What's more, current hypercall XEN_DOMCTL_irq_permission require
>> passing in pirq and grant the access of irq, it is not suitable
>> for dom0 that has no PIRQ flag, because passthrough a device
>> needs gsi and grant the corresponding irq to guest. So, add a
>> new hypercall to grant gsi permission when dom0 is not PV or dom0
>> has not PIRQ flag.
>>
>> Signed-off-by: Huang Rui <ray.huang@xxxxxxx>
>> Signed-off-by: Jiqian Chen <Jiqian.Chen@xxxxxxx>
>
> A problem throughout the series as it seems: Who's the author of these
> patches? There's no From: saying it's not you, but your S-o-b also
> isn't first.
So I need to change to:
Signed-off-by: Jiqian Chen <Jiqian.Chen@xxxxxxx> means I am the author.
Signed-off-by: Huang Rui <ray.huang@xxxxxxx> means Rui sent them to upstream
firstly.
Signed-off-by: Jiqian Chen <Jiqian.Chen@xxxxxxx> means I take continue to
upstream.
>
>> --- a/tools/libs/light/libxl_pci.c
>> +++ b/tools/libs/light/libxl_pci.c
>> @@ -1412,6 +1412,37 @@ static bool pci_supp_legacy_irq(void)
>> #define PCI_SBDF(seg, bus, devfn) \
>> ((((uint32_t)(seg)) << 16) | (PCI_DEVID(bus, devfn)))
>>
>> +static int pci_device_set_gsi(libxl_ctx *ctx,
>> + libxl_domid domid,
>> + libxl_device_pci *pci,
>> + bool map,
>> + int *gsi_back)
>> +{
>> + int r, gsi, pirq;
>> + uint32_t sbdf;
>> +
>> + sbdf = PCI_SBDF(pci->domain, pci->bus, (PCI_DEVFN(pci->dev,
>> pci->func)));
>> + r = xc_physdev_gsi_from_dev(ctx->xch, sbdf);
>> + *gsi_back = r;
>> + if (r < 0)
>> + return r;
>> +
>> + gsi = r;
>> + pirq = r;
>
> r is a GSI as per above; why would you store such in a variable named pirq?
> And how can ...
>
>> + if (map)
>> + r = xc_physdev_map_pirq(ctx->xch, domid, gsi, &pirq);
>> + else
>> + r = xc_physdev_unmap_pirq(ctx->xch, domid, pirq);
>
> ... that value be the correct one to pass into here? In fact, the pIRQ number
> you obtain above in the "map" case isn't handed to the caller, i.e. it is
> effectively lost. Yet that's what would need passing into such an unmap call.
Yes r is GSI and I know pirq will be replaced by xc_physdev_map_pirq.
What I do "pirq = r" is for xc_physdev_unmap_pirq, unmap need passing in pirq,
and the number of pirq is always equal to gsi.
>
>> + if (r)
>> + return r;
>> +
>> + r = xc_domain_gsi_permission(ctx->xch, domid, gsi, map);
>
> Looking at the hypervisor side, this will fail for PV Dom0. In which case imo
> you better would avoid making the call in the first place.
Yes, for PV dom0, the errno is EOPNOTSUPP, then it will do below
xc_domain_irq_permission.
>
>> + if (r && errno == EOPNOTSUPP)
>
> Before here you don't really need the pIRQ number; if all it really is needed
> for is ...
>
>> + r = xc_domain_irq_permission(ctx->xch, domid, pirq, map);
>
> ... this, then it probably also should only be obtained when it's needed. Yet
> overall the intentions here aren't quite clear to me.
Adding the function pci_device_set_gsi is for PVH dom0, while also ensuring
compatibility with PV dom0.
When PVH dom0, it does xc_physdev_map_pirq and xc_domain_gsi_permission(new
hypercall for PVH dom0)
When PV dom0, it keeps the same actions as before codes, it does
xc_physdev_map_pirq and xc_domain_irq_permission.
>
>> @@ -1485,6 +1516,19 @@ static void pci_add_dm_done(libxl__egc *egc,
>> fclose(f);
>> if (!pci_supp_legacy_irq())
>> goto out_no_irq;
>> +
>> + r = pci_device_set_gsi(ctx, domid, pci, 1, &gsi);
>> + if (gsi >= 0) {
>> + if (r < 0) {
>
> This unusual way of error checking likely wants a comment.
Will add in next version.
>
>> + rc = ERROR_FAIL;
>> + LOGED(ERROR, domainid,
>> + "pci_device_set_gsi gsi=%d (error=%d)", gsi, errno);
>> + goto out;
>> + } else {
>> + goto process_permissive;
>> + }
>
> Btw, no need for "else" when the earlier if ends in "goto" or alike.
OK, I will change in next version.
>
>> @@ -1493,13 +1537,6 @@ static void pci_add_dm_done(libxl__egc *egc,
>> goto out_no_irq;
>> }
>> if ((fscanf(f, "%u", &irq) == 1) && irq) {
>> - sbdf = PCI_SBDF(pci->domain, pci->bus,
>> - (PCI_DEVFN(pci->dev, pci->func)));
>> - r = xc_physdev_gsi_from_dev(ctx->xch, sbdf);
>> - /* if fail, keep using irq; if success, r is gsi, use gsi */
>> - if (r != -1) {
>> - irq = r;
>> - }
>
> If I'm not mistaken, this and ...
>
>> @@ -2255,13 +2302,6 @@ skip_bar:
>> }
>>
>> if ((fscanf(f, "%u", &irq) == 1) && irq) {
>> - sbdf = PCI_SBDF(pci->domain, pci->bus,
>> - (PCI_DEVFN(pci->dev, pci->func)));
>> - r = xc_physdev_gsi_from_dev(ctx->xch, sbdf);
>> - /* if fail, keep using irq; if success, r is gsi, use gsi */
>> - if (r != -1) {
>> - irq = r;
>> - }
>
> ... this is code added by the immediately preceding patch. It's pretty odd
> for that to be deleted here again right away. Can the interaction of the
> two patches perhaps be re-arranged to avoid this anomaly?
OK, I will do in next version.
>
>> @@ -237,6 +238,43 @@ long arch_do_domctl(
>> break;
>> }
>>
>> + case XEN_DOMCTL_gsi_permission:
>> + {
>> + unsigned int gsi = domctl->u.gsi_permission.gsi;
>> + int irq = gsi_2_irq(gsi);
>
> I'm not sure it is a good idea to issue this call ahead of the basic error
> checks below.
I will move it below the checks.
>
>> + bool allow = domctl->u.gsi_permission.allow_access;
>
> This allows any non-zero values to mean "true". I think you want to bail
> on values larger than 1, much like you also want to check that the padding
> fields are all zero.
Will change in next version.
>
>> + /*
>> + * If current domain is PV or it has PIRQ flag, it has a mapping
>> + * of gsi, pirq and irq, so it should use XEN_DOMCTL_irq_permission
>> + * to grant irq permission.
>> + */
>> + if ( is_pv_domain(current->domain) || has_pirq(current->domain) )
>
> Please use currd here (and also again below).
Will change in next version.
>
>> + {
>> + ret = -EOPNOTSUPP;
>> + break;
>> + }
>> +
>> + if ( gsi >= nr_irqs_gsi || irq < 0 )
>> + {
>> + ret = -EINVAL;
>> + break;
>> + }
>> +
>> + if ( !irq_access_permitted(current->domain, irq) ||
>> + xsm_irq_permission(XSM_HOOK, d, irq, allow) )
>
> Daniel, is it okay to issue the XSM check using the translated value, not
> the one that was originally passed into the hypercall?
>
>> --- a/xen/arch/x86/io_apic.c
>> +++ b/xen/arch/x86/io_apic.c
>> @@ -955,6 +955,27 @@ static int pin_2_irq(int idx, int apic, int pin)
>> return irq;
>> }
>>
>> +int gsi_2_irq(int gsi)
>> +{
>> + int entry, ioapic, pin;
>> +
>> + ioapic = mp_find_ioapic(gsi);
>> + if ( ioapic < 0 )
>> + return -1;
>
> Can this be a proper errno value (likely -EINVAL), please?
Will change in next version.
>
>> + pin = gsi - io_apic_gsi_base(ioapic);
>
> Hmm, instead of the below: Once you have an (ioapic,pin) tuple, can't
> you simply call apic_pin_2_gsi_irq()?
Will change in next version.
>
>> + entry = find_irq_entry(ioapic, pin, mp_INT);
>> + /*
>> + * If there is no override mapping for irq and gsi in mp_irqs,
>> + * then the default identity mapping applies.
>> + */
>> + if ( entry < 0 )
>> + return gsi;
>> +
>> + return pin_2_irq(entry, ioapic, pin);
>
> Under certain conditions this may return 0. Yet you surely don't want
> to pass IRQ0 back as a result; you want to hand an error back instead.
Will add in next version.
>
>> --- a/xen/include/public/domctl.h
>> +++ b/xen/include/public/domctl.h
>> @@ -465,6 +465,14 @@ struct xen_domctl_irq_permission {
>> };
>>
>>
>> +/* XEN_DOMCTL_gsi_permission */
>> +struct xen_domctl_gsi_permission {
>> + uint32_t gsi;
>> + uint8_t allow_access; /* flag to specify enable/disable of x86 gsi
>> access */
>> + uint8_t pad[3];
>> +};
>> +
>> +
>
> Nit: No (new) double blank lines please. In fact (didn't I say this before
> already?) you could insert between the two above, such that the existing issue
> also disappears.
I remember I changed it here, maybe I made a mistake somewhere, and I will
modify it in the next version.
>
> Jan
--
Best regards,
Jiqian Chen.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |