|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] x86/hvm: Advertise and support extended destination IDs for MSI/IO-APIC
On Thu, Feb 19, 2026 at 12:44:47PM +0000, Julian Vetter wrote:
>
>
> On 2/9/26 14:16, Roger Pau Monné wrote:
> > On Mon, Feb 09, 2026 at 11:34:18AM +0000, Julian Vetter wrote:
> >> x2APIC guests with more than 128 vCPUs have APIC IDs above 255, but MSI
> >> addresses and IO-APIC RTEs only provide an 8-bit destination field.
> >> Without extended destination ID support, Linux limits the maximum usable
> >> APIC ID to 255, refusing to bring up vCPUs beyond that limit. So,
> >> advertise XEN_HVM_CPUID_EXT_DEST_ID in the HVM hypervisor CPUID leaf,
> >> signalling that guests may use MSI address bits 11:5 and IO-APIC RTE
> >> bits 55:49 as additional high destination ID bits. This expands the
> >> destination ID from 8 to 15 bits.
> >>
> >> Signed-off-by: Julian Vetter <julian.vetter@xxxxxxxxxx>
> >> ---
> >> xen/arch/x86/cpuid.c | 9 +++++++++
> >> xen/arch/x86/hvm/irq.c | 11 ++++++++++-
> >> xen/arch/x86/hvm/vioapic.c | 2 +-
> >> xen/arch/x86/hvm/vmsi.c | 4 ++--
> >> xen/arch/x86/include/asm/hvm/hvm.h | 4 ++--
> >> xen/arch/x86/include/asm/hvm/vioapic.h | 13 +++++++++++++
> >> xen/arch/x86/include/asm/msi.h | 3 +++
> >> 7 files changed, 40 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
> >> index d85be20d86..fb17c71d74 100644
> >> --- a/xen/arch/x86/cpuid.c
> >> +++ b/xen/arch/x86/cpuid.c
> >> @@ -148,6 +148,15 @@ static void cpuid_hypervisor_leaves(const struct vcpu
> >> *v, uint32_t leaf,
> >> res->a |= XEN_HVM_CPUID_DOMID_PRESENT;
> >> res->c = d->domain_id;
> >>
> >> + /*
> >> + * Advertise extended destination ID support. This allows guests
> >> to use
> >> + * bits 11:5 of the MSI address and bits 55:49 of the IO-APIC RTE
> >> for
> >> + * additional destination ID bits, expanding the addressable APIC
> >> ID
> >> + * range from 8 to 15 bits. This is required for x2APIC guests
> >> with
> >> + * APIC IDs > 255.
> >> + */
> >> + res->a |= XEN_HVM_CPUID_EXT_DEST_ID;
> >
> > This cannot be unilaterally advertised: you need a QEMU (or in general
> > any device model that manages PCI passthrough) to understand the
> > extended destination mode. This requires the introduction of
> > a new XEN_DOMCTL_bind_pt_irq equivalent hypercall, that can take an
> > extended destination ID not limited to 256 values:
> >
> > struct xen_domctl_bind_pt_irq {
> > [...]
> > uint32_t gflags;
> > #define XEN_DOMCTL_VMSI_X86_DEST_ID_MASK 0x0000ff
> >
> > When doing PCI passthrough it's QEMU the entity that decodes the MSI
> > address and data fields, and hence would need expanding (and
> > negotiation with Xen) about whether the Extended ID feature can be
> > advertised.
> >
> > It would be good to introduce a new XEN_DMOP_* set of hypercalls that
> > support Extended ID to do the PCI passthrough interrupt binding.
>
> Thank you for your feedback. But wouldn't it be enough if QEMU extracts
> the additional bits from the gflags and pass it on to XEN?
Possibly, you need to use the still unused 7 bits at the top of the
flags field AFAICT.
> In
> pt_irq_create_bind I already extract the additional bits. In QEMU the
> function msi_dest_id would just need to extract the additional bits
> before calling xc_domain_update_msi_irq. The gflags argument in
> xc_domain_update_msi_irq is 32Bits, so there is enough room to pass the
> additional bits. What do you think?
It's possible. However there's still a question of how does QEMU
signal Xen that it implements the extended destination logic?? QMEU
and Xen are two separate components, and Xen cannot unilaterally
advertise support for Extended IDs if QEMU doesn't actually implement
it. You need some kind of negotiation between the device model and
Xen.
It would IMO be way better if we could simply avoid having to parse
the MSI address and data fields in QEMU, and just forward them to Xen.
Then Xen could interpret them in whatever format it wants, and there
would be no negotiation needed between QEMU and Xen.
XEN_DOMCTL_{un}bind_pt_irq hypercalls have no reason to be domctls, it
would be much better if we introduced equivalent DM ops, as that would
remove toe usage of two unstable hypercalls from QEMU and would
bring us closer to QEMU not being tied to running Xen version. Hence
my recommendation to take this opportinity to introduce a new pair of
DM ops to replace those domctls.
Thanks, Roger.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |