[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] xen/x86: On x2APIC mode, derive LDR from APIC_ID



Hi,

On Tue, Nov 14, 2023 at 03:11:28PM +0100, Roger Pau Monné wrote:
> On Tue, Nov 14, 2023 at 12:55:46PM +0000, Andrew Cooper wrote:
> > On 14/11/2023 12:32 pm, Jan Beulich wrote:
> > > On 14.11.2023 13:18, Alejandro Vallejo wrote:
> > >> On Tue, Nov 14, 2023 at 11:14:22AM +0100, Jan Beulich wrote:
> > >>> On 13.11.2023 18:53, Roger Pau Monné wrote:
> > >>>> On Mon, Nov 13, 2023 at 04:50:23PM +0000, Alejandro Vallejo wrote:
> > >>>>> Both Intel and AMD manuals agree that on x2APIC mode, the APIC LDR 
> > >>>>> and ID
> > >>>>> registers are derivable from each other through a fixed formula.
> > >>>>>
> > >>>>> Xen uses that formula, but applies it to vCPU IDs (which are 
> > >>>>> sequential)
> > >>>>> rather than x2APIC_IDs (which are not, at the moment). As I 
> > >>>>> understand it,
> > >>>>> this is an attempt to tightly pack vCPUs into clusters so each 
> > >>>>> cluster has
> > >>>>> 16 vCPUs rather than 8, but this is problematic for OSs that might 
> > >>>>> read the
> > >>>>> x2APIC_ID and internally derive LDR (or the other way around)
> > >>>> I would replace the underscore from x2APIC ID with a space instead.
> > >>>>
> > >>>> Seeing the commit that introduced the bogus LDR value, I'm not sure it
> > >>>> was intentional,
> > >>> Hard to reconstruct over 9 years later. It feels like Alejandro may be 
> > >>> right
> > >>> with his derivation.
> > >>>
> > >>>> as previous Xen code had:
> > >>>>
> > >>>> u32 id = vlapic_get_reg(vlapic, APIC_ID);
> > >>>> u32 ldr = ((id & ~0xf) << 16) | (1 << (id & 0xf));
> > >>>> vlapic_set_reg(vlapic, APIC_LDR, ldr);
> > >>>>
> > >>>> Which was correct, as the LDR was derived from the APIC ID and not the
> > >>>> vCPU ID.
> > >>> Well, it gave the appearance of deriving from the APIC ID. Just that it 
> > >>> was
> > >>> missing GET_xAPIC_ID() around the vlapic_get_reg() (hence why LDR was
> > >>> uniformly 1 on all CPUs).
> > >>>
> > >>>>> This patch fixes the implementation so we follow the rules in the 
> > >>>>> x2APIC
> > >>>>> spec(s).
> > >>>>>
> > >>>>> While in the neighborhood, replace the u32 type with the standard 
> > >>>>> uint32_t
> > >>>> Likely wants:
> > >>>>
> > >>>> Fixes: f9e0cccf7b35 ('x86/HVM: fix ID handling of x2APIC emulation')
> > >>> +1
> > >>>
> > >>>>> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@xxxxxxxxx>
> > >>>> I do wonder whether we need to take any precautions with guests being
> > >>>> able to trigger an APIC reset, and thus seeing a changed LDR register
> > >>>> if the guest happens to be migrated from an older hypervisor version
> > >>>> that doesn't have this fix.  IOW: I wonder whether Xen should keep the
> > >>>> previous bogus LDR value across APIC resets for guests that have
> > >>>> already seen it.
> > >>> That earlier change deliberately fixed up any bogus values. I wonder
> > >>> whether what you suggest will do more good or more harm than going
> > >>> even farther and once again fixing up bad values in lapic_load_fixup().
> > >>> After all LDR being wrong affects vlapic_match_logical_addr()'s outcome.
> > >>> I think one of the two wants adding to the change, though.
> > >>>
> > >> You mean changing the LDR of a vCPU to the correct value on migrate? That
> > >> feels like playing with fire. A migrated VM is presumably a VM that is
> > >> running without issues (or it would have been rebooted). Letting it run
> > >> as it did seems safer.
> > > See Andrew's reply.
> > >
> > >> I don't think vlapic_match_logical_addr() is affected. The LDR's are 
> > >> still
> > >> unique in the bogus case so the matching ought to work. Problem would 
> > >> arise
> > >> if the guest makes assumptions about APIC_ID and LDR relationships.
> > > The LDRs still being unique (or not) isn't what I'm concerned about. It is
> > > the function's return value which would be wrong, as the incoming "mda"
> > > presumably was set in its respective field on the assumption that the LDRs
> > > are set in a spec-compliant way. There not having been problem reports
> > > makes me wonder whether any guests actually use logical delivery mode in a
> > > wider fashion.
> > 
> > They likely don't.
> > 
> > Logical delivery for xAPIC only works in a tiny fraction of cases
> > (assuming correct topology information, which we don't give), and
> > persuading a VM to turn on x2APIC without a vIOMMU is not something
> > we've managed to do in Xen.
> 
> We do, in fact the pvshim (or nested Xen) will run in x2APIC mode if
> available.
> 
> Linux >= 5.17 will also use x2APIC mode if available when running as a
> Xen HVM guest:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c8980fcb210851138cb34c9a8cb0cf0c09f07bf9
> 
> If a guest has been booted with the bogus LDR we need to keep it on
> migrate, otherwise at least Xen will break (because it does read the
> LDR from the hardware instead of building it based on the APIC ID).
Ack. I'll just integrate this into my ongoing series, with...
> 
> Switching to the correct LDR on APIC reset can be sensible, any APIC
> device reset should be done together with updating whatever registers
> have been previously cached, and OSes don't do APIC resets anyway.
... this in mind.

> 
> > Also (as I learn talking to people just last night) it turns out that
> > Logical Destination Mode for external interrupts is entirely broken
> > anyway.  It always hits the lowest ID in the cluster, unless the LAPIC
> > in question is already servicing a same-or-higher priority interrupt at
> > which point the next ID in the cluster is tried.
> 
> Yeah, I've heard similar things for lowpri mode.  It's also valid to
> implement as a round-robin.
> 
> Thanks, Roger.

Cheers,
Alejandro



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.