[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3] xen/x86: On x2APIC mode, derive LDR from APIC ID



On Thu, Nov 23, 2023 at 12:21:36PM +0000, Alejandro Vallejo wrote:
> On Thu, Nov 23, 2023 at 10:03:12AM +0100, Roger Pau Monné wrote:
> > On Wed, Nov 22, 2023 at 04:08:17PM +0000, Alejandro Vallejo wrote:
> > > +static uint32_t x2apic_ldr_from_id(uint32_t id)
> > > +{
> > > +    return ((id & ~0xf) << 12) | (1 << (id & 0xf));
> > > +}
> > > +
> > >  static void set_x2apic_id(struct vlapic *vlapic)
> > >  {
> > > -    u32 id = vlapic_vcpu(vlapic)->vcpu_id;
> > > -    u32 ldr = ((id & ~0xf) << 12) | (1 << (id & 0xf));
> > > +    uint32_t vcpu_id = vlapic_vcpu(vlapic)->vcpu_id;
> > > +    uint32_t apic_id = vcpu_id * 2;
> > > +    uint32_t apic_ldr = x2apic_ldr_from_id(apic_id);
> > > +
> > > +    /* This is a migration bug workaround. See wall of text in 
> > > lapic_load_fixup() */
> > 
> > Line length > 80 cols.
> > 
> > I try to avoid referencing function names in comments, as they tend to
> > get out of sync without noticing.  It's much easier to use cscope to
> > grep for x2apic_ldr_bug_with_vcpu_id and find the comment itself.
> In my experience that's less of a problem than it's usually made out to be,
> and helps new readers know about the real context something is in the place
> it is.
> 
> But I do hold the atypical belief that an out of date pointer to context is
> preferrable to no context.

It's a question of taste TBH, I'm certainly not going to insist.

Since you have to wrap the line to fit in 80 cols anyway, I think I
would rather write: "This is a workaround for migrated domains. ...".
Current text reads to me as it's a migration bug, but that's not the
case, the bug is in the previous Xen versions.  I'm not a native
speaker anyway, so maybe it's just me reading it wrong.

> > 
> > > +    if ( vlapic_domain(vlapic)->arch.hvm.x2apic_ldr_bug_with_vcpu_id )
> > > +        apic_ldr = x2apic_ldr_from_id(vcpu_id);
> > >  
> > > -    vlapic_set_reg(vlapic, APIC_ID, id * 2);
> > > -    vlapic_set_reg(vlapic, APIC_LDR, ldr);
> > > +    vlapic_set_reg(vlapic, APIC_ID, apic_id);
> > > +    vlapic_set_reg(vlapic, APIC_LDR, apic_ldr);
> > >  }
> > >  
> > >  int guest_wrmsr_apic_base(struct vcpu *v, uint64_t val)
> > > @@ -1498,27 +1508,35 @@ static int cf_check lapic_save_regs(struct vcpu 
> > > *v, hvm_domain_context_t *h)
> > >   */
> > >  static void lapic_load_fixup(struct vlapic *vlapic)
> > >  {
> > > -    uint32_t id = vlapic->loaded.id;
> > > +    /* Skip fixups on xAPIC mode, or if the x2APIC LDR is already 
> > > correct */
> > > +    if ( !vlapic_x2apic_mode(vlapic) ||
> > > +         (vlapic->loaded.ldr == x2apic_ldr_from_id(vlapic->loaded.id)) )
> > > +        return;
> > >  
> > > -    if ( vlapic_x2apic_mode(vlapic) && id && vlapic->loaded.ldr == 1 )
> > > -    {
> > > +    if ( vlapic->loaded.ldr == 1 )
> > > +       /*
> > > +        * Xen <= 4.4 had a bug by which all the APICs configured in 
> > > x2APIC
> > > +        * mode got LDR = 1. We can't leave it as-is because it assigned 
> > > the
> > > +        * same LDR to every CPU.  We'll fix fix the bug now and assign an
> > > +        * LDR value consistent with the APIC ID.
> > > +        */
> > > +        set_x2apic_id(vlapic);
> > > +    else if ( vlapic->loaded.ldr ==
> > > +              x2apic_ldr_from_id(vlapic_vcpu(vlapic)->vcpu_id) )
> > >          /*
> > > -         * This is optional: ID != 0 contradicts LDR == 1. It's being 
> > > added
> > > -         * to aid in eventual debugging of issues arising from the fixup 
> > > done
> > > -         * here, but can be dropped as soon as it is found to conflict 
> > > with
> > > -         * other (future) changes.
> > > +         * This is a migration from a broken Xen between 4.4 and 4.18 and
> > > +         * we must _PRESERVE_ LDRs so new vCPUs use consistent 
> > > derivations.
> > 
> > Not sure if we should try to avoid mentioning specific versions in the
> > comments, as I this fix will be backported to stable branches (I hope),
> > and hence those will no longer be affected.
> Hence the "broken Xen" part of the paragraphs. Not every 4.18 will have the
> problem, but it shouldn't be seen in 4.19 onwards. I think there's value in
> stating the versions that "may" exhibit problems, but this is all
> subjective 

FE.

> > 
> > > +         * This is so existing running guests that may have already read
> > > +         * the LDR at the source host aren't surprised when IPIs stop
> > > +         * working as they did at the other end. To address this, we set
> > > +         * this domain boolean so future CPU hotplugs derive an LDR
> > > +         * consistent with the older Xen's broken idea of consistency.
> > 
> > I think this is possibly too verbose, I would be fine with just the
> > first sentence TBH.  If we want the full comment however, the wording
> > should be slightly addressed: it's not just IPIs that would possibly
> > fail to be delivered, but any interrupt attempting to target the APIC
> > using the previous LDR addressing (either an IPI or an external
> > interrupt).
> I can s/IPIs/targetted interrupts/ and remove the second sentence.

I would just use interrupts FWIW.

> > 
> > >           */
> > > -        if ( GET_xAPIC_ID(id) != vlapic_vcpu(vlapic)->vcpu_id * 2 ||
> > > -             id != SET_xAPIC_ID(GET_xAPIC_ID(id)) )
> > > -            printk(XENLOG_G_WARNING "%pv: bogus APIC ID %#x loaded\n",
> > > -                   vlapic_vcpu(vlapic), id);
> > > -        set_x2apic_id(vlapic);
> > > -    }
> > > -    else /* Undo an eventual earlier fixup. */
> > > -    {
> > > -        vlapic_set_reg(vlapic, APIC_ID, id);
> > > -        vlapic_set_reg(vlapic, APIC_LDR, vlapic->loaded.ldr);
> > > -    }
> > > +        vlapic_domain(vlapic)->arch.hvm.x2apic_ldr_bug_with_vcpu_id = 
> > > true;
> > > +    else
> > > +        printk(XENLOG_G_WARNING
> > > +               "%pv: bogus x2APIC loaded id=%#x ldr=%#x\n",
> > > +               vlapic_vcpu(vlapic), vlapic->loaded.id, 
> > > vlapic->loaded.ldr);
> > 
> > Could you write the expected values while at it:
> > 
> > "%pv: bogus x2APIC loaded id=%#x ldr=%#x (expected id=%#x ldr=%#x)\n"
> x2APIC ID is current strictly related to the vcpu ID, but it won't be after
> I'm done with topology. I can print the expected LDR though.

Oh, I see.  Just printing the expected LDR then would be fine.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.