[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 09/19] x86/vmce: fill MSR_IA32_MCG_STATUS on all vcpus in broadcast case
On 02/17/17 03:21 -0700, Jan Beulich wrote: > >>> On 17.02.17 at 07:39, <haozhong.zhang@xxxxxxxxx> wrote: > > --- a/xen/arch/x86/cpu/mcheck/mcaction.c > > +++ b/xen/arch/x86/cpu/mcheck/mcaction.c > > @@ -88,21 +88,21 @@ mc_memerr_dhandler(struct mca_binfo *binfo, > > goto vmce_failed; > > } > > > > + if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL ) > > + vmce_vcpuid = VMCE_INJECT_BROADCAST; > > + else > > + vmce_vcpuid = global->mc_vcpuid; > > + > > bank->mc_addr = gfn << PAGE_SHIFT | > > (bank->mc_addr & (PAGE_SIZE -1 )); > > - if ( fill_vmsr_data(bank, d, > > - global->mc_gstatus) == -1 ) > > + if ( fill_vmsr_data(bank, d, global->mc_gstatus, > > + vmce_vcpuid == VMCE_INJECT_BROADCAST) > > == -1 ) > > You're changing the return values of function to be proper -E... > values, so you can't validly compare against -1 here anymore. > will fix > > int fill_vmsr_data(struct mcinfo_bank *mc_bank, struct domain *d, > > - uint64_t gstatus) > > + uint64_t gstatus, bool broadcast) > > { > > struct vcpu *v = d->vcpu[0]; > > + int ret; > > > > - if ( mc_bank->mc_domid != (uint16_t)~0 ) > > - { > > - if ( v->arch.vmce.mcg_status & MCG_STATUS_MCIP ) > > - { > > - mce_printk(MCE_QUIET, "MCE: guest has not handled previous" > > - " vMCE yet!\n"); > > - return -1; > > - } > > - > > - spin_lock(&v->arch.vmce.lock); > > + if ( mc_bank->mc_domid == (uint16_t)~0 ) > > + return -EINVAL; > > Returning -EINVAL here is a behavioral change which, if intended, > needs justification in the commit message. > It's intended. "ASSERT(d)" in its only caller mc_memerr_dhandler() implies an invalid domain id should be treated as an error. I'll mention this in the commit message. > > - v->arch.vmce.mcg_status = gstatus; > > - /* > > - * 1. Skip bank 0 to avoid 'bank 0 quirk' of old processors > > - * 2. Filter MCi_STATUS MSCOD model specific error code to guest > > - */ > > - v->arch.vmce.bank[1].mci_status = mc_bank->mc_status & > > - MCi_STATUS_MSCOD_MASK; > > - v->arch.vmce.bank[1].mci_addr = mc_bank->mc_addr; > > - v->arch.vmce.bank[1].mci_misc = mc_bank->mc_misc; > > + ret = vcpu_fill_mc_msrs(v, gstatus, mc_bank->mc_status, > > + mc_bank->mc_addr, mc_bank->mc_misc); > > + if ( ret || !broadcast ) > > + goto out; > > > > - spin_unlock(&v->arch.vmce.lock); > > + for_each_vcpu ( d, v ) > > + { > > + if ( v == d->vcpu[0] ) > > Comparing v->vcpu_id with zero would be a cheaper check as it > seems. > will change > > + continue; > > + ret = vcpu_fill_mc_msrs(v, MCG_STATUS_MCIP | MCG_STATUS_RIPV, > > + 0, 0, 0); > > What guarantees RIP to be valid, and why all the zeros? Perhaps > I'm lacking some understanding of how real hardware handles the > broadcasting, but it would help if you left an explaining comment > here. > I just inject the less severity error on vcpus other than vcpu0. As long as vMCE with the actual error information is injected to vcpu0, the guest MCE handler should be able to decide whether it can recover or not. If it can, vMCE injected on other vcpus should not offer conflict information. If it cannot, MC MSRs on other vcpus will not matter in fact as they represent a less severe error. MSR_IA32_MCG_STATUS = MCG_STATUS_MCIP | MCG_STATUS_RIPV and all banks being zero (invalid) is one of the combinations satisfying above requirement. > > + if ( ret ) > > + break; > > Wouldn't you better, on a best effort basis, try to update the > remaining vCPU-s anyway (making sure you don't lose the error > indicator to be handed back to your caller). > sure, will change to update all vcpus > > } > > > > - return 0; > > + out: > > + return ret; > > If at the destination of a goto all you do is return, please avoid > using goto. > will change Thanks, Haozhong _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |