Xen project Mailing List

Re: [Xen-devel] [PATCH V2] x86, amd_ucode: Safeguard against #GP

To: Aravind Gopalakrishnan <aravind.gopalakrishnan@xxxxxxx>, JBeulich@xxxxxxxx, xen-devel@xxxxxxxxxxxxx

From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

Date: Wed, 28 May 2014 00:47:17 +0100

Cc: boris.ostrovsky@xxxxxxxxxx, keir@xxxxxxx

Delivery-date: Tue, 27 May 2014 23:47:35 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 27/05/2014 19:24, Aravind Gopalakrishnan wrote: > When HW tries to load a corrupted patch, it generates #GP > and hangs the system. Use wrmsr_safe instead so that we > fail to load microcode gracefully. > > Also, massage error handling around apply_microcode to keep > in tune with error handling style of other parts of the code. > > Example on a Fam15h system- > (XEN) microcode: CPU0 collect_cpu_info: patch_id=0x6000626 > (XEN) microcode: CPU0 size 7870, block size 2586 offset 76 equivID > 0x6012 rev 0x6000637 > (XEN) microcode: CPU0 found a matching microcode update with version > 0x6000637 (current=0x6000626) > (XEN) traps.c:3073: GPF (0000): ffff82d08016f682 -> ffff82d08022d9f8 > (XEN) microcode: CPU0 update from revision 0x6000637 to 0x6000626 failed > ^^^^^^^^^^^^^^^^^^^^^^ > As shown, the log message above has the two revisions reversed. Fix this > > Changes in V2: > - Do not ignore return value from wrmsr_safe > - Flip revision numbers as shown above > > Signed-off-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@xxxxxxx> > Reviewed-by: Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx> I thought we had identified that the hangs were to do with your use of 'noreboot' on the Xen command line. ~Andrew > --- > xen/arch/x86/microcode_amd.c | 25 +++++++++++++++++++------ > 1 file changed, 19 insertions(+), 6 deletions(-) > > diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c > index e83f4b6..1db8a0d 100644 > --- a/xen/arch/x86/microcode_amd.c > +++ b/xen/arch/x86/microcode_amd.c > @@ -178,32 +178,39 @@ static int apply_microcode(int cpu) > uint32_t rev; > struct microcode_amd *mc_amd = uci->mc.mc_amd; > struct microcode_header_amd *hdr; > + int error = -EINVAL; > > /* We should bind the task to the CPU */ > BUG_ON(raw_smp_processor_id() != cpu); > > if ( mc_amd == NULL ) > - return -EINVAL; > + goto apply_err1; > > hdr = mc_amd->mpb; > if ( hdr == NULL ) > - return -EINVAL; > + goto apply_err1; > > spin_lock_irqsave(&microcode_update_lock, flags); > > - wrmsrl(MSR_AMD_PATCHLOADER, (unsigned long)hdr); > + error = wrmsr_safe(MSR_AMD_PATCHLOADER, (unsigned long)hdr); > > /* get patch id after patching */ > rdmsrl(MSR_AMD_PATCHLEVEL, rev); > > spin_unlock_irqrestore(&microcode_update_lock, flags); > > + /* Catch HW patch application failure */ > + if ( error ) { > + printk(KERN_ERR "microcode: CPU%d ucode patch application failed HW > tests. " > + "HW returned #GP\n", cpu); > + goto apply_err2; This... > + } > + > /* check current patch id and patch's id for match */ > if ( rev != hdr->patch_id ) > { > - printk(KERN_ERR "microcode: CPU%d update from revision " > - "%#x to %#x failed\n", cpu, hdr->patch_id, rev); > - return -EIO; > + error = -EIO; > + goto apply_err2; > } > > printk(KERN_WARNING "microcode: CPU%d updated from revision %#x to > %#x\n", > @@ -212,6 +219,12 @@ static int apply_microcode(int cpu) > uci->cpu_sig.rev = rev; > > return 0; > + > +apply_err2: > + printk(KERN_ERR "microcode: CPU%d update from revision " > + "%#x to %#x failed\n", cpu, rev, hdr->patch_id); ... combined with this will result in two error messages being printed. This seems over overkill for the circumstance. ~Andrew. > +apply_err1: > + return error; > } > > static int get_ucode_from_buffer_amd( _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.