[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] misc/xenmicrocode: Upload /lib/firmware/<some blob> to the hypervisor

On Wed, Jan 28, 2015 at 09:39:24AM +0100, Borislav Petkov wrote:
> On Wed, Jan 28, 2015 at 12:10:43AM +0000, Andrew Cooper wrote:
> > There was a thread on xen-devel but I cant currently find it in the
> > archives.
> > 
> > To the best of my memory,  it was a 4 core APU system where the BIOS had
> > updated the microcode on cpu 0 but left 1-3 at a lower patch level. 
> > Every time the reporter tried creating an HVM guest (i.e. entering SVM
> > non-root mode), the system reset.
> > 
> > The instability was sorted by ensuring each core was at the same
> > microcode level.
> That sounds like a BIOS bug to me, frankly.
> > As Xen updates microcode one cpu at a time from 0, it could easily
> > create a similar situation if microcode is updated after VMs have been
> > started.  Come to think of it, this is also an impending problem for PVH
> > dom0 systems.
> The common way for doing microcode updates is to update all cores at
> the same time, possibly. Or at least as close to one another in time as
> possible.

How close?

> Now, we do two methods:
> * the early update which should be done as early as possible during
> boot. I don't think that should be a problem wrt to guests if you do it
> early enough.
> * the late update is an addition to the early one to cover the cases of
> long running systems where a reboot is prohibitively painful. With that,
> as with the early method, you would want to update all hardware cores in
> one go.
> Now, this is where it becomes tricky for virt: you need to stop guests,
> do the update and then resume them. Even worse, if all of a sudden you
> want to hide hardware features and/or instructions like HSW TSX for
> example, you most likely want to even avoid the late update and warn the
> admin that she has to reboot that machine and apply microcode with the
> early method.
> So this should be the gist of it...

I've reviewed the implmentation a bit more on the Xen side. For early boot
things look similar to what is done upstream on the kernel. For the run time
update here's what Xen does in detail, elaborating a bit more on Andrew's
summary of how it works.

The XENPF_microcode_update hypercall calls the general Xen microcode_update() 
will do microcode_ops->start_update() (only AMD has this op for and it does
svm_host_osvw_reset()) and finally it continues the hypercall by calling
do_microcode_update() on the cpumask_first(&cpu_online_map) *always*. The
mechanism that Xen uses to continue the hypercall is by using
continue_hypercall_on_cpu(), if this returns 0 then it is guaranteed to run
*at some in the future* on the given CPU. If preemption is enabled this
could also mean the hypercall was preempted, and can be preempted later
on the other CPU. This will in turn will do the same call but on the
next CPU using continuation until it reaches the end of the CPU mask.
The do_microcode_update() call itself calls ops->cpu_request_microcode()
on each iteration which in turn should also do the ops->apply_microcode()
once a microcode buffer on the file that fits is found. The buffers are
kept in case of suspend / resume.

There is no tight loop here or locking of what other CPUs do while one is 
work to update microcode. Tons of things can happen in between so some 
seem desirable and likely this implementation does differ quite signifantly
over the Linux kernel's legacy 'rescan' interface.

Given this review, it seems folks should use xenmicrocode keeping in mind the
above algorithm, and support wise folks should be ready to consider upgrades on
microcode and possible issues / caveats from vendors on a case by case basis.

From what I gather some folks have even considered tainting kernels when the
sysfs rescan interface is used, I do wonder if this is worthy on Xen for this
tool given the possible issues here... or am I just paranoid about this?
It seems like this might be more severe of an issue for Xen as-is.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.