[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops



* Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:

> On Thu, Sep 17, 2015 at 12:19 AM, Ingo Molnar <mingo@xxxxxxxxxx> wrote:
> >
> > * Andy Lutomirski <luto@xxxxxxxxxx> wrote:
> >
> >> Setting CONFIG_PARAVIRT=y has an unintended side effect: it silently
> >> turns all rdmsr and wrmsr operations into the safe variants without
> >> any checks that the operations actually succeed.
> >>
> >> This is IMO awful: it papers over bugs.  In particular, KVM gueests
> >> might be unwittingly depending on this behavior because
> >> CONFIG_KVM_GUEST currently depends on CONFIG_PARAVIRT.  I'm not
> >> aware of any such problems, but applying this series would be a good
> >> way to shake them out.
> >>
> >> Fix it so that the MSR operations work the same on CONFIG_PARAVIRT=n
> >> and CONFIG_PARAVIRT=y as long as Xen isn't being used.  The Xen
> >> maintainers are welcome to make a similar change on top of this.
> >>
> >> Since there's plenty of time before the next merge window, I think
> >> we should apply and fix anything that breaks.
> >
> > No, I think we should at most generate a warning instead, and not crash the 
> > kernel
> > via rdmsr()!
> >
> > Most big distro kernels on bare metal have CONFIG_PARAVIRT=y (I checked 
> > Ubuntu and
> > Fedora), so we are potentially exposing a lot of users to problems.
> >
> > Crashing the bootup on an unknown MSR is bad. Many MSR reads and writes are
> > non-critical and returning the 'safe' result is much better than crashing or
> > hanging the bootup.
> >
> 
> Should we do that for CONFIG_PARAVIRT=n, too?

Absolutely. PARAVIRT=n should not behave differently from PARAVIRT=y on bare 
metal.

> It would be straightforward to rig this up (temporarily?) on top of these 
> patches.  To keep bloat down, we might want to implement it in 
> do_general_protection rather than sticking it in native_read_msr.

Fair enough.

> wrmsr is a different beast, since we can fail due to writing the wrong value 
> to 
> an otherwise valid MSR.  Given that MSR screwups can very easily be security 
> holes, I'm not sure that warning and blindly continuing on an unchecked 
> failed 
> wrmsr is a good idea.

So the fact is that right now we are silently ignoring failures there, and have 
been doing that for some time. The right first step is to live with that and 
start 
generating low-key, once-per-bootup warnings at most, and see how frequent (and 
how serious) they are.

We could add a (default disabled) CONFIG_PANIC_ON_UNKNOWN_MSR=y option if 
that's 
really a serious concern.

> In any event, I think it's nuts that CONFIG_PARAVIRT changes this
> behavior.  We should pick something sane and stick with it.

Absolutely - and as it happens, the 'does not crash the kernel' PARAVIRT=y 
accidental behavior is actually quite close to what we wanted for a long time, 
so 
let's make it official - and add a warning to make sure we are aware of 
problems. 

But don't turn 'potential problems' into showstopper bugs such as a hard to 
debug 
early boot crash, which to most Linux users means a black screen on bootup!

> > ( We should double check that rdmsr()/wrmsr() results are never left
> >   uninitialized, but are set to zero or so, for cases where the return code 
> > is not
> >   checked. )
> 
> It sure looks like native_read_msr_safe doesn't clear the output if the rdmsr 
> fails.

Yeah, that should be fixed, to make such soft failures deterministic.

Thanks,

        Ingo

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.