[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 1/4] x86/MSI-X: be more careful during teardown



>>> On 13.04.15 at 12:50, <stefano.stabellini@xxxxxxxxxxxxx> wrote:
> On Mon, 13 Apr 2015, Jan Beulich wrote:
>> >>> On 02.04.15 at 18:49, <stefano.stabellini@xxxxxxxxxxxxx> wrote:
>> > On Wed, 25 Mar 2015, Jan Beulich wrote:
>> >> When a device gets detached from a guest, pciback will clear its
>> >> command register, thus disabling both memory and I/O decoding. The
>> >> disabled memory decoding, however, has an effect on the MSI-X table
>> >> accesses the hypervisor does: These won't have the intended effect
>> >> anymore. Even worse, for PCIe devices (but not SR-IOV virtual
>> >> functions) such accesses may (will?) be treated as Unsupported
>> >> Requests, causing respective errors to be surfaced, potentially in the
>> >> form of NMIs that may be fatal to the hypervisor or Dom0 is different
>> >> ways. Hence rather than carrying out these accesses, we should avoid
>> >> them where we can, and use alternative (e.g. PCI config space based)
>> >> mechanisms to achieve at least the same effect.
>> > 
>> > I don't think that it is a good idea for both Xen and Linux to access
>> > the command register simultaneously.  Working around Linux in Xen
>> > doesn't sound like an optimal solution.   Maybe we could just fix the
>> > pciback and that would be enough.
>> 
>> I'm afraid that would just eliminate the specific case, but not the
>> general issue.
> 
> If we trust Dom0 to do the right thing, then I don't think there is a
> general issue to be solved. Dom0 can break the system at any time, I
> don't see any differences here, unless we have a plan to actually be
> able to handle a misbehaving dom0, in that case I am all for it.

No, that gets us in the wrong direction. Dom0 can have legitimate
reasons to have to clear memory or I/O decoding on a device at
run time (even if current Linux doesn't do so). The more general
problem we may need to solve is that of racing config space
accesses (one by Dom0, the other by the hypervisor). But that's
beyond this series' scope.

>> While we trust Dom0 to not do outright bad things,
>> the hypervisor should still avoid doing things that can go wrong
>> due to the state a device is put (or left) in by Dom0.
> 
> Xen should also avoid doing things that can go wrong because of the
> state a device is put in by QEMU or other components in the system.
> There isn't much room for Xen to play with.

Qemu is either part of Dom0, or doesn't play with devices directly.

> And how are we going to deal with older "unfixed" QEMUs?
> So far we have been using the same policy for QEMU and the Dom0 kernel:
> Xen doesn't break them -- old Linux kernels and QEMUs are supposed to
> just work.

I'm not sure that's really true for qemu, or if it is, then only by pure
luck: The tool stack interface of the hypervisor as well as the libxc
interfaces are subject to change between any two releases. I view
it as unavoidable to break older qemu here.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.