[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Is: pci=assign-busses blows up Xen 4.4 Was:Re: [PATCH] x86/msi: Validate the guest-identified PCI devices in pci_prepare_msix()

>>> On 05.02.14 at 21:07, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> wrote:
> Anyhow, what I discovered was that the patch attached does allow me to
> boot with Xen. It is not pretty.

There were two patches attached - an ugly kernel one, and a
bogus hypervisor one. The hypervisor one is bogus because it
attempts to paper over Dom0 not fulfilling its role (here: failure
to propagate bus topology changes). That said ...

> But I was thinking to fix in the hypervisor and realized there are three
> ways of fixing it:
>  1). Do the hypercall to delete/add devices and let initial domain figure
>      this out. (which the Linux attached patch does).
>  2). Be more aware of the bus2bridge topology when removing a PCI bridge or
>      device. I had one bug where we ended up with this bus2bridge structure:
>       6 -> 06:00.0
>       7 -> 06:00.0
>       8 -> 07:01.0
> Which meant that for devices on bus 8, 7 and 6 we would never find the 
> upstream bridge. The reason is that 6 -> 06 points to itself so
> find_upstream_bridge ends up looping 255 times around and returns -1.

... I agree that removal of bridges could likely do with some
improvements: In particular, all devices behind a bridge should
be removed at the same time.

> I am not entirely sure I undertand why we do that. In 'free_pdev' we do
> this:
>       for ( ; sec_bus <= sub_bus; sec_bus++ )
>               pseg->bus2bridge[sec_bus] = pseg->bus2bridge[pdev->bus];
> and then:
>  list_del(&pdev->alldevs_list);
>  xfree(pdev->msix);
>  xfree(pdev);
> so if the device that is being deleted is the bridge - we point the secondary
> and subordinate to the deleted device.

No - we point it to the upstream bridge of the deleted one. Which
ought to properly reflect reality (in that this is what now becomes
responsible for all the buses previously covered by the bridge being

> But if the deleted device is the
> upstream bridge we end up leaving a stale bus2bridge context.

I don't think we do, but I also don't have too much faith in the
correctness of that code, so by way of an example I may be
convinced that there is a problem here.

> That is OK normally as 'alloc_pdev' would over-write it (if the secondary
> and subordinate did not change). But in 'assign-busses' case they change so
> we are left with an 'stale' one.
> This means when the same device is added (but with a new bus value) we
> end fixing up the secondary to subordinate busses to point to us (06).
> But '06' which used to be a secondary bus, still retains the old value.

Once again - we shouldn't talk about overwriting of previously
wrong values. The values should be kept correct, by means of
the Dom0 kernel propagating all updates it does.

> 3). Trap on PCI_SECONDARY_BUS and PCI_SUBORDINATE_BUS writes and
>     fixup the structures.
>     I hadn't attempted that but that could also be done. That way Xen
>     is aware of those changes and can update its PCI structures.

That would be horrible - for the MMCONFIG case we'd need to
mark all respective bridges' config spaces read-only (and emulate
writes). We avoided that previously, and we should avoid that
here. It still all boils down to Dom0 needing to propagate correct/
complete information.

And please be clear about one point: The bus scan the hypervisor
does is unavoidable in order to set up the IOMMU for Dom0 such
that it can access certain devices (namely ones needed for console
access) _before_ it actually does its own bus scan (including the
reporting of the devices to the hypervisor). Hence Dom0 has to
take into account that the hypervisor already may have knowledge
about device -> bus assignments.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.