[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v10 13/17] vpci: add initial support for virtual PCI bus topology

To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
From: Stefano Stabellini <sstabellini@xxxxxxxxxx>
Date: Wed, 29 Nov 2023 18:28:10 -0800 (PST)
Cc: Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stewart Hildebrand <stewart.hildebrand@xxxxxxx>, Oleksandr Andrushchenko <Oleksandr_Andrushchenko@xxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Wei Liu <wl@xxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
Delivery-date: Thu, 30 Nov 2023 02:28:28 +0000
List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Wed, 29 Nov 2023, Roger Pau Monné wrote:
> On Tue, Nov 28, 2023 at 11:45:34PM +0000, Volodymyr Babchuk wrote:
> > Hi Roger,
> > 
> > Roger Pau Monné <roger.pau@xxxxxxxxxx> writes:
> > 
> > > On Wed, Nov 22, 2023 at 01:18:32PM -0800, Stefano Stabellini wrote:
> > >> On Wed, 22 Nov 2023, Roger Pau Monné wrote:
> > >> > On Tue, Nov 21, 2023 at 05:12:15PM -0800, Stefano Stabellini wrote:
> > >> > > Let me expand on this. Like I wrote above, I think it is important 
> > >> > > that
> > >> > > Xen vPCI is the only in-use PCI Root Complex emulator. If it makes 
> > >> > > the
> > >> > > QEMU implementation easier, it is OK if QEMU emulates an unneeded and
> > >> > > unused PCI Root Complex. From Xen point of view, it doesn't exist.
> > >> > > 
> > >> > > In terms if ioreq registration, QEMU calls
> > >> > > xendevicemodel_map_pcidev_to_ioreq_server for each PCI BDF it wants 
> > >> > > to
> > >> > > emulate. That way, Xen vPCI knows exactly what PCI config space
> > >> > > reads/writes to forward to QEMU.
> > >> > > 
> > >> > > Let's say that:
> > >> > > - 00:02.0 is PCI passthrough device
> > >> > > - 00:03.0 is a PCI emulated device
> > >> > > 
> > >> > > QEMU would register 00:03.0 and vPCI would know to forward anything
> > >> > > related to 00:03.0 to QEMU, but not 00:02.0.
> > >> > 
> > >> > I think there's some work here so that we have a proper hierarchy
> > >> > inside of Xen.  Right now both ioreq and vpci expect to decode the
> > >> > accesses to the PCI config space, and setup (MM)IO handlers to trap
> > >> > ECAM, see vpci_ecam_{read,write}().
> > >> > 
> > >> > I think we want to move to a model where vPCI doesn't setup MMIO traps
> > >> > itself, and instead relies on ioreq to do the decoding and forwarding
> > >> > of accesses.  We need some work in order to represent an internal
> > >> > ioreq handler, but that shouldn't be too complicated.  IOW: vpci
> > >> > should register devices it's handling with ioreq, much like QEMU does.
> > >> 
> > >> I think this could be a good idea.
> > >> 
> > >> This would be the very first IOREQ handler implemented in Xen itself,
> > >> rather than outside of Xen. Some code refactoring might be required,
> > >> which worries me given that vPCI is at v10 and has been pending for
> > >> years. I think it could make sense as a follow-up series, not v11.
> > >
> > > That's perfectly fine for me, most of the series here just deal with
> > > the logic to intercept guest access to the config space and is
> > > completely agnostic as to how the accesses are intercepted.
> > >
> > >> I think this idea would be beneficial if, in the example above, vPCI
> > >> doesn't really need to know about device 00:03.0. vPCI registers via
> > >> IOREQ the PCI Root Complex and device 00:02.0 only, QEMU registers
> > >> 00:03.0, and everything works. vPCI is not involved at all in PCI config
> > >> space reads and writes for 00:03.0. If this is the case, then moving
> > >> vPCI to IOREQ could be good.
> > >
> > > Given your description above, with the root complex implemented in
> > > vPCI, we would need to mandate vPCI together with ioreqs even if no
> > > passthrough devices are using vPCI itself (just for the emulation of
> > > the root complex).  Which is fine, just wanted to mention the
> > > dependency.
> > >
> > >> On the other hand if vPCI actually needs to know that 00:03.0 exists,
> > >> perhaps because it changes something in the PCI Root Complex emulation
> > >> or vPCI needs to take some action when PCI config space registers of
> > >> 00:03.0 are written to, then I think this model doesn't work well. If
> > >> this is the case, then I think it would be best to keep vPCI as MMIO
> > >> handler and let vPCI forward to IOREQ when appropriate.
> > >
> > > At first approximation I don't think we would have such interactions,
> > > otherwise the whole premise of ioreq being able to register individual
> > > PCI devices would be broken.
> > >
> > > XenSever already has scenarios with two different user-space emulators
> > > (ie: two different ioreq servers) handling accesses to different
> > > devices in the same PCI bus, and there's no interaction with the root
> > > complex required.

Good to hear

 
> > Out of curiosity: how legacy PCI interrupts are handled in this case? In
> > my understanding, it is Root Complex's responsibility to propagate
> > correct IRQ levels to an interrupt controller?
> 
> I'm unsure whether my understanding of the question is correct, so my
> reply might not be what you are asking for, sorry.
> 
> Legacy IRQs (GSI on x86) are setup directly by the toolstack when the
> device is assigned to the guest, using PHYSDEVOP_map_pirq +
> XEN_DOMCTL_bind_pt_irq.  Those hypercalls bind together a host IO-APIC
> pin to a guest IO-APIC pin, so that interrupts originating from that
> host IO-APIC pin are always forwarded to the guest an injected as
> originating from the guest IO-APIC pin.
> 
> Note that the device will always use the same IO-APIC pin, this is not
> configured by the OS.

QEMU calls xen_set_pci_intx_level which is implemented by
xendevicemodel_set_pci_intx_level, which is XEN_DMOP_set_pci_intx_level,
which does set_pci_intx_level. Eventually it calls hvm_pci_intx_assert
and hvm_pci_intx_deassert.

I don't think any of this goes via the Root Complex otherwise, like
Roger pointed out, it wouldn't be possible to emulated individual PCI
devices in separate IOREQ servers.

References:
- Re: [PATCH v10 13/17] vpci: add initial support for virtual PCI bus topology
  - From: Volodymyr Babchuk
- Re: [PATCH v10 13/17] vpci: add initial support for virtual PCI bus topology
  - From: Stefano Stabellini
- Re: [PATCH v10 13/17] vpci: add initial support for virtual PCI bus topology
  - From: Volodymyr Babchuk
- Re: [PATCH v10 13/17] vpci: add initial support for virtual PCI bus topology
  - From: Stefano Stabellini
- Re: [PATCH v10 13/17] vpci: add initial support for virtual PCI bus topology
  - From: Volodymyr Babchuk
- Re: [PATCH v10 13/17] vpci: add initial support for virtual PCI bus topology
  - From: Stefano Stabellini
- Re: [PATCH v10 13/17] vpci: add initial support for virtual PCI bus topology
  - From: Roger Pau Monné
- Re: [PATCH v10 13/17] vpci: add initial support for virtual PCI bus topology
  - From: Stefano Stabellini
- Re: [PATCH v10 13/17] vpci: add initial support for virtual PCI bus topology
  - From: Roger Pau Monné
- Re: [PATCH v10 13/17] vpci: add initial support for virtual PCI bus topology
  - From: Volodymyr Babchuk
- Re: [PATCH v10 13/17] vpci: add initial support for virtual PCI bus topology
  - From: Roger Pau Monné

Prev by Date: Re: [PATCH v6 00/17] Device tree based NUMA support for Arm
Next by Date: [linux-5.4 test] 183929: tolerable FAIL - PUSHED
Previous by thread: Re: [PATCH v10 13/17] vpci: add initial support for virtual PCI bus topology
Next by thread: Re: [PATCH v10 13/17] vpci: add initial support for virtual PCI bus topology
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.