[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v10 13/17] vpci: add initial support for virtual PCI bus topology


  • To: Stefano Stabellini <sstabellini@xxxxxxxxxx>
  • From: Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>
  • Date: Fri, 17 Nov 2023 22:22:29 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=epam.com; dmarc=pass action=none header.from=epam.com; dkim=pass header.d=epam.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=1FqHYIBCmIUVYNTi3k8KIt0QsjH65j22jk28JKOxZcQ=; b=YHKJez3MV6W0AWoDj/scP7kfrL7IMPp7cri+fHL6l2TVZFUdrBeSDetv2MNUhLr+GTTTvxwv9dncTwBJ0fO2wWplExPu9OPUzlGJNxTWczbgpYKrJlU9RBmrkXfUfy8Mj0OUUkTt8pSJ15J95/NcpMhbNdLg6lYBIt21tqxsZlEHUk1ziqnyD137UpWLrOBVuAuJDbJ3fFveffCL8/vw5u4Zbc5OYfzRRG5pyRTdjDsJ9qqKTzBn+JEs6dr+hgbtx8DBFfE4mXd5ugMF2Unh6HH0EPTh4i/RjeAGrOq1R9vyLk+P2zAyYKw7vHkjhdXi6oht2dhaatrdvQo3+5yNeg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=V4MU0ArhOwmAJUPQynPHWBDdRZLS6GPEDV8eUp4OkYF7lyBfFJpVWqg51RNrBS1DS9LP1iHTh3w74sofLZBMFi7cD5qJ2/TdvXes5ijomEO2XgLZMvgJnRVxlCPH4GlecJZDTJv3MY/CRO1rwh/nxWwvTEzeNWVcSFwH2lAqSZ3vSEhQg4ib2W1Q/DR7sBDXk3O2kG4SCS4/akq6+ZIbBi6sVHb+1GE8Ka1uPhs/jlsa210ntuNb1G0tyWXqMWH20Mf1dk1gfucV/UKASQp19lW0tSUPEMcl9r83vK5zdYl80Db3j4m1RqmOfFXpQIUS0WGReQk86Bjrdl6c2TFT8w==
  • Cc: Julien Grall <julien@xxxxxxx>, Stewart Hildebrand <stewart.hildebrand@xxxxxxx>, Oleksandr Andrushchenko <Oleksandr_Andrushchenko@xxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Wei Liu <wl@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Fri, 17 Nov 2023 22:22:52 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHZ/Vi+wP3QsyKZ0EartcLXqtnBRLB9UyCAgAB7YgCAAAiUAIAAEHKAgADZXQCAAEyPgIAAGSwAgAAczgCAAAOwAA==
  • Thread-topic: [PATCH v10 13/17] vpci: add initial support for virtual PCI bus topology

Hi Stefano,

Stefano Stabellini <sstabellini@xxxxxxxxxx> writes:

> On Fri, 17 Nov 2023, Volodymyr Babchuk wrote:
>> Hi Julien,
>> 
>> Julien Grall <julien@xxxxxxx> writes:
>> 
>> > Hi Volodymyr,
>> >
>> > On 17/11/2023 14:09, Volodymyr Babchuk wrote:
>> >> Hi Stefano,
>> >> Stefano Stabellini <sstabellini@xxxxxxxxxx> writes:
>> >> 
>> >>> On Fri, 17 Nov 2023, Volodymyr Babchuk wrote:
>> >>>>> I still think, no matter the BDF allocation scheme, that we should try
>> >>>>> to avoid as much as possible to have two different PCI Root Complex
>> >>>>> emulators. Ideally we would have only one PCI Root Complex emulated by
>> >>>>> Xen. Having 2 PCI Root Complexes both of them emulated by Xen would be
>> >>>>> tolerable but not ideal.
>> >>>>
>> >>>> But what is exactly wrong with this setup?
>> >>>
>> >>> [...]
>> >>>
>> >>>>> The worst case I would like to avoid is to have
>> >>>>> two PCI Root Complexes, one emulated by Xen and one emulated by QEMU.
>> >>>>
>> >>>> This is how our setup works right now.
>> >>>
>> >>> If we have:
>> >>> - a single PCI Root Complex emulated in Xen
>> >>> - Xen is safety certified
>> >>> - individual Virtio devices emulated by QEMU with grants for memory
>> >>>
>> >>> We can go very far in terms of being able to use Virtio in safety
>> >>> use-cases. We might even be able to use Virtio (frontends) in a SafeOS.
>> >>>
>> >>> On the other hand if we put an additional Root Complex in QEMU:
>> >>> - we pay a price in terms of complexity of the codebase
>> >>> - we pay a price in terms of resource utilization
>> >>> - we have one additional problem in terms of using this setup with a
>> >>>    SafeOS (one more device emulated by a non-safe component)
>> >>>
>> >>> Having 2 PCI Root Complexes both emulated in Xen is a middle ground
>> >>> solution because:
>> >>> - we still pay a price in terms of resource utilization
>> >>> - the code complexity goes up a bit but hopefully not by much
>> >>> - there is no impact on safety compared to the ideal scenario
>> >>>
>> >>> This is why I wrote that it is tolerable.
>> >> Ah, I see now. Yes, I am agree with this. Also I want to add some
>> >> more
>> >> points:
>> >> - There is ongoing work on implementing virtio backends as a
>> >> separate
>> >>    applications, written in Rust. Linaro are doing this part. Right now
>> >>    they are implementing only virtio-mmio, but if they want to provide
>> >>    virtio-pci as well, they will need a mechanism to plug only
>> >>    virtio-pci, without Root Complex. This is argument for using single 
>> >> Root
>> >>    Complex emulated in Xen.
>> >> - As far as I know (actually, Oleksandr told this to me), QEMU has
>> >> no
>> >>    mechanism for exposing virtio-pci backends without exposing PCI root
>> >>    complex as well. Architecturally, there should be a PCI bus to which
>> >>    virtio-pci devices are connected. Or we need to make some changes to
>> >>    QEMU internals to be able to create virtio-pci backends that are not
>> >>    connected to any bus. Also, added benefit that PCI Root Complex
>> >>    emulator in QEMU handles legacy PCI interrupts for us. This is
>> >>    argument for separate Root Complex for QEMU.
>> >> As right now we have only virtio-pci backends provided by QEMU and
>> >> this
>> >> setup is already working, I propose to stick to this
>> >> solution. Especially, taking into account that it does not require any
>> >> changes to hypervisor code.
>> >
>> > I am not against two hostbridge as a temporary solution as long as
>> > this is not a one way door decision. I am not concerned about the
>> > hypervisor itself, I am more concerned about the interface exposed by
>> > the toolstack and QEMU.
>
> I agree with this...
>
>
>> > To clarify, I don't particular want to have to maintain the two
>> > hostbridges solution once we can use a single hostbridge. So we need
>> > to be able to get rid of it without impacting the interface too much.
>
> ...and this
>
>
>> This depends on virtio-pci backends availability. AFAIK, now only one
>> option is to use QEMU and QEMU provides own host bridge. So if we want
>> get rid of the second host bridge we need either another virtio-pci
>> backend or we need to alter QEMU code so it can live without host
>> bridge.
>> 
>> As for interfaces, it appears that QEMU case does not require any changes
>> into hypervisor itself, it just boils down to writing couple of xenstore
>> entries and spawning QEMU with correct command line arguments.
>
> One thing that Stewart wrote in his reply that is important: it doesn't
> matter if QEMU thinks it is emulating a PCI Root Complex because that's
> required from QEMU's point of view to emulate an individual PCI device.
>
> If we can arrange it so the QEMU PCI Root Complex is not registered
> against Xen as part of the ioreq interface, then QEMU's emulated PCI
> Root Complex is going to be left unused. I think that would be great
> because we still have a clean QEMU-Xen-tools interface and the only
> downside is some extra unused emulation in QEMU. It would be a
> fantastic starting point.

I believe, that in this case we need to set manual ioreq handlers, like
what was done in patch "xen/arm: Intercept vPCI config accesses and
forward them to emulator", because we need to route ECAM accesses either
to a virtio-pci backend or to a real PCI device. Also we need to tell
QEMU to not install own ioreq handles for ECAM space.

Another point is PCI legacy interrupts, which should be emulated on Xen
side. And unless I miss something, we will need some new mechanism to
signal those interrupts from QEMU/other backend. I am not sure if we can
use already existing IRQ signaling mechanism, because PCI interrupts are
ORed for all devices on a bridge and are level-sensitive.

Of course, we will need all of this anyways, if we want to support
standalone virtio-pci backends, but for me it sounds more like "finish
point" :)

-- 
WBR, Volodymyr


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.