[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v10 13/17] vpci: add initial support for virtual PCI bus topology


  • To: Stefano Stabellini <sstabellini@xxxxxxxxxx>
  • From: Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>
  • Date: Tue, 21 Nov 2023 00:42:51 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=epam.com; dmarc=pass action=none header.from=epam.com; dkim=pass header.d=epam.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=NlqiU2KpMUfzMqvO05ohk5EBTPYD7tyvu4/9yTd/ico=; b=RYPi8ZjCe59FWzJin2X1Jj4g60HYzZCLnV69s3+u89AxPoBYcPouAhXqzYLyQOYmMJd0KruN5BKVsB9zSEv2hRYi8QYSjKzwCVznhK19RXAk/oqrNLBJM7xboa9Dzd+hoXf9bPvMXHuQktmU6B8dEB4ZKmbM7KALq1HUW0BVT5alb6Ag0/Br9Dx4XkIr7u7WlBGUtUumYixu4smpPjzn48vETS+cZ9QpnRuxu+KW8DkFvFSB34alGxDr4FJbPDsWxwxOmy/IYTaHWCuHz3Le8o0PBMGd32fA41yYyhkaa9xu7MWynQJTRkNfJElASC+rLqw5T2OL39QAT2znrD8+Zw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=C6IIWs5jVqXEaokbFu53TDZtrcmZD9beYxqNQPZ8iG0cUYMM9wPyKV0hzvb86HOyMzoxNm7t4AT7wkKl8PoYRw9QDDElFS60860bxzTr7BAA/J3dUtmoclLNgEhr6ShM2qoJzYww9jDCTgf+jXiXyBsoELPW4cCz3or/64sym0VCFHyTkQ6LCeNHyCsI+x59s0ADBcDkVgvqkgRdfPNaHykBgpxk7Z/+GKjjPFC4s3W+v5v7O7gaKn2IPH7n/JaGmI0Yu//Z4aew0PBtJHh4nonIcxhK6BXVIMk92je1Wc+kxtvJoe1ehWFzjv9kvthbj2mSSlTv/IvQsFIICu1SOw==
  • Cc: Julien Grall <julien@xxxxxxx>, Stewart Hildebrand <stewart.hildebrand@xxxxxxx>, Oleksandr Andrushchenko <Oleksandr_Andrushchenko@xxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Wei Liu <wl@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Tue, 21 Nov 2023 00:43:13 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHZ/Vi+wP3QsyKZ0EartcLXqtnBRLB9UyCAgAB7YgCAAAiUAIAAEHKAgADZXQCAAEyPgIAAGSwAgAAczgCAAAOwAIAALzmAgASxF4A=
  • Thread-topic: [PATCH v10 13/17] vpci: add initial support for virtual PCI bus topology

Hi Stefano,

Stefano Stabellini <sstabellini@xxxxxxxxxx> writes:

> On Fri, 17 Nov 2023, Volodymyr Babchuk wrote:
>> > On Fri, 17 Nov 2023, Volodymyr Babchuk wrote:
>> >> Hi Julien,
>> >> 
>> >> Julien Grall <julien@xxxxxxx> writes:
>> >> 
>> >> > Hi Volodymyr,
>> >> >
>> >> > On 17/11/2023 14:09, Volodymyr Babchuk wrote:
>> >> >> Hi Stefano,
>> >> >> Stefano Stabellini <sstabellini@xxxxxxxxxx> writes:
>> >> >> 
>> >> >>> On Fri, 17 Nov 2023, Volodymyr Babchuk wrote:
>> >> >>>>> I still think, no matter the BDF allocation scheme, that we should 
>> >> >>>>> try
>> >> >>>>> to avoid as much as possible to have two different PCI Root Complex
>> >> >>>>> emulators. Ideally we would have only one PCI Root Complex emulated 
>> >> >>>>> by
>> >> >>>>> Xen. Having 2 PCI Root Complexes both of them emulated by Xen would 
>> >> >>>>> be
>> >> >>>>> tolerable but not ideal.
>> >> >>>>
>> >> >>>> But what is exactly wrong with this setup?
>> >> >>>
>> >> >>> [...]
>> >> >>>
>> >> >>>>> The worst case I would like to avoid is to have
>> >> >>>>> two PCI Root Complexes, one emulated by Xen and one emulated by 
>> >> >>>>> QEMU.
>> >> >>>>
>> >> >>>> This is how our setup works right now.
>> >> >>>
>> >> >>> If we have:
>> >> >>> - a single PCI Root Complex emulated in Xen
>> >> >>> - Xen is safety certified
>> >> >>> - individual Virtio devices emulated by QEMU with grants for memory
>> >> >>>
>> >> >>> We can go very far in terms of being able to use Virtio in safety
>> >> >>> use-cases. We might even be able to use Virtio (frontends) in a 
>> >> >>> SafeOS.
>> >> >>>
>> >> >>> On the other hand if we put an additional Root Complex in QEMU:
>> >> >>> - we pay a price in terms of complexity of the codebase
>> >> >>> - we pay a price in terms of resource utilization
>> >> >>> - we have one additional problem in terms of using this setup with a
>> >> >>>    SafeOS (one more device emulated by a non-safe component)
>> >> >>>
>> >> >>> Having 2 PCI Root Complexes both emulated in Xen is a middle ground
>> >> >>> solution because:
>> >> >>> - we still pay a price in terms of resource utilization
>> >> >>> - the code complexity goes up a bit but hopefully not by much
>> >> >>> - there is no impact on safety compared to the ideal scenario
>> >> >>>
>> >> >>> This is why I wrote that it is tolerable.
>> >> >> Ah, I see now. Yes, I am agree with this. Also I want to add some
>> >> >> more
>> >> >> points:
>> >> >> - There is ongoing work on implementing virtio backends as a
>> >> >> separate
>> >> >>    applications, written in Rust. Linaro are doing this part. Right now
>> >> >>    they are implementing only virtio-mmio, but if they want to provide
>> >> >>    virtio-pci as well, they will need a mechanism to plug only
>> >> >>    virtio-pci, without Root Complex. This is argument for using single 
>> >> >> Root
>> >> >>    Complex emulated in Xen.
>> >> >> - As far as I know (actually, Oleksandr told this to me), QEMU has
>> >> >> no
>> >> >>    mechanism for exposing virtio-pci backends without exposing PCI root
>> >> >>    complex as well. Architecturally, there should be a PCI bus to which
>> >> >>    virtio-pci devices are connected. Or we need to make some changes to
>> >> >>    QEMU internals to be able to create virtio-pci backends that are not
>> >> >>    connected to any bus. Also, added benefit that PCI Root Complex
>> >> >>    emulator in QEMU handles legacy PCI interrupts for us. This is
>> >> >>    argument for separate Root Complex for QEMU.
>> >> >> As right now we have only virtio-pci backends provided by QEMU and
>> >> >> this
>> >> >> setup is already working, I propose to stick to this
>> >> >> solution. Especially, taking into account that it does not require any
>> >> >> changes to hypervisor code.
>> >> >
>> >> > I am not against two hostbridge as a temporary solution as long as
>> >> > this is not a one way door decision. I am not concerned about the
>> >> > hypervisor itself, I am more concerned about the interface exposed by
>> >> > the toolstack and QEMU.
>> >
>> > I agree with this...
>> >
>> >
>> >> > To clarify, I don't particular want to have to maintain the two
>> >> > hostbridges solution once we can use a single hostbridge. So we need
>> >> > to be able to get rid of it without impacting the interface too much.
>> >
>> > ...and this
>> >
>> >
>> >> This depends on virtio-pci backends availability. AFAIK, now only one
>> >> option is to use QEMU and QEMU provides own host bridge. So if we want
>> >> get rid of the second host bridge we need either another virtio-pci
>> >> backend or we need to alter QEMU code so it can live without host
>> >> bridge.
>> >> 
>> >> As for interfaces, it appears that QEMU case does not require any changes
>> >> into hypervisor itself, it just boils down to writing couple of xenstore
>> >> entries and spawning QEMU with correct command line arguments.
>> >
>> > One thing that Stewart wrote in his reply that is important: it doesn't
>> > matter if QEMU thinks it is emulating a PCI Root Complex because that's
>> > required from QEMU's point of view to emulate an individual PCI device.
>> >
>> > If we can arrange it so the QEMU PCI Root Complex is not registered
>> > against Xen as part of the ioreq interface, then QEMU's emulated PCI
>> > Root Complex is going to be left unused. I think that would be great
>> > because we still have a clean QEMU-Xen-tools interface and the only
>> > downside is some extra unused emulation in QEMU. It would be a
>> > fantastic starting point.
>> 
>> I believe, that in this case we need to set manual ioreq handlers, like
>> what was done in patch "xen/arm: Intercept vPCI config accesses and
>> forward them to emulator", because we need to route ECAM accesses
>> either to a virtio-pci backend or to a real PCI device. Also we need
>> to tell QEMU to not install own ioreq handles for ECAM space.
>
> I was imagining that the interface would look like this: QEMU registers
> a PCI BDF and Xen automatically starts forwarding to QEMU ECAM
> reads/writes requests for the PCI config space of that BDF only. It
> would not be the entire ECAM space but only individual PCI conf
> reads/writes that the BDF only.
>

Okay, I see that there is the
xendevicemodel_map_pcidev_to_ioreq_server() function and corresponding
IOREQ_TYPE_PCI_CONFIG call. Is this what you propose to use to register
PCI BDF?

I see that xen-hvm-common.c in QEMU is able to handle only standard 256
bytes configuration space, but I hope that it will be easy fix.

-- 
WBR, Volodymyr


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.