Xen project Mailing List

Re: RFC: PCI devices passthrough on Arm design proposal

To: Oleksandr Andrushchenko <andr2000@xxxxxxxxx>

From: Rahul Singh <Rahul.Singh@xxxxxxx>

Date: Fri, 17 Jul 2020 13:28:12 +0000

Accept-language: en-US

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=o8/Kziu+oSoJnQr/G2G+fkiXFcIjx2GdzkhMcZYaIqs=; b=FjhiU1yfYFtPONEoqBtpnn6OB/5rZW8Z+Lc7TtUUkwCPN7Rut6X83+r6zuwKgn8dcTwRIw1UPb2mu+TR1uc4LKZ3ZbiQZIjCjokxoRjdXnUUS5Rv9T6Dy7G3/0qpxDu+/NnM1tAx0JZHZnYHZvrDlZEPqZlyAvRsI80lQn1isEC5VKN9ZTsgE5lEWazcIvtmg3SreUSh0URpp/oJuZ1zLrBg1Hqn1cMOnighqxXbHMqVbxYvZQurKSfV0oSMb+ufa0F1hzdxpM9pZdKVxRc857J2jO5QZQ1brNlOYnILh3KoUW5GLbH2eAsc/kGMLsYb0n7wG4kauWRa5Nvog5vPLw==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=mK1YdXahLQaL2y2K4hoSgh7DUnpNFpXuIxrcan0w77sNgh6H8+bi2fmvLiU7NwlipErv+Qxx5dyw08mnBb+/EgtVtxF3kRA3VoDaVIV0Qldl4vwKiJRgPm9gfUWdPo8eE7hri/27Q95u5VVpCdRWqavUGCGHVZVbSnfuLCGsXRWRQzNprQ4lzaoA4lMEnKIS+IOEpTwSFgfL/L5PiQFCwyT8ejvLxlzn8bkzDOenAYCzP3Go8D+RZThs1uKzaxwPmZXDXOuToPyrTssRdITSfjKEU9HzQLrlhFnGiNKyflBde9Ia83InrgOFNHzlEgJRZ3fCzNFCNkarx+hp0oPXxw==

Authentication-results-original: gmail.com; dkim=none (message not signed) header.d=none;gmail.com; dmarc=none action=none header.from=arm.com;

Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>, Julien Grall <julien.grall.oss@xxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, nd <nd@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>

Delivery-date: Fri, 17 Jul 2020 13:28:32 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Nodisclaimer: true

Original-authentication-results: gmail.com; dkim=none (message not signed) header.d=none;gmail.com; dmarc=none action=none header.from=arm.com;

Thread-index: AQHWW4kYTVU0hTDyYEitKlUuU5vZlKkKf2uAgAACLICAAOrEgIAACn+AgABOT4A=

Thread-topic: RFC: PCI devices passthrough on Arm design proposal

> On 17 Jul 2020, at 9:47 am, Oleksandr Andrushchenko <andr2000@xxxxxxxxx> > wrote: > > > On 7/17/20 11:10 AM, Jan Beulich wrote: >> On 16.07.2020 19:10, Rahul Singh wrote: >>> # Discovering PCI devices: >>> >>> PCI-PCIe enumeration is a process of detecting devices connected to its >>> host. It is the responsibility of the hardware domain or boot firmware to >>> do the PCI enumeration and configure the BAR, PCI capabilities, and >>> MSI/MSI-X configuration. >>> >>> PCI-PCIe enumeration in XEN is not feasible for the configuration part as >>> it would require a lot of code inside Xen which would require a lot of >>> maintenance. Added to this many platforms require some quirks in that part >>> of the PCI code which would greatly improve Xen complexity. Once hardware >>> domain enumerates the device then it will communicate to XEN via the below >>> hypercall. >>> >>> #define PHYSDEVOP_pci_device_add 25 >>> struct physdev_pci_device_add { >>> uint16_t seg; >>> uint8_t bus; >>> uint8_t devfn; >>> uint32_t flags; >>> struct { >>> uint8_t bus; >>> uint8_t devfn; >>> } physfn; >>> /* >>> * Optional parameters array. >>> * First element ([0]) is PXM domain associated with the device (if * >>> XEN_PCI_DEV_PXM is set) >>> */ >>> uint32_t optarr[XEN_FLEX_ARRAY_DIM]; >>> }; >>> >>> As the hypercall argument has the PCI segment number, XEN will access the >>> PCI config space based on this segment number and find the host-bridge >>> corresponding to this segment number. At this stage host bridge is fully >>> initialized so there will be no issue to access the config space. >>> >>> XEN will add the PCI devices in the linked list maintain in XEN using the >>> function pci_add_device(). XEN will be aware of all the PCI devices on the >>> system and all the device will be added to the hardware domain. >> Have you had any thoughts about Dom0 re-arranging the bus numbering? >> This is, afaict, a still open issue on x86 as well. > > This can get even trickier as we may have PCI enumerated at boot time > > by the firmware and then Dom0 may perform the enumeration differently. > > So, Xen needs to be aware of what is going to be used as the source of the > > enumeration data and be ready to re-build its internal structures in order > > to be aligned with that entity: e.g. compare Dom0 and Dom0less use-cases > The idea is that as soon as Xen has done his enumeration (it being on boot or after Dom0 signal), no domain will be able to modify the physical PCI bus anymore. - Rahul >> >>> Limitations: >>> * When PCI devices are added to XEN, MSI capability is not initialized >>> inside XEN and not supported as of now. >> I think this is a pretty severe limitation, as modern devices tend to >> not support pin based interrupts anymore. >> >>> # Emulated PCI device tree node in libxl: >>> >>> Libxl is creating a virtual PCI device tree node in the device tree to >>> enable the guest OS to discover the virtual PCI during guest boot. We >>> introduced the new config option [vpci="pci_ecam"] for guests. When this >>> config option is enabled in a guest configuration, a PCI device tree node >>> will be created in the guest device tree. >> I support Stefano's suggestion for this to be an optional thing, i.e. >> there to be no need for it when there are PCI devices assigned to the >> guest anyway. I also wonder about the pci_ prefix here - isn't >> vpci="ecam" as unambiguous? >> >>> A new area has been reserved in the arm guest physical map at which the >>> VPCI bus is declared in the device tree (reg and ranges parameters of the >>> node). A trap handler for the PCI ECAM access from guest has been >>> registered at the defined address and redirects requests to the VPCI driver >>> in Xen. >>> >>> Limitation: >>> * Only one PCI device tree node is supported as of now. >>> >>> BAR value and IOMEM mapping: >>> >>> Linux guest will do the PCI enumeration based on the area reserved for ECAM >>> and IOMEM ranges in the VPCI device tree node. Once PCI device is assigned >>> to the guest, XEN will map the guest PCI IOMEM region to the real physical >>> IOMEM region only for the assigned devices. >>> >>> As of now we have not modified the existing VPCI code to map the guest PCI >>> IOMEM region to the real physical IOMEM region. We used the existing guest >>> “iomem” config option to map the region. >>> For example: >>> Guest reserved IOMEM region: 0x04020000 >>> Real physical IOMEM region:0x50000000 >>> IOMEM size:128MB >>> iomem config will be: iomem = ["0x50000,0x8000@0x4020"] >> This surely is planned to go away before the code hits upstream? The >> ranges really should be read out of the BARs, as I see the >> "limitations" section further down suggests, but it's not clear >> whether "limitations" are items that you plan to take care of before >> submitting your code for review. >> >> Jan

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.