Xen project Mailing List

Re: [PATCH v5 06/14] vpci/header: implement guest BAR register handlers

To: Oleksandr Andrushchenko <Oleksandr_Andrushchenko@xxxxxxxx>

Date: Mon, 31 Jan 2022 14:36:12 +0100

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=4G+f5m8hWIDUXAPivUIcHKl8jCbB4ejms75XP71uiJM=; b=GlDUSA95nt1FbLWuyRja9oyVlln/suU6S4AfWDWfIw6upRJSzRdalUhGVV2cUYN+sSdaxypzYhlzwCVmxMibRqQ/6eBh3UJj40KnwuQ+8dAbCURoXACCKockFkFDT/39I0jJfGXITt5zTEZGg9uyyIYKK1DT0KRdASRMyD4aL9EKhw8ujRj5XE5dVexeyL3hNw3auljlppjHF6+/xYlgobiwsbbi+niwEfYgltoCbEBR1ybb/CxEt+AZYQCqwCaIi3Y9avY2zNC+pk+RLNOMc66GeECg5aarm1uPRbKTahB5Vx/en5RIzj+MFNZi8ObACxxB1VfZC1p/AE/WM3N/rQ==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=aDl8Gonx8ITJix+TIFkvZVK1XZj7Ie3RaXoUXnhV+WWfWjqPZ6cbB7Shz3hwcqMUg63KBirz2/cLPEltDf3xzgiZxku91IxgOYh7JSr5EuKH8EiDbKWvUbYIVOOcdp5soQgAOnacpKhO+nBLGJuasGQPwhVIdE4JWSFT10yK6sw9RoyBKeiyOnc6pF/hi53mX7GXqjBXDdRn/UlPJ/bpIZg+2okgrlR/jkSi0xpLEWprL1/U4RzpaMMay/ypOVQYj/o26kl1iWz39Qk/9WuaFGH6tiXzYpS7mTJXPGjNv/2NHTZMqXR8TgXapTSOFwhE1kYTgM1YOfDCt6ObjbR/tQ==

Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;

Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "julien@xxxxxxx" <julien@xxxxxxx>, "sstabellini@xxxxxxxxxx" <sstabellini@xxxxxxxxxx>, Oleksandr Tyshchenko <Oleksandr_Tyshchenko@xxxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, Artem Mygaiev <Artem_Mygaiev@xxxxxxxx>, "andrew.cooper3@xxxxxxxxxx" <andrew.cooper3@xxxxxxxxxx>, "george.dunlap@xxxxxxxxxx" <george.dunlap@xxxxxxxxxx>, "paul@xxxxxxx" <paul@xxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>, Rahul Singh <rahul.singh@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>

Delivery-date: Mon, 31 Jan 2022 13:36:23 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 31.01.2022 14:30, Oleksandr Andrushchenko wrote: > > > On 31.01.22 13:39, Jan Beulich wrote: >> On 31.01.2022 12:23, Oleksandr Andrushchenko wrote: >>> On 31.01.22 13:10, Roger Pau Monné wrote: >>>> Right (see my previous reply to this comment). I think it would be >>>> easier (and cleaner) if you switched the default behavior regarding >>>> unhandled register access for domUs at the start of the series (drop >>>> writes, reads returns ~0), and then you won't need to add all those >>>> dummy handler to drop writes and return ~0 for reads. >>>> >>>> It's going to be more work initially as you would need to support >>>> passthrough of more registers, but it's the right approach that we >>>> need implementation wise. >>> While I agree in general, this effectively means that I'll need to provide >>> handling for all PCIe registers and capabilities from the very start. >>> Otherwise no guest be able to properly initialize a PCI device without that. >>> Of course, we may want starting from stubs instead of proper emulation, >>> which will direct the access to real HW and later on we add proper >>> emulation. >>> But, again, this is going to be a rather big piece of code where we need >>> to explicitly handle every possible capability. >> Since the two sub-threads are now about exactly the same topic, I'm >> answering here instead of there. >> >> No, you are not going to need to emulate all possible capabilities. >> We (or really qemu) don't do this on x86 either. Certain capabilities >> may be a must, but not everything. There are also device specific >> registers not covered by any capability structures - what to do with >> those is even more of a question. >> >> Furthermore for some of the fields justification why access to the >> raw hardware value is fine is going to be easy: r/o fields like >> vendor and device ID, for example. But every bit you allow direct >> access to needs to come with justification. >> >>> At the moment we are not going to claim that vPCI provides all means to >>> pass through a PCI device safely with this respect and this is why the >>> feature >>> itself won't even be a tech preview yet. For that reason I think we can >>> still >>> have implemented only crucial set of handlers and still allow the rest to >>> be read/write directly without emulation. >> I think you need to separate what you need for development from what >> goes upstream: For dev purposes you can very well invert the policy >> from white- to black-listing. But if we accepted the latter into the >> main tree, the risk would be there that something gets missed at the >> time where the permission model gets changed around. >> >> You could even have a non-default mode operating the way you want it >> (along the lines of pciback's permissive mode), allowing you to get >> away without needing to carry private patches. Things may also >> initially only work in that mode. But the default should be a mode >> which is secure (and which perhaps initially offers only very limited >> functionality). > Ok, so to make it clear: > 1. We do not allow unhandled access for guests: for that I will create a > dedicated patch which will implement such restrictions. Something like > the below (for both vPCI read and write): > > diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c > index c5e67491c24f..9ef2a1b5af58 100644 > --- a/xen/drivers/vpci/vpci.c > +++ b/xen/drivers/vpci/vpci.c > @@ -347,6 +347,7 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, > unsigned int size) > const struct vpci_register *r; > unsigned int data_offset = 0; > uint32_t data = ~(uint32_t)0; > + bool handled = false; > > if ( !size ) > { > @@ -405,6 +406,8 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, > unsigned int size) > if ( cmp > 0 ) > continue; > > + handled = true; /* Found the handler for this access. */ > + > if ( emu.offset < r->offset ) > { > /* Heading gap, read partial content from hardware. */ > @@ -432,6 +435,10 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, > unsigned int size) > } > spin_unlock(&pdev->vpci_lock); > > + /* All unhandled guest requests return all 1's. */ > + if ( !is_hardware_domain(d) && !handled ) > + return ~(uint32_t)0; > + > if ( data_offset < size ) > { > /* Tailing gap, read the remaining. */ Except that like for the "tailing gap" you also need to avoid the "heading gap" ending up in a read of the underlying hardware register. Effectively you want to deal properly with all vpci_read_hw() invocations (including the one when no pdev was found, which for a DomU may simply mean domain_crash()). Jan

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.