[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [early RFC] ARM PCI Passthrough design document
On Thu, Feb 02, 2017 at 03:12:52PM -0800, Stefano Stabellini wrote: > On Thu, 2 Feb 2017, Edgar E. Iglesias wrote: > > On Wed, Feb 01, 2017 at 07:04:43PM +0000, Julien Grall wrote: > > > Hi Edgar, > > > > > > On 31/01/2017 19:06, Edgar E. Iglesias wrote: > > > >On Tue, Jan 31, 2017 at 05:09:53PM +0000, Julien Grall wrote: > > > >>On 31/01/17 16:53, Edgar E. Iglesias wrote: > > > >>>On Wed, Jan 25, 2017 at 06:53:20PM +0000, Julien Grall wrote: > > > >>>>On 24/01/17 20:07, Stefano Stabellini wrote: > > > >>>>>On Tue, 24 Jan 2017, Julien Grall wrote: > > > >>>>For generic host bridge, the initialization is inexistent. However > > > >>>>some host > > > >>>>bridge (e.g xgene, xilinx) may require some specific setup and also > > > >>>>configuring clocks. Given that Xen only requires to access the > > > >>>>configuration > > > >>>>space, I was thinking to let DOM0 initialization the host bridge. > > > >>>>This would > > > >>>>avoid to import a lot of code in Xen, however this means that we need > > > >>>>to > > > >>>>know when the host bridge has been initialized before accessing the > > > >>>>configuration space. > > > >>> > > > >>> > > > >>>Yes, that's correct. > > > >>>There's a sequence on the ZynqMP that involves assiging Gigabit > > > >>>Transceivers > > > >>>to PCI (GTs are shared among PCIe, USB, SATA and the Display Port), > > > >>>enabling clocks and configuring a few registers to enable ECAM and MSI. > > > >>> > > > >>>I'm not sure if this could be done prior to starting Xen. Perhaps. > > > >>>If so, bootloaders would have to know a head of time what devices > > > >>>the GTs are supposed to be configured for. > > > >> > > > >>I've got further questions regarding the Gigabit Transceivers. You > > > >>mention > > > >>they are shared, do you mean that multiple devices can use a GT at the > > > >>same > > > >>time? Or the software is deciding at startup which device will use a > > > >>given > > > >>GT? If so, how does the software make this decision? > > > > > > > >Software will decide at startup. AFAIK, the allocation is normally done > > > >once but I guess that in theory you could design boards that could switch > > > >at runtime. I'm not sure we need to worry about that use-case though. > > > > > > > >The details can be found here: > > > >https://www.xilinx.com/support/documentation/user_guides/ug1085-zynq-ultrascale-trm.pdf > > > > > > > >I suggest looking at pages 672 and 733. > > > > > > Thank you for the documentation. I am trying to understand if we could > > > move > > > initialization in Xen as suggested by Stefano. I looked at the driver in > > > Linux and the code looks simple not many dependencies. However, I was not > > > able to find where the Gigabit Transceivers are configured. Do you have > > > any > > > link to the code for that? > > > > Hi Julien, > > > > I suspect that this setup has previously been done by the initial bootloader > > auto-generated from design configuration tools. > > > > Now, this is moving into Linux. > > There's a specific driver that does that but AFAICS, it has not been > > upstreamed yet. > > You can see it here: > > https://github.com/Xilinx/linux-xlnx/blob/master/drivers/phy/phy-zynqmp.c > > > > DTS nodes that need a PHY can then just refer to it, here's an example from > > SATA: > > &sata { > > phy-names = "sata-phy"; > > phys = <&lane3 PHY_TYPE_SATA 1 3 150000000>; > > }; > > > > I'll see if I can find working examples for PCIe on the ZCU102. Then I'll > > share > > DTS, Kernel etc. > > > > If you are looking for a platform to get started, an option could be if I > > get you a build of > > our QEMU that includes models for the PCIe controller, MSI and SMMU > > connections. > > These models are friendly wrt. PHY configs and initialization sequences, it > > will > > accept pretty much any sequence and still work. This would allow you to > > focus on > > architectural issues rather than exact details of init sequences (which we > > can > > deal with later). > > > > > > > > > > > > This would also mean that the MSI interrupt controller will be moved in > > > Xen. > > > Which I think is a more sensible design (see more below). > > > > > > >> > > > >>>> - For all other host bridges => I don't know if there are host > > > >>>> bridges > > > >>>>falling under this category. I also don't have any idea how to handle > > > >>>>this. > > > >>>> > > > >>>>> > > > >>>>>Otherwise, if Dom0 is the only one to drive the physical host bridge, > > > >>>>>and Xen is the one to provide the emulated host bridge, how are DomU > > > >>>>>PCI > > > >>>>>config reads and writes supposed to work in details? > > > >>>> > > > >>>>I think I have answered to this question with my explanation above. > > > >>>>Let me > > > >>>>know if it is not the case. > > > >>>> > > > >>>>>How is MSI configuration supposed to work? > > > >>>> > > > >>>>For GICv3 ITS, the MSI will be configured with the eventID (it is uniq > > > >>>>per-device) and the address of the doorbell. The linkage between the > > > >>>>LPI and > > > >>>>"MSI" will be done through the ITS. > > > >>>> > > > >>>>For GICv2m, the MSI will be configured with an SPIs (or offset on some > > > >>>>GICv2m) and the address of the doorbell. Note that for DOM0 SPIs are > > > >>>>mapped > > > >>>>1:1. > > > >>>> > > > >>>>So in both case, I don't think it is necessary to trap MSI > > > >>>>configuration for > > > >>>>DOM0. This may not be true if we want to handle other MSI controller. > > > >>>> > > > >>>>I have in mind the xilinx MSI controller (embedded in the host > > > >>>>bridge? [4]) > > > >>>>and xgene MSI controller ([5]). But I have no idea how they work and > > > >>>>if we > > > >>>>need to support them. Maybe Edgar could share details on the Xilinx > > > >>>>one? > > > >>> > > > >>> > > > >>>The Xilinx controller has 2 dedicated SPIs and pages for MSIs. AFAIK, > > > >>>there's no > > > >>>way to protect the MSI doorbells from mal-configured end-points > > > >>>raising malicious EventIDs. > > > >>>So perhaps trapped config accesses from domUs can help by adding this > > > >>>protection > > > >>>as drivers configure the device. > > > >>> > > > >>>On Linux, Once MSI's hit, the kernel takes the SPI interrupts, reads > > > >>>out the EventID from a FIFO in the controller and injects a new IRQ > > > >>>into > > > >>>the kernel. > > > >> > > > >>It might be early to ask, but how do you expect MSI to work with DOMU > > > >>on > > > >>your hardware? Does your MSI controller supports virtualization? Or are > > > >>you > > > >>looking for a different way to inject MSI? > > > > > > > >MSI support in HW is quite limited to support domU and will require SW > > > >hacks :-( > > > > > > > >Anyway, something along the lines of this might work: > > > > > > > >* Trap domU CPU writes to MSI descriptors in config space. > > > > Force real MSI descriptors to the address of the door bell area. > > > > Force real MSI descriptors to use a specific device unique Event ID > > > > allocated by Xen. > > > > Remember what EventID domU requested per device and descriptor. > > > > > > > >* Xen or Dom0 take the real SPI generated when device writes into the > > > >doorbell area. > > > > At this point, we can read out the EventID from the MSI FIFO and map > > > > it to the one requested from domU. > > > > Xen or Dom0 inject the expected EventID into domU > > > > > > > >Do you have any good ideas? :-) > > > > > > From my understanding your MSI controller is embedded in the hostbridge, > > > right? If so, the MSIs would need to be handled where the host bridge will > > > be initialized (e.g either Xen or DOM0). > > > > Yes, it is. > > > > > > > > From a design point of view, it would make more sense to have the MSI > > > controller driver in Xen as the hostbridge emulation for guest will also > > > live there. > > > > > > So if we receive MSI in Xen, we need to figure out a way for DOM0 and > > > guest > > > to receive MSI. The same way would be the best, and I guess non-PV if > > > possible. I know you are looking to boot unmodified OS in a VM. This would > > > mean we need to emulate the MSI controller and potentially xilinx PCI > > > controller. How much are you willing to modify the OS? > > > > Today, we have not yet implemented PCIe drivers for our baremetal SDK. So > > things are very open and we could design with pretty much anything in mind. > > > > Yes, we could perhaps include a very small model with most registers > > dummied. > > Implementing the MSI read FIFO would allow us to: > > > > 1. Inject the MSI doorbell SPI into guests. The guest will then see the same > > IRQ as on real HW. > > > > 2. Guest reads host-controller registers (MSI FIFO) to get the signaled MSI. > > > > > > > > > Regarding the MSI doorbell, I have seen it is configured by the software > > > using a physical address of a page allocated in the RAM. When the PCI > > > devices is writing into the doorbell does the access go through the SMMU? > > > > That's a good question. On our QEMU model it does, but I'll have to dig a > > little to see if that is the case on real HW aswell. > > > > > Regardless the answer, I think we would need to map the MSI doorbell page > > > in > > > the guest. Meaning that even if we trap MSI configuration access, a guess > > > could DMA in the page. So if I am not mistaken, MSI would be insecure in > > > this case :/. > > > > > > Or maybe we could avoid mapping the doorbell in the guest and let Xen > > > receive an SMMU abort. When receiving the SMMU abort, Xen could sanitize > > > the > > > value and write into the real MSI doorbell. Not sure if it would works > > > thought. > > > > Yeah, this is a problem. > > I'm not sure if SMMU aborts would work because I don't think we know the > > value of the data written when we take the abort. > > Without the data, I'm not sure how we would distinguish between different > > MSI's from the same device. > > > > Also, even if the MSI doorbell would be protected by the SMMU, all PCI > > devices are presented with the same AXI Master ID. > > Does that mean that from the SMMU perspective you can only assign them > all or none? Unfortunately yes. > > BTW, this master-ID SMMU limitation is a showstopper for domU guests isn't > > it? > > Or do you have ideas around that? Perhaps some PV way to request mappings > > for DMA? > > No, we don't have anything like that. There are too many device specific > ways to request DMAs to do that. For devices that cannot be effectively > protected by IOMMU, (on x86) we support assignment but only in an > insecure fashion. OK, I see. A possible hack could be to allocate a chunk of DDR dedicated for PCI DMA. PCI DMA devs could be locked in to only be able to access this mem + MSI doorbell. Guests can still screw each other up but at least it becomes harder to read/write directly from each others OS memory. It may not be worth the effort though.... Cheers, Edgar _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |