|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Pinned, non-revocable mappings of VRAM: will bad things happen?
On 4/21/26 12:55, Val Packett wrote: > > On 4/20/26 4:12 PM, Demi Marie Obenour wrote: >> On 4/20/26 14:53, Christian König wrote: >>> On 4/20/26 20:46, Demi Marie Obenour wrote: >>>> On 4/20/26 13:58, Christian König wrote: >>>>> On 4/20/26 19:03, Demi Marie Obenour wrote: >>>>>> On 4/20/26 04:49, Christian König wrote: >>>>>>> On 4/17/26 21:35, Demi Marie Obenour wrote: >>>>> ... >>>>>>>> Are any of the following reasonable options? >>>>>>>> >>>>>>>> 1. Change the guest kernel to only map (and thus pin) a small subset >>>>>>>> of VRAM at any given time. If unmapped VRAM is accessed the guest >>>>>>>> traps the page fault, evicts an old VRAM mapping, and creates a >>>>>>>> new one. >>>>>>> Yeah, that could potentially work. >>>>>>> >>>>>>> This is basically what we do on the host kernel driver when we can't >>>>>>> resize the BAR for some reason. In that use case VRAM buffers are >>>>>>> shuffled in and out of the CPU accessible window of VRAM on demand. >>>>>> How much is this going to hurt performance? >>>>> Hard to say, resizing the BAR can easily give you 10-15% more performance >>>>> on some use cases. >>>>> >>>>> But that involves physically transferring the data using a DMA. For this >>>>> solution we basically only have to we basically only have to transfer a >>>>> few messages between host and guest. >>>>> >>>>> No idea how performant that is. >>>> In this use-case, 20-30% performance penalties are likely to be >>>> "business as usual". >>> Well that is quite a bit. >>> >>>> Close to native performance would be ideal, but >>>> to be useful it just needs to beat software rendering by a wide margin, >>>> and not cause data corruption or vulnerabilities. >>> That should still easily be the case, even trivial use cases are multiple >>> magnitudes faster on GPUs compared to software rendering. >> Makes sense. If only GPUs supported easy and flexible virtualization the >> way CPUs do :(. >> >>>>>>> But I have one question: When XEN has a problem handling faults from >>>>>>> the guest on the host then how does that work for system memory >>>>>>> mappings? >>>>>>> >>>>>>> There is really no difference between VRAM and system memory in the >>>>>>> handling for the GPU driver stack. >>>>>>> >>>>>>> Regards, >>>>>>> Christian. >>>>>> Generally, Xen makes the frontend (usually an unprivileged VM) >>>>>> responsible for providing mappings to the backend (usually the host). >>>>>> That is possible with system RAM but not with VRAM, because Xen has >>>>>> no awareness of VRAM. To Xen, VRAM is just a PCI BAR. >>>>> No, that doesn't work with system memory allocations of GPU drivers >>>>> either. >>>>> >>>>> We already had it multiple times that people tried to be clever and >>>>> incremented the page reference counter on driver allocated system memory >>>>> and were totally surprised that this can result in security issues and >>>>> data corruption. >>>>> >>>>> I seriously hope that this isn't the case here again. As far as I know >>>>> XEN already has support for accessing VMAs with VM_PFN or otherwise I >>>>> don't know how driver allocated system memory access could potentially >>>>> work. >>>>> >>>>> Accessing VRAM is pretty much the same use case as far as I can see. >>>>> >>>>> Regards, >>>>> Christian. >>>> The Xen-native approach would be for system memory allocations to >>>> be made using the Xen driver and then imported into the virtio-GPU >>>> driver via dmabuf. Is there any chance this could be made to happen? >>> That could be. Adding Pierre-Eric to comment since he knows that use much >>> better than I do. >>> >>>> If it's a lost cause, then how much is the memory overhead of pinning >>>> everything ever used in a dmabuf? It should be possible to account >>>> pinned host memory against a guest's quota, but if that leads to an >>>> unusable system it isn't going to be good. >>> That won't work at all. >>> >>> We have use cases where you *must* migrate a DMA-buf to VRAM or otherwise >>> the GPU can't use it. >>> >>> A simple scanout to a monitor is such an use case for example, that is >>> usually not possible from system memory. >> Direct scanout isn't a concern here. >> >>>> Is supporting page faults in Xen the only solution that will be viable >>>> long-term, considering the tolerance for very substantial performance >>>> overheads compared to native? AAA gaming isn't the initial goal here. >>>> Qubes OS already supports PCI passthrough for that. >>> We have AAA gaming working on XEN through native context working for quite >>> a while. >>> >>> Pierre-Eric can tell you more about that. >>> >>> Regards, >>> Christian. >> I've heard of that, but last I checked it required downstream patches >> to Xen, Linux, and QEMU. I don't know if any of those have been >> upstreamed since, but I believe that upstreaming the Xen and Linux >> patches (or rewriting them and upstreaming the rewritten version) would >> be necessary. Qubes OS (which I don't work for anymore but still want >> to help with this) almost certainly won't be using QEMU for GPU stuff. > > Yeah, our plan is to use xen-vhost-frontend[1] + vhost-device-gpu, > ported/extended/modified as necessary. (I already have > xen-vhost-frontend itself working on amd64 PVH with purely xenbus-based > hotplug/configuration, currently working on cleaning up and submitting > the necessary patches.) > > I'm curious to hear more details about how AMD has it working but last > time I checked, there weren't any missing pieces in Xen or Linux that > we'd need.. The AMD downstream changes were mostly related to QEMU. > > As for the memory management concerns, I would like to remind everyone > once again that the pinning of GPU dmabufs in regular graphics workloads > would be *very* short-term. In GPU paravirtualization (native contexts > or venus or whatever else) the guest mostly operates on *opaque handles* > that refer to buffers owned by the host GPU process. The typical > rendering process (roughly) only involves submitting commands to the GPU > that refer to memory using these handles. Only upon mmap() would a > buffer be pinned/granted to the guest, and those are typically only used > for *uploads* where the guest immediately does its memcpy() and unmaps > the buffer. > > So I'm not worried about (unintentionally) pinning too much GPU driver > memory. > > In terms of deliberate denial-of-service attacks from the guest to the > host, the only reasonable response is: > > ¯\_(ツ)_/¯ > > CPU-mapping lots of GPU memory is far from the only DoS vector, the GPU > commands themselves can easily wedge the GPU core in a million ways (and > last time I checked amdgpu was noooot so good at recovering from hangs). > > > [1]: https://github.com/vireshk/xen-vhost-frontend > > ~val I think it is best to handle things like GPU crashes by giving the guest some time to unmap its grants, and if that fails, crashing it. This should be done from a revoke callback, as afterwards the VRAM might get reused. Does amdgpu call revoke callbacks when the device is reset and VRAM is lost? It seems like it at least ought to. As an aside, Qubes needs to use the process isolation mode of the amdgpu driver. This means that only one process will be on the GPU at a time, so it _should_ be possible to blow away all GPU-resident state except VRAM without affecting other processes. Unfortunately, I think AMD GPUs might have HW or FW limitations that prevent that, at least on dGPUs. It might make sense to recommend KDE with GPU acceleration. KWin can recover from losing VRAM. -- Sincerely, Demi Marie Obenour (she/her/hers) Attachment:
OpenPGP_0xB288B55FFF9C22C1.asc Attachment:
OpenPGP_signature.asc
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |