[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Design session notes: GPU acceleration in Xen



On Fri, Jun 14, 2024 at 09:21:56AM +0200, Roger Pau Monné wrote:
> On Fri, Jun 14, 2024 at 08:38:51AM +0200, Jan Beulich wrote:
> > On 13.06.2024 20:43, Demi Marie Obenour wrote:
> > > GPU acceleration requires that pageable host memory be able to be mapped
> > > into a guest.
> > 
> > I'm sure it was explained in the session, which sadly I couldn't attend.
> > I've been asking Ray and Xenia the same before, but I'm afraid it still
> > hasn't become clear to me why this is a _requirement_. After all that's
> > against what we're doing elsewhere (i.e. so far it has always been
> > guest memory that's mapped in the host). I can appreciate that it might
> > be more difficult to implement, but avoiding to violate this fundamental
> > (kind of) rule might be worth the price (and would avoid other
> > complexities, of which there may be lurking more than what you enumerate
> > below).
> 
> My limited understanding (please someone correct me if wrong) is that
> the GPU buffer (or context I think it's also called?) is always
> allocated from dom0 (the owner of the GPU).

A GPU context is a GPU address space.  It's the GPU equivalent of a CPU
process.  I don't believe that the same context can be used by more than
one userspace process (though I could be wrong), but the same userspace
process can create and use as many contexts as it wants.

> The underling memory
> addresses of such buffer needs to be mapped into the guest.  The
> buffer backing memory might be GPU MMIO from the device BAR(s) or
> system RAM, and such buffer can be paged by the dom0 kernel at any
> time (iow: changing the backing memory from MMIO to RAM or vice
> versa).  Also, the buffer must be contiguous in physical address
> space.
> 
> I'm not sure it's possible to ensure that when using system RAM such
> memory comes from the guest rather than the host, as it would likely
> require some very intrusive hooks into the kernel logic, and
> negotiation with the guest to allocate the requested amount of
> memory and hand it over to dom0.  If the maximum size of the buffer is
> known in advance maybe dom0 can negotiate with the guest to allocate
> such a region and grant it access to dom0 at driver attachment time.

I don't think there is a useful maximum size known.  There may be a
limit, but it would be around 4GiB or more, which is far too high to
reserve physical memory for up front.

> One aspect that I'm lacking clarity is better understanding of how the
> process of allocating and assigning a GPU buffer to a guest is
> performed (I think this is the key to how GPU VirtIO native contexts
> work?).

The buffer is allocated by the GPU driver in response to an ioctl() made
by the userspace server process.  If the buffer needs to be accessed by
the guest CPU (not all do), it is mapped into part of an emulated PCI
BAR for access by the guest.  This mailing list thread is about making
that possible.

> Another question I have, are guest expected to have a single GPU
> buffer, or they can have multiple GPU buffers simultaneously
> allocated?

I believe there is only one emulated BAR, but this is very large (GiBs)
and sparsely populated.  There can be many GPU buffers mapped into the
BAR.
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab

Attachment: signature.asc
Description: PGP signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.