[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Proposal for physical address based hypercalls
Hi Jan,
On 28/09/2022 11:38, Jan Beulich wrote:
For quite some time we've been talking about replacing the present virtual
address based hypercall interface with one using physical addresses. This is in
particular a prerequisite to being able to support guests with encrypted
memory, as for such guests we cannot perform the page table walks necessary to
translate virtual to (guest-)physical addresses. But using (guest) physical
addresses is also expected to help performance of non-PV guests (i.e. all Arm
ones plus HVM/PVH on x86), because of the no longer necessary address
translation.
I am not sure this is going to be a gain in performance on Arm. In most
of the cases we are using the HW to translate the guest virtual address
to a host physical address. But there are no instruction to translate a
guest physical address to a host physical address. So we will have to do
the translation in software.
That said, there are other reasons on Arm (and possibly x86) to get rid
of the virtual address. At the moment, we are requiring the VA to be
always valid. This is quite fragile as we can't fully control how the
kernel is touching its page-table (remember that on Arm we need to use
break-before-make to do any shattering).
I have actually seen in the past some failure during the translation on
Arm32. But I never fully investigated it because they were hard to repro
as they rarely happen.
Clearly to be able to run existing guests, we need to continue to support the
present virtual address based interface. Previously it was suggested to change
the model on a per-domain basis, perhaps by a domain creation control. This
has two major shortcomings:
- Entire guest OSes would need to switch over to the new model all in one go.
This could be particularly problematic for in-guest interfaces like Linux'es
privcmd driver, which is passed hypercall argument from user space. Such
necessarily use virtual addresses, and hence the kernel would need to learn
of all hypercalls legitimately coming in, in order to translate the buffer
addresses. Reaching sufficient coverage there might take some time.
- All base components within an individual guest instance which might run in
succession (firmware, boot loader, kernel, kexec) would need to agree on the
hypercall ABI to use.
As an alternative I'd like to propose the introduction of a bit (or multiple
ones, see below) augmenting the hypercall number, to control the flavor of the
buffers used for every individual hypercall. This would likely involve the
introduction of a new hypercall page (or multiple ones if more than one bit is
to be used), to retain the present abstraction where it is the hypervisor which
actually fills these pages. For multicalls the wrapping multicall itself would
be controlled independently of the constituent hypercalls.
A model involving just a single bit to indicate "flat" buffers has limitations
when it comes to large buffers passed to a hypercall. Since in many cases
hypercalls (currently) allowing for rather large buffers wouldn't normally be
used with buffers significantly larger than a single page (several of the
mem-ops for example), special casing the (presumably) few hypercalls which have
an actual need for large buffers might be an option.
Another approach would be to build in a scatter/gather model for buffers right
away. Jürgen suggests that the low two address bits could be used as a
"descriptor" here.
IIUC, with this approach we would still need to have a bit in the
hypercall number to indicate this is not a virtual address. Is that correct?
Cheers,
--
Julien Grall
|