[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Proposal for physical address based hypercalls
On 28.09.2022 15:03, Juergen Gross wrote: > On 28.09.22 14:06, Jan Beulich wrote: >> On 28.09.2022 12:58, Andrew Cooper wrote: >>> On 28/09/2022 11:38, Jan Beulich wrote: >>>> As an alternative I'd like to propose the introduction of a bit (or >>>> multiple >>>> ones, see below) augmenting the hypercall number, to control the flavor of >>>> the >>>> buffers used for every individual hypercall. This would likely involve the >>>> introduction of a new hypercall page (or multiple ones if more than one >>>> bit is >>>> to be used), to retain the present abstraction where it is the hypervisor >>>> which >>>> actually fills these pages. >>> >>> There are other concerns which need to be accounted for. >>> >>> Encrypted VMs cannot use a hypercall page; they don't trust the >>> hypervisor in the first place, and the hypercall page is (specifically) >>> code injection. So the sensible new ABI cannot depend on a hypercall table. >> >> I don't think there's a dependency, and I think there never really has been. >> We've been advocating for its use, but we've not enforced that anywhere, I >> don't think. >> >>> Also, rewriting the hypercall page on migrate turns out not to have been >>> the most clever idea, and only works right now because the instructions >>> are the same length in the variations for each mode. >>> >>> Also continuations need to change to avoid userspace liveness problems, >>> and existing hypercalls that we do have need splitting between things >>> which are actually privileged operations (within the guest context) and >>> things which are logical control operations, so the kernel can expose >>> the latter to userspace without retaining the gaping root hole which is >>> /dev/xen/privcmd, and a blocker to doing UEFI Secureboot. >>> >>> So yes, starting some new clean(er) interface from hypercall 64 is the >>> plan, but it very much does not want to be a simple mirror of the >>> existing 0-63 with a differing calling convention. >> >> All of these look like orthogonal problems to me. That's likely all >> relevant for, as I think you've been calling it, ABI v2, but shouldn't >> hinder our switching to a physical address based hypercall model. >> Otherwise I'm afraid we'll never make any progress in that direction. > > What about an alternative model allowing to use most of the current > hypercalls unmodified? > > We could add a new hypercall for registering hypercall buffers via > virtual address, physical address, and size of the buffers (kind of a > software TLB). Why not? > The buffer table would want to be physically addressed > by the hypercall, of course. I'm not convinced of this, as it would break uniformity of the hypercall interfaces. IOW in the hypervisor we then wouldn't be able to use copy_from_guest() to retrieve the contents. Perhaps this simply shouldn't be a table, but a hypercall not involving any buffers (i.e. every discontiguous piece would need registering separately). I expect such a software TLB wouldn't have many entries, so needing to use a couple of hypercalls shouldn't be a major issue. > It might be interesting to have this table per vcpu (it should be > allowed to use the same table for multiple vcpus) in order to speed > up finding translation entries of percpu buffers. Yes. Perhaps insertion and purging could simply be two new VCPUOP_*. As a prereq I think we'd need to sort the cross-vCPU accessing of guest data, coincidentally pointed out in a post-commit-message remark in https://lists.xen.org/archives/html/xen-devel/2022-09/msg01761.html. The subject vCPU isn't available in copy_to_user_hvm(), which is where I'd expect the TLB lookup to occur (while assuming handles point at globally mapped space _might_ be okay, using the wrong vCPU's TLB surely isn't). > Any hypercall buffer being addressed virtually could first tried to > be found via the SW-TLB. This wouldn't require any changes for most > of the hypercall interfaces. Only special cases with very large buffers > might need indirect variants (like Jan said: via GFN lists, which could > be passed in registered buffers). > > Encrypted guests would probably want to use static percpu buffers in > order to avoid switching the encryption state of the buffers all the > time. > > An unencrypted PVH/HVM domain (e.g. PVH dom0) could just define one > giant buffer with the domain's memory size via the physical memory > mapping of the kernel. All kmalloc() addresses would be in that region. That's Linux-centric. I'm not convinced all OSes maintain a directmap. Without such, switching to this model might end up quite intrusive on the OS side. Thinking of Linux, we'd need a 2nd range covering the data part of the kernel image. Further this still wouldn't (afaics) pave a reasonable route towards dealing with privcmd-invoked hypercalls. Finally - in how far are we concerned of PV guests using linear addresses for hypercall buffers? I ask because I don't think the model lends itself to use also for the PV guest interfaces. Jan > A buffer address not found would need to be translated like today (and > fail for an encrypted guest). > > Thoughts? > > > Juergen
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |