|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v3 1/2] IOMMU/x86: disallow device assignment to PoD guests
On Thu, Feb 03, 2022 at 09:31:03AM +0100, Jan Beulich wrote:
> On 02.02.2022 17:13, Roger Pau Monné wrote:
> > On Tue, Jan 04, 2022 at 10:41:32AM +0100, Jan Beulich wrote:
> >> @@ -359,7 +360,10 @@ p2m_pod_set_mem_target(struct domain *d,
> >>
> >> ASSERT( pod_target >= p2m->pod.count );
> >>
> >> - ret = p2m_pod_set_cache_target(p2m, pod_target, 1/*preemptible*/);
> >> + if ( has_arch_pdevs(d) || cache_flush_permitted(d) )
> >
> > Is it possible to have cache flush allowed without any PCI device
> > assigned? AFAICT the iomem/ioport_caps would only get setup when there
> > are device passed through?
>
> One can assign MMIO or ports to a guest the raw way. That's not
> secure, but functionally explicitly permitted.
>
> > TBH I would be fine if we just say that PoD cannot be used in
> > conjunction with an IOMMU, and just check for is_iommu_enable(d) here.
> >
> > I understand it's technically possible for PoD to be used together
> > with a domain that will later get a device passed through once PoD is
> > no longer in use, but I doubt there's much value in supporting that
> > use case, and I fear we might be introducing corner cases that could
> > create issues in the future. Overall I think it would be safer to just
> > disable PoD in conjunction with an IOMMU.
>
> I consider it wrong to put in place such a restriction, but I could
> perhaps accept you and Andrew thinking this way if this was the only
> aspect playing into here. However, this would then want an equivalent
> tools side check, and while hunting down where to make the change as
> done here, I wasn't able to figure out where that alternative
> adjustment would need doing. Hence I would possibly(!) buy into this
> only if someone else took care of doing so properly in the tool stack
> (including the emission of a sensible error message).
What about the (completely untested) chunk below:
diff --git a/tools/libs/light/libxl_create.c b/tools/libs/light/libxl_create.c
index d7a40d7550..e585ef4c5c 100644
--- a/tools/libs/light/libxl_create.c
+++ b/tools/libs/light/libxl_create.c
@@ -1160,17 +1160,16 @@ int libxl__domain_config_setdefault(libxl__gc *gc,
pod_enabled = (d_config->c_info.type != LIBXL_DOMAIN_TYPE_PV) &&
(d_config->b_info.target_memkb < d_config->b_info.max_memkb);
- /* We cannot have PoD and PCI device assignment at the same time
+ /* We cannot have PoD and an active IOMMU at the same time
* for HVM guest. It was reported that IOMMU cannot work with PoD
* enabled because it needs to populated entire page table for
- * guest. To stay on the safe side, we disable PCI device
- * assignment when PoD is enabled.
+ * guest.
*/
if (d_config->c_info.type != LIBXL_DOMAIN_TYPE_PV &&
- d_config->num_pcidevs && pod_enabled) {
+ d_config->c_info.passthrough != LIBXL_PASSTHROUGH_DISABLED &&
+ pod_enabled) {
ret = ERROR_INVAL;
- LOGD(ERROR, domid,
- "PCI device assignment for HVM guest failed due to PoD enabled");
+ LOGD(ERROR, domid, "IOMMU cannot be enabled together with PoD");
goto error_out;
}
> Finally this still leaves out the "raw MMIO / ports" case mentioned
> above.
But the raw MMIO 'mode' doesn't care much about PoD, because if
there's no PCI device assigned there's no IOMMU setup, and thus such
raw MMIO regions (could?) belong to a device that's not constrained by
the guest p2m anyway?
> >> --- a/xen/common/vm_event.c
> >> +++ b/xen/common/vm_event.c
> >> @@ -639,7 +639,7 @@ int vm_event_domctl(struct domain *d, st
> >>
> >> rc = -EXDEV;
> >> /* Disallow paging in a PoD guest */
> >> - if ( p2m_pod_entry_count(p2m_get_hostp2m(d)) )
> >> + if ( p2m_pod_active(d) )
> >
> > Isn't it fine to just check for entry_count like you suggest in the
> > change to libxl?
>
> I didn't think it would be, but I'm not entirely sure: If paging was
> enabled before a guest actually starts, it wouldn't have any entries
> but still be a PoD guest if it has a non-empty cache. The VM event
> folks may be able to clarify this either way. But ...
>
> > This is what p2m_pod_entry_count actually does (rather than entry_count |
> > count).
>
> ... you really mean "did" here, as I'm removing p2m_pod_entry_count()
> in this patch. Of course locking could be added to it instead (or
> p2m_pod_get_mem_target() be used in its place), but I'd prefer if we
> could go with just the check which precisely matches what the comment
> says (IOW otherwise I'd need to additionally know what exactly the
> comment is to say).
Could you briefly mention this in the commit message? Ie: VM event
code is also adjusted to make sure PoD is not in use and cannot be
used during the guest lifetime?
Thanks, Roger.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |