[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v3 1/2] IOMMU/x86: disallow device assignment to PoD guests
On Thu, Feb 03, 2022 at 10:21:54AM +0100, Jan Beulich wrote: > On 03.02.2022 10:04, Roger Pau Monné wrote: > > On Thu, Feb 03, 2022 at 09:31:03AM +0100, Jan Beulich wrote: > >> On 02.02.2022 17:13, Roger Pau Monné wrote: > >>> On Tue, Jan 04, 2022 at 10:41:32AM +0100, Jan Beulich wrote: > >>>> @@ -359,7 +360,10 @@ p2m_pod_set_mem_target(struct domain *d, > >>>> > >>>> ASSERT( pod_target >= p2m->pod.count ); > >>>> > >>>> - ret = p2m_pod_set_cache_target(p2m, pod_target, 1/*preemptible*/); > >>>> + if ( has_arch_pdevs(d) || cache_flush_permitted(d) ) > >>> > >>> Is it possible to have cache flush allowed without any PCI device > >>> assigned? AFAICT the iomem/ioport_caps would only get setup when there > >>> are device passed through? > >> > >> One can assign MMIO or ports to a guest the raw way. That's not > >> secure, but functionally explicitly permitted. > >> > >>> TBH I would be fine if we just say that PoD cannot be used in > >>> conjunction with an IOMMU, and just check for is_iommu_enable(d) here. > >>> > >>> I understand it's technically possible for PoD to be used together > >>> with a domain that will later get a device passed through once PoD is > >>> no longer in use, but I doubt there's much value in supporting that > >>> use case, and I fear we might be introducing corner cases that could > >>> create issues in the future. Overall I think it would be safer to just > >>> disable PoD in conjunction with an IOMMU. > >> > >> I consider it wrong to put in place such a restriction, but I could > >> perhaps accept you and Andrew thinking this way if this was the only > >> aspect playing into here. However, this would then want an equivalent > >> tools side check, and while hunting down where to make the change as > >> done here, I wasn't able to figure out where that alternative > >> adjustment would need doing. Hence I would possibly(!) buy into this > >> only if someone else took care of doing so properly in the tool stack > >> (including the emission of a sensible error message). > > > > What about the (completely untested) chunk below: > > > > diff --git a/tools/libs/light/libxl_create.c > > b/tools/libs/light/libxl_create.c > > index d7a40d7550..e585ef4c5c 100644 > > --- a/tools/libs/light/libxl_create.c > > +++ b/tools/libs/light/libxl_create.c > > @@ -1160,17 +1160,16 @@ int libxl__domain_config_setdefault(libxl__gc *gc, > > pod_enabled = (d_config->c_info.type != LIBXL_DOMAIN_TYPE_PV) && > > (d_config->b_info.target_memkb < d_config->b_info.max_memkb); > > > > - /* We cannot have PoD and PCI device assignment at the same time > > + /* We cannot have PoD and an active IOMMU at the same time > > * for HVM guest. It was reported that IOMMU cannot work with PoD > > * enabled because it needs to populated entire page table for > > - * guest. To stay on the safe side, we disable PCI device > > - * assignment when PoD is enabled. > > + * guest. > > */ > > if (d_config->c_info.type != LIBXL_DOMAIN_TYPE_PV && > > - d_config->num_pcidevs && pod_enabled) { > > + d_config->c_info.passthrough != LIBXL_PASSTHROUGH_DISABLED && > > + pod_enabled) { > > ret = ERROR_INVAL; > > - LOGD(ERROR, domid, > > - "PCI device assignment for HVM guest failed due to PoD > > enabled"); > > + LOGD(ERROR, domid, "IOMMU cannot be enabled together with PoD"); > > goto error_out; > > } > > Perhaps. Seeing this I actually recall coming across this check during > my investigation. Not changing it along the lines of what you do was > then really more because of me not being convinced of the extra > restriction; I clearly misremembered when writing the earlier reply. > If we were to do what you suggest, I'd like to ask that the comment be > changed differently, though: "We cannot ..." then isn't really true > anymore. We choose not to permit this mode; "cannot" only applies to > actual device assignment (and of course only as long as there aren't > restartable IOMMU faults). I'm fine with an adjusted wording here. This was mostly a placement suggestion, but I didn't gave much thought to the error message. > >> Finally this still leaves out the "raw MMIO / ports" case mentioned > >> above. > > > > But the raw MMIO 'mode' doesn't care much about PoD, because if > > there's no PCI device assigned there's no IOMMU setup, and thus such > > raw MMIO regions (could?) belong to a device that's not constrained by > > the guest p2m anyway? > > Hmm, yes, true. > > >>>> --- a/xen/common/vm_event.c > >>>> +++ b/xen/common/vm_event.c > >>>> @@ -639,7 +639,7 @@ int vm_event_domctl(struct domain *d, st > >>>> > >>>> rc = -EXDEV; > >>>> /* Disallow paging in a PoD guest */ > >>>> - if ( p2m_pod_entry_count(p2m_get_hostp2m(d)) ) > >>>> + if ( p2m_pod_active(d) ) > >>> > >>> Isn't it fine to just check for entry_count like you suggest in the > >>> change to libxl? > >> > >> I didn't think it would be, but I'm not entirely sure: If paging was > >> enabled before a guest actually starts, it wouldn't have any entries > >> but still be a PoD guest if it has a non-empty cache. The VM event > >> folks may be able to clarify this either way. But ... > >> > >>> This is what p2m_pod_entry_count actually does (rather than entry_count | > >>> count). > >> > >> ... you really mean "did" here, as I'm removing p2m_pod_entry_count() > >> in this patch. Of course locking could be added to it instead (or > >> p2m_pod_get_mem_target() be used in its place), but I'd prefer if we > >> could go with just the check which precisely matches what the comment > >> says (IOW otherwise I'd need to additionally know what exactly the > >> comment is to say). > > > > Could you briefly mention this in the commit message? Ie: VM event > > code is also adjusted to make sure PoD is not in use and cannot be > > used during the guest lifetime? > > I've added > > "Nor was the use of that function in line with the immediately preceding > comment: A PoD guest isn't just one with a non-zero entry count, but > also one with a non-empty cache (e.g. prior to actually launching the > guest)." > > to the already existing paragraph about the removal of that function. Thanks, Roger.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |