[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Semantics of XEN_DOMCTL_SHADOW_OP_SET_ALLOCATION
On 27.09.2021 23:01, Andrew Cooper wrote: > A recent ABI change in Xen caused a total breakage under the Xapi > toolstack, and the investigation had lead to this. I'm curious which change this was; while it's likely one of mine, I can't seem to be able to easily guess. > First of all, the memory pool really needs renaming, because (not naming > names) multiple developers were fooled into thinking that the bug was > caused by a VM being unexpectedly in shadow mode. > > Second, any MB value >= 0x1000000 will truncate to 0 between > {hap,shadow}_domctl() and {hap,shadow}_set_allocation(). This wants fixing of course. I assume a patch is already in the works. If not, let me know and I'll see about making one. > But for the main issue, passing 0 in at the hypercall level is broken. > > hap_enable() forces a minimum of 256 pages. A subsequent hypercall > trying to set 0 frees {tot 245, free 244, p2m 11} all the way down to > {tot 1, free 0, p2m 11} before failing with -ENOMEM because there are no > more free pages to free. Getting -ENOMEM from this kind of operation > isn't really correct. It's questionable, but I wouldn't call it outright "not correct": The function was requested to obtain memory (from the pool), so one may view this as allocation. The set-allocation functions really are both allocations and frees at the same time (moving pages from one pool to another). > Passing 0 cannot possibly function. There are non-zero p2m frames by > the time createdomain completes, as we allocate the top of the p2m, and > they cannot be freed without the teardown logic which releases memory in > the correct order. > > In fact, passing anything lower than the current free size is guaranteed > to fail. Continuations also mean that you can't pick a value which is > guaranteed not to fail, because even a well (poorly?) placed foreign map > in dom0 could change the properties of the pool. Well, I suppose outside of domain cleanup shrinking of the pool was always meant as kind of a best effort operation. > The shadow side rejects hypercall attempts using 0 I haven't been able to spot this rejection logic. Instead I'm getting the impression that the BUG() at the bottom of _shadow_prealloc() would be hit if shrinking the pool beyond what can really be freed (i.e. in particular if any pages are in use for the p2m). > (but can be bypassed > with the above truncation bug), and will try to drop shadows to get down > to the limit. This represents a difference vs HAP, and in practice 1M > granularity is probably enough to ensure that you can't fail to set the > shadow allocation that low. There is also a reachable BUG() somewhere > in this path as reported several times on xen-devel, but I still haven't > figured out how to tickle it. Any pointer to one such report? I don't recall any, and hence it's not clear to me whether that's the one in _shadow_prealloc(). > There is no code or working usecase for reducing the size of the shadow > pool, except on domain destruction. I think we should prohibit the > ability to shrink the shadow pool, and defer most of this mess to anyone > who turns up with a plausible usecase. No present use case for reducing is a fair argument for dropping support for doing so. That'll still mean an error return, which - according to what you have written near the top - may still upset the Xapi tool stack. Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |