Xen project Mailing List

Re: [PATCH for-4.16 v4] gnttab: allow setting max version per-domain

To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

From: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Date: Sat, 30 Oct 2021 09:53:19 +0200

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=MVJ1k9E/HU64AYnlrTPywAaZ3jOuUc+JxseXIJ8jVeM=; b=N3ke0DeNSvH/HvWA2J7/08deNhg/TDOOur/ANgjn3cbpJxvIdltAST5tS7EcQNKC1SOVuvchkyMavziFmy2P+UjSnNIhYSwU5OWLVva0zKbuh/3WmL2CgGHUj+Ae4BXRhZhCnLmkv58K7aTvb15FOVMxJ7mv3i29Ze/2p5IJiOmLd0VkB8ZsJiRNqqiELdReaqqS4Jqzg4yWJqMllnC5MKKuUHPuX/qY5IMj3krwVDB1EichqSjXVckvIMSiBx1eftedPLN574Hogbu2dTAoW/uhqU2Up01yQQr+hG1wre2nm6MCE9gNjNSb9zkmStHSBzJINe1AAwur1vKOL7dDjw==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=fCyKVZvnlpeVtjNCUNoESM1s4trefVVu/tUPn/6sIi95hGxVHKLBJVGf0s2Ys8oZNVBY/ZnQoDRQ7kfY3CO59JIuaqQztldU6XUjTSZg+RP0+aNAPuIUgAiyDIxxu+0Fx+3HzDsvfeHs+oTqdbRIvJ4H4kVlMTNtIoAr+ijbsrhbT6GICa1RqIBH07o0k4szg1klEGfRj/uK5yazHPzzwbDIQBH1HfjUghDboJ8g9nQ8TON+lzOl4PsEjEIMMsa5w6KrWk53QHN2mK3tSj2pG/CovmFeXf6mj5JBap6ukX7UatX36hso/8NtRUrlwdXoNfUiPJNKDGWmW8o7mnNtkw==

Authentication-results: esa6.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com

Cc: <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Juergen Gross <jgross@xxxxxxxx>, Christian Lindig <christian.lindig@xxxxxxxxxx>, David Scott <dave@xxxxxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, Ian Jackson <iwj@xxxxxxxxxxxxxx>

Delivery-date: Sat, 30 Oct 2021 07:54:11 +0000

Ironport-data: A9a23:8s4hcqChlDCBDBVW/3jlw5YqxClBgxIJ4kV8jS/XYbTApGsqhTZVm 2UZWWnXbqnbNGKhKtBxb4jioR5X6Mfcx4VhQQY4rX1jcSlH+JHPbTi7wuYcHM8wwunrFh8PA xA2M4GYRCwMo/u1Si6FatANl1ElvU2zbue6WLGs1hxZH1c+EX5500M7wIbVv6Yz6TSHK1LV0 T/Ni5W31G+Ng1aY5UpNtspvADs21BjDkGtwUm4WPJinj3eH/5UhN7oNJLnZEpfNatI88thW5 Qr05OrREmvxp3/BAz4++1rxWhVirrX6ZWBihpfKMkQLb9crSiEai84G2PQghUh/sSW5lPEsz I9xqt+Rax8wIPzFtLgFTEwNe81+FfUuFL7vJHG+tYqYzlHccmuqyPJrZK00FdRGoKAtWzgIr KFGbmBWBvyAr7veLLaTUO5ji95lNMD2FIgepmth3XfSCvNOrZXrHvuUuoAEhGZYasZmTa7RS NggZzNTNBXFWy9FIH4HMMgytbL97pX4W2IB8w/EzUYt2EDfxRJ8+KLgO93UfpqNX8o9tkSXv GXd5EziHwoXcteYzFKt822urv/CmzvhX4AfH6H+8eRl6HWtwWgUBAwTREGMi/CzgU6jWPpSM 0URvCEpqMAa71e3R9PwWxm5pn+svRMGXddUVeog52ml1a788wufQG8eQVZpeNEg8cM7WzEu/ luIhM/yQyxitqWPTnCQ/avSqim9UQAXMGsDaCksXQYDpd75r+kbjBjCU9JiG66dlcDuFHf7x DXikcQlr+xN14hRjfz9pA2ZxWL3znTUcuIrzj/HbmmEswcjWJHmRq6w8EaK5NocNJnMGzFto 0M4s8SZ6ekPC7SEmyqMXPgBEdmV2hqVDNHPqQUxRsd8rlxB71bmJNkNu28meC+FJ+5dIWexC HI/rz+983O60JGCSaZsK7y8BM0xpUQLPYS0D6uEBjaij3UYSeNmwM2MTRLPt4wOuBJ1+U3aB Xt9WZz0ZZr9If8+pAdav89HjdcWKtkWnAs/v6zTwRW9yqa5b3WIU7oDO1bmRrlnt/7Z+1+Er I4FbpDiJ/BjvAvWOHG/HWk7dgliEJTGLcqu95w/mhCre1IO9J4d5w/5nup6Jt0Nc1V9nebU5 HCtMnK0O3Kk7UAr3T6iMyg5AJu2BM4XhStiYUQEYAb5s1B+MN3HxPpOKPMKkUwPqbULIQhcF KJeJa1tw51nF1z6xtjqRcCm8dE5K0jz3Wpj/UONOVACQnKpfCSQkvfMdQrz7igeSC2xsMo1u bq70Q3HB5EEQmxf4Az+MZpDFnu94ioQnvxcRUzNLoUBcUng6tEyeSfwkuU2M4cHLhCanmmW0 AOfABE5o+jRotBqrImV1P7c94r5QfFjGkd6HnXA6erkPyft4Wf+k5RLV/yFfG6BWTqsqrmif +hc09r1LOYDwARRq4N5HrsylfA+6tLjqqV01ANhGHmXPV2nBqk5eiuN3NVVt70Lzbhc4FPkV kWK89hcGLOIJMK6TwJBeFt7NryOjKhGlCPT4PI5JFTByBV2pLfXA19POxSsiTBGKOcnOo0S3 up86tUd7Bayi0R2P4/e3DxU7WmFMlcJT74j6sMBGIbuhwcmlgNCbJjbBnOk6Z2DcYwRYEwjI zvSj6venbVMgEHFdiNrR3TK2ONcg7UIuQxLkwBedwjYxIKdi69lxgBV/BQ2Uh9Rn0dO3O9EM 2R2M1F4ePeV9DByickfB22hFmmt3vFCFpAdH7fRqFDkcg==

Ironport-hdrordr: A9a23:lea7kam6sl/ilplQ5RrJ4F/J6kHpDfIS3DAbv31ZSRFFG/Fw8P re/sjztCWE6wr5PUtKpTnuAsi9qB/nmqKdgrN8AV7BZmTbUQKTRelfBOPZsljd8kbFltK1u5 0PT0DZYueAaWRSvILf2k2ZCNY7hP2K7aiEjfrXpk0GcT1X

Ironport-sdr: HzU8GZT1m77svVQDHoja5CToL11dWaDmNAV9SDJyMRm2ZxMUWSdH32rdwgYtkKWZC3nimkhluA dgJfFEHbcvjMuYKzdx0CFC8KjTmPKdu0AkdtPF2iKUhzEhgoAxXMJst/Nmg+c5IHw/Jh8hb2Gf a2L5kXpH/lBeJCHOxABYGj9zk8fh7rN+GTeMMzcC/Zs7yjcYxiJXoGfIZQymGnxy0+XrCVFhhz FXtHJ2c/Tz7oWcYadkUu/lRqe2Dm11JoEvXIAxvFdtI7Ae3CWnQmKb4VF6/uHGbmAPltvIShbk E2oRZugYecRTd0taa1zdq66l

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Fri, Oct 29, 2021 at 05:39:52PM +0100, Andrew Cooper wrote: > On 29/10/2021 08:59, Roger Pau Monne wrote: > > diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c > > index e510395d08..f94f0f272c 100644 > > --- a/xen/common/grant_table.c > > +++ b/xen/common/grant_table.c > > @@ -1917,11 +1918,33 @@ active_alloc_failed: > > } > > > > int grant_table_init(struct domain *d, int max_grant_frames, > > - int max_maptrack_frames) > > + int max_maptrack_frames, unsigned int options) > > { > > struct grant_table *gt; > > + unsigned int max_grant_version = options & > > XEN_DOMCTL_GRANT_version_mask; > > int ret = -ENOMEM; > > > > + if ( max_grant_version == XEN_DOMCTL_GRANT_version_default ) > > + max_grant_version = opt_gnttab_max_version; > > + if ( !max_grant_version ) > > + { > > + dprintk(XENLOG_INFO, "%pd: invalid grant table version 0 > > requested\n", > > + d); > > + return -EINVAL; > > + } > > + if ( max_grant_version > opt_gnttab_max_version ) > > + { > > + dprintk(XENLOG_INFO, > > + "%pd: requested grant version (%u) greater than supported > > (%u)\n", > > + d, max_grant_version, opt_gnttab_max_version); > > + return -EINVAL; > > + } > > I think this wants to live in sanitise_domain_config() along with all > the other auditing of flags and settings. The reason to place those there is that the sanity checks for the other grant table related parameters (max_grant_frames and max_maptrack_frames) are performed in this function also. I think it's better to keep the checks together. We should consider exporting the relevant values from grant table code and then moving all the checks to sanitise_domain_config, but likely a follow up work given the current point in the release. > Also, it can be simplified: > > if ( max_grant_version < 1 || > max_grant_version > opt_gnttab_max_version ) > { > dprintk(XENLOG_INFO, "Requested gnttab max version %u outside of > supported range [%u, %u]\n", ...); > } It was originally done this way so that the first check (!max_grant_version) could be adjusted when support for max_grant_version == 0 was introduced [0] in order to signal the disabling of grant tables altogether. > > > > + if ( unlikely(max_page >= PFN_DOWN(TB(16))) && is_pv_domain(d) && > > + max_grant_version < 2 ) > > + dprintk(XENLOG_INFO, > > + "%pd: host memory above 16Tb and grant table v2 > > disabled\n", > > + d); > > This is rather more complicated. > > For PV, this going wrong in the first place is conditional on CONFIG_BIGMEM. > For HVM, it the guest address size, not the host. > For ARM, I don't even know, because I've lost track of which bits of the > ABI are directmap in an otherwise translated domain. This was only aiming to cover the PV case, which I think it's the more likely one. It's possible there's people attempting to create PV guests on a 16TB machine, but I think it's more unlikely that the guest itself will have 16TB of RAM. > I think it is probably useful to do something about it, but probably not > in this patch. I'm fine with this, we had no warning at all before, so I don't think it should be a hard requirement to add one now. It would be nice if there's consensus, but otherwise let's just skip it. > Perhaps modify domain_set_alloc_bitsize() to impose an upper limit for > the "host memory size matters" cases? > > For the guest address size cases, this possibly wants to feed in to the > max policy calculations in the same way that shadow kinda does. So grant table version will basically clamp the amount of memory a guest can use? What about guests that doesn't use grant tables at all, do we expect those to set the future max_grant_version to 0 in order to avoid the clamping without having to expose grant v2? > > diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h > > index 51017b47bc..0ec57614bd 100644 > > --- a/xen/include/public/domctl.h > > +++ b/xen/include/public/domctl.h > > @@ -89,14 +89,20 @@ struct xen_domctl_createdomain { > > /* > > * Various domain limits, which impact the quantity of resources > > * (global mapping space, xenheap, etc) a guest may consume. For > > - * max_grant_frames and max_maptrack_frames, < 0 means "use the > > - * default maximum value in the hypervisor". > > + * max_grant_frames, max_maptrack_frames and max_gnttab_version < 0 > > + * means "use the default maximum value in the hypervisor". > > */ > > uint32_t max_vcpus; > > uint32_t max_evtchn_port; > > int32_t max_grant_frames; > > int32_t max_maptrack_frames; > > > > +/* Grant version, use low 4 bits. */ > > +#define XEN_DOMCTL_GRANT_version_mask 0xf > > +#define XEN_DOMCTL_GRANT_version_default 0xf > > This needs to be a toolstack decision, not something in Xen. This > doesn't fix the case where VMs can't cope with change underfoot. > > It is fine for the user say "use the default", but this must be turned > into an explicit 1 or 2 by the toolstack, so that the version(s) visible > to the guest remains invariant while it is booted. Please bear with me, as I'm afraid I don't understand why this is relevant. Allowed max grant version can only change as a result of a migration, and A VM being booted cannot (usually) be migrated, as it requires guest cooperation (and a fully setup xenstore). Any guest allowing migration during boot (which is AFAICT the only way for a max grant table version change) should be capable of handling the max grant table version changing on resume, whereas this resume happens in the middle of the boot process is a guest decision, and it should be capable of handling such changes, or else refuse to suspend during boot. Any resume process will always involve a re-initialization of the grant table. If the intent is to make this compatible with transparent live migration I think there are also other grant table structures that will need to be migrated in that case, and thus the version would already be conveyed there. Thanks, Roger.

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.