[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 1/2] x86: Perform mem_sharing teardown before paging teardown


  • To: Tamas K Lengyel <tamas@xxxxxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Wed, 15 Feb 2023 11:27:19 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=SERSB7BNRhjB/1xGr5c4n0jNpMuSxfs9lzAEyQg8tEk=; b=aSdudF8wNA3nEqH1p6ka9Bgrx1LXh1w4SFKsYPxj6Gj4hvGS8vyvRkA5qy+CduNKLVxnJg1G75cuCvWIoDdpRN26wCpPmpn0ix5o0Fns/gzcqQHSXAn8Z1p1HRLZtJY+qt7VVcG3sUX1hF+GDSEix7fewSTZDf5OFqLmU29lzqIRj9DS7qemLr1EDvoLDVP8wZ90y8+pPtnb9K2WFY5jah+r79dIHP0lVI1DP1liXgg7ePQZCoBmPpSpScGXy6OnjP3qoq5d7/qxnEZ6q1tlU66uwlAdKDifBVhuWeizGmf0sNCdzWkDOpn2OTNnTP/ySg29uSBOWC030Ssp/lHAHQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=g8gR5uycbg06iBdVCrWEUcU/YmnU0nlEfOtqzd4yrjXuJvLiNhjlwleRRI51vmNZFOlzCskTsV14lhTUfgBZmpWqxhILjZiHPM8975NiFA0MLhrPaksYazeGovQ5RmmFA+q9kbBqD10oPaNsh6hh4TQNFybdQjV5Rh5XSrazL59+kBgj27V/5q3opcCeR4EDUeBDL5vKJ6T4+/xh8Cd2L2tCXXdfUYbuXprqce/vd22TwBv3ZJ/H/4XnG0ldF9VyS5iIKoRX2eXTxWSnwDuCJ3u42gcKfXUbKcTBFskb8Oy7oqJanSFo3V5NP1gEQkLrBAgDUeQXbep2Rt2QlEPnHg==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Wed, 15 Feb 2023 10:27:36 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 14.02.2023 06:07, Tamas K Lengyel wrote:
> An assert failure has been observed at p2m-basic.c:173 when performing vm

Please can you at least also name the function here? The line number is
going to change, and when coming back to this change later, it'll be
more troublesome to figure out which precise assertion was meant. Yes,
...

> forking and then destroying the forked VM. The assert checks whether the
> domain's shared pages counter is 0.

... you verbally describe it here, but still.

> According to the patch that originally
> added the assert (7bedbbb5c31) the p2m_teardown should only happen after
> mem_sharing already relinquished all shared pages.
> 
> In this patch we flip the order in which relinquish ops are called to avoid
> tripping the assert.
> 
> Signed-off-by: Tamas K Lengyel <tamas@xxxxxxxxxxxxx>
> ---
> Note: it is unclear why this assert hasn't tripped in the past. It hasn't
> been observed with 4.17-rc1 but it is in RELEASE-4.17.0

That's almost certainly a result of e7aa55c0aab3 ("x86/p2m: free the paging
memory pool preemptively"), which added calls to p2m_teardown() to
hap_teardown(). If you agree, this wants recording in a Fixes: tag, as
- being an XSA fix - that change has been backported to everywhere, and
hence the issue now also exists everywhere.

> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -2310,6 +2310,32 @@ int domain_relinquish_resources(struct domain *d)
>          if ( ret )
>              return ret;
>  
> +#ifdef CONFIG_MEM_SHARING
> +    PROGRESS(shared):

If we go with the re-ordering as you suggest, then please also move the
enumerator near the top of the switch() body.

Did you consider the alternative of adjusting the ASSERT() in question
instead? It could reasonably become

#ifdef CONFIG_MEM_SHARING
    ASSERT(!p2m_is_hostp2m(p2m) || !remove_root || !atomic_read(&d->shr_pages));
#endif

now, I think. That would be less intrusive a change (helpful for
backporting), but there may be other (so far unnamed) benefits of pulling
ahead the shared memory teardown.

> +        if ( is_hvm_domain(d) )
> +        {
> +            /* If the domain has shared pages, relinquish them allowing
> +             * for preemption. */

Similarly, if the code is to be moved, please correct style here at this
occasion.

> +            ret = relinquish_shared_pages(d);
> +            if ( ret )
> +                return ret;

While I can easily agree with the movement ahead of this being okay, ...

> +            /*
> +             * If the domain is forked, decrement the parent's pause count
> +             * and release the domain.
> +             */
> +            if ( mem_sharing_is_fork(d) )
> +            {
> +                struct domain *parent = d->parent;
> +
> +                d->parent = NULL;
> +                domain_unpause(parent);
> +                put_domain(parent);
> +            }

... I can only trust you on being sure that moving ahead of this is
okay, too.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.