[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH] x86/p2m-pt: do type recalculations with p2m read lock


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Mon, 3 Apr 2023 17:42:42 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=NFq82okmVOxlJdIrhlenhySNz73COmJ8btmugVasD6Q=; b=QiTbQtcQ8uILdiIkU7DgC71hPhl8skOam18jXFneve6EYA9+cbQeCbXfnW+GOJFcbi/Seb0CbGQezxFDJXDVCnTecECxbM1adyPg2bsgRb1aowQrExCDs7jonbPygeYK4ZbAD1mExvcS0DHiSMZQiH5o9yay49d7NALZ066AIOkbOrRChIKN9o12fSawRfqffi8j428bX8AHbsjf2hBRzwm6VeFCNYUf7ZN0hLBcZ5Xi62O2NtDg5alsImWO3dl5XrQCLA/BnFzFWvEk+ZuvF1C7cgaSXsJj+nAE4AvqyJQNudBxZDnYXyDeG68fhqBTNJSBlZdsmeQlTSjroVB2AQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=LJQQkdjP0C6CNvu1vXY5tc9Prfzh1TH2o33xtHDOrhsaaBajTxU5F82wsMJVu//yVJClo7njpoo1S2/UwFe8bqYxANvmIJAoOJkcWzPF4jU78YOs32Ew7pDGtCLiSj4GxyFFZJ56MpjbvLeBicPF6TL/pxz5HHKPLAOHLRdM0m7JD96lHrRiLQHpwruNLJJHx2jHp55+OY1ysHFA/94PFAzd5AjYFqlrq+bnKLHqNv2Mj3JITydirMstJp/kBydUZUQJLGnYtTYKuf0pPyt2K4qyC58t2RuHVEK6d0DNwxLBqEFvrlWqSQj/0ZPo3P4sa6OIPANl2jVrCw5z1xl5Sg==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Mon, 03 Apr 2023 15:43:05 +0000
  • Ironport-data: A9a23:9F6dDqOe06JvaVzvrR2DlsFynXyQoLVcMsEvi/4bfWQNrUoihTQGn DQcWDzVPv/fazTzKtFya962phxQ6J/czoMwGQto+SlhQUwRpJueD7x1DKtS0wC6dZSfER09v 63yTvGacajYm1eF/k/F3oDJ9CU6jufQAOKnUoYoAwgpLSd8UiAtlBl/rOAwh49skLCRDhiE/ Nj/uKUzAnf8s9JPGj9SuvLrRC9H5qyo42tE5gdmP5ingXeF/5UrJMNHTU2OByOQrrl8RoaSW +vFxbelyWLVlz9F5gSNy+uTnuUiG9Y+DCDW4pZkc/HKbitq/0Te5p0TJvsEAXq7vh3S9zxHJ HehgrTrIeshFvWkdO3wyHC0GQkmVUFN0OevzXRSLaV/ZqAJGpfh66wGMa04AWEX0rh5JTFr1 sY1ExJOXimBrfyY2pefa+Y506zPLOGzVG8ekldJ6GiASN0BGNXESaiM4sJE1jAtgMwIBezZe 8cSdTtoalLHfgFLPVAUTpk5mY9EhFGmK2Ee9A3T+PRxujeIpOBy+OGF3N79YNuFSN8Thk+Fj mnH4374ElcRM9n3JT+tqyrw27KSzHmlMG4UPIPh0fVP2kyW/Go4KE0Nf0qBn9v6r0HrDrqzL GRRoELCt5Ma9kamU938VB2Qu2Ofs1gXXN84O/I+wBGAzOzT+QnxLngJSHtNZcIrsOcyRCc2z RmZktXxHzttvbaJD3WH+d+pQSiaPCEUKSoJYnYCRA5cut37+ths1VTIU8ppF7OzgpvtAzbsz juWrS84wbIOkcoM0Kb99lfC696xmqX0oscOzl2/dgqYAslRPuZJu6TABYDn0Mt9
  • Ironport-hdrordr: A9a23:bxvKQqpWeu/ob/KOw+k0SfQaV5rveYIsimQD101hICG9Evb0qy nOpoV/6faQslwssR4b9uxoVJPvfZq+z+8W3WByB9eftWDd0QPFEGgL1+DfKlbbak7DH4BmtJ uJc8JFeafN5VoRt7eG3OFveexQvOVu88qT9JjjJ28Gd3APV0n5hT0JcjpyFCdNNW57LKt8Lr WwzOxdqQGtfHwGB/7LfUXsD4D41rv2fIuNW29+OyIa
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Mon, Apr 03, 2023 at 05:30:37PM +0200, Jan Beulich wrote:
> On 03.04.2023 16:27, Roger Pau Monné wrote:
> > On Mon, Apr 03, 2023 at 02:39:08PM +0200, Jan Beulich wrote:
> >> On 03.04.2023 12:14, Roger Pau Monne wrote:
> >>> --- a/xen/arch/x86/mm/p2m-pt.c
> >>> +++ b/xen/arch/x86/mm/p2m-pt.c
> >>> @@ -486,9 +486,6 @@ static int cf_check do_recalc(struct p2m_domain *p2m, 
> >>> unsigned long gfn)
> >>>          p2m_type_t ot, nt;
> >>>          unsigned long mask = ~0UL << (level * PAGETABLE_ORDER);
> >>>  
> >>> -        if ( !valid_recalc(l1, e) )
> >>> -            P2M_DEBUG("bogus recalc leaf at d%d:%lx:%u\n",
> >>> -                      p2m->domain->domain_id, gfn, level);
> >>>          ot = p2m_flags_to_type(l1e_get_flags(e));
> >>>          nt = p2m_recalc_type_range(true, ot, p2m, gfn & mask, gfn | 
> >>> ~mask);
> >>>          if ( nt != ot )
> >>
> >> I'm afraid I neither understand why you make this change, nor why you
> >> then leave the other use of valid_recalc() in place.
> > 
> > The message can be bogus if we allow concurrent do_recalc(), and I
> > did miss the previous one.
> > 
> > I missed the one at the top.  Originally I wanted to send the RFC with
> > just changing the lock to read mode, but then I though I might as
> > well fix that (now bogus) print message.
> > 
> >>> @@ -538,9 +535,9 @@ int p2m_pt_handle_deferred_changes(uint64_t gpa)
> >>>       */
> >>>      ASSERT(!altp2m_active(current->domain));
> >>>  
> >>> -    p2m_lock(p2m);
> >>> +    p2m_read_lock(p2m);
> >>>      rc = do_recalc(p2m, PFN_DOWN(gpa));
> >>> -    p2m_unlock(p2m);
> >>> +    p2m_read_unlock(p2m);
> >>>  
> >>>      return rc;
> >>>  }
> >>
> >> How can this be safe, when do_recalc() involves p2m_next_level(), which
> >> may install new (intermediate) page tables?
> > 
> > Oh, great, didn't realize it was capable of doing so, it's more hidden
> > than in the EPT case.  Seems like this will only happen if a superpage
> > needs to be split because a lower order frame is being used as an
> > ioreq server page.
> > 
> > Do you think it would be safe to try to attempt to perform the recalc
> > with the read lock only and fallback to the write lock if there's a
> > need to call p2m_next_level()?
> 
> Yes, that ought to be okay.
> 
> > Do you agree it might be possible to do the recalc with just the read
> > lock if it's updating of PTE type / recalc flags only?
> 
> Technically this looks to be possible, yes. Question is whether we do
> ourselves much good by introducing such a special case of permitting
> a certain kind of writes with the lock only held in read mode. The
> latest when we find a second (more or less similar) use case thing
> are likely to become difficult.

Yes, it's not very nice.  I'm open to suggestions of other ways to
remove some of the contention.  If we go down this route it would need
to be clearly documented.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.