[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: [PATCH v2] x86/svm: do not try to handle recalc NPT faults immediately
> -----Original Message----- > From: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx> > Sent: 03 June 2020 12:45 > To: paul@xxxxxxx; 'Jan Beulich' <jbeulich@xxxxxxxx> > Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx; andrew.cooper3@xxxxxxxxxx; wl@xxxxxxx; > roger.pau@xxxxxxxxxx; > george.dunlap@xxxxxxxxxx > Subject: Re: [PATCH v2] x86/svm: do not try to handle recalc NPT faults > immediately > > On 03/06/2020 12:28, Paul Durrant wrote: > >> -----Original Message----- > >> From: Jan Beulich <jbeulich@xxxxxxxx> > >> Sent: 03 June 2020 12:22 > >> To: paul@xxxxxxx > >> Cc: 'Igor Druzhinin' <igor.druzhinin@xxxxxxxxxx>; > >> xen-devel@xxxxxxxxxxxxxxxxxxxx; > >> andrew.cooper3@xxxxxxxxxx; wl@xxxxxxx; roger.pau@xxxxxxxxxx; > >> george.dunlap@xxxxxxxxxx > >> Subject: Re: [PATCH v2] x86/svm: do not try to handle recalc NPT faults > >> immediately > >> > >> On 03.06.2020 12:26, Paul Durrant wrote: > >>>> -----Original Message----- > >>>> From: Jan Beulich <jbeulich@xxxxxxxx> > >>>> Sent: 03 June 2020 11:03 > >>>> To: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx> > >>>> Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx; andrew.cooper3@xxxxxxxxxx; > >>>> wl@xxxxxxx; roger.pau@xxxxxxxxxx; > >>>> george.dunlap@xxxxxxxxxx; Paul Durrant <paul@xxxxxxx> > >>>> Subject: Re: [PATCH v2] x86/svm: do not try to handle recalc NPT faults > >>>> immediately > >>>> > >>>> On 02.06.2020 18:56, Igor Druzhinin wrote: > >>>>> A recalculation NPT fault doesn't always require additional handling > >>>>> in hvm_hap_nested_page_fault(), moreover in general case if there is no > >>>>> explicit handling done there - the fault is wrongly considered fatal. > >>>>> > >>>>> This covers a specific case of migration with vGPU assigned on AMD: > >>>>> at a moment log-dirty is enabled globally, recalculation is requested > >>>>> for the whole guest memory including directly mapped MMIO regions of > >>>>> vGPU > >>>>> which causes a page fault being raised at the first access to those; > >>>>> but due to MMIO P2M type not having any explicit handling in > >>>>> hvm_hap_nested_page_fault() a domain is erroneously crashed with > >>>>> unhandled > >>>>> SVM violation. > >>>>> > >>>>> Instead of trying to be opportunistic - use safer approach and handle > >>>>> P2M recalculation in a separate NPT fault by attempting to retry after > >>>>> making the necessary adjustments. This is aligned with Intel behavior > >>>>> where there are separate VMEXITs for recalculation and EPT violations > >>>>> (faults) and only faults are handled in hvm_hap_nested_page_fault(). > >>>>> Do it by also unifying do_recalc return code with Intel implementation > >>>>> where returning 1 means P2M was actually changed. > >>>>> > >>>>> Since there was no case previously where > >>>>> p2m_pt_handle_deferred_changes() > >>>>> could return a positive value - it's safe to replace ">= 0" with just > >>>>> "== 0" > >>>>> in VMEXIT_NPF handler. finish_type_change() is also not affected by the > >>>>> change as being able to deal with >0 return value of p2m->recalc from > >>>>> EPT implementation. > >>>>> > >>>>> Reviewed-by: Roger Pau Monné <roger.pau@xxxxxxxxxx> > >>>>> Signed-off-by: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx> > >>>> > >>>> Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx> > >>>> albeit preferably with ... > >>>> > >>>>> @@ -448,12 +451,14 @@ static int do_recalc(struct p2m_domain *p2m, > >>>>> unsigned long gfn) > >>>>> clear_recalc(l1, e); > >>>>> err = p2m->write_p2m_entry(p2m, gfn, pent, e, level + 1); > >>>>> ASSERT(!err); > >>>>> + > >>>>> + recalc_done = true; > >>>>> } > >>>>> > >>>>> out: > >>>>> unmap_domain_page(table); > >>>>> > >>>>> - return err; > >>>>> + return err ?: (recalc_done ? 1 : 0); > >>>> > >>>> ... this shrunk to > >>>> > >>>> return err ?: recalc_done; > >>>> > >>>> (easily doable while committing). > >>>> > >>>> Also Cc Paul. > >>>> > >>> > >>> paging_log_dirty_enable() still fails global enable if has_arch_pdevs() > >>> is true, so presumably there's no desperate need for this to go in 4.14? > >> > >> The MMIO case is just the particular situation here. Aiui the same issue > >> could potentially surface with other p2m types. Also given I'd consider > >> this a backporting candidate, while it may not be desperately needed for > >> the release, I think it deserves considering beyond the specific aspect > >> you mention. > >> > > > > In which case I think the commit message probably ought to be rephrased, > > since it appears to focus > on a case that cannot currently happen. > > This can happen without has_arch_pdevs() being true. It's enough to just > directly map some physical memory into a guest to get p2m_direct_mmio > type present in the page tables. OK, that's fine, but when will that happen without pass-through? If we can have a commit message justifying the change based on what can actually happen now, then I would not be opposed to it going in 4.14. Paul > > Igor
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |