[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2] x86/svm: do not try to handle recalc NPT faults immediately
On 03/06/2020 12:48, Paul Durrant wrote: >> -----Original Message----- >> From: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx> >> Sent: 03 June 2020 12:45 >> To: paul@xxxxxxx; 'Jan Beulich' <jbeulich@xxxxxxxx> >> Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx; andrew.cooper3@xxxxxxxxxx; wl@xxxxxxx; >> roger.pau@xxxxxxxxxx; >> george.dunlap@xxxxxxxxxx >> Subject: Re: [PATCH v2] x86/svm: do not try to handle recalc NPT faults >> immediately >> >> On 03/06/2020 12:28, Paul Durrant wrote: >>>> -----Original Message----- >>>> From: Jan Beulich <jbeulich@xxxxxxxx> >>>> Sent: 03 June 2020 12:22 >>>> To: paul@xxxxxxx >>>> Cc: 'Igor Druzhinin' <igor.druzhinin@xxxxxxxxxx>; >>>> xen-devel@xxxxxxxxxxxxxxxxxxxx; >>>> andrew.cooper3@xxxxxxxxxx; wl@xxxxxxx; roger.pau@xxxxxxxxxx; >>>> george.dunlap@xxxxxxxxxx >>>> Subject: Re: [PATCH v2] x86/svm: do not try to handle recalc NPT faults >>>> immediately >>>> >>>> On 03.06.2020 12:26, Paul Durrant wrote: >>>>>> -----Original Message----- >>>>>> From: Jan Beulich <jbeulich@xxxxxxxx> >>>>>> Sent: 03 June 2020 11:03 >>>>>> To: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx> >>>>>> Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx; andrew.cooper3@xxxxxxxxxx; >>>>>> wl@xxxxxxx; roger.pau@xxxxxxxxxx; >>>>>> george.dunlap@xxxxxxxxxx; Paul Durrant <paul@xxxxxxx> >>>>>> Subject: Re: [PATCH v2] x86/svm: do not try to handle recalc NPT faults >>>>>> immediately >>>>>> >>>>>> On 02.06.2020 18:56, Igor Druzhinin wrote: >>>>>>> A recalculation NPT fault doesn't always require additional handling >>>>>>> in hvm_hap_nested_page_fault(), moreover in general case if there is no >>>>>>> explicit handling done there - the fault is wrongly considered fatal. >>>>>>> >>>>>>> This covers a specific case of migration with vGPU assigned on AMD: >>>>>>> at a moment log-dirty is enabled globally, recalculation is requested >>>>>>> for the whole guest memory including directly mapped MMIO regions of >>>>>>> vGPU >>>>>>> which causes a page fault being raised at the first access to those; >>>>>>> but due to MMIO P2M type not having any explicit handling in >>>>>>> hvm_hap_nested_page_fault() a domain is erroneously crashed with >>>>>>> unhandled >>>>>>> SVM violation. >>>>>>> >>>>>>> Instead of trying to be opportunistic - use safer approach and handle >>>>>>> P2M recalculation in a separate NPT fault by attempting to retry after >>>>>>> making the necessary adjustments. This is aligned with Intel behavior >>>>>>> where there are separate VMEXITs for recalculation and EPT violations >>>>>>> (faults) and only faults are handled in hvm_hap_nested_page_fault(). >>>>>>> Do it by also unifying do_recalc return code with Intel implementation >>>>>>> where returning 1 means P2M was actually changed. >>>>>>> >>>>>>> Since there was no case previously where >>>>>>> p2m_pt_handle_deferred_changes() >>>>>>> could return a positive value - it's safe to replace ">= 0" with just >>>>>>> "== 0" >>>>>>> in VMEXIT_NPF handler. finish_type_change() is also not affected by the >>>>>>> change as being able to deal with >0 return value of p2m->recalc from >>>>>>> EPT implementation. >>>>>>> >>>>>>> Reviewed-by: Roger Pau Monné <roger.pau@xxxxxxxxxx> >>>>>>> Signed-off-by: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx> >>>>>> >>>>>> Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx> >>>>>> albeit preferably with ... >>>>>> >>>>>>> @@ -448,12 +451,14 @@ static int do_recalc(struct p2m_domain *p2m, >>>>>>> unsigned long gfn) >>>>>>> clear_recalc(l1, e); >>>>>>> err = p2m->write_p2m_entry(p2m, gfn, pent, e, level + 1); >>>>>>> ASSERT(!err); >>>>>>> + >>>>>>> + recalc_done = true; >>>>>>> } >>>>>>> >>>>>>> out: >>>>>>> unmap_domain_page(table); >>>>>>> >>>>>>> - return err; >>>>>>> + return err ?: (recalc_done ? 1 : 0); >>>>>> >>>>>> ... this shrunk to >>>>>> >>>>>> return err ?: recalc_done; >>>>>> >>>>>> (easily doable while committing). >>>>>> >>>>>> Also Cc Paul. >>>>>> >>>>> >>>>> paging_log_dirty_enable() still fails global enable if has_arch_pdevs() >>>>> is true, so presumably there's no desperate need for this to go in 4.14? >>>> >>>> The MMIO case is just the particular situation here. Aiui the same issue >>>> could potentially surface with other p2m types. Also given I'd consider >>>> this a backporting candidate, while it may not be desperately needed for >>>> the release, I think it deserves considering beyond the specific aspect >>>> you mention. >>>> >>> >>> In which case I think the commit message probably ought to be rephrased, >>> since it appears to focus >> on a case that cannot currently happen. >> >> This can happen without has_arch_pdevs() being true. It's enough to just >> directly map some physical memory into a guest to get p2m_direct_mmio >> type present in the page tables. > > OK, that's fine, but when will that happen without pass-through? If we can > have a commit message justifying the change based on what can actually happen > now, then I would not be opposed to it going in 4.14. Yes, it can happen now - we had regular (not SR-IOV) vGPU migration totally broken because of this on AMD - never got tested before at all. You don't need special patches on top of Xen or anything. To get p2m_mmio_direct you just need to call XEN_DOMCTL_memory_mapping on a domain. All Igor
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |