Xen project Mailing List

Re: [Xen-devel] [PATCH] x86/PV: fix unintended dependency of m2p-strict mode on migration-v2

From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

Date: Mon, 1 Feb 2016 14:07:48 +0000

Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Keir Fraser <keir@xxxxxxx>

Delivery-date: Mon, 01 Feb 2016 14:08:31 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 01/02/16 13:20, Jan Beulich wrote: > Ping? (I'd really like to get this resolved, so we don't need to > indefinitely run with non-upstream behavior in our distros.) > > Thanks, Jan My remaining issue is whether this loop gets executed by default. I realise that there is a difference between legacy and v2 migration, and that v2 migration by default worked. If that means we managed to skip this loop in its entirety for v2, then I am far less concerned about the overhead. ~Andrew > >>>> On 13.01.16 at 17:15, <JBeulich@xxxxxxxx> wrote: >>>>> On 13.01.16 at 17:00, <andrew.cooper3@xxxxxxxxxx> wrote: >>> On 13/01/16 15:36, Jan Beulich wrote: >>>>>>> On 13.01.16 at 16:25, <andrew.cooper3@xxxxxxxxxx> wrote: >>>>> On 12/01/16 15:19, Jan Beulich wrote: >>>>>>>>> On 12.01.16 at 12:55, <andrew.cooper3@xxxxxxxxxx> wrote: >>>>>>> On 12/01/16 10:08, Jan Beulich wrote: >>>>>>>> This went unnoticed until a backport of this to an older Xen got used, >>>>>>>> causing migration of guests enabling this VM assist to fail, because >>>>>>>> page table pinning there preceeds vCPU context loading, and hence L4 >>>>>>>> tables get initialized for the wrong mode. Fix this by post-processing >>>>>>>> L4 tables when setting the intended VM assist flags for the guest. >>>>>>>> >>>>>>>> Note that this leaves in place a dependency on vCPU 0 getting its guest >>>>>>>> context restored first, but afaict the logic here is not the only thing >>>>>>>> depending on that. >>>>>>>> >>>>>>>> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx> >>>>>>>> >>>>>>>> --- a/xen/arch/x86/domain.c >>>>>>>> +++ b/xen/arch/x86/domain.c >>>>>>>> @@ -1067,8 +1067,48 @@ int arch_set_info_guest( >>>>>>>> goto out; >>>>>>>> >>>>>>>> if ( v->vcpu_id == 0 ) >>>>>>>> + { >>>>>>>> d->vm_assist = c(vm_assist); >>>>>>>> >>>>>>>> + /* >>>>>>>> + * In the restore case we need to deal with L4 pages which got >>>>>>>> + * initialized with m2p_strict still clear (and which hence >>>>>>>> lack >>>>> the >>>>>>>> + * correct initial RO_MPT_VIRT_{START,END} L4 entry). >>>>>>>> + */ >>>>>>>> + if ( d != current->domain && VM_ASSIST(d, m2p_strict) && >>>>>>>> + is_pv_domain(d) && !is_pv_32bit_domain(d) && >>>>>>>> + atomic_read(&d->arch.pv_domain.nr_l4_pages) ) >>>>>>>> + { >>>>>>>> + bool_t done = 0; >>>>>>>> + >>>>>>>> + spin_lock_recursive(&d->page_alloc_lock); >>>>>>>> + >>>>>>>> + for ( i = 0; ; ) >>>>>>>> + { >>>>>>>> + struct page_info *page = >>>>>>>> page_list_remove_head(&d->page_list); >>>>>>>> + >>>>>>>> + if ( page_lock(page) ) >>>>>>>> + { >>>>>>>> + if ( (page->u.inuse.type_info & PGT_type_mask) == >>>>>>>> + PGT_l4_page_table ) >>>>>>>> + done = !fill_ro_mpt(page_to_mfn(page)); >>>>>>>> + >>>>>>>> + page_unlock(page); >>>>>>>> + } >>>>>>>> + >>>>>>>> + page_list_add_tail(page, &d->page_list); >>>>>>>> + >>>>>>>> + if ( done || (!(++i & 0xff) && >>>>>>>> hypercall_preempt_check()) ) >>>>>>>> + break; >>>>>>>> + } >>>>>>>> + >>>>>>>> + spin_unlock_recursive(&d->page_alloc_lock); >>>>>>>> + >>>>>>>> + if ( !done ) >>>>>>>> + return -ERESTART; >>>>>>> This is a long loop. It is preemptible, but will incur a time delay >>>>>>> proportional to the size of the domain during the VM downtime. >>>>>>> >>>>>>> Could you defer the loop until after %cr3 has set been set up, and only >>>>>>> enter the loop if the kernel l4 table is missing the RO mappings? That >>>>>>> way, domains migrated with migration v2 will skip the loop entirely. >>>>>> Well, first of all this would be the result only as long as you or >>>>>> someone else don't re-think and possibly move pinning ahead of >>>>>> context load again. >>>>> A second set_context() will unconditionally hit the loop though. >>>> Right - another argument against making any change to what is >>>> in the patch right now. >>> If there are any L4 pages, the current code will unconditionally search >>> the pagelist on every entry to the function, even when it has already >>> fixed up the strictness. >>> >>> A toolstack can enter this functions multiple times for the same vcpu, >>> by resetting the vcpu state inbetween. How much do we care about this >>> usage? >> If we cared at all, we'd need to insert another similar piece of >> code in the reset path (moving L4s back to m2p-relaxed mode). >> >> Jan >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@xxxxxxxxxxxxx >> http://lists.xen.org/xen-devel > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.