[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] BUG: using smp_processor_id() in preemptible [00000000] code: blkback.1.xvdb/9138 caller is decrease_reservation
Wednesday, September 11, 2013, 2:39:01 PM, you wrote: > On 11/09/13 13:08, Stefano Stabellini wrote: >> On Wed, 11 Sep 2013, David Vrabel wrote: >>> On 10/09/13 19:13, Sander Eikelenboom wrote: >>>> Hi Wei, >>>> >>>> Just back from holiday i tried running a new xen-unstable and >>>> linux kernel (current Linus his tree + Konrad's last pull request >>>> merged on top). I saw a thread and patch about a bug_on in >>>> increase_reservation ... i'm seeing the splat below in dom0 when >>>> guests get started. >>> >>> Yes, the use of __per_cpu() in decrease_reservation is not safe. >>> >>> Stefano, what testing did you give "xen/balloon: set a mapping for >>> ballooned out pages" (cd9151e2). The number of critical problems >>> it's had suggests not a lot? >>> >>> I'm also becoming less happy with the inconsistency between the >>> p2m updates between the different (non-)auto_translated_physmap >>> guest types. >>> >>> I think it (and all the attempts to fix it) should be reverted at >>> this stage and we should try again for 3.13 which some more through >>> testing and a more careful look at what updates to the p2m are >>> needed. >> >> Issues like this one are due to different kernel configurations / >> usage patters. To reproduce this issue one needs a preemptible kernel >> and blkback. I use a non-preemptible kernel and QEMU as block >> backend. >> >> Granted, in this case I should have tested blkback and both >> preemptible and non-preemptible kernel configurations. But in >> general it is nearly impossible to test all the possible >> configurations beforehand, it is a classic case of combinatorial >> explosion. >> >> These are exactly the kind of things that an exposure to a wider >> range of users/developers help identify and fix. >> >> So I think that we did the right thing here, by sending a pull >> request early in the release cycle, so that now we have many other >> RCs to fix all the issues that come up. > That sounds fair. > Sander, does the follow patch fix this issue? Hi David, This patch indeed seems to fix it. Thx, Sander > 8<--------------------------------------------------- > xen/balloon: ensure preemption is disabled when using a scratch page > In decrease_reservation(), if the kernel is preempted between updating > the mapping and updating the p2m then they may end up using different > scratch pages. > Use get_balloon_scratch_page() and put_balloon_scratch_page() which use > get_cpu_var() and put_cpu_var() to correctly disable preemption. > Signed-off-by: David Vrabel <david.vrabel@xxxxxxxxxx> > --- > diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c > index 3101cf6..a74647b 100644 > --- a/drivers/xen/balloon.c > +++ b/drivers/xen/balloon.c > @@ -305,6 +305,18 @@ static enum bp_state reserve_additional_memory(long > credit) > } > #endif /* CONFIG_XEN_BALLOON_MEMORY_HOTPLUG */ > > +struct page *get_balloon_scratch_page(void) > +{ > + struct page *ret = get_cpu_var(balloon_scratch_page); > + BUG_ON(ret == NULL); > + return ret; > +} > + > +void put_balloon_scratch_page(void) > +{ > + put_cpu_var(balloon_scratch_page); > +} > + > static enum bp_state increase_reservation(unsigned long nr_pages) > { > int rc; > @@ -380,6 +392,7 @@ static enum bp_state decrease_reservation(unsigned long > nr_pages, gfp_t gfp) > enum bp_state state = BP_DONE; > unsigned long pfn, i; > struct page *page; > + struct page *scratch_page; > int ret; > struct xen_memory_reservation reservation = { > .address_bits = 0, > @@ -399,6 +412,8 @@ static enum bp_state decrease_reservation(unsigned long > nr_pages, gfp_t gfp) > if (nr_pages > ARRAY_SIZE(frame_list)) > nr_pages = ARRAY_SIZE(frame_list); > > + scratch_page = get_balloon_scratch_page(); > + > for (i = 0; i < nr_pages; i++) { > page = alloc_page(gfp); > if (page == NULL) { > @@ -416,7 +431,7 @@ static enum bp_state decrease_reservation(unsigned long > nr_pages, gfp_t gfp) > if (xen_pv_domain() && !PageHighMem(page)) { > ret = HYPERVISOR_update_va_mapping( > (unsigned long)__va(pfn << PAGE_SHIFT), > - > pfn_pte(page_to_pfn(__get_cpu_var(balloon_scratch_page)), > + pfn_pte(page_to_pfn(scratch_page), > PAGE_KERNEL_RO), 0); > BUG_ON(ret); > } > @@ -432,14 +447,14 @@ static enum bp_state decrease_reservation(unsigned long > nr_pages, gfp_t gfp) > pfn = mfn_to_pfn(frame_list[i]); > if (!xen_feature(XENFEAT_auto_translated_physmap)) { > unsigned long p; > - struct page *pg; > - pg = __get_cpu_var(balloon_scratch_page); > - p = page_to_pfn(pg); > + p = page_to_pfn(scratch_page); > __set_phys_to_machine(pfn, pfn_to_mfn(p)); > } > balloon_append(pfn_to_page(pfn)); > } > > + put_balloon_scratch_page(); > + > set_xen_guest_handle(reservation.extent_start, frame_list); > reservation.nr_extents = nr_pages; > ret = HYPERVISOR_memory_op(XENMEM_decrease_reservation, &reservation); > @@ -491,18 +506,6 @@ static void balloon_process(struct work_struct *work) > mutex_unlock(&balloon_mutex); > } > > -struct page *get_balloon_scratch_page(void) > -{ > - struct page *ret = get_cpu_var(balloon_scratch_page); > - BUG_ON(ret == NULL); > - return ret; > -} > - > -void put_balloon_scratch_page(void) > -{ > - put_cpu_var(balloon_scratch_page); > -} > - > /* Resets the Xen limit, sets new target, and kicks off processing. */ > void balloon_set_new_target(unsigned long target) > { _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |