[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] BUG: using smp_processor_id() in preemptible [00000000] code: blkback.1.xvdb/9138 caller is decrease_reservation



Wednesday, September 11, 2013, 2:39:01 PM, you wrote:

> On 11/09/13 13:08, Stefano Stabellini wrote:
>> On Wed, 11 Sep 2013, David Vrabel wrote:
>>> On 10/09/13 19:13, Sander Eikelenboom wrote:
>>>> Hi Wei,
>>>> 
>>>> Just back from holiday i tried running a new xen-unstable and
>>>> linux kernel (current Linus his tree + Konrad's last pull request
>>>> merged on top). I saw a thread and patch about a bug_on in
>>>> increase_reservation ... i'm seeing the splat below in dom0 when
>>>> guests get started.
>>> 
>>> Yes, the use of __per_cpu() in decrease_reservation is not safe.
>>> 
>>> Stefano, what testing did you give "xen/balloon: set a mapping for 
>>> ballooned out pages" (cd9151e2).  The number of critical problems
>>> it's had suggests not a lot?
>>> 
>>> I'm also becoming less happy with the inconsistency between the
>>> p2m updates between the different (non-)auto_translated_physmap
>>> guest types.
>>> 
>>> I think it (and all the attempts to fix it) should be reverted at
>>> this stage and we should try again for 3.13 which some more through
>>> testing and a more careful look at what updates to the p2m are
>>> needed.
>> 
>> Issues like this one are due to different kernel configurations /
>> usage patters. To reproduce this issue one needs a preemptible kernel
>> and blkback. I use a non-preemptible kernel and QEMU as block
>> backend.
>> 
>> Granted, in this case I should have tested blkback and both
>> preemptible and non-preemptible kernel configurations.  But in
>> general it is nearly impossible to test all the possible
>> configurations beforehand, it is a classic case of combinatorial
>> explosion.
>> 
>> These are exactly the kind of things that an exposure to a wider
>> range of users/developers help identify and fix.
>> 
>> So I think that we did the right thing here, by sending a pull
>> request early in the release cycle, so that now we have many other
>> RCs to fix all the issues that come up.

> That sounds fair.

> Sander, does the follow patch fix this issue?

Hi David,

This patch indeed seems to fix it.

Thx,

Sander


> 8<---------------------------------------------------
> xen/balloon: ensure preemption is disabled when using a scratch page

> In decrease_reservation(), if the kernel is preempted between updating
> the mapping and updating the p2m then they may end up using different
> scratch pages.

> Use get_balloon_scratch_page() and put_balloon_scratch_page() which use
> get_cpu_var() and put_cpu_var() to correctly disable preemption.

> Signed-off-by: David Vrabel <david.vrabel@xxxxxxxxxx>
> ---
> diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
> index 3101cf6..a74647b 100644
> --- a/drivers/xen/balloon.c
> +++ b/drivers/xen/balloon.c
> @@ -305,6 +305,18 @@ static enum bp_state reserve_additional_memory(long 
> credit)
>  }
>  #endif /* CONFIG_XEN_BALLOON_MEMORY_HOTPLUG */
>  
> +struct page *get_balloon_scratch_page(void)
> +{
> +       struct page *ret = get_cpu_var(balloon_scratch_page);
> +       BUG_ON(ret == NULL);
> +       return ret;
> +}
> +
> +void put_balloon_scratch_page(void)
> +{
> +       put_cpu_var(balloon_scratch_page);
> +}
> +
>  static enum bp_state increase_reservation(unsigned long nr_pages)
>  {
>         int rc;
> @@ -380,6 +392,7 @@ static enum bp_state decrease_reservation(unsigned long 
> nr_pages, gfp_t gfp)
>         enum bp_state state = BP_DONE;
>         unsigned long  pfn, i;
>         struct page   *page;
> +       struct page   *scratch_page;
>         int ret;
>         struct xen_memory_reservation reservation = {
>                 .address_bits = 0,
> @@ -399,6 +412,8 @@ static enum bp_state decrease_reservation(unsigned long 
> nr_pages, gfp_t gfp)
>         if (nr_pages > ARRAY_SIZE(frame_list))
>                 nr_pages = ARRAY_SIZE(frame_list);
>  
> +       scratch_page = get_balloon_scratch_page();
> +
>         for (i = 0; i < nr_pages; i++) {
>                 page = alloc_page(gfp);
>                 if (page == NULL) {
> @@ -416,7 +431,7 @@ static enum bp_state decrease_reservation(unsigned long 
> nr_pages, gfp_t gfp)
>                 if (xen_pv_domain() && !PageHighMem(page)) {
>                         ret = HYPERVISOR_update_va_mapping(
>                                 (unsigned long)__va(pfn << PAGE_SHIFT),
> -                               
> pfn_pte(page_to_pfn(__get_cpu_var(balloon_scratch_page)),
> +                               pfn_pte(page_to_pfn(scratch_page),
>                                         PAGE_KERNEL_RO), 0);
>                         BUG_ON(ret);
>                 }
> @@ -432,14 +447,14 @@ static enum bp_state decrease_reservation(unsigned long 
> nr_pages, gfp_t gfp)
>                 pfn = mfn_to_pfn(frame_list[i]);
>                 if (!xen_feature(XENFEAT_auto_translated_physmap)) {
>                         unsigned long p;
> -                       struct page *pg;
> -                       pg = __get_cpu_var(balloon_scratch_page);
> -                       p = page_to_pfn(pg);
> +                       p = page_to_pfn(scratch_page);
>                         __set_phys_to_machine(pfn, pfn_to_mfn(p));
>                 }
>                 balloon_append(pfn_to_page(pfn));
>         }
>  
> +       put_balloon_scratch_page();
> +
>         set_xen_guest_handle(reservation.extent_start, frame_list);
>         reservation.nr_extents   = nr_pages;
>         ret = HYPERVISOR_memory_op(XENMEM_decrease_reservation, &reservation);
> @@ -491,18 +506,6 @@ static void balloon_process(struct work_struct *work)
>         mutex_unlock(&balloon_mutex);
>  }
>  
> -struct page *get_balloon_scratch_page(void)
> -{
> -       struct page *ret = get_cpu_var(balloon_scratch_page);
> -       BUG_ON(ret == NULL);
> -       return ret;
> -}
> -
> -void put_balloon_scratch_page(void)
> -{
> -       put_cpu_var(balloon_scratch_page);
> -}
> -
>  /* Resets the Xen limit, sets new target, and kicks off processing. */
>  void balloon_set_new_target(unsigned long target)
>  {



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.