[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] RFC: mem_event: use wait queue when ring is full


  • To: "Olaf Hering" <olaf@xxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxx
  • From: "Andres Lagar-Cavilla" <andres@xxxxxxxxxxxxxxxx>
  • Date: Tue, 22 Nov 2011 20:58:06 -0800
  • Cc: adin@xxxxxxxxxxxxxx
  • Delivery-date: Wed, 23 Nov 2011 04:59:12 +0000
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=lagarcavilla.org; h=message-id :in-reply-to:references:date:subject:from:to:cc:reply-to :mime-version:content-type:content-transfer-encoding; q=dns; s= lagarcavilla.org; b=GD051V78W5Uql2sYQEq9RhW9d5oXbUp/OblndD8oSDLG zbVjJoV9R39QVvl4tzW3PWpHnmPoPSHqjcPvqwXEDK0+r5jEPnstl+6unWeaT75m dm3H0CDnkUXb0wjUMr6rxqKkXiHrUWOo3jfvocFCRmqPBwPM3tBSRLomyV69Gz0=
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Olaf, two questions here

- do you have any insight for events caused by foreign mappings? Those
will be lost with a full ring, with or without wait queues

- we have posted a patch (twice) previously, with changes to ring
management, most importantly sending guest vcpus to sleep when space in
the ring is < d->max_vcpus. I see these two patches as complementary. What
is your take?

Thanks,
Andres

> Date: Tue, 22 Nov 2011 22:13:24 +0100
> From: Olaf Hering <olaf@xxxxxxxxx>
> To: xen-devel@xxxxxxxxxxxxxxxxxxx
> Subject: [Xen-devel] [PATCH 2 of 3] RFC: mem_event: use wait queue
>       when ring       is full
> Message-ID: <de6860cb9205b68d1287.1321996404@xxxxxxxxxxxx>
> Content-Type: text/plain; charset="us-ascii"
>
> # HG changeset patch
> # User Olaf Hering <olaf@xxxxxxxxx>
> # Date 1321996145 -3600
> # Node ID de6860cb9205b68d1287482288d1b7b9d0255609
> # Parent  d347a8a36d2e7951f98a3d22866dce004484d95f
> RFC: mem_event: use wait queue when ring is full
>
> CAUTION: while the patch by itself is supposed to be complete,
> the added usage of waitqueues will lead to sudden host reboots!
>
>
> This change is based on an idea/patch from Adin Scannell.
>
> If the ring is full, put the current vcpu to sleep if it belongs to the
> target domain. The wakeup happens in the p2m_*_resume functions.
> A request from another domain will use the existing ->req_producers
> functionality because sleeping is not possible if the vcpu does not
> belong to the target domain.
>
> This change fixes also a bug in p2m_mem_paging_drop_page(). Up to now a
> full ring will lead to harmless inconsistency in the pager,
>
> v2:
>  - p2m_mem_paging_populate: move vcpu_pause after put_request, otherwise
> the
>    vcpu will not wake_up after a wait_event because the pause_count was
>    increased twice. Fixes guest hangs.
>  - update free space check in _mem_event_put_request()
>  - simplify mem_event_put_request()
>
> Signed-off-by: Olaf Hering <olaf@xxxxxxxxx>
>
> diff -r d347a8a36d2e -r de6860cb9205 xen/arch/x86/hvm/hvm.c
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -4099,7 +4099,7 @@ static int hvm_memory_event_traps(long p
>          return 1;
>
>      rc = mem_event_check_ring(d, &d->mem_event->mem_access);
> -    if ( rc )
> +    if ( rc < 0 )
>          return rc;
>
>      memset(&req, 0, sizeof(req));
> diff -r d347a8a36d2e -r de6860cb9205 xen/arch/x86/mm/mem_event.c
> --- a/xen/arch/x86/mm/mem_event.c
> +++ b/xen/arch/x86/mm/mem_event.c
> @@ -23,6 +23,7 @@
>
>  #include <asm/domain.h>
>  #include <xen/event.h>
> +#include <xen/wait.h>
>  #include <asm/p2m.h>
>  #include <asm/mem_event.h>
>  #include <asm/mem_paging.h>
> @@ -39,6 +40,7 @@
>
>  static int mem_event_enable(struct domain *d,
>                              xen_domctl_mem_event_op_t *mec,
> +                            int mem_event_bit,
>                              struct mem_event_domain *med)
>  {
>      int rc;
> @@ -107,8 +109,12 @@ static int mem_event_enable(struct domai
>
>      mem_event_ring_lock_init(med);
>
> +    med->mem_event_bit = mem_event_bit;
> +
> +    init_waitqueue_head(&med->wq);
> +
>      /* Wake any VCPUs paused for memory events */
> -    mem_event_unpause_vcpus(d);
> +    mem_event_unpause_vcpus(d, med);
>
>      return 0;
>
> @@ -124,6 +130,9 @@ static int mem_event_enable(struct domai
>
>  static int mem_event_disable(struct mem_event_domain *med)
>  {
> +    if (!list_empty(&med->wq.list))
> +        return -EBUSY;
> +
>      unmap_domain_page(med->ring_page);
>      med->ring_page = NULL;
>
> @@ -133,13 +142,21 @@ static int mem_event_disable(struct mem_
>      return 0;
>  }
>
> -void mem_event_put_request(struct domain *d, struct mem_event_domain
> *med, mem_event_request_t *req)
> +static int _mem_event_put_request(struct domain *d, struct
> mem_event_domain *med, mem_event_request_t *req)
>  {
>      mem_event_front_ring_t *front_ring;
> +    int free_requests;
>      RING_IDX req_prod;
>
>      mem_event_ring_lock(med);
>
> +    free_requests = RING_FREE_REQUESTS(&med->front_ring);
> +    /* A request was eventually claimed in mem_event_check_ring() */
> +    if ((current->domain == d && free_requests <= med->req_producers) ||
> !free_requests) {
> +        mem_event_ring_unlock(med);
> +        return 0;
> +    }
> +
>      front_ring = &med->front_ring;
>      req_prod = front_ring->req_prod_pvt;
>
> @@ -155,6 +172,20 @@ void mem_event_put_request(struct domain
>      mem_event_ring_unlock(med);
>
>      notify_via_xen_event_channel(d, med->xen_port);
> +
> +    return 1;
> +}
> +
> +void mem_event_put_request(struct domain *d, struct mem_event_domain
> *med, mem_event_request_t *req)
> +{
> +    /* Go to sleep if request came from guest */
> +    if (current->domain == d) {
> +        wait_event(med->wq, _mem_event_put_request(d, med, req));
> +        return;
> +    }
> +    /* Ring was full anyway, unable to sleep in non-guest context */
> +    if (!_mem_event_put_request(d, med, req))
> +        printk("Failed to put memreq: d %u t %x f %x gfn %lx\n",
> d->domain_id, req->type, req->flags, (unsigned long)req->gfn);
>  }
>
>  void mem_event_get_response(struct mem_event_domain *med,
> mem_event_response_t *rsp)
> @@ -178,21 +209,27 @@ void mem_event_get_response(struct mem_e
>      mem_event_ring_unlock(med);
>  }
>
> -void mem_event_unpause_vcpus(struct domain *d)
> +void mem_event_unpause_vcpus(struct domain *d, struct mem_event_domain
> *med)
>  {
>      struct vcpu *v;
>
>      for_each_vcpu ( d, v )
> -        if ( test_and_clear_bit(_VPF_mem_event, &v->pause_flags) )
> +        if ( test_and_clear_bit(med->mem_event_bit, &v->pause_flags) )
>              vcpu_wake(v);
>  }
>
> -void mem_event_mark_and_pause(struct vcpu *v)
> +void mem_event_mark_and_pause(struct vcpu *v, struct mem_event_domain
> *med)
>  {
> -    set_bit(_VPF_mem_event, &v->pause_flags);
> +    set_bit(med->mem_event_bit, &v->pause_flags);
>      vcpu_sleep_nosync(v);
>  }
>
> +/**
> + * mem_event_put_req_producers - Release a claimed slot
> + * @med: mem_event ring
> + *
> + * mem_event_put_req_producers() releases a claimed slot in the mem_event
> ring.
> + */
>  void mem_event_put_req_producers(struct mem_event_domain *med)
>  {
>      mem_event_ring_lock(med);
> @@ -200,9 +237,26 @@ void mem_event_put_req_producers(struct
>      mem_event_ring_unlock(med);
>  }
>
> +/**
> + * mem_event_check_ring - Check state of a mem_event ring
> + * @d: guest domain
> + * @med: mem_event ring
> + *
> + * Return codes: < 0: the ring is not yet configured
> + *                 0: the ring has some room
> + *               > 0: the ring is full
> + *
> + * mem_event_check_ring() checks the state of the given mem_event ring.
> + * If the current vcpu belongs to the guest domain, the function assumes
> that
> + * mem_event_put_request() will sleep until the ring has room again.
> + *
> + * If the current vcpu does not belong to the target domain the caller
> must try
> + * again until there is room. A slot is claimed and the caller can place
> a
> + * request. If the caller does not need to send a request, the claimed
> slot has
> + * to be released with mem_event_put_req_producers().
> + */
>  int mem_event_check_ring(struct domain *d, struct mem_event_domain *med)
>  {
> -    struct vcpu *curr = current;
>      int free_requests;
>      int ring_full = 1;
>
> @@ -218,8 +272,8 @@ int mem_event_check_ring(struct domain *
>          ring_full = 0;
>      }
>
> -    if ( ring_full && (curr->domain == d) )
> -        mem_event_mark_and_pause(curr);
> +    if ( current->domain == d )
> +        ring_full = 0;
>
>      mem_event_ring_unlock(med);
>
> @@ -287,7 +341,7 @@ int mem_event_domctl(struct domain *d, x
>              if ( p2m->pod.entry_count )
>                  break;
>
> -            rc = mem_event_enable(d, mec, med);
> +            rc = mem_event_enable(d, mec, _VPF_me_mem_paging, med);
>          }
>          break;
>
> @@ -326,7 +380,7 @@ int mem_event_domctl(struct domain *d, x
>              if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL )
>                  break;
>
> -            rc = mem_event_enable(d, mec, med);
> +            rc = mem_event_enable(d, mec, _VPF_me_mem_access, med);
>          }
>          break;
>
> diff -r d347a8a36d2e -r de6860cb9205 xen/arch/x86/mm/mem_sharing.c
> --- a/xen/arch/x86/mm/mem_sharing.c
> +++ b/xen/arch/x86/mm/mem_sharing.c
> @@ -253,19 +253,11 @@ static void mem_sharing_audit(void)
>  #endif
>
>
> -static struct page_info* mem_sharing_alloc_page(struct domain *d,
> -                                                unsigned long gfn)
> +static void mem_sharing_notify_helper(struct domain *d, unsigned long
> gfn)
>  {
> -    struct page_info* page;
>      struct vcpu *v = current;
>      mem_event_request_t req;
>
> -    page = alloc_domheap_page(d, 0);
> -    if(page != NULL) return page;
> -
> -    memset(&req, 0, sizeof(req));
> -    req.type = MEM_EVENT_TYPE_SHARED;
> -
>      if ( v->domain != d )
>      {
>          /* XXX This path needs some attention.  For now, just fail
> foreign
> @@ -275,20 +267,20 @@ static struct page_info* mem_sharing_all
>          gdprintk(XENLOG_ERR,
>                   "Failed alloc on unshare path for foreign (%d)
> lookup\n",
>                   d->domain_id);
> -        return page;
> +        return;
>      }
>
> -    vcpu_pause_nosync(v);
> -    req.flags |= MEM_EVENT_FLAG_VCPU_PAUSED;
> +    if (mem_event_check_ring(d, &d->mem_event->mem_share) < 0)
> +        return;
>
> -    if(mem_event_check_ring(d, &d->mem_event->mem_share)) return page;
> -
> +    memset(&req, 0, sizeof(req));
> +    req.type = MEM_EVENT_TYPE_SHARED;
> +    req.flags = MEM_EVENT_FLAG_VCPU_PAUSED;
>      req.gfn = gfn;
>      req.p2mt = p2m_ram_shared;
>      req.vcpu_id = v->vcpu_id;
>      mem_event_put_request(d, &d->mem_event->mem_share, &req);
> -
> -    return page;
> +    vcpu_pause_nosync(v);
>  }
>
>  unsigned int mem_sharing_get_nr_saved_mfns(void)
> @@ -653,7 +645,7 @@ gfn_found:
>      if(ret == 0) goto private_page_found;
>
>      old_page = page;
> -    page = mem_sharing_alloc_page(d, gfn);
> +    page = alloc_domheap_page(d, 0);
>      if(!page)
>      {
>          /* We've failed to obtain memory for private page. Need to re-add
> the
> @@ -661,6 +653,7 @@ gfn_found:
>          list_add(&gfn_info->list, &hash_entry->gfns);
>          put_gfn(d, gfn);
>          shr_unlock();
> +        mem_sharing_notify_helper(d, gfn);
>          return -ENOMEM;
>      }
>
> diff -r d347a8a36d2e -r de6860cb9205 xen/arch/x86/mm/p2m.c
> --- a/xen/arch/x86/mm/p2m.c
> +++ b/xen/arch/x86/mm/p2m.c
> @@ -884,17 +884,13 @@ void p2m_mem_paging_drop_page(struct dom
>      struct vcpu *v = current;
>      mem_event_request_t req;
>
> -    /* Check that there's space on the ring for this request */
> -    if ( mem_event_check_ring(d, &d->mem_event->mem_paging) == 0)
> -    {
> -        /* Send release notification to pager */
> -        memset(&req, 0, sizeof(req));
> -        req.flags |= MEM_EVENT_FLAG_DROP_PAGE;
> -        req.gfn = gfn;
> -        req.vcpu_id = v->vcpu_id;
> +    /* Send release notification to pager */
> +    memset(&req, 0, sizeof(req));
> +    req.flags = MEM_EVENT_FLAG_DROP_PAGE;
> +    req.gfn = gfn;
> +    req.vcpu_id = v->vcpu_id;
>
> -        mem_event_put_request(d, &d->mem_event->mem_paging, &req);
> -    }
> +    mem_event_put_request(d, &d->mem_event->mem_paging, &req);
>  }
>
>  /**
> @@ -952,7 +948,6 @@ void p2m_mem_paging_populate(struct doma
>      /* Pause domain if request came from guest and gfn has paging type */
>      if (  p2m_is_paging(p2mt) && v->domain == d )
>      {
> -        vcpu_pause_nosync(v);
>          req.flags |= MEM_EVENT_FLAG_VCPU_PAUSED;
>      }
>      /* No need to inform pager if the gfn is not in the page-out path */
> @@ -969,6 +964,10 @@ void p2m_mem_paging_populate(struct doma
>      req.vcpu_id = v->vcpu_id;
>
>      mem_event_put_request(d, &d->mem_event->mem_paging, &req);
> +
> +    /* Pause guest after put_request to allow wake_up after wait_event */
> +    if ( req.flags & MEM_EVENT_FLAG_VCPU_PAUSED )
> +        vcpu_pause_nosync(v);
>  }
>
>  /**
> @@ -1071,8 +1070,8 @@ void p2m_mem_paging_resume(struct domain
>      if ( rsp.flags & MEM_EVENT_FLAG_VCPU_PAUSED )
>          vcpu_unpause(d->vcpu[rsp.vcpu_id]);
>
> -    /* Unpause any domains that were paused because the ring was full */
> -    mem_event_unpause_vcpus(d);
> +    /* Unpause all vcpus that were paused because the ring was full */
> +    wake_up(&d->mem_event->mem_paging.wq);
>  }
>
>  void p2m_mem_access_check(unsigned long gpa, bool_t gla_valid, unsigned
> long gla,
> @@ -1111,7 +1110,7 @@ void p2m_mem_access_check(unsigned long
>                     "Memory access permissions failure, no mem_event
> listener: pausing VCPU %d, dom %d\n",
>                     v->vcpu_id, d->domain_id);
>
> -            mem_event_mark_and_pause(v);
> +            mem_event_mark_and_pause(v, &d->mem_event->mem_access);
>          }
>          else
>          {
> @@ -1131,8 +1130,7 @@ void p2m_mem_access_check(unsigned long
>      req.reason = MEM_EVENT_REASON_VIOLATION;
>
>      /* Pause the current VCPU unconditionally */
> -    vcpu_pause_nosync(v);
> -    req.flags |= MEM_EVENT_FLAG_VCPU_PAUSED;
> +    req.flags = MEM_EVENT_FLAG_VCPU_PAUSED;
>
>      /* Send request to mem event */
>      req.gfn = gfn;
> @@ -1148,6 +1146,7 @@ void p2m_mem_access_check(unsigned long
>      mem_event_put_request(d, &d->mem_event->mem_access, &req);
>
>      /* VCPU paused, mem event request sent */
> +    vcpu_pause_nosync(v);
>  }
>
>  void p2m_mem_access_resume(struct p2m_domain *p2m)
> @@ -1161,9 +1160,11 @@ void p2m_mem_access_resume(struct p2m_do
>      if ( rsp.flags & MEM_EVENT_FLAG_VCPU_PAUSED )
>          vcpu_unpause(d->vcpu[rsp.vcpu_id]);
>
> -    /* Unpause any domains that were paused because the ring was full or
> no listener
> -     * was available */
> -    mem_event_unpause_vcpus(d);
> +    /* Wakeup all vcpus waiting because the ring was full */
> +    wake_up(&d->mem_event->mem_access.wq);
> +
> +    /* Unpause all vcpus that were paused because no listener was
> available */
> +    mem_event_unpause_vcpus(d, &d->mem_event->mem_access);
>  }
>
>
> diff -r d347a8a36d2e -r de6860cb9205 xen/include/asm-x86/mem_event.h
> --- a/xen/include/asm-x86/mem_event.h
> +++ b/xen/include/asm-x86/mem_event.h
> @@ -25,12 +25,12 @@
>  #define __MEM_EVENT_H__
>
>  /* Pauses VCPU while marking pause flag for mem event */
> -void mem_event_mark_and_pause(struct vcpu *v);
> +void mem_event_mark_and_pause(struct vcpu *v, struct mem_event_domain
> *med);
>  int mem_event_check_ring(struct domain *d, struct mem_event_domain *med);
>  void mem_event_put_req_producers(struct mem_event_domain *med);
>  void mem_event_put_request(struct domain *d, struct mem_event_domain
> *med, mem_event_request_t *req);
>  void mem_event_get_response(struct mem_event_domain *med,
> mem_event_response_t *rsp);
> -void mem_event_unpause_vcpus(struct domain *d);
> +void mem_event_unpause_vcpus(struct domain *d, struct mem_event_domain
> *med);
>
>  int mem_event_domctl(struct domain *d, xen_domctl_mem_event_op_t *mec,
>                       XEN_GUEST_HANDLE(void) u_domctl);
> diff -r d347a8a36d2e -r de6860cb9205 xen/include/xen/sched.h
> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -14,6 +14,7 @@
>  #include <xen/nodemask.h>
>  #include <xen/radix-tree.h>
>  #include <xen/multicall.h>
> +#include <xen/wait.h>
>  #include <public/xen.h>
>  #include <public/domctl.h>
>  #include <public/sysctl.h>
> @@ -192,6 +193,10 @@ struct mem_event_domain
>      mem_event_front_ring_t front_ring;
>      /* event channel port (vcpu0 only) */
>      int xen_port;
> +    /* mem_event bit for vcpu->pause_flags */
> +    int mem_event_bit;
> +    /* list of vcpus waiting for room in the ring */
> +    struct waitqueue_head wq;
>  };
>
>  struct mem_event_per_domain
> @@ -615,9 +620,12 @@ static inline struct domain *next_domain
>   /* VCPU affinity has changed: migrating to a new CPU. */
>  #define _VPF_migrating       3
>  #define VPF_migrating        (1UL<<_VPF_migrating)
> - /* VCPU is blocked on memory-event ring. */
> -#define _VPF_mem_event       4
> -#define VPF_mem_event        (1UL<<_VPF_mem_event)
> + /* VCPU is blocked on mem_paging ring. */
> +#define _VPF_me_mem_paging   4
> +#define VPF_me_mem_paging    (1UL<<_VPF_me_mem_paging)
> + /* VCPU is blocked on mem_access ring. */
> +#define _VPF_me_mem_access   5
> +#define VPF_me_mem_access    (1UL<<_VPF_me_mem_access)
>
>  static inline int vcpu_runnable(struct vcpu *v)
>  {
>
>
>
> ------------------------------
>
> Message: 3
> Date: Tue, 22 Nov 2011 22:15:19 +0100
> From: Olaf Hering <olaf@xxxxxxxxx>
> To: Keir Fraser <keir.xen@xxxxxxxxx>
> Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
> Subject: Re: [Xen-devel] Need help with fixing the Xen waitqueue
>       feature
> Message-ID: <20111122211519.GA1039@xxxxxxxxx>
> Content-Type: text/plain; charset=utf-8
>
> On Tue, Nov 22, Olaf Hering wrote:
>
>> On Tue, Nov 22, Keir Fraser wrote:
>>
>> > Could it have ended up on the waitqueue?
>>
>> Unlikely, but I will add checks for that as well.
>
> I posted three changes which make use of the wait queues.
> For some reason the code at the very end of p2m_mem_paging_populate()
> triggers when d is dom0, so its vcpu is put to sleep.
>
>
> Olaf
>
>
>
> ------------------------------
>
> Message: 4
> Date: Tue, 22 Nov 2011 21:53:35 +0000
> From: Keir Fraser <keir.xen@xxxxxxxxx>
> To: Olaf Hering <olaf@xxxxxxxxx>
> Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
> Subject: Re: [Xen-devel] Need help with fixing the Xen waitqueue
>       feature
> Message-ID: <CAF1CA5F.2560C%keir.xen@xxxxxxxxx>
> Content-Type: text/plain;     charset="US-ASCII"
>
> On 22/11/2011 21:15, "Olaf Hering" <olaf@xxxxxxxxx> wrote:
>
>> On Tue, Nov 22, Olaf Hering wrote:
>>
>>> On Tue, Nov 22, Keir Fraser wrote:
>>>
>>>> Could it have ended up on the waitqueue?
>>>
>>> Unlikely, but I will add checks for that as well.
>>
>> I posted three changes which make use of the wait queues.
>> For some reason the code at the very end of p2m_mem_paging_populate()
>> triggers when d is dom0, so its vcpu is put to sleep.
>
> We obviously can't have dom0 going to sleep on paging work. This, at
> least,
> isn't a wait-queue bug.
>
>> Olaf
>
>
>
>
>
> ------------------------------
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
>
>
> End of Xen-devel Digest, Vol 81, Issue 296
> ******************************************
>



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.