[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Need help with fixing the Xen waitqueue feature


  • To: Olaf Hering <olaf@xxxxxxxxx>
  • From: Keir Fraser <keir@xxxxxxx>
  • Date: Wed, 23 Nov 2011 18:18:42 +0000
  • Cc: xen-devel@xxxxxxxxxxxxxxxxxxx, Jan Beulich <JBeulich@xxxxxxxx>
  • Delivery-date: Wed, 23 Nov 2011 18:19:42 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>
  • Thread-index: AcyqA6u/YPIz4SJrjUuVJSulwM3zTgACKiyE
  • Thread-topic: [Xen-devel] Need help with fixing the Xen waitqueue feature

On 23/11/2011 17:16, "Keir Fraser" <keir.xen@xxxxxxxxx> wrote:

> On 23/11/2011 17:00, "Olaf Hering" <olaf@xxxxxxxxx> wrote:
> 
>> On Tue, Nov 22, Keir Fraser wrote:
>> 
>>> We obviously can't have dom0 going to sleep on paging work. This, at least,
>>> isn't a wait-queue bug.
>> 
>> I had to rearrange some code in p2m_mem_paging_populate for my debug
>> stuff. This led to an uninitialized req, and as a result req.flags
>> sometimes had MEM_EVENT_FLAG_VCPU_PAUSED set. For some reason gcc did
>> not catch that..
>> Now waitqueues appear to work ok for me. Thanks!
> 
> Great. However, while eyeballing wait.c I spotted at least two bugs. I'm
> pretty sure that the hypervisor will blow up pretty quickly when you resume
> testing with multiple physical CPUs, for example. I need to create a couple
> of fixup patches which I will then send to you for test.

We have quite a big waitqueue problem actually. The current scheme of
per-cpu stacks doesn't work nicely, as the stack pointer will change if a
vcpu goes to sleep and then wakes up on a different cpu. This really doesn't
work nicely with preempted C code, which may implement frame pointers and/or
arbitrarily take the address of on-stack variables. The result will be
hideous cross-stack corruptions, as these frame pointers and cached
addresses of automatic variables will reference the wrong cpu's stack!
Fixing or detecting this in general is not possible afaics.

So, we'll have to switch to per-vcpu stacks, probably with separate per-cpu
irq stacks (as a later followup). That's quite a nuisance!

 -- Keir

> By the way, did you test my patch to domain_crash when the stack-save area
> isn't large enough?
> 
>> What do you think about C99 initializers in p2m_mem_paging_populate,
>> just to avoid such mistakes?
>> 
>>    mem_event_request_t req = { .type = MEM_EVENT_TYPE_PAGING };
> 
> We like them.
> 
>  -- Keir
> 
> 



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.