[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility



Ian Jackson wrote:
> Jim Fehlig writes ("Re: [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD 
> flexibility"):
>   
>> Looking at the libvirt code again, it seems a single thread services the
>> event loop. See virNetServerRun() in src/util/virnetserver.c. Indeed, I
>> see the same thread ID in all the timer and fd callbacks. One of the
>> libvirt core devs can correct me if I'm wrong.
>>     
>
> OK.  So just to recap where we stand:
>
>  * I think libxl needs the SIGCHLD flexibility series.  I'll repost
>    that (v3) but it's had hardly any changes.
>   

Ok, thanks.  I'm currently testing on your git branch referenced earlier
in this thread

git://xenbits.xen.org/people/iwj/xen.git#wip.enumerate-pids-v2.1

>  * You need to fix the timer deregistration arrangements in the
>    libvirt/libxl driver to avoid the crash you identified the other day.
>   

Yes, I'm testing a fix now.

>  * Something needs to be done about the 20ms slop in the libvirt event
>    loop (as it could cause libxl to lock up).  If you can't get rid of
>    it in the libvirt core, then adding 20ms to the every requested
>    callback time in the libvirt/libxl driver would work for now.
>   

The commit msg adding the fuzz says

    Fix event test timer checks on kernels with HZ=100
   
    On kernels with HZ=100, the resolution of sleeps in poll() is
    quite bad. Doing a precise check on the expiry time vs the
    current time will thus often thing the timer has not expired
    even though we're within 10ms of the expected expiry time. This
    then causes another pointless sleep in poll() for <10ms. Timers
    do not need to have such precise expiration, so we treat a timer
    as expired if it is within 20ms of the expected expiry time. This
    also fixes the eventtest.c test suite on kernels with HZ=100
   
    * daemon/event.c: Add 20ms fuzz when checking for timer expiry


I could handle this in the libxl driver as you say, but doing so makes
me a bit nervous.  Potentially locking up libxl makes me nervous too :).

>  * I think we can get away with not doing anything about the fd
>    deregistration race in libvirt because both Linux and FreeBSD have
>    behaviours that are tolerable.
>
>  * libxl should have the fd deregistration race fixed in Xen 4.5.
>
> Have you managed to fix the timer deregistration crash, and retest ?
>   

Yes.  I've been running my tests for about 24 hours now with no problems
noted.  The tests include starting/stopping a persistent VM,
creating/stopping a transient VM, rebooting a persistent VM,
saving/restoring a transient VM, and getting info on all of these VMs.

I should probably add saving/restoring a persistent VM to the mix since
the associated libxl_ctx is never freed.  Only when a persistent VM is
undefined is the libxl_ctx freed.

Regards,
Jim

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.