|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 4 of 5 V3] tools/libxl: Control network buffering in remus callbacks [and 1 more messages] [and 1 more messages]
Shriram Rajagopalan writes ("Re: [PATCH 4 of 5 V3] tools/libxl: Control network
buffering in remus callbacks [and 1 more messages] [and 1 more messages]"):
> The nested-ao patch makes sense for Remus, even without fixing this
> timeout issue. I can modify my stuff accordingly. Probably create a
> nested-ao per iteration and drop it at the start of the next
> iteration.
Right. Great.
> However, the timeout part is not convincing enough. For example,
> libxl__domain_suspend_common_callback [the version before your patch]
> has two 6 second wait loops in the worst case..
...
> LOG(DEBUG, "wait for the guest to acknowledge suspend request");
> watchdog = 60;
> while (!strcmp(state, "suspend") && watchdog > 0) {
> usleep(100000);
...
> and then once again
...
> usleep(100000);
Oh dear. That is very poor.
> Now I know where the 200ms overhead per checkpoint comes from.
>
> Shouldn't this also be made into an event loop? Irrespective of
> whether it is invoked in Remus' context or normal
> suspend/resume/save/restore/migrate context.
Yes, you are entitrely correct.
Both of these loops should be replaced with timeout/event/callback
approaches.
Do you want to attempt this or would you like me to do it ?
> Currently there isn't any other reason to make the change in this
> patch, so I don't think it should be committed right away. But if for
> some reason it does get committed to staging, you or we can just drop
> it from the start of your series.
...
> The only reason it might get committed to staging without other remus patches
> would be to fix the issue I cited above.
Yes.
Ian.
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |