|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] xend: do not polling vcpus info if guest state is not RUNNING or PAUSED
On Tue, Nov 19, 2013 at 06:41:37PM +0800, Joe Jin wrote:
> On 11/19/13 16:03, Roger Pau Monné wrote:
> > On 19/11/13 07:13, Joe Jin wrote:
> >> When created new guest on NUMA server, xend tried to get the best node by
> >> calculated all vcpus info, the race is if other geust is rebooting, the
> >> guest in the list when entered find_relaxed_node(), but when call
> >> getVCPUInfo() the guest be terminated, then getVCPUInfo() will fail with
> >> below error:
> >>
> >> [2013-09-04 20:01:26 6254] ERROR (XendDomainInfo:496) VM start failed
> >> Traceback (most recent call last):
> >> File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py",
> >> line 482, in start
> >> XendTask.log_progress(31, 60, self._initDomain)
> >> File "/usr/lib64/python2.4/site-packages/xen/xend/XendTask.py", line
> >> 209, in log_progress
> >> retval = func(*args, **kwds)
> >> File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py",
> >> line 2918, in _initDomain
> >> node = self._setCPUAffinity()
> >> File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py",
> >> line 2835, in _setCPUAffinity
> >> best_node = find_relaxed_node(candidate_node_list)[0]
> >> File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py",
> >> line 2803, in find_relaxed_node
> >> cpuinfo = dom.getVCPUInfo()
> >> File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py",
> >> line 1600, in getVCPUInfo
> >> raise XendError(str(exn))
> >> XendError: (3, 'No such process')
> >>
> >> This patch will let find_relaxed_node() only polling the RUNNING or PAUSED
> >> guest vpus info to avoid the race.
> >>
> >> Signed-off-by: Joe Jin <joe.jin@xxxxxxxxxx>
> >> ---
> >> tools/python/xen/xend/XendDomainInfo.py | 2 ++
> >> 1 files changed, 2 insertions(+), 0 deletions(-)
> >>
> >> diff --git a/tools/python/xen/xend/XendDomainInfo.py
> >> b/tools/python/xen/xend/XendDomainInfo.py
> >> index e9d3e7e..66e4b9f 100644
> >> --- a/tools/python/xen/xend/XendDomainInfo.py
> >> +++ b/tools/python/xen/xend/XendDomainInfo.py
> >> @@ -2734,6 +2734,8 @@ class XendDomainInfo:
> >> from xen.xend import XendDomain
> >> doms = XendDomain.instance().list('all')
> >> for dom in filter (lambda d: d.domid != self.domid, doms):
> >> + if dom._stateGet() not in
> >> (DOM_STATE_RUNNING,DOM_STATE_PAUSED):
> >> + continue
> >
> > Isn't it possible that the domain has rebooted and is no longer there
> > between this two calls?
> >
> > IMHO it's very unlikely, but there's still a window where getVCPUInfo
> > could fail.
> >
>
> Yes your right, this patch just reduce the window.
> I created a new patch for this, please comment!
>
> [PATCH] xend: getVCPUInfo should handle died domain
>
> When created new guest on NUMA server, xend tried to get the best node by
> calculated all vcpus info, the race is if other geust is rebooting, the
> guest in the list when entered find_relaxed_node(), but when call
> getVCPUInfo() the guest already be terminated, then getVCPUInfo() will
> fail with below error:
>
> [2013-09-04 20:01:26 6254] ERROR (XendDomainInfo:496) VM start failed
> Traceback (most recent call last):
> File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line
> 482, in start
> XendTask.log_progress(31, 60, self._initDomain)
> File "/usr/lib64/python2.4/site-packages/xen/xend/XendTask.py", line 209,
> in log_progress
> retval = func(*args, **kwds)
> File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line
> 2918, in _initDomain
> node = self._setCPUAffinity()
> File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line
> 2835, in _setCPUAffinity
> best_node = find_relaxed_node(candidate_node_list)[0]
> File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line
> 2803, in find_relaxed_node
> cpuinfo = dom.getVCPUInfo()
> File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line
> 1600, in getVCPUInfo
> raise XendError(str(exn))
> XendError: (3, 'No such process')
>
> This patch will handle the situation.
>
> Signed-off-by: Joe Jin <joe.jin@xxxxxxxxxx>
> ---
> tools/python/xen/xend/XendDomainInfo.py | 4 ++++
> 1 files changed, 4 insertions(+), 0 deletions(-)
>
> diff --git a/tools/python/xen/xend/XendDomainInfo.py
> b/tools/python/xen/xend/XendDomainInfo.py
> index e9d3e7e..c6414ed 100644
> --- a/tools/python/xen/xend/XendDomainInfo.py
> +++ b/tools/python/xen/xend/XendDomainInfo.py
> @@ -34,6 +34,7 @@ import os
> import stat
> import shutil
> import traceback
> +import errno
> from types import StringTypes
>
> import xen.lowlevel.xc
> @@ -1541,6 +1542,9 @@ class XendDomainInfo:
> return sxpr
>
> except RuntimeError, exn:
> + # Domain already died.
> + if exn.args[0] == errno.ESRCH:
> + return sxpr
> raise XendError(str(exn))
>
>
Adding Matt as he has stepped up to be the bug-fix maintainer of Xend
(I think? Is that correct - should that be reflected in the MAINTAINERS file?)
> --
> 1.7.1
>
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |