[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen Platform QoS design discussion

On 08/05/14 06:21, Xu, Dongxiao wrote:

<massive snip>

>>> We have two different hypercalls right now for getting "dominfo": a
>>> domctl and a sysctl.  You use the domctl if you want information about
>>> a single domain, you use sysctl if you want information about all
>>> domains.  The sysctl implementation calls the domctl implementation
>>> internally.
>> It is not a fair comparison, given the completely different nature of
>> the domctls in question.  XEN_DOMCTL_getdomaininfo is doing very little
>> more than reading specific bits of data out the appropriate struct
>> domain and its struct vcpu's which can trivially be done by the cpu
>> handling the hypercall.
>>> Is there a problem with doing the same thing here?  Or, with starting
>>> with a domctl, and then creating a sysctl if iterating over all
>>> domains (and calling the domctl internally) if we measure the domctl
>>> to be too slow for many callers?
>>>  -George
>> My problem is not with the domctl per-se.
>> My problem is that this is not a QoS design discussion;  this is an
>> email thread about a specific QoS implementation which is not answering
>> the concerns raised against it to the satisfaction of people raising the
>> concerns.
>> The core argument here is that a statement of "OpenStack want to get a
>> piece of QoS data back from libvirt/xenapi when querying a specific
>> domain" is being used to justify implementing the hypercall in an
>> identical fashion.
>> This is not a libxl design; this is a single user story forming part of
>> the requirement "I as a cloud service provider would like QoS
>> information for each VM to be available to my
>> $CHOSEN_ORCHESTRATION_SOFTWARE so I can {differentially charge
>> customers, balance my load more evenly, etc}".
>> The only valid justification for implementing a brand new hypercall in a
>> certain way is "Because $THIS_CERTAIN_WAY is the $MOST_SENSIBLE way to
>> perform the actions I need to perform", for appropriately
>> substitutions.  Not "because it is the same way I want to hand this
>> information off at the higher level".
>> As part of this design discussion. I have raised a concern saying "I
>> believe the usecase of having a stats gathering daemon in dom0 has not
>> been appropriately considered", qualified with "If you were to use the
>> domctl as currently designed from a stats gathering daemon, you will
>> cripple Xen with the overhead".
>> Going back to the original use, xenapi has a stats daemon for these
>> things.  It has an rpc interface so a query given a specific domain can
>> return some or all data for that domain, but it very definitely does not
>> translate each request into a hypercall for the requested information.
>> I have no real experience with libvirt, so can't comment on stats
>> gathering in that context.
>> I have proposed an alternative Xen->libxc interface designed with a
>> stats daemon in mind, explaining why I believe it has lower overheads to
>> Xen and why is more in line with what I expect ${VENDOR}Stack to
>> actually want.
>> I am now waiting for a reasoned rebuttal which has more content than
>> "because there are a set of patches which already implement it in this way".
> No, I don't have the patch for domctl implementation. 
> In the past half year, all previous v1-v10 patches are implemented in sysctl 
> way, however based on that, people raised a lot of comments (large size of 
> memory, runtime non-0 order of memory allocation, page sharing with user 
> space, CPU online/offline special logic, etc.), and these make the platform 
> QoS implementation more and more complex in Xen. That's why I am proposing 
> the domctl method that can make things easier.
> I don't have more things to argue or rebuttal, and if you prefer sysctl, I 
> can continue to work out a v11, v12 or more, to present the big 2-dimension 
> array to end user and let them withdraw their real required data, still 
> includes the extra CPU online/offline logics to handle the QoS resource 
> runtime allocation.
> Thanks,
> Dongxiao

I am sorry - I was not trying to make an argument for one of the
proposed mechanisms over the other.  The point I was trying to make
(which on further consideration isn't as clear as I was hoping) is that
you cannot possibly design the hypercall interface before knowing the
library usecases, and there is a clear lack of understanding (or at
least communication) in this regard.

So, starting from the top. OpenStack want QoS information, and want to
get it from libvirt/XenAPI.  I think libvirt/XenAPI is the correct level
to do this at, and think exactly the same would apply to CloudStack as
well.  The relevant part of this is the question "how does
libvirt/XenAPI collect stats".

XenAPI collects stats with the RRD Daemon, running in dom0.  It has an
internal database of statistics, and hands data from this database out
upon RPC requests.  It also has threads whose purpose is to periodically
refresh the data in the database.  This provides a disconnect between
${FOO}Stack requesting stats for a domain and the logic to obtain stats
for that domain.

I am however unfamiliar with libvirt in this regard.  Could you please
explain how the libvirt daemon deals with stats?


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.