[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] PV-vNUMA issue: topology is misinterpreted by the guest



On Fri, 2015-07-24 at 13:11 -0400, Boris Ostrovsky wrote:
> On 07/24/2015 12:48 PM, Juergen Gross wrote:
> > On 07/24/2015 06:40 PM, Boris Ostrovsky wrote:
> >> On 07/24/2015 12:10 PM, Juergen Gross wrote:
> >>>
> >>> If we can fiddle with the masks on boot, we could do it in a running
> >>> system, too. Another advantage with not relying on cpuid. :-)
> >>
> >>
> >> I am trying to catch up with this thread so I may have missed it, but I
> >> still don't understand why we don't want to rely on CPUID.
> >>
> >> I think I saw Juergen said --- because it's HW-specific. But what's
> >> wrong with that? Hypervisor is building virtualized x86 (in this case)
> >> hardware and on such HW CPUID is the standard way of determining
> >> thread/core topology. Plus various ACPI tables and such.
> >>
> >> And having a solution that doesn't address userspace (when there *is* a
> >> solution that can do it) doesn't seem like the best approach. Yes, it
> >> still won't cover userspace for PV guests but neither will the kernel
> >> patch.
> >>
> >> As far as licensing is concerned --- are we sure this can't also be
> >> addressed by CPUID? BTW, if I was asked about who is most concerned
> >> about licensing my first answer would be --- databases. I.e. userspace.
> >
> > The problem is to construct cpuids which will enable the linux
> > scheduler to work correct in spite of the hypervisor scheduler
> > moving vcpus between pcpus. The only way to do this is to emulate
> > single-threaded cores on the numa nodes without further grouping.
> > So either single-core sockets or one socket with many cores.
> 
> Right.
> 
So you see it now? If adhering to some guest virtual topology (for
whatever reason, e.g., performance) and licensing disagree, using CPUID
for both is always going to fail!

> > This might be problematic for licensing: the multi-socket solution
> > might require a higher license based on socket numbers. Or the
> > license is based on cores and will be more expensive as no hyperthreads
> > are detectable.
> 
> If we don't pin VCPUs approriately (which I think is the scenario that 
> we are discussing) then CPUID can be used for find out package ID. And 
> so any licensed SW will easily discover that it is running on different 
> packages.
> 
That's why Juergen is suggesting to keep the things separate,
effectively decoupling them, AFAICU his proposal.

Note that I'm still a bit puzzled by the idea of presenting different
information to the guest OS and to the guest userspace, but that has
upsides, and this decoupling is one.

In fact, in the case you're describing, i.e., not pinned vcpus, etc:
 - the Linux kernel is not relying on CPUID when building scheduling  
   domains, and everything so, at least all the user space applications
   that does not call CPUID and/or rely on that for
   scheduling/performance matters *will* *be* *fine*
 - you can pass down, via tools, a mangled CPUID, e.g., making it best 
   fit your licensing needs.

Problems arises in case you have both the following kind of applications
(or the same application doing both the following operations):
 a) applications that poke at CPUID for licensing purposes
 b) applications that poke at CPUID for placement/performance purposes

In this case, it's well possible that mangling CPUID for making a)
happy, will make b) unhappy, and vice-versa. And the fact that the
kernel does not rely on CPUID any longer may not help, as the
application is kind of bypassing it... Although, it won't harm either...

Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.