[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH RFC 1/2] linux/vnuma: vnuma support for pv guest



On Fri, Aug 30, 2013 at 12:29:09AM +0200, Dario Faggioli wrote:
> On gio, 2013-08-29 at 10:51 -0400, Konrad Rzeszutek Wilk wrote:
> > On Thu, Aug 29, 2013 at 03:32:13PM +0100, George Dunlap wrote:
> > > On 29/08/13 15:23, Konrad Rzeszutek Wilk wrote:
> > > >  - Why not re-use existing code? As in, if both PV and HVM can use
> > > >    SRAT, why not do it that way? This way you are exercising the same
> > > >    code path in both guests and it means less bugs to chase.
> > > >
> > > >    Incidententaly this also means that the mechanism to fetch the
> > > >    NUMA information from the hypervisor and construct the SRAT tables
> > > >    from can be done the same in both hvmloader and Linux kernel.
> > > 
> > > If the SRAT tables are built by hvmloader for HVM guests, couldn't
> > > hvmloader make this hypercall and construct the tables, instead of
> > > having the domain builder do it?  That would mean that most of the
> > > codepath is the same; HVM just has the extra step of encoding and
> > > decoding the information in SRAT form.
> > 
> > Correct. 
> >
> Indeed. Also, Matt mentioned the HVM implementation does the most of the
> work in libxc and hvmloader, is that the case? (Matt, I knew about the
> "old" HVM-NUMA series from Andre, but I don't have the details fresh
> enough right now... Will take another look ASAP.)
> 
> If yes, please, consider that, when talking about PV-vNUMA, although
> right now the series addresses DomU only, we plan to make it possible
> for Dom0 to have a virtual NUMA topology too. In that case, I don't
> think any code from libxc and hvmloader could be shared, could it?

But it could be stuck in the hypervisor at that point. So all of that
would reside within the Xen source base - in three places. Yuck.
> 
> So, to me, it sort of looks like we risk to introduce more code
> duplication than what we're trying to avoid! :-P

Would be nice if this could be somehow stuck in a libary that all
three (hvmloader, libxc and hypervisor) could be built with.

> 
> > > >I think short term Elena's option is better one as it gets it going and
> > > >we can experiement with it. Long term I think stashing the data in ACPI
> > > >SRAT/SLIT is right. But Elena might not get to it in the next couple of
> > > >months - which means somebody else will have to sign up for that.
> > > 
> > > I definitely think that Elena needs to continue on the same path for
> > > now, so she can actually have closure on the project in a reasonable
> > > amount of time.
> > 
> > I concur. The caveat is that it could mean that the x86 maintainers might
> > object to the generic path having the special case for Xen and that
> > particular patch part won't be upstreamed until it has been taken care of.
> > 
> > If Elena is OK with that possiblity  - then that is fine with me.
> >
> Sorry, I'm not sure I understand what you mean here. When talking about
> PV-NUMA, with Elena's approach, we have a Xen special case in
> numa_init(). With the fake table approach, we'd have a Xen special case
> in the ACPI parsing code (Matt was talking about something like
> acpi_os_get_root_pointer() ), wouldn't we?
> 
> So, either way, we'll have a Xen special case, the only difference I see
> is that, in the latter, we'd probably have to deal with the ACPI
> maintainers instead than with the x86 ones. :-)

Correct.
> 
> FWIW, x86_numa_init() already looks like this:
> 
>  621 void __init x86_numa_init(void)
>  622 {
>  623         if (!numa_off) {
>  624 #ifdef CONFIG_X86_NUMAQ
>  625                 if (!numa_init(numaq_numa_init))
>  626                         return;
>  627 #endif
>  628 #ifdef CONFIG_ACPI_NUMA
>  629                 if (!numa_init(x86_acpi_numa_init))
>  630                         return;
>  631 #endif
>  632 #ifdef CONFIG_AMD_NUMA
>  633                 if (!numa_init(amd_numa_init))
>  634                         return;
>  635 #endif
>  636         }
>  637 
>  638         numa_init(dummy_numa_init);
>  639 }
> 
> I.e., quite a bit of architecture/implementation special casing already!
> So, perhaps, having some more `#ifdef CONFIG_XEN' and/or `if
> (xen_domain())' in there won't be that much upsetting after all. :-)

Or perhaps fix that special casing and have something similar to the
iommu_detect API. And make that 'iommu_detect' be more generic now.

> 
> Thanks and Regards,
> Dario
> 
> -- 
> <<This happens because I choose it to happen!>> (Raistlin Majere)
> -----------------------------------------------------------------
> Dario Faggioli, Ph.D, http://about.me/dario.faggioli
> Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)
> 



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.