[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] RE: Xen/ia64 - global or per VP VHPT



Since this discussion has gotten so ia64 (IPF) specific, I'm
going to move it entirely to xen-ia64-devel to avoid
excessive cross-posting.  For reasons of courtesy, we should
probably limit cross-posting to issues that are of general
interest to the broader Xen development community.

Let me take this opportunity to advertise xen-ia64-devel.
If you are interested in following this (or other Xen/ia64)
discussion, please sign up at

http://lists.xensource.com/xen-ia64-devel

or read the archives at

http://lists.xensource.com/archives/html/xen-ia64-devel/

> -----Original Message-----
> From: Munoz, Alberto J [mailto:alberto.j.munoz@xxxxxxxxx] 
> Sent: Friday, April 29, 2005 2:58 PM
> To: Magenheimer, Dan (HP Labs Fort Collins); Yang, Fred; Dong, Eddie
> Cc: ipf-xen; Xen-devel; xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
> Subject: RE: Xen/ia64 - global or per VP VHPT
> 
> Hi Dan,
> 
> Magenheimer, Dan (HP Labs Fort Collins) 
> <mailto:dan.magenheimer@xxxxxx>
> wrote on Friday, April 29, 2005 1:05 PM:
> 
> >> In my opinion this is a moot point because in order to provide the
> >> appropriate semantics for physical mode emulation (PRS.dt, 
> or PSR.it, or
> >> PSR.rt == 0) it is necessary to support a 4K page size as 
> the minimum
> >> (unless you special case translations for physical mode
> >> emulation). Also in
> >> terms of machine memory utilization, it is better to have
> >> smaller pages (I
> >> know this functionality is not yet available in Xen, but I
> >> believe it will
> >> become important once people are done working on the basics).
> > 
> > In my opinion, performance when emulating physical mode is
> > a moot point.  
> 
> Linux IPF TLB miss handlers turn off PRS.dt. This is very performance
> sensitive.
> 
> > It might make sense to simply not insert
> > metaphysical addresses into the VHPT and just rely on the
> > TLB (though perhaps a one-entry virtual TLB might be required
> > to ensure forward progress).
> > 
> > Remember, one major difference between full virtualization (VT)
> > and paravirtualization is that you have to handle any case that
> > any crazy OS designer might try, while I just have to ensure that
> > I can tell the crazy OS designer what crazy things need to be
> > removed to make sure it works on Xen :-)  This guarantees that
> > our design choices will sometimes differ.
> 
> I have not forgotten that (just as I have not forgotten this 
> same argument
> used in other contexts in the past, let's just do it this way 
> because we
> know no reasonable software will ever do that...) 
> 
> The way I see you applying this argument here is a bit 
> different, though:
> there are things that Linux does today that will cause 
> trouble with this
> particular design choice, but all I have to do is to make sure these
> troublesome things get designed out of the paravirtualized OS.
> 
> In any case, I think it is critical to define exactly what an IPF
> paravirtualized guest is (maybe this has already been done 
> and I missed it)
> before making assumptions as to what the guest will and will not do
> (specially when those things are done by native guests 
> today). I don't think
> it is quiet the same as an X-86 XenoLinux, as a number of the 
> hypercalls are
> very specific to addressing X-86 virtualization holes, which 
> do not have
> equivalents in IPF. 
> 
> I know that there have been attempts at paravirtualizing 
> (actually more like
> dynamically patching) IPF Linux before (e.g., vBlades, you 
> may be familiar
> with it :-), but I am not sure if the Xen project for IPF has decided
> exactly what an IPF paravirtualized XenoLinux will look like. 
> I am also not
> sure if it has also been decided that no native IPF guests (no binary
> patching) will be supported.
> 
> >> It is not just purging. Having a global VHPT is, in general,
> >> really bad for scalability....
> > 
> >> Another important thing is hashing into the VHPT. If you have ...
> > 
> >> As you point out this is ALWAYS the case, but what matters is
> >> what are your target workloads and target systems are...
> > 
> > All this just says that a global VHPT may not be good for a
> > big machine.  This may be true.  I'm not suggesting that
> > Xen/ia64 support ONLY a global VHPT or even necessarily that
> > it be the default, just that we preserve the capability to
> > configure either (or even both).
> 
> Let's define "big" in an environment where there are multiple 
> cores per
> die...
> 
> Another argument (independent of scalability) here is that 
> interference
> between guests/domains in a virtualization environment should 
> be minimized.
> This particular design of a single vhpt is fostering this 
> interference.
> 
> > 
> > I wasn't present in the early Itanium architecture discussions
> > but I'll bet there were advocates for both lVHPT and sVHPT who
> > each thought it a terrible waste that the architecture support
> > both.  That was silicon and both are supported; this is a small
> > matter of software :-)
> 
> I was present during those early discussions and the argument 
> went this way:
> we need to support both Windows (a MAS OS) and HP-UX (a SAS 
> OS) => we need
> to support both short and long format VHPT.
> 
> > 
> >> Memory footprint is really not that big a deal for these
> >> large machines, but
> >> in any case, the size of the VHPT is typically proportional
> >> to the size of
> >> physical memory (some people suggest 4 PTEs per physical page
> >> frame and some
> >> people suggest 2, but in any case, there is a linear
> >> relationship between
> >> the two). If you follow this guide line, then individual
> >> VHPTs for 5 guests
> >> should be 1/5 of the size of the combined VHPT for all 5 guests.
> > 
> > The point is that significant memory needs to be reserved in advance
> > or dynamically recovered whenever a domain launches.  Maybe this
> > isn't a big deal with a good flexible memory allocator and
> > "hidden ballooning" to steal physical memory from running domains.
> 
> Going back to the example of 5 VHPTs of size X vs. one VHPT 
> of size 5X, I
> would say that this problem is worse with the single VHPT, as 
> it either has
> to have the ability to grow dynamically as domains get 
> created, or has to be
> pre-allocated to a size that supports a maximum number of domains.
> 
> > 
> > E.g., assume an administrator automatically configures all domains
> > with a nominal 4GB but ability to dynamically grow up to 64GB.  The
> > per-guest VHPT would need to pre-allocate a shadow VHPT for the
> > largest of these (say 1% of 64GB) even if each of the domains never
> > grew beyond the 4GB, right?  (Either that or some kind of VHPT
> > resizing might be required whenever memory is "hot-plugged"?)
> 
> I am not sure I understand your example. As I said in my 
> previous posting,
> experience has shown that the optimal size of the VHPT (for 
> performance) is
> dependent of the number of physical pages it supports (not 
> how many domains,
> but how many total pages those domains will be using). In 
> other words, the
> problem of having a VHPT support more memory is independent 
> of whether it
> represents one domain or multiple domains. It depends on how 
> many total
> memory pages are being supported. 
> 
> I believe that you somehow think that having a single VHPT to support
> multiple domains would save you some memory, or rather the 
> need to grow a
> VHPT? Or put another way, why do you think that the situation 
> you describe
> above is unique to the multiple VHPT design and not to the single VHPT
> design?
> 
> > 
> > Again, there's a lot of interesting questions and discussion around
> > this... which means its best to preserve our options if possible.
> 
> I see it a bit more black and white than you do.
>  
> > Dan
> 
> Bert
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.