[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Request for input: Extended event channel support
On Wed, Mar 27, 2013 at 11:23:23AM +0000, George Dunlap wrote: > * Executive summary > > The number of event channels available for dom0 is currently one of > the biggest limitations on scaling up the number of VMs which can be > created on a single system. There are two alternative implementations > we could choose, one of which is ready now, the other of which is > potentially technically superior, but will not be ready for the 4.3 > release. > > The core question we need to ask the community: How important is > lifting the event channel scalability limit to 4.3? Will waiting > until 4.4 cause a limit in the uptake of the Xen platform? > > * The issue > > The existing event channel implementation for PV guests is implemented > as 2-level bit array. This limits the total number of event channels > to word_size ^ 2, which is 1024 for 32-bit guests and 4096 for 64-bit > guests. > > This sounds like a lot, until you consider that in a typical system, > each VM needs 4 or more event channels in domain 0. This means that > for a 32-bit dom0, there is a theoretical maximum of 256 guests -- and > in practice it's more like 180 or so, because of event channels > required for other things. XenServer already has customers using VDI > that require more VMs than this. > > * The dilemma > > When we began the 4.3 release cycle, this was one of the items we > identified as a key feature we needed to get for 4.3. Wei Liu started > work on an extension of the existing implmentation, allowing 3 levels > of event channels. The draft of this is ready, and just needs the > last bit of polishing and bug-chasing before it can be accepted. > > However, several months ago, David Vrabel came up with an alternate > design which in theory was more scalable, based on queues of linked > lists (which we have internally been calling "FIFO" for short). David > has been working on the implementation since, and has a draft > protoype; but it's in no shape to be included in 4.3. > > There are some things that are attractive about the second solution, > including the flexible assignment of interrupt priorities, ease of > scalability, and potentially even the FIFO nature of the interrupt > delivery. > > The question at hand then, is whether to take what we have in the > 3-level implementation for 4.3, or wait to see how the FIFO > implementation turns out (taking either it or the 3-level > implementation in 4.4). > > * The solution in hand: 3-level event channels > > The basic idea behind 3-level event channels is to extend the existing > 2-level implementation to 3 levels. Going to 3 levels would give us > 32k event channels for 32-bit, and 256k for 64-bit. > > One of the advantages of this method is that since it is similar to > the existing method, the general concepts and race conditions are > fairly well understood and tested. > > One of the disadvantages that this method inherits from the 2-level > event channels is the lack of priority. In the initial implementation > of event channels, priority was handled by event channel order: scans > for events always started at 0 and went upwards. However, this was > not very scalable, as lower-numbered events could easily completely > lock out higher-numbered events; and frequently "lower-numbered" > simply meant "created earlier". Event channels were forced into a > priority even if one was not wanted. > > So the implementation was tweaked, so that scans don't start at 0, but > continue where the last event left off. This made it so that earlier > events were not prioritized and removed the starvation issue, but at > the cost of removing all event priorities. Certain events, like the > timer event, are special-cased to be always checked, but this is > rather a bit of a hack and not very scalable or flexible. Hm, I actually think that is not in the upstream kernel at all. That would explain why on very heavily busy guest the hrtimer: interrupt took XXxXXXXxx ns is printed. Is this patch somewhere available? _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |