[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [RFC] Extend the number of event channels availabe to guests



Hello,
reported below there is a request for comment on the plan to extend the number of event channel the hypervisor can grok. I've informally discussed some parts of this with Ian Campbell but I would like to formalize it someway, hear more opinion on it and possibly give the project more exposure and guidance, as this is supposed to be one of the major features for 4.3.

SYNOPSIS
Currently the number of eventchannel every guest can setup is 1k or 4k, depending if the guest is 32 or 64 bits. This is a limitation in the number of guests an host can actively run, because the host and guest will need to setup eventchannel among them and thus at some point Dom0 will exhaust the number of available eventchannel for all guests. Scope of this work is to raise the number of available eventchannel for every guest (and then, also for Dom0).

The 4k number cames out directly by the eventchannel organization. In order to address a single channel, every guest keeps a map of corresponding bits in its shared page with the hypervisor. However, in order to avoid to search anytime for 4k bits, a per-cpu, upper level further mask is present to address singularly smaller words of the pending eventchannel mask (making the organization of the code at all the effects a two-level lookup table.

In order to expand the number of available eventchannels, one must take into account 2 important aspects, related to compatibility: ABI and ability to run both old and new method altogether. The former one is about the fact that all the controlling structures related to eventchannels live in public ABI of the hypervisor. A valid solution, then, must not enforce any ABI changes at all. The latter one is about the ability to leave the hypervisor to work with both the old model and the new one. This is to keep support with guests running an older kernel than the patched one.

Proposal
The proposal is pretty simple: the eventchannel search will become a three-level lookup table, with the leaf level being composed by shared pages registered at boot time by the guests. The bitmap working now as leaf (then called "second level") will work alternatively as leaf level still (for older kernel) or for intermediate level to address into a new array of shared pages (for newer kernels). This leaves the possibility to reuse the existing mechanisms without modifying its internals.

More specifically, what needs to happen:
- Add new members to struct domain to handle an array of pages (to contain the actual evtchn bitmaps), a further array of pages (to contain the evtchn masks) and a control bit to say if it is subjective to the new mode or not. Initially the arrays will be empty and the control bit will be OFF. - At init_platform() time, the guest must allocate the pages to compose the 2 arrays and invoke a novel hypercall which, at big lines, does the following: * Creates some pages to populate the new arrays in struct domain via alloc_xenheap_pages() * Recreates the mapping with the gpfn passed from the userland, using basically guest_physmap_add_page()
  * Sets the control bit to ON
- Places that need to access to the actual leaf bit (like, for example, xen_evtchn_do_upcall()) will need to double check the control bit. If it is OFF they consider the second level as the leaf one, otherwise they will do a further lookup to get the bit from the new array of pages.

Of course there are some nits to be decided yet, like, for example:
* How many pages should the new level have? We can start by populating just one, for example * Who should have really the knowledge of how many pages to allocate? Likely the hypervisor should have a threshhold, but in general we may want to have a posting mechanism to have the guest ask the hypervisor before-hand and satisfy its actual request * How many bits should be indirected in the third-level by every single bit in the second-level? (that is a really minor factor, but still).

Please let me know what do you think about this.

Thanks,
Attilio


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.