[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC v3]Proposal to allow setting up shared memory areas between VMs from xl config file



Hi,

On 18/07/17 19:30, Zhongze Liu wrote:
====================================================
1. Motivation and Description
====================================================
Virtual machines use grant table hypercalls to setup a share page for
inter-VMs communications. These hypercalls are used by all PV
protocols today. However, very simple guests, such as baremetal
applications, might not have the infrastructure to handle the grant table.
This project is about setting up several shared memory areas for inter-VMs
communications directly from the VM config file.
So that the guest kernel doesn't have to have grant table support (in the
embedded space, this is not unusual) to be able to communicate with
other guests.

====================================================
2. Implementation Plan:
====================================================

======================================
2.1 Introduce a new VM config option in xl:
======================================

2.1.1 Design Goals
~~~~~~~~~~~~~~~~~~~~~~~~~~~

The shared areas should be shareable among several (>=2) VMs, so every shared
physical memory area is assigned to a set of VMs. Therefore, a “token” or
“identifier” should be used here to uniquely identify a backing memory area.
A string no longer than 128 bytes is used here to serve the purpose.

The backing area would be taken from one domain, which we will regard
as the "master domain", and this domain should be created prior to any
other "slave domain"s. Again, we have to use some kind of tag to tell who
is the "master domain".

And the ability to specify the permissions and cacheability (and shareability
for arm HVM's) of the pages to be shared should be also given to the user.

s/arm/ARM/. Furthermore it is called ARM guest and not HVM.


2.2.2 Syntax and Behavior
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The following example illustrates the syntax of the proposed config entry:

In xl config file of vm1:

   static_shm = [ 'id=ID1, begin=0x100000, end=0x200000, role=master,
                   arm_shareattr=inner, arm_inner_cacheattr=wb,
                   arm_outer_cacheattr=wb, x86_cacheattr=wb, prot=ro',

                   'id=ID2, begin=0x300000, end=0x400000, role=master,
                   arm_shareattr=inner, arm_inner_cacheattr=wb,
                   arm_outer_cacheattr=wb, x86_cacheattr=wb, prot=rw' ]

In xl config file of vm2:

    static_shm = [ 'id=ID1, begin=0x500000, end=0x600000, role=slave, prot=ro' ]

In xl config file of vm3:

    static_shm = [ 'id=ID2, begin=0x700000, end=0x800000, role=slave, prot=ro' ]

where:
  @id                   can be any string that matches the regexp "[^ \t\n,]+"
                        and no logner than 128 characters

s/logner/longer/

  @begin/end            can be decimals or hexidemicals of the form "0x20000".

s/hexidemicals/hexadecimals/

  @role                 can only be 'master' or 'slave'
  @prot                 can be 'n', 'r', 'ro', 'w', 'wo', 'x', 'xo', 'rw', 'rx',
                        'wx' or 'rwx'. Default is 'rw'.
  @arm_shareattr        can be 'inner' our 'outter', this will be ignored and

s/outter/outer/. If you really want to support shareability, you want to provide non-shareable too.

But I think, as suggested on the answer to Stefano, I would be easier if we provide a set of policies that will configure the guest correctly. This would avoid to do sanity check on the options used by the user.


                        a warning will be printed out to the screen if it
                        is specified in an x86 HVM config file.
                        Default is 'inner'
  @arm_outer_cacheattr  can be 'uc', 'wt', 'wb', 'bufferable' or 'wa', this will
                        be ignored and a warning will be printed out to the
                        screen if it is specified in an x86 HVM config file.
                        Default is 'inner'

I guess you took the name from asm-arm/page.h? Those attributes are for stage-1 page-table and not stage-2 (i.e used for translated an intermediate physical address to a physical address). Actually nowhere you explain that this will be used to configure the mapping in stage-2.

The possibility to configure the mappings are very different (see D4.5 in ARM DDI0487B.a). You can configure cacheability but not cache allocation hints. For instance wa (write-allocate) is a hint.

You will also want to warn the user that this may not prevent memory attribute mismatch depending the the cacheability policy.

  @arm_inner_cacheattr  can be 'uc', 'wt', 'wb', 'bufferable' or 'wa'. Default
                        is 'wb'.
  @x86_cacheattr        can be 'uc', 'wc', 'wt', 'wp', 'wb' or 'suc'. Default
                        is 'wb'.


Besides, the sizes of the areas specified by @begin and @end in the slave
domain's config file should be smaller than the corresponding sizes specified
in its master's domain. And overlapping backing memory areas are allowed.

In the example above. A memory area ID1 will be shared between vm1 and vm2.
This area will be taken from vm1 and mapped into vm2's stage-2 page table.
The parameter "prot=ro" means that this memory area are offered with read-only
permission. vm1 can access this area using 0x100000~0x200000, and vm2 using
0x500000~0x600000.

Likewise, a memory area ID2 will be shared between vm1 and vm3 with read and
write permissions. vm1 is the master and vm2 the slave. vm1 can access the
area using 0x300000~0x400000 and vm3 using 0x700000~0x800000.

For the arm_* and x86_* cache attributes and shareability attributes, the
behavior is briefly described below:

  + The the permission flags (i.e. ro/wo/rw etc.):
    - If specified in the master domains' config, they describe the largest set
      of permissions that are granted to the shared memory area, which means if
      master says 'rw' in its own config file, then the slaves can only say 'r'
      or 'w' or 'rw', but not 'x'.
    - If specified in the slave domains' config, they describe the stage-2 page
      permissions that would be used when we map the shared pages into the slave
      But this doesn't make any restrictions on how the slave domains are going
      to manipulate the related stage-1 page tables (and we can't).
  + The cacheability flags and shareability flags:
    These are valid only if they are specified in the master domain's config
    files. They also control the stage-2 page attributes of the shared memory.

Note that the "master" role in vm1 for both ID1 and ID2 indicates that vm1
should be created prior to both vm2 and vm3, for they both rely on the pages
backed by vm1. If one tries to create vm2 or vm3 prior to vm1, she will get
an error. And in vm1's config file, the "prot=ro" parameter of ID1 indicates
that if one tries to share this page with vm1 with, say, "rw" permission,
she will get an error, too.

======================================
2.2 Store the mem-sharing information in xenstore
======================================
For we don't have some persistent storage for xl to store the information
of the shared memory areas, we have to find some way to keep it between xl
launches. And xenstore is a good place to do this. The information for one
shared area should include the ID, master's domid, address range,
memory attributes and information of the slaves etc.
A current plan is to place the information under /local/shared_mem/ID.
Still take the above config files as an example:

Suppose we are running under x86 (and thus the arm_* attributes will be ignored,
if we instantiate vm1, vm2 and vm3, one after another, “xenstore ls -f” should
output something like this:

After VM1 was instantiated, the output of “xenstore ls -f”
will be something like this:

    /local/shared_mem/ID1/master = domid_of_vm1
    /local/shared_mem/ID1/begin = 0x100
    /local/shared_mem/ID1/end = 0x200
    /local/shared_mem/ID1/permissions = "r"
    /local/shared_mem/ID1/x86_cacheattr = "wb"
    /local/shared_mem/ID1/slaves = ""

    /local/shared_mem/ID2/master = domid_of_vm1
    /local/shared_mem/ID2/begin = 0x300
    /local/shared_mem/ID2/end = 0x400
    /local/shared_mem/ID2/permissions = "rw"
    /local/shared_mem/ID1/x86_cacheattr = "wb"
    /local/shared_mem/ID2/slaves = ""

After VM2 was instantiated, the following new lines will appear:

    /local/shared_mem/ID1/slaves/domid_of_vm2/begin = 0x500
    /local/shared_mem/ID1/slaves/domid_of_vm2/end = 0x600
    /local/shared_mem/ID1/slaves/domid_of_vm2/permissions = "r"

After VM2 was instantiated, the following new lines will appear:

    /local/shared_mem/ID2/slaves/domid_of_vm3/gmfn_begin = 0x700
    /local/shared_mem/ID2/slaves/domid_of_vm3/gmfn_end = 0x800

What is the granularity for gmfn_begin/gmfn_end?

    /local/shared_mem/ID2/slaves/domid_of_vm3/permissions = "rw"


When we encounter an id IDx during "xl create":

  + If it’s not under /local/shared_mem:
    + If the the corresponding entry has "role=master", create the
      corresponding entries for IDx in xenstore
    + If there isn't a "master" tag, say error.

  + If it’s found under /local/shared_mem:
    + If the corresponding entry has a "master" tag, say error
    + If there isn't a "master" tag, map the pages to the newly
      created domain, and add the current domain and necessary information
      under /local/shared_mem/IDx/slaves.

Locks should be used to make sure that the creation of these entries are
atomic.

======================================
2.3 mapping the memory areas
======================================
Handle the newly added config option in tools/{xl, libxl} and utilize
toos/libxc to do the actual memory mapping. Specifically, we will use
xc_domain_add_to_physmap_batch with XENMAPSPACE_gmfn_foreign to
do the actual mapping.

Unfortunately, we don't have the suitable API to change the catcheability

s/catcheability/cacheability/

and shareability attributes of the shared memory pages in the stage-2
page table. So these attributes are currently marked as "not implemented",
and xl should print an error if any of these attributes are set to their
non-default values (See 2.2.2 Syntax and Behavior).

They will be implemented when a suitable API becomes available.

======================================
2.4 error handling
======================================
Add code to handle various errors: Invalid address, invalid permissions, wrong
order of vm creation, mismatched length of memory area etc.

====================================================
3. Expected Outcomes/Goals:
====================================================
A new VM config option in xl will be introduced, allowing users to setup
several shared memory areas for inter-VMs communications.
This should work on both x86 and ARM.

====================================================
3. Future Directions:
====================================================
Implement the prot, x86_* and arm_* memory attribute options.

Set up a notification channel between domains who are communicating through
shared memory regions, this allows one vm to signal her friends when data is
available in the shared memory or when the data in the shared memory is
consumed. The channel could be built upon PPI or SGI.


[See also:
https://wiki.xenproject.org/wiki/Outreach_Program_Projects#Share_a_page_in_memory_from_the_VM_config_file]

Cheers,

--
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.