[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC][PATCH 04/13] tools/libxl: detect and avoid conflicts with RDM



Campbell, Jackson, Wei and Stefano,

Any consideration?

I can follow up Jan's idea but I need you guys make sure I'm going to do this properly.

Thanks
Tiejun

On 2015/5/7 10:22, Chen, Tiejun wrote:
On 2015/5/6 23:34, Jan Beulich wrote:
On 06.05.15 at 17:00, <tiejun.chen@xxxxxxxxx> wrote:
On 2015/4/20 19:13, Jan Beulich wrote:
On 10.04.15 at 11:21, <tiejun.chen@xxxxxxxxx> wrote:
--- a/tools/libxc/xc_domain.c
+++ b/tools/libxc/xc_domain.c
@@ -1665,6 +1665,46 @@ int xc_assign_device(
       return do_domctl(xch, &domctl);
   }

+struct xen_reserved_device_memory
+*xc_device_get_rdm(xc_interface *xch,
+                   uint32_t flag,
+                   uint16_t seg,
+                   uint8_t bus,
+                   uint8_t devfn,
+                   unsigned int *nr_entries)
+{

So what's the point of having both this new function and
xc_reserved_device_memory_map()? Is the latter useful for
anything besides the purpose here?

I just hope xc_reserved_device_memory_map() is a standard interface to
call that XENMEM_reserved_device_memory_map, but xc_device_get_rdm() can
handle some errors in current case.

I think you are hinting we just need one, right?

Correct. But remember - I'm not a maintainer of this code, so

But this may be a little complex with one...

maintainers may be of different opinion.

Anyway, let me ask our tools maintainers.

Campbell, Jackson, Wei and Stefano,

What about your concern to this?


+    struct xen_reserved_device_memory *xrdm = NULL;
+    int rc = xc_reserved_device_memory_map(xch, flag, seg, bus,
devfn, xrdm,
+                                           nr_entries);
+
+    if ( rc < 0 )
+    {
+        if ( errno == ENOBUFS )
+        {
+            if ( (xrdm = malloc(*nr_entries *
+
sizeof(xen_reserved_device_memory_t))) == NULL )
+            {
+                PERROR("Could not allocate memory.");

Now that's exactly the kind of error message that makes no sense:
As errno will already cause PERROR() to print something along the
lines of the message you provide here, you're just creating
redundancy. Indicating the purpose of the allocation, otoh, would
add helpful context for the one inspecting the resulting log.

What about this?

PERROR("Could not allocate memory buffers to store reserved device
memory entries.");

You kind of go from one extreme to the other - the message
doesn't need to be overly long, but it should be distinct from
all other messages (so that when seen one can identify what
went wrong).

I originally refer to some existing examples like this,

int
xc_core_arch_memory_map_get(xc_interface *xch, struct
xc_core_arch_context *unused,
                             xc_dominfo_t *info, shared_info_any_t
*live_shinfo,
                             xc_core_memory_map_t **mapp,
                             unsigned int *nr_entries)
{
     ...
     map = malloc(sizeof(*map));
     if ( map == NULL )
     {
         PERROR("Could not allocate memory");
         return -1;
     }

Maybe this is wrong to my case. Could I change this?

PERROR("Could not allocate memory for XENMEM_reserved_device_memory_map
hypercall");

Or just give me your line.


@@ -302,8 +300,11 @@ static int setup_guest(xc_interface *xch,

       for ( i = 0; i < nr_pages; i++ )
           page_array[i] = i;
-    for ( i = mmio_start >> PAGE_SHIFT; i < nr_pages; i++ )
-        page_array[i] += mmio_size >> PAGE_SHIFT;
+    /*
+     * This condition 'lowmem_end <= mmio_start' is always true.
+     */

For one I think you mean "The", not "This", as there's no such
condition around here. And then - why? DYM "is supposed to
always be true"? In which case you may want to check...

I always do this inside libxl__build_hvm() but before setup_guest(),

+    if (args.lowmem_size > mmio_start)
+        args.lowmem_size = mmio_start;

And plus, we also another policy to rdm,

      #1. Above a predefined boundary (default 2G)
          - move lowmem_end below reserved region to solve conflict;

This means there's such a likelihood of args.lowmem_size < mmio_start)
as well.

So here I'm saying the condition is always true.

Okay, but again - if this is relevant to the following code, an
assertion or alike may still be warranted.

Yes I should add 'assert()' here.


and hence don't have the final say on stylistic issues, I don't see
why the above couldn't be expressed with a single return statement.

Are you saying something like this? Note this was showed by yourself
long time ago.

I know, and hence I was puzzled to still see you use the more
convoluted form.

static bool check_mmio_hole_conflict(uint64_t start, uint64_t memsize,
                                        uint64_t mmio_start, uint64_t
mmio_size)
{
       return start + memsize > mmio_start && start < mmio_start +
mmio_size;
}

But I don't think this really can't work out our case.

It's equivalent to the original you had, so I don't see what you
mean with "this really can't work out our case".


Let me make this point clear.

The original implementation,

+static int check_rdm_hole(uint64_t start, uint64_t memsize,
+                          uint64_t rdm_start, uint64_t rdm_size)
+{
+    if (start + memsize <= rdm_start || start >= rdm_start + rdm_size)
+        return 0;
+    else
+        return 1;
+}

means it returns 'false' in two cases:

#1. end = start + memsize; end <= rdm_start;

This region [start, end] is below of rdm entry.

#2. rdm_end = rdm_start + rdm_size; stat >= rdm_end;

This region [start, end] is above of rdm entry.

So others conditions should indicate that rdm entry is overlapping with
this region. Actually this has three cases:

#1. This region just conflicts with the first half of rdm entry;
#2. This region just conflicts with the second half of rdm entry;
#3. This whole region falls inside of rdm entry;

Then it should return 'true', right?

But with this single line,

return start + memsize > rdm_start && start < rdm_start + rdm_size;

=>

return end > rdm_start && start < rdm_end;

This just guarantee it return 'true' *only* if #3 above occurs.

+int libxl__domain_device_check_rdm(libxl__gc *gc,
+                                   libxl_domain_config *d_config,
+                                   uint64_t rdm_mem_guard,
+                                   struct xc_hvm_build_args *args)
+{
+    int i, j, conflict;
+    libxl_ctx *ctx = libxl__gc_owner(gc);
+    struct xen_reserved_device_memory *xrdm = NULL;
+    unsigned int nr_all_rdms = 0;
+    uint64_t rdm_start, rdm_size, highmem_end = (1ULL << 32);
+    uint32_t type = d_config->b_info.rdm.type;
+    uint16_t seg;
+    uint8_t bus, devfn;
+
+    /* Might not to expose rdm. */
+    if ((type == LIBXL_RDM_RESERVE_TYPE_NONE) &&
!d_config->num_pcidevs)
+        return 0;
+
+    /* Collect all rdm info if exist. */
+    xrdm = xc_device_get_rdm(ctx->xch, LIBXL_RDM_RESERVE_TYPE_HOST,
+                             0, 0, 0, &nr_all_rdms);

What meaning has passing a libxl private value to a libxc function?

We intend to collect all rdm entries info in advance and then we can
construct d_config->rdms based on our policies as follows. Because we
need to first allocate d_config->rdms properly to store rdms, but in
some cases we don't know how many buffers are enough. For example, we
don't have that global flag but with multiple pci devices. And even a
shared entry worsen this situation.

So here, we set that flag as LIBXL_RDM_RESERVE_TYPE_HOST but without any
SBDF to grab all rdms.

I'm afraid you didn't get my point: Values passed to libxc should be

Sorry for this misunderstanding.

known to libxc. Values privately defined by libxl for its own purposes
aren't known to libxc, and hence shouldn't be passed to libxc
functions.

I think we should set this with 'PCI_DEV_RDM_ALL' since,

struct xen_reserved_device_memory_map {
     /* IN */
     /* Currently just one bit to indicate checkng all Reserved Device
Memory. */
#define PCI_DEV_RDM_ALL   0x1


+     * 'try' policy is specified, and we also mark this as INVALID
not to expose
+     * this entry to hvmloader.

What is "this" in "... also mark this as ..."? Certainly neither the
conflict
nor the warning.

Sorry, this is my fault.

       * If a conflict is detected on a given RMRR entry, an error
will be
       * returned if 'strict' policy is specified. Or conflict is
treated as a
       * warning if 'relaxed' policy is specified, and we also mark
this as
       * INVALID not to expose this entry to hvmloader.

The same "this" still doesn't have anything reasonable it references. I
think you mean "the entry" (in which case the subsequent "this entry"
could become just "it" afaict). But (not being a native speaker) the
grammar of the second half of the sentence looks odd (and hence
potentially confusing) to me anyway (i.e. even with the previous

Sure, we need to make this better and clear.

issue fixed).

      * If a conflict is detected on a given RMRR entry, an error will be
      * returned if 'strict' policy is specified. Instead, if 'relaxed'
policy
      * specified, this conflict is treated just as a warning, but we
mark this
      * RMRR entry as INVALID to indicate that this entry shouldn't be
exposed
      * to hvmloader.

I hope this can help us understand what we do.


+     *
+     * Firstly we should check the case of rdm < 4G because we may
need to
+     * expand highmem_end.
+     */
+    for (i = 0; i < d_config->num_rdms; i++) {
+        rdm_start = d_config->rdms[i].start;
+        rdm_size = d_config->rdms[i].size;
+        conflict = check_rdm_hole(0, args->lowmem_size, rdm_start,
rdm_size);
+
+        if (!conflict)
+            continue;
+
+        /*
+         * Just check if RDM > our memory boundary
+         */
+        if (d_config->rdms[i].start > rdm_mem_guard) {
+            /*
+             * We will move downwards lowmem_end so we have to expand
+             * highmem_end.
+             */
+            highmem_end += (args->lowmem_size - rdm_start);
+            /* Now move downwards lowmem_end. */
+            args->lowmem_size = rdm_start;

Considering that the action here doesn't depend on the specific
->rdms[] slot being looked at, I don't see why the loop needs to

I'm not sure if I understand what you mean.

All rdm entries are organized disorderly in d_config->rdms, so we should
traverse all entries to make sure args->lowmem_size is below all rdms'
start address.

I think I see what confused me: in the if() condition you reference
d_config->rdms[i].start, yet the body of the if() has no reference
to d_config->rdms[i] at all. If the if() used rdm_start it would
become obvious that this is being latched at the beginning of the

Indeed, I really should use rdm_start here.

body (which is what I overlooked, assuming the variable's value
to have got set prior to the loop), and hence the body is not loop
invariant.


So just replace d_config->rdms[i].start with rdm_start like this,

         /*
          * Just check if RDM > our memory boundary
          */
         if (rdm_start > rdm_mem_guard) {
             /*
              * We will move downwards lowmem_end so we have to expand
              * highmem_end.
              */
             highmem_end += (args->lowmem_size - rdm_start);
             /* Now move downwards lowmem_end. */
             args->lowmem_size = rdm_start;
         }
     }

Thanks
Tiejun

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.