[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [v7][RFC][PATCH 06/13] hvmloader/ram: check if guest memory is out of reserved device memory maps



>>> On 24.10.14 at 09:34, <tiejun.chen@xxxxxxxxx> wrote:
> We need to check to reserve all reserved device memory maps in e820
> to avoid any potential guest memory conflict.
> 
> Currently, if we can't insert RDM entries directly, we may need to handle
> several ranges as follows:
> a. Fixed Ranges --> BUG()
>  lowmem_reserved_base-0xA0000: reserved by BIOS implementation,
>  BIOS region,
>  RESERVED_MEMBASE ~ 0x100000000,

This seems conceptually wrong to me, and I said so before:
Depending on host characteristics this approach may mean you're
going to be unable to build any HVM guests. Minimally there needs
to be a way to avoid these checks (resulting in devices associated
with RMRRs not being assignable to such a guest). I'm therefore
only going to briefly look at the rest of this patch.

> +static unsigned int construct_rdm_e820_maps(unsigned int 
> next_e820_entry_index,
> +                                            uint32_t nr_map,
> +                                            struct 
> xen_mem_reserved_device_memory *map,
> +                                            struct e820entry *e820,
> +                                            unsigned int 
> lowmem_reserved_base,
> +                                            unsigned int bios_image_base)
> +{
> +    unsigned int i, j, sum_nr = next_e820_entry_index + nr_map;
> +    uint64_t start, end, next_start, rdm_start, rdm_end;
> +    uint32_t type;
> +    unsigned int insert = 0, do_insert = 0;
> +    int err = 0;
> +
> + do_real_construct:
> +    for ( i = 0; i < nr_map; i++ )
> +    {
> +        rdm_start = map[i].start_pfn << PAGE_SHIFT;
> +        rdm_end = rdm_start + (map[i].nr_pages << PAGE_SHIFT);
> +
> +        for ( j = 0; j < next_e820_entry_index - 1; j++ )
> +        {
> +            start = e820[j].addr;
> +            end = e820[j].addr + e820[j].size;
> +            type = e820[j].type;
> +            next_start = e820[j+1].addr;
> +
> +            /* lowmem_reserved_base-0xA0000: reserved by BIOS 
> implementation. */
> +            if ( lowmem_reserved_base < 0xA0000 &&
> +                 start == lowmem_reserved_base )
> +            {
> +                if ( rdm_start >= start && rdm_start <= end )
> +                {
> +                    err = -1;
> +                    break;
> +                }
> +            }
> +
> +            /*
> +             * BIOS region.
> +             */
> +            if ( start == bios_image_base )
> +            {
> +                if ( rdm_start >= start && rdm_start <= end )
> +                {
> +                    err = -1;
> +                    break;
> +                }
> +            }
> +
> +            /* The default memory map always occupy one fixed reserved
> +             * range: RESERVED_MEMBASE ~ 0x100000000
> +             */
> +            if ( rdm_start >= RESERVED_MEMBASE &&
> +                      rdm_start <= ((uint64_t)1 << 32) )
> +            {
> +                err = -1;
> +                break;
> +            }
> +
> +            /* Just amid those remaining e820 entries. */
> +            if ( (rdm_start > end) && (rdm_end < next_start) )
> +            {
> +                if ( do_insert )
> +                {
> +                    memmove(&e820[j+2], &e820[j+1],
> +                            (sum_nr - j - 1) * sizeof(struct e820entry));
> +
> +                    /* Then fill RMRR into that entry. */
> +                    e820[j+1].addr = rdm_start;
> +                    e820[j+1].size = rdm_end - rdm_start;
> +                    e820[j+1].type = E820_RESERVED;
> +                    next_e820_entry_index++;
> +                }
> +                insert++;
> +            }
> +            /* Already at the end. */
> +            else if ( (rdm_start > end) && !next_start )
> +            {
> +                if ( do_insert )
> +                {
> +                    e820[next_e820_entry_index].addr = rdm_start;
> +                    e820[next_e820_entry_index].size = rdm_end - rdm_start;
> +                    e820[next_e820_entry_index].type = E820_RESERVED;
> +                    next_e820_entry_index++;
> +                }
> +                insert++;
> +            }
> +            /* If completely overlap with one RAM range. */
> +            else if ( rdm_start == start && rdm_end == end && type == 
> E820_RAM )

Comment and expression disagree.

> +            {
> +                if ( do_insert )
> +                    e820[j].type = E820_RESERVED;
> +                insert++;
> +            }
> +            /* If we're just alligned with start of one RAM range. */
> +            else if ( rdm_start == start && rdm_end < end && type == 
> E820_RAM )
> +            {
> +                if ( do_insert )
> +                {
> +                    memmove(&e820[j+1], &e820[j],
> +                            (sum_nr - j) * sizeof(struct e820entry));
> +
> +                    e820[j+1].addr = rdm_end;
> +                    e820[j+1].size = e820[j].addr + e820[j].size - rdm_end;
> +                    e820[j+1].type = E820_RAM;
> +                    next_e820_entry_index++;
> +
> +                    e820[j].addr = rdm_start;
> +                    e820[j].size = rdm_end - rdm_start;
> +                    e820[j].type = E820_RESERVED;
> +                }
> +                insert++;
> +            }
> +            /* If we're just alligned with end of one RAM range. */
> +            else if ( rdm_start > start && rdm_end == end && type == 
> E820_RAM )
> +            {
> +                if ( do_insert )
> +                {
> +                    memmove(&e820[j+1], &e820[j],
> +                            (sum_nr - j) * sizeof(struct e820entry));
> +
> +                    e820[j].size = rdm_start - e820[j].addr;
> +                    e820[j].type = E820_RAM;
> +
> +                    e820[j+1].addr = rdm_start;
> +                    e820[j+1].size = rdm_end - rdm_start;
> +                    e820[j+1].type = E820_RESERVED;
> +                    next_e820_entry_index++;
> +                }
> +                insert++;
> +            }
> +            /* If we're just in of one RAM range */
> +            else if ( rdm_start > start && rdm_end < end && type == E820_RAM 
> )
> +            {
> +                if ( do_insert )
> +                {
> +                    memmove(&e820[j+2], &e820[j],
> +                            (sum_nr - j) * sizeof(struct e820entry));
> +
> +                    e820[j+2].addr = rdm_end;
> +                    e820[j+2].size = e820[j].addr + e820[j].size - rdm_end;
> +                    e820[j+2].type = E820_RAM;
> +                    next_e820_entry_index++;
> +
> +                    e820[j+1].addr = rdm_start;
> +                    e820[j+1].size = rdm_end - rdm_start;
> +                    e820[j+1].type = E820_RESERVED;
> +                    next_e820_entry_index++;
> +
> +                    e820[j].size = rdm_start - e820[j].addr;
> +                    e820[j].type = E820_RAM;
> +                }
> +                insert++;
> +            }
> +            /* If we're going last RAM:Hole range */
> +            else if ( end < next_start &&
> +                      rdm_start > start &&
> +                      rdm_end < next_start &&
> +                      type == E820_RAM )
> +            {
> +                if ( do_insert )
> +                {
> +                    memmove(&e820[j+1], &e820[j],
> +                            (sum_nr - j) * sizeof(struct e820entry));
> +
> +                    e820[j].size = rdm_start - e820[j].addr;
> +                    e820[j].type = E820_RAM;
> +
> +                    e820[j+1].addr = rdm_start;
> +                    e820[j+1].size = rdm_end - rdm_start;
> +                    e820[j+1].type = E820_RESERVED;
> +                    next_e820_entry_index++;
> +                }
> +                insert++;
> +            }

This if-else-if series looks horrible - is there really no way to consolidate
it? Also, other than punching holes in the E820 map you don't seem to
be doing anything here. And the earlier tools side patches didn't do
anything about this either. Consequently, at the time where it may
become necessary to establish the 1:1 mapping in the P2M, there'll
be the RAM mapping still there, causing the device assignment to fail.
Again - did you _test_ this scenario with a big enough guest and your
gfx card passed through?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.