[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86/setup: do not relocate below the end of current Xen image placement



On Tue, Nov 28, 2017 at 07:28:25AM -0700, Jan Beulich wrote:
> >>> On 28.11.17 at 13:41, <daniel.kiper@xxxxxxxxxx> wrote:
> > On Mon, Nov 27, 2017 at 09:51:56AM -0700, Jan Beulich wrote:
> >> >>> On 27.11.17 at 16:41, <daniel.kiper@xxxxxxxxxx> wrote:
> >> > If it is possible we would like to have the Xen image higher than the
> >> > booloader put it and certainly do not overwrite the Xen code and data
> >> > during copy/relocation. Otherwise the Xen may crash silently at boot.
> >>
> >> Is this something that you've actually observed happening? I'd be
> >> curious about the particular numbers if so.
> >
> > We were hit by the issue in OVS Xen 4.4 with my earlier version of
> > EFI/Multiboot2 patches. Initially its implementation allowed relocation
> > of Xen even if it was relocated by the bootloader. This led to the
> > crashes on some new Oracle machines because copy destination partially
> > overlapped with the end of current/initial Xen image placement.
> >
> > After some discussion here we decided to disable Xen relocation in my
> > EFI/Multiboot2 upstream patches if the booloader did the work for us.
> > Though one case is still not covered. If Xen is not relocated by the
> > booloader then it tries to do that by itself. If all RAM regions above
> > currently occupied one are unsuitable for relocation then Xen tries move
> > itself higher in it. And if (end - reloc_size + XEN_IMG_OFFSET) goes
> > below _end then copy/relocation destination overlaps, at least partially,
> > with its source.
> >
> > I can agree that this should not happen on todays machines very often.
> > If at all. It is rather unusual to not have usable RAM regions above
> > ~5 MiB nowadays. Though I think that we should at least consider putting
> > such safety measure here. Otherwise Xen may crash mysteriously without
> > any stack trace. So, as you may imagine it is very confusing and impairs
> > further debugging. However, due to rarity of the issue I am not going to
> > insist very strongly here.
>
> I don't mind taking care of this case, as long as everything is
> being properly described.

OK, great!

[...]

> > OK, we have below:
> >
> > [...]
> >
> > e = end - reloc_size;
> >
> > [...]
> >
> > move_memory(e + XEN_IMG_OFFSET, XEN_IMG_OFFSET, _end - _start, 1);
> >
> > So, we have: e + XEN_IMG_OFFSET >= XEN_IMG_OFFSET + _end - _start
> >
> > We can drop XEN_IMG_OFFSET, so we have: e >= _end - _start
> >
> > However, we cannot use "e" in the condition, so we have:
> >   end - reloc_size >= _end - _start
>
> Okay, so this implies an original base address of zero. In that case,

Yep!

> however, we can't possibly move Xen downwards; looks like I've
> misunderstood the title/description: You're really worried about an
> overlap of the regions. From description and comment I was getting
> the impression that the goal would be to cover more cases (including
> ones where Xen wasn't loaded at the lowest possible address).
> Could you try to make this more clear?

Sure thing!

> Furthermore I'm not convinced the condition needs to be as strict
> as you make it now: As long as move_memory() can deal with
> overlapping areas, a partial overlap looks to be okay. That would
> allow for more cases where Xen does actually get moved even
> with very little memory.

This made me suspicious. I spent some time looking at the code and
realized that the problem lies a few lines below.

for ( j = 0; j < L3_PAGETABLE_ENTRIES; j++, pl3e++ )
{
    /* Not present, 1GB mapping, or already relocated? */
    if ( !(l3e_get_flags(*pl3e) & _PAGE_PRESENT) ||
         (l3e_get_flags(*pl3e) & _PAGE_PSE) ||
         (l3e_get_pfn(*pl3e) > PFN_DOWN(xen_phys_start)) )
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Similar conditions are also below. These things mean that if xen_phys_start 
falls
below at least one current Xen code/data 2 MiB mapping then everything which was
originally more or less between xen_phys_start - _end is not mapped later...

So, it looks that the condition should look like this right now:

xen_phys_start >= XEN_IMG_OFFSET + _end - _start

xen_phys_start == e
e = end - reloc_size

end - reloc_size >= XEN_IMG_OFFSET + _end - _start

So, finally it should be:

if ( ( end > s ) && (end - reloc_size >= XEN_IMG_OFFSET + _end - _start) )

> Another option is to consider not moving Xen based on other
> criteria: The main goal here is to free up memory below 16Mb. If
> the machine has no memory above 16Mb, moving Xen around
> won't do any good in this regard.

This is also an option. However, I am not sure why everything below
16 MiB is so precious. I have a feeling that it is related somehow
to ancient PC days but I do not remember details.

Daniel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.