|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 1/3] x86/boot: Drop pte_update_limit from physical relocation logic
On 10/12/2021 07:17, Jan Beulich wrote:
> On 09.12.2021 18:34, Andrew Cooper wrote:
>> On 07/12/2021 11:37, Jan Beulich wrote:
>>> On 07.12.2021 11:53, Andrew Cooper wrote:
>>>> --- a/xen/arch/x86/setup.c
>>>> +++ b/xen/arch/x86/setup.c
>>>> @@ -1230,7 +1230,6 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>>>> l3_pgentry_t *pl3e;
>>>> l2_pgentry_t *pl2e;
>>>> int i, j, k;
>>>> - unsigned long pte_update_limit;
>>>>
>>>> /* Select relocation address. */
>>>> xen_phys_start = end - reloc_size;
>>>> @@ -1238,14 +1237,6 @@ void __init noreturn __start_xen(unsigned long
>>>> mbi_p)
>>>> bootsym(trampoline_xen_phys_start) = xen_phys_start;
>>>>
>>>> /*
>>>> - * No PTEs pointing above this address are candidates for
>>>> relocation.
>>>> - * Due to possibility of partial overlap of the end of source
>>>> image
>>>> - * and the beginning of region for destination image some
>>>> PTEs may
>>>> - * point to addresses in range [e, e + XEN_IMG_OFFSET).
>>>> - */
>>>> - pte_update_limit = PFN_DOWN(e);
>>> ... considering the comment here: Isn't it 0d31d1680868 ("x86/setup: do
>>> not relocate Xen over current Xen image placement") where need for this
>>> disappeared? Afaict the non-overlap of source and destination is the
>>> crucial factor here, yet your description doesn't mention this aspect at
>>> all. I'd therefore like to ask for an adjustment there.
>> I don't consider that commit relevant.
>>
>> There is no circumstance ever where you can relocate Xen with
>> actually-overlapping ranges, because one way or another, you'd end up
>> copying non-pagetable data over the live pagetables.
> That was fragile, yes. I think I (vaguely!) recall a discussion I had
> with Keir about this, with him pointing out that the logic builds upon
> all necessary mappings being held in the TLB. If you strictly think
> that's not worthwhile to consider or mention, then so be it.
Ok, I'll tweak the commit message. But having come back to this after
more than a year away, I think it's worth pointing out that the details
in 0d31d1680868 demonstrate that the "partial overlap" logic was indeed
buggy.
In a VM, a vcpu migration at just the wrong moment will empty the TLB
under your feet. So will a whole slew of micro-architectural
conditions, or a poorly timed SMI.
Whatever people have tried to call it in the past, it was broken plain
and simple.
~Andrew
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |