[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86/hyperv: Adjust hypercall page placement



I worked on Xen-on-Azure last summer in my previous position. Allocating a heap page was how we solved this particular issue in our branch as well.

I see you say you're still 'working on bringing Xen up on Azure', so I'm not sure how far along your branch / patch set is, but for what it's worth, we had "everything" working, including use of VMBus and passthrough devices in dom0.

The Azure version of the product got put on ice, and the code wasn't in a very 'upstreamable' state due to how patches/branches were managed, but bottom line is I know every bump on the Xen-on-Azure road, so may be able to help get an upstream version of this 'the rest of the way' if you start pushing it to xen-devel.

Cheers,
Trolle

On Thu, Apr 24, 2025 at 4:47 PM Alejandro Vallejo <agarciav@xxxxxxx> wrote:
On Thu Apr 24, 2025 at 7:22 PM BST, Ariadne Conill wrote:
> Hi,
>
>> On Apr 24, 2025, at 6:48 AM, Alejandro Vallejo <agarciav@xxxxxxx> wrote:
>>
>> On Thu Apr 24, 2025 at 1:45 PM BST, Alejandro Vallejo wrote:
>>> Xen nowadays crashes under some Hyper-V configurations when
>>> paddr_bits>36. At the 44bit boundary we reach an edge case in which the
>>> end of the guest physical address space is not representable using 32bit
>>> MFNs. Furthermore, it's an act of faith that the tail of the physical
>>> address space has no reserved regions already.
>>>
>>> This commit uses the first unused MFN rather than the last, thus
>>> ensuring the hypercall page placement is more resilient against such
>>> corner cases.
>>>
>>> While at this, add an extra BUG_ON() to explicitly test for the
>>> hypercall page being correctly set, and mark hcall_page_ready as
>>> __ro_after_init.
>>>
>>> Fixes: 620fc734f854("x86/hyperv: setup hypercall page")
>>> Signed-off-by: Alejandro Vallejo <agarciav@xxxxxxx>
>>
>> After a side discussion, this seems on the unsafe side of things due to
>> potential collision with MMIO. I'll resend (though not today) with the
>> page overlapping a RAM page instead. Possibly the last page of actual
>> RAM.
>
> We have been working on bringing Xen up on Azure over at Edera, and
> have encountered this problem.  Our solution to this problem was to
> change Xen to handle the hypercall trampoline page in the same way as
> Linux: dynamically allocating a page from the heap and then marking it
> as executable.
>
> This approach should avoid the issues with MMIO and page overlaps.

Yes, that's what I meant by overlapping RAM. Overlaying the hypercall
page on top of existing RAM rather than trying to find a suitable hole.

> Would it be more interesting to start with our patch instead?

If you have it ready to go, for sure. My ability to test any of this is
fairly limited. I suspect the VM is just not getting 48 bits worth of
guest-physical address space, and so making any hypercall turns into an
EPT violation.

I couldn't run the tests that would definitely prove it though

>From the little I saw of the dmesg going forward, I suspect there's more
required (at least in time handling) to enable support in gen2
insteances.

Cheers,
Alejandro



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.