|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Ubuntu 16.04.1 LTS kernel 4.4.0-57 over-allocation and xen-access fail
On 10/01/17 15:02, Razvan Cojocaru wrote:
> On 01/10/2017 04:13 PM, Andrew Cooper wrote:
>> On 10/01/17 09:06, Razvan Cojocaru wrote:
>>> On 01/09/2017 02:54 PM, Andrew Cooper wrote:
>>>> On 09/01/17 11:36, Razvan Cojocaru wrote:
>>>>> Hello,
>>>>>
>>>>> We've come across a weird phenomenon: an Ubuntu 16.04.1 LTS HVM guest
>>>>> running kernel 4.4.0 installed via XenCenter in XenServer Dundee seems
>>>>> to eat up all the RAM it can:
>>>>>
>>>>> (XEN) [ 394.379760] d1v1 Over-allocation for domain 1: 524545 > 524544
>>>>>
>>>>> This leads to a problem with xen-access, specifically libxc which does
>>>>> this in xc_vm_event_enable() (this is Xen 4.6):
>>>>>
>>>>> ring_page = xc_map_foreign_batch(xch, domain_id, PROT_READ | PROT_WRITE,
>>>>> &mmap_pfn, 1);
>>>>>
>>>>> if ( mmap_pfn & XEN_DOMCTL_PFINFO_XTAB )
>>>>> {
>>>>> /* Map failed, populate ring page */
>>>>> rc1 = xc_domain_populate_physmap_exact(xch, domain_id, 1, 0, 0,
>>>>> &ring_pfn);
>>>>> if ( rc1 != 0 )
>>>>> {
>>>>> PERROR("Failed to populate ring pfn\n");
>>>>> goto out;
>>>>> }
>>>>>
>>>>> The first time everything works fine, xen-access can map the ring page.
>>>>> But most of the time the second time fails in the
>>>>> xc_domain_populate_physmap_exact() call, and again this is dumped in the
>>>>> Xen log (once for each failed attempt):
>>>>>
>>>>> (XEN) [ 395.952188] d0v3 Over-allocation for domain 1: 524545 > 524544
>>>> Thinking further about this, what happens if you avoid removing the page
>>>> on exit?
>>>>
>>>> The first populate succeeds, and if you leave the page populated, the
>>>> second time you come around the loop, it should not be of type XTAB, and
>>>> the map should succeed.
>>> Sorry for the late reply, had to put out another fire yesterday.
>>>
>>> I've taken your recommendation to roughly mean this:
>>>
>>> diff --git a/xen/common/vm_event.c b/xen/common/vm_event.c
>>> index ba9690a..805564b 100644
>>> --- a/xen/common/vm_event.c
>>> +++ b/xen/common/vm_event.c
>>> @@ -100,8 +100,11 @@ static int vm_event_enable(
>>> return 0;
>>>
>>> err:
>>> + /*
>>> destroy_ring_for_helper(&ved->ring_page,
>>> ved->ring_pg_struct);
>>> + */
>>> + ved->ring_page = NULL;
>>> vm_event_ring_unlock(ved);
>>>
>>> return rc;
>>> @@ -229,9 +232,12 @@ static int vm_event_disable(struct domain *d,
>>> struct vm_event_domain *ved)
>>> }
>>> }
>>>
>>> + /*
>>> destroy_ring_for_helper(&ved->ring_page,
>>> ved->ring_pg_struct);
>>> + */
>>>
>>> + ved->ring_page = NULL;
>>> vm_event_cleanup_domain(d);
>>>
>>> vm_event_ring_unlock(ved);
>>>
>>> but this unfortunately still fails to map the page the second time. Do
>>> you mean to simply no longer munmap() the ring page from libxc / the
>>> client application?
>> Neither.
>>
>> First of all, I notice that this is probably buggy:
>>
>> ring_pfn = pfn;
>> mmap_pfn = pfn;
>> rc1 = xc_get_pfn_type_batch(xch, domain_id, 1, &mmap_pfn);
>> if ( rc1 || mmap_pfn & XEN_DOMCTL_PFINFO_XTAB )
>> {
>> /* Page not in the physmap, try to populate it */
>> rc1 = xc_domain_populate_physmap_exact(xch, domain_id, 1, 0, 0,
>> &ring_pfn);
>> if ( rc1 != 0 )
>> {
>> PERROR("Failed to populate ring pfn\n");
>> goto out;
>> }
>> }
>>
>> A failure of xc_get_pfn_type_batch() is not a suggestion that population
>> might work.
>>
>>
>> What I meant was taking out this call:
>>
>> /* Remove the ring_pfn from the guest's physmap */
>> rc1 = xc_domain_decrease_reservation_exact(xch, domain_id, 1, 0,
>> &ring_pfn);
>> if ( rc1 != 0 )
>> PERROR("Failed to remove ring page from guest physmap");
>>
>> To leave the frame in the guest physmap. The issue is fundamentally
>> that after this frame has been taken out, something kicks the VM to
>> realise it has an extra frame of balloonable space, which it clearly
>> compensates for.
>>
>> You can work around the added attack surface by marking it RO in EPT;
>> neither Xen's nor dom0's mappings are translated via EPT, so they can
>> still make updates, but the guest won't be able to write to it.
>>
>> I should say that this is all a gross hack, and is in desperate need of
>> a proper API to make rings entirely outside of the gfn space, but this
>> hack should work for now.
> Thanks! So far, it seems to work like a charm like this:
Great.
>
> diff --git a/tools/libxc/xc_vm_event.c b/tools/libxc/xc_vm_event.c
> index 2fef96a..5dd00a6 100644
> --- a/tools/libxc/xc_vm_event.c
> +++ b/tools/libxc/xc_vm_event.c
> @@ -130,9 +130,17 @@ void *xc_vm_event_enable(xc_interface *xch, domid_t
> domain_id, int param,
> }
>
> /* Remove the ring_pfn from the guest's physmap */
> + /*
> rc1 = xc_domain_decrease_reservation_exact(xch, domain_id, 1, 0,
> &ring_pfn);
> if ( rc1 != 0 )
> PERROR("Failed to remove ring page from guest physmap");
> + */
> +
> + if ( xc_set_mem_access(xch, domain_id, XENMEM_access_r, mmap_pfn, 1) )
> + {
> + PERROR("Could not set ring page read-only\n");
> + goto out;
> + }
>
> out:
> saved_errno = errno;
>
> Should I send this as a patch for mainline as well?
Probably a good idea, although I would include a code comment explaining
what is going on, because this is subtle if you don't know the context.
~Andrew
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |