Xen project Mailing List

Re: [Xen-devel] vfree crash

To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Petre Ovidiu PIRCALABU <ppircalabu@xxxxxxxxxxxxxxx>, Paul Durrant <Paul.Durrant@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>

From: Jan Beulich <JBeulich@xxxxxxxx>

Date: Wed, 3 Jul 2019 13:53:55 +0000

Accept-language: en-US

Authentication-results: spf=none (sender IP is ) smtp.mailfrom=JBeulich@xxxxxxxx;

Cc: Alexandru Stefan ISAILA <aisaila@xxxxxxxxxxxxxxx>, "rcojocaru@xxxxxxxxxxxxxxx" <rcojocaru@xxxxxxxxxxxxxxx>

Delivery-date: Wed, 03 Jul 2019 13:55:01 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Thread-index: AQHVLdtKR3tMnDvHzU6ntSjZrlUsUaa1cLWAgAAGYACAAANV/oAAHoaAgANRnjCAAAc1AA==

Thread-topic: [Xen-devel] vfree crash

On 03.07.2019 15:27, Andrew Cooper wrote: > On 01/07/2019 11:47, Jan Beulich wrote: >> On 01.07.2019 10:56, Andrew Cooper wrote: >>> On 01/07/2019 09:45, Petre Ovidiu PIRCALABU wrote: >>>> The problem lies with vfree because it creates a new list with the >>>> pages, unmaps the va pointer and then frees the pages. If I do these >>>> steps manually (without adding them to a new list) it works. >>> The problem here is that struct page_info only has a single linked list >>> pointer, and vfree() blindly assumes it is available for use, which >>> isn't true once you've called assign_pages() on the vmap area. >>> >>> At the moment, it doesn't look like it is possible to set v*alloc()'d >>> pages up suitably to be mapped by a guest. (Similar corruption will >>> occur via share_xen_page_with_guest() and the xenheap list). >> Well, whoever assigns pages to a domain behind vmalloc()'s back has got >> to make sure to de-assign those pages before freeing them. > > Why? Or perhaps more accurately, where is any of this written down. I take this to be a rhetorical question. > Allocation of memory seems logically unrelated to making it mappable by > guests, so when vmalloc() *is* the correct allocation function to use, > the fact that assign_pages() results in vfree() silently corrupting the > domains memory list is unexpected behaviour. At best this depends on the position you take. Assignment of pages _always_ transfers their freeing to the refcounting machinery. You won't find many free_domheap_pages() matching alloc_domheap_pages() with a non-NULL d (and no special flags). They'll be paired with put_page() instead. The ones you will find (like in memory_exchange() have special precautions taken up front). >> An alternative >> _might_ be to leave freeing to the normal cleanup processes (when the >> last page ref gets put), and just vunmap()-ing the range when the mapping >> isn't needed anymore. > > So this is what I suggested as an interim solution, but I'm not sure if > it is a sensible option longterm. > > The scenario here is for the "vm-event-ng" interface which was posted as > an RFC earlier. Several key purposes for the new interface is to be a > slot-per-vcpu, and to be usable via the acquire_resource infrastructure. > > struct vm_event_st is currently 384 bytes, which is only 10 full structs > per page. The size of the structure is liable to change over time, and > most likely won't evenly divide a page, so vmalloc() is the correct > allocation interface within Xen. > > The alloc and free in this case is being done as a side effect of the > vmi enable/disable calls. The lifetime of the VMI interface isn't the > same as the lifetime of the VM. > > With HVI specifically, the SVA VM can reboot, and it needs to re-attach > to the protected VMs. There are other load balancing scenarios where > the protection of a VM might logically move between two different SVAs. > > In either case, retaining the first vmalloc() will result in a failure > to remap the ring, as the domain assignment will be to the old domid. > > Therefore, I think it is important to be able to fully disable and clean > up the VMI interface at some point before the protected VM is destroyed. > > As a result, I think the proper fix here is to modify vfree() not to > clobber the pagelist. > > Thoughts? Beyond what I've said above already, from your description I imply that it's the monitoring domain which is to own the page. Yet that's in conflict with you also saying that this domain may want rebooting. In such a case, the pages need to either not be owned by that domain, or they need to be re-allocated during every attach operation. Furthermore - what use would not clobbering the list be? There would still be the call to free_domheap_page() there, which is legitimate only if the page (still having an owner) has just been transitioned to a zero general ref count. What might be possible here is for ( i = 0; i < pages; i++ ) { struct page_info *page = vmap_to_page(va + i * PAGE_SIZE); ASSERT(page); if ( <whatever> ) put_page(pg) else page_list_add(page, &pg_list); } Or maybe this is actually what you have been thinking of. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.