[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] Re: [RFC][PATCH] Basic support for page offline
At 03:54 -0500 on 09 Feb (1234151686), Jiang, Yunhong wrote: > Hi, Tim, this patchset try to support page offline request. I want to get > some initial feedback before more testing. I haven't had a chance to read the patches in detail yet, but my initial impression is that: - The general approach so far seems good (I suspect that your 2.3 stage below could also be done like 2.2 without a full live migration but since that's not implemented yet that's fine). - It seems like a lot of code for what it does. On the Xen side that's just a general impression since I'm not familiar with the bits of the heap allocators that you're changing. In libxc you seem to have duplicated parts of the save/restore code -- better to make those routines externally visible to the rest of libxc and call them from your new function. - Like all systems code everywhere, it needs more comments. :) You've introduced some generic-sounding functions (adjust_pte &c) without describing what they do. I'll have more detailed comments later in the week, I hope. Cheers, Tim. > Page offline can be used by multiple usage model, belows are some examples: > a) If too many correctable error happen to one page, management tools may try > to offline the page to avoid more server error in future; > b) When page is ECC error and can't be recoverd by hardware, Xen's MCA > handler may try to offline the page, so that it will not be accessed anymore. > c) Offline some DIMM for power management etc (Of course, this is far more > than simple page offline) > > The basic idea to offline a page is: > 1) If a page is free, it will be removed from page allocator > 2) If page is in use, the owner will be checked > 2.1) if it is owned by xen/dom0, the offline will be failed > 2.2) If it is owned by a PV guest with no device assigned, user space tools > will try to replace the page with new one. > 2.3) It it is owned by a HVM guest with no device assigned, user space > tools will try to live migration it. > 2.4) If it is owned by a guest with device assigned, user space tools can > do live migration if needed. > > This patchset includes support for type 2.1/2.2. > > page_offfline_xen.patch gives basic support. The new hypercall > (XEN_SYSCTL_page_offline) will mark a page offlining if the page is in-use, > otherwise, it will remove the page from the page allocator. It also changes > the free_heap_pages(), so that if a page_offlining page is freed, that page > will be marked as page_offlined and will not be allocated anymore. One tricky > thing is, the offlined page may not be buddy-aligned (i.e., it may be in the > middle of a 2^order pages), so that we have to re-arrange the buddy system > (i.e. &heap[][][]) carefully. > > page_offline_xen_memory.patch add support to PV guest, a new hypercall > (XENMEM_page_offline) try to replace the old page with the new one. This will > happen only when the guest has been suspeneded, to avoid complex page sharing > situation. I'm still checking if more situation need be considered, like LDT > pages and CR3 pages, so any suggestion is really great help. > > page_offline_tools.patch is an example user space tools based on > libxc/xc_domain_save.c, it will try to firstly mark a page offline, and > checking the result. If a page is owned by a PV guest, it will try to replace > the pages. > > I did some basic testing, tried free pages and PV guest pages and is ok. Of > course, I need more test on it. And more robust error handling is needed. > > Any suggestion is welcome. > > Thanks > Yunhong Jiang -- Tim Deegan <Tim.Deegan@xxxxxxxxxx> Principal Software Engineer, Citrix Systems (R&D) Ltd. [Company #02300071, SL9 0DZ, UK.] _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |