[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] [PATCH] x86/hap: fix race condition between ENABLE_LOGDIRTY and track_dirty_vram hypercall
There is a race condition between XEN_DOMCTL_SHADOW_OP_ENABLE_LOGDIRTY and HVMOP_track_dirty_vram hypercall. Although HVMOP_track_dirty_vram is called many times from qemu-dm which is connected via VNC, XEN_DOMCTL_SHADOW_OP_ENABLE_LOGDIRTY is called only once from a migration process (e.g. xc_save, libxl-save-helper). So the race seldom happens, but the following cases are possible. ======================================================================== [case-1] XEN_DOMCTL_SHADOW_OP_ENABLE_LOGDIRTY hypercall -> paging_enable_logdirty() -> hap_logdirty_init() -> paging_log_dirty_disable() dirty_vram = NULL -> paging_log_dirty_init(d, hap_enable_log_dirty, ...) ---> (A) -> paging_log_dirty_enable() ************************************************************************** /* <--- (B) */ -> hap_enable_vram_tracking() // should be hap_enable_log_dirty() !!! return -EINVAL <- return -EINVAL // live-migration failure!!! ************************************************************************** HVMOP_track_dirty_vram -> hap_track_dirty_vram() /* (!paging_mode_log_dirty(d) && !dirty_vram) */ -> hap_vram_tracking_init() /* <--- (A) */ -> paging_log_dirty_init(d, hap_enable_vram_tracking, ...) ---> (B) -> paging_log_dirty_enable() ======================================================================== [case-2] XEN_DOMCTL_SHADOW_OP_ENABLE_LOGDIRTY hypercall -> paging_enable_logdirty() -> hap_logdirty_init() -> paging_log_dirty_disable() /* d->arch.hvm_domain.dirty_vram != NULL */ d->arch.paging.mode &= ~PG_log_dirty; ---> (C) /* d->arch.hvm_domain.dirty_vram is freed */ dereference dirty_vram // access to freed memory !!! <--- (D) HVMOP_track_dirty_vram -> hap_track_dirty_vram() /* (!paging_mode_log_dirty(d) && dirty_vram) */ <---(C) rc = -EINVAL xfree(dirty_vram); ---> (D) dirty_vram = d->arch.hvm_domain.dirty_vram = NULL; return rc; ======================================================================== Actually I encountered the xen crash by null pointer exception in xen-3.4. FYI, in xen-4.x, c/s 19738:8dd5c3cae086 mitigated it. I'm not sure why paging_lock() is used partially in hap_XXX_vram_tracking functions. Thus, this patch introduces a new lock. It would be better to use paging_lock() instead of the new lock since shadow paging mode (not HAP mode) uses paging_lock to avoid this race condition. Thanks, Kouya Signed-off-by: Kouya Shimura <kouya@xxxxxxxxxxxxxx> Attachment:
hap_dirty_vram.patch _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |