[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 2/2] Xen/vMCE: bugfix to remove problematic is_vmce_ready check
>>> On 03.05.13 at 16:16, "Liu, Jinsong" <jinsong.liu@xxxxxxxxx> wrote: > Jan Beulich wrote: >>>>> On 03.05.13 at 10:41, "Liu, Jinsong" <jinsong.liu@xxxxxxxxx> wrote: >>> Jan Beulich wrote: >>>>>>> On 27.04.13 at 10:38, "Liu, Jinsong" <jinsong.liu@xxxxxxxxx> >>>>>>> wrote: >>>>> From 9098666db640183f894b9aec09599dd32dddb7fa Mon Sep 17 00:00:00 >>>>> 2001 From: Liu Jinsong <jinsong.liu@xxxxxxxxx> >>>>> Date: Sat, 27 Apr 2013 22:37:35 +0800 >>>>> Subject: [PATCH 2/2] Xen/vMCE: bugfix to remove problematic >>>>> is_vmce_ready check >>>>> >>>>> is_vmce_ready() is problematic: >>>>> * For dom0, it checks if virq bind to dom0 mcelog driver. If not, >>>>> it results dom0 crash. However, it's problematic and overkilled >>>>> since mcelog as a dom0 feature could be enabled/disabled per dom0 >>>>> option: (XEN) MCE: This error page is ownded by DOM 0 >>>>> (XEN) DOM0 not ready for vMCE >>>>> (XEN) domain_crash called from mcaction.c:133 >>>>> (XEN) Domain 0 reported crashed by domain 32767 on cpu#31: >>>>> (XEN) Domain 0 crashed: rebooting machine in 5 seconds. >>>>> (XEN) Resetting with ACPI MEMORY or I/O RESET_REG. >>>>> >>>>> * For dom0, if really need check, it should check whether vMCE >>>>> injection for dom0 ready (say, exception trap bounce check, which >>>>> has been done at inject_vmce()), not check dom0 mcelog ready (which >>>>> has been done at mce_softirq() before send global virq to dom0). >>>> >>>> Following the argumentation above, I wonder which of the other >>>> "goto vmce_failed" are really appropriate, i.e. whether the patch >>>> shouldn't be extended (at least for the Dom0 case). >>> >>> You mean other 'goto vmce_failed' are also not appropriate (I'm not >>> quite clear your point)? >> >> Yes. >> >>> Would you please point out which point you think not appropriate? >> >> I question whether it is correct/necessary to crash the domain in >> any of those failure cases. Perhaps when we fail to unmap the >> page it is, but failure of fill_vmsr_data() and inject_vmce() don't >> appear to be valid reasons once the is_vmce_ready() path is being >> dropped. > > For fill_vmsr_data(), it failed only when MCG_STATUS_MCIP bit still set when > next vMCE# occur, means the 2nd vMCE# occur when the 1st vMCE# not handled > yet. Per SDM it should shutdown. > > For inject_vmce(), it failed when > 1). vcpu is still mce_pending, or > 2). pv not register trap callback > Maybe it's some overkilled for dom0 (for other guest, it's ok to kill them), > but any graceful way to quit? Just exit and do nothing (except perhaps log a rate limited message)? > or, considering it rarely happens, how about keep current way (kill guest no > matter dom0 or not)? Possibly - I was merely asking why this one condition was found to be too strict, while the others are being left as is. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |