[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] vpmu=1 and running 'perf top' within a PVHVM guest eventually hangs dom0 and hypervisor has stuck vCPUS. Romley-EP (model=45, stepping=2)
----- maillists.shan@xxxxxxxxx wrote: > We also met the issue as fixed by Dietmar's workaround. I remember we > two had some email discussion at that time. > > The issue causing interrupt loop is: > It seems that on NHM (at that time) when a PMI arrives at CPU, the > counter has a value to zero (instead of some other small value, say 3 > or 5, seen on Core 2 Duo). In this case, unmasking the PMI via APIC > will trigger immediately another PMI. > This does not produce problem with native kernel, since it typically > programs the counter with another value (as needed by making yet > another sampling point) before unmasking. > For Xen, PMI handler cannot handle the counter immediately since it > should be handled by guests. It just records a virtual PMI to guests > and unmasks the PMI before return. > > We don't know whether this is a desired HW behavior. But we hope we > can get confirm from internal HW team quickly. I will note that this workaround appeared not to be needed on Haswell. I have run my tests there for fairly long period of time without any problems. Of course, this doesn't *prove* that the workaround is not needed but I'd usually trigger this hang withing 20-30 minutes at the most on other processors. On Haswell I ran for 6 or 7 hours. -boris > > Shan Haitao > > 2013/3/13 Dietmar Hahn <dietmar.hahn@xxxxxxxxxxxxxx>: > > Am Dienstag 12 MÃrz 2013, 16:54:11 schrieb Boris Ostrovsky: > >> On 03/12/2013 04:31 PM, Konrad Rzeszutek Wilk wrote: > >> > On Tue, Mar 12, 2013 at 02:50:59PM -0400, Boris Ostrovsky wrote: > >> >> On 03/12/2013 01:30 PM, Konrad Rzeszutek Wilk wrote: > >> >>> This issue I am encountering seems to only happen on > multi-socket > >> >>> machines. > >> >> I believe I was able to reproduce this (once) on my laptop. > >> >> > >> >>> It also does not help that the only multi-socket box I have is > >> >>> an Romley-EP (so two socket SandyBridge CPUs). The other > >> >>> SandyBridge boxes I've (one socket) are not showing this. > Granted > >> >>> they are also a different model (42). > >> >>> > >> >>> The problem is that when I run 'perf top' within an SMP PVHVM > >> >>> guest, after a couple of seconds or minutes the guest hangs. > >> >>> Hypervisor ends up stuck too looping, and then the dom0 ends > >> >>> up hanging as well. > >> >>> > >> >>> Dumping the cpu registers (Ctrl-A x3, then 'd' > >> >>> shows that the guest is pretty firmly stuck in > vmx_vmexit_handler: > >> >>> > >> >>> (XEN) [<ffff82c4c01d386f>] vmx_vmexit_handler+0x22f/0x174 > >> >> And in my case this address is the second instruction after STI, > i.e. we > >> >> are right at the point where interrupts got enabled. > >> >> > >> >> So I am wondering whether this has something to do with the > counter > >> >> overflow interrupt (which I believe is an NMI). > >> > Interestingly enough, if I run the PVHVM guest with 'nowatchdog' > >> > it runs fine! > >> > >> I think by default perf top runs off timer interrupt so it does not > use > >> HW counters. But watchdog > >> is implemented on top of the counters so perhaps it fires the > interrupt > >> at a bad time, messing > >> something up. > > > > This looks like a strange behavior we had on nehalem cpus see > > http://lists.xen.org/archives/html/xen-devel/2010-11/msg01157.html > > For this I added a quirk, see check_pmc_quirk() in vpmu_core2.c > > The model 42 is in the quirk list and it seems to work but Romley-EP > is model > > 43 I think which is not in the list. > > Maybe you should add this model and give it a try. > > > > > > Dietmar. > > > > -- > > Company details: http://ts.fujitsu.com/imprint.html > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@xxxxxxxxxxxxx > > http://lists.xen.org/xen-devel > > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxx > http://lists.xen.org/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |