Re: [Xen-devel] vpmu=1 and running 'perf top' within a PVHVM guest eventually hangs dom0 and hypervisor has stuck vCPUS. Romley-EP (model=45, stepping=2)

On Tue, Mar 12, 2013 at 02:50:59PM -0400, Boris Ostrovsky wrote:
> On 03/12/2013 01:30 PM, Konrad Rzeszutek Wilk wrote:
> >This issue I am encountering seems to only happen on multi-socket
> >machines.
> I believe I was able to reproduce this (once) on my laptop.
> >It also does not help that the only multi-socket box I have is
> >an Romley-EP (so two socket SandyBridge CPUs). The other
> >SandyBridge boxes I've (one socket) are not showing this. Granted
> >they are also a different model (42).
> >
> >The problem is that when I run 'perf top' within an SMP PVHVM
> >guest, after a couple of seconds or minutes the guest hangs.
> >Hypervisor ends up stuck too looping, and then the dom0 ends
> >up hanging as well.
> >
> >Dumping the cpu registers (Ctrl-A x3, then 'd'
> >shows that the guest is pretty firmly stuck in vmx_vmexit_handler:
> >
> >(XEN)    [<ffff82c4c01d386f>] vmx_vmexit_handler+0x22f/0x174
> And in my case this address is the second instruction after STI, i.e. we
> are right at the point where interrupts got enabled.
> So I am wondering whether this has something to do with the counter
> overflow interrupt (which I believe is an NMI).

Interestingly enough, if I run the PVHVM guest with 'nowatchdog'
it runs fine!

