[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] vpmu=1 and running 'perf top' within a PVHVM guest eventually hangs dom0 and hypervisor has stuck vCPUS. Romley-EP (model=45, stepping=2)

To: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
From: Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>
Date: Tue, 12 Mar 2013 16:54:11 -0400
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx, jun.nakajima@xxxxxxxxx, qing.he@xxxxxxxxx, eddie.dong@xxxxxxxxx, dietmar.hahn@xxxxxxxxxxxxxx, jbeulich@xxxxxxxx, suravee.suthikulpanit@xxxxxxx, jiongxi.li@xxxxxxxxx
Delivery-date: Tue, 12 Mar 2013 20:54:27 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 03/12/2013 04:31 PM, Konrad Rzeszutek Wilk wrote:

On Tue, Mar 12, 2013 at 02:50:59PM -0400, Boris Ostrovsky wrote:

On 03/12/2013 01:30 PM, Konrad Rzeszutek Wilk wrote:

This issue I am encountering seems to only happen on multi-socket
machines.

I believe I was able to reproduce this (once) on my laptop.

It also does not help that the only multi-socket box I have is
an Romley-EP (so two socket SandyBridge CPUs). The other
SandyBridge boxes I've (one socket) are not showing this. Granted
they are also a different model (42).

The problem is that when I run 'perf top' within an SMP PVHVM
guest, after a couple of seconds or minutes the guest hangs.
Hypervisor ends up stuck too looping, and then the dom0 ends
up hanging as well.

Dumping the cpu registers (Ctrl-A x3, then 'd'
shows that the guest is pretty firmly stuck in vmx_vmexit_handler:

(XEN)    [<ffff82c4c01d386f>] vmx_vmexit_handler+0x22f/0x174

And in my case this address is the second instruction after STI, i.e. we
are right at the point where interrupts got enabled.

So I am wondering whether this has something to do with the counter
overflow interrupt (which I believe is an NMI).

Interestingly enough, if I run the PVHVM guest with 'nowatchdog'
it runs fine!

I think by default perf top runs off timer interrupt so it does not useHW counters. But watchdogis implemented on top of the counters so perhaps it fires the interruptat a bad time, messing

something up.

-boris


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

Follow-Ups:
- Re: [Xen-devel] vpmu=1 and running 'perf top' within a PVHVM guest eventually hangs dom0 and hypervisor has stuck vCPUS. Romley-EP (model=45, stepping=2)
  - From: Boris Ostrovsky
- Re: [Xen-devel] vpmu=1 and running 'perf top' within a PVHVM guest eventually hangs dom0 and hypervisor has stuck vCPUS. Romley-EP (model=45, stepping=2)
  - From: Jan Beulich
- Re: [Xen-devel] vpmu=1 and running 'perf top' within a PVHVM guest eventually hangs dom0 and hypervisor has stuck vCPUS. Romley-EP (model=45, stepping=2)
  - From: Dietmar Hahn

References:
- [Xen-devel] vpmu=1 and running 'perf top' within a PVHVM guest eventually hangs dom0 and hypervisor has stuck vCPUS. Romley-EP (model=45, stepping=2)
  - From: Konrad Rzeszutek Wilk
- Re: [Xen-devel] vpmu=1 and running 'perf top' within a PVHVM guest eventually hangs dom0 and hypervisor has stuck vCPUS. Romley-EP (model=45, stepping=2)
  - From: Boris Ostrovsky
- Re: [Xen-devel] vpmu=1 and running 'perf top' within a PVHVM guest eventually hangs dom0 and hypervisor has stuck vCPUS. Romley-EP (model=45, stepping=2)
  - From: Konrad Rzeszutek Wilk

Prev by Date: Re: [Xen-devel] Linux 3.4 dom0 kernel error loading xen-acpi-processor: Input/output error
Next by Date: Re: [Xen-devel] Is: SKB_MAX_LEN bites again. Was: Re: bug disabling guest interface
Previous by thread: Re: [Xen-devel] vpmu=1 and running 'perf top' within a PVHVM guest eventually hangs dom0 and hypervisor has stuck vCPUS. Romley-EP (model=45, stepping=2)
Next by thread: Re: [Xen-devel] vpmu=1 and running 'perf top' within a PVHVM guest eventually hangs dom0 and hypervisor has stuck vCPUS. Romley-EP (model=45, stepping=2)
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.