[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] fxsave, fnsave, ltr hang for guest OS.


  • To: <alarson@xxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxx>
  • From: Keir Fraser <keir@xxxxxxx>
  • Date: Thu, 04 Nov 2010 16:50:52 +0000
  • Cc:
  • Delivery-date: Thu, 04 Nov 2010 09:51:52 -0700
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=sender:user-agent:date:subject:from:to:message-id:thread-topic :thread-index:in-reply-to:mime-version:content-type :content-transfer-encoding; b=r6dSDRLgJXR/OZYbf50ckI0zBSog4po6rzEwL8m6AZ4pLoqwkS+waKDoJTNVwPcImK HThU8V5Om2vCmnfXUr3aaAPjfOqSCiHbOng3J4RjtTyQv//hXsDw6Aq/R62bcB8suYI8 3OQ5IHG7gOOlXxTMmV8pjEMpShCST4xiPHUzQ=
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>
  • Thread-index: Act8QHCnfA3zrRcuWkSQOnXkRcmvBg==
  • Thread-topic: [Xen-devel] fxsave, fnsave, ltr hang for guest OS.

On 04/11/2010 16:32, "alarson@xxxxxxxx" <alarson@xxxxxxxx> wrote:

> It turns out that OpenSuse provides a debug xen kernel so I was able
> to use that yesterday, but I wasn't making much progress until it
> occurred to me that by pressing 'd', I'm sampling the stack of xen,
> but I don't know for sure which client the xen stack represents.  When
> I restricted the two clients (dom0 and my os) to exclusively their own
> CPU, then, if you exclude the first trace below, a pattern seems to
> emerge, and it would seem that I should start with sh_page_fault().

Hm, well maybe. sh_page_fault() is the entry point to one of the most
complex pieces of the hypervisor, so you're likely to find it more of a
sticky tar pit than a source of salvation. It may be involved, but you might
want to wade in armed with some more debug info first so that your search
has some greater focus.

I would dispense with sampling the 'd' key -- clearly a bunch of stuff is
happening -- and instead go for printk() tracing in the vmexit handler
vmx_vmexit_handler(). Perhaps enable this tracing only when the saved guest
EIP happens to be the address that you know your 'hanging' instruction
resides at, so that the logging console only gets noisy when the bad
situation occurs. From that function you can log useful things like: what is
the vmexit for? Faulting linear address (if a page fault)? Etc etc. See if
there is a (looping) pattern to the vmexit reasons and see thereby if you
can work out the overall livelock loop you are experiencing (this assumes
your problem is a livelock loop of some kind, which seems quite possible
from what you've seen so far).

 -- Keir

> I think I can see a path forward, but I figured I'd post hoping to get
> lucky again...



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.