[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [PATCH] Allow xen to share device with domain (RE:[PATCH]RE:[Xen-ia64-devel] xencons interrupt problem)


  • To: "Tian, Kevin" <kevin.tian@xxxxxxxxx>, "Alex Williamson" <alex.williamson@xxxxxx>
  • From: "Tian, Kevin" <kevin.tian@xxxxxxxxx>
  • Date: Mon, 10 Jul 2006 17:20:30 +0800
  • Cc: xen-ia64-devel <xen-ia64-devel@xxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Mon, 10 Jul 2006 02:20:59 -0700
  • List-id: Discussion of the ia64 port of Xen <xen-ia64-devel.lists.xensource.com>
  • Thread-index: AcahSZExZt6TzBjhSya8+As8M2mL/wAHvzqAAKXpRSA=
  • Thread-topic: [PATCH] Allow xen to share device with domain (RE:[PATCH]RE:[Xen-ia64-devel] xencons interrupt problem)

Hi, Alex,
        It would be interesting to monitor the resume point of xenlinux 
when you see box halted there. For example, when you're suspecting 
an infinite wait for serial tx space, what on earth does xenlinux try to 
write? Based on the content at that point, it's possible to grep in kernel 
source to see whether that sentence is a normal output or an error 
output.

        It's a boresome case if some printks keep roared in the interrupt 
handler path of shared NIC device before that handler writes EOI to 
IOSAPIC. In most cases, such printks normally mean warning or error 
in this path. So maybe you can do a check to see whether your halt 
falls into this scenario first. Then if yes, we can see whether to address 
this issue directly or to solve another culprit causing it.

        As a robust solution even for such error condition, maybe to 
override old content in serial tx buffer has to be allowed to forward progress.


Thanks,
Kevin

>From: Tian, Kevin
>Sent: 2006年7月7日 10:09
>>From: Alex Williamson [mailto:alex.williamson@xxxxxx]
>>Sent: 2006年7月7日 6:14
>>
>>On Tue, 2006-07-04 at 09:49 +0800, Tian, Kevin wrote:
>>> Hi, Alex,
>>>     Could you try attached patch to see whether progressing a step
>>> for you? It's made on top of last patch, to address a bug that
>>> VEC_XEN_ALIAS is only meaningful when enable bit is on. This bug
>>> may result guest to think shared irq line edge-triggered and thus no
>>> EOI request is issued which may stuck the subsequent instances. :-)
>>
>>Hi Kevin,
>>
>>   Good catch with this patch, but it still hangs.  Besides having xen
>>call end() in __do_IRQ(), I can also prevent the hang by booting with
>>sync_console.  If I INIT the system when it's hung, the only CPU that's
>>not in the idle loop is sitting in do_console_io(), maybe into
>>guest_console_write() (which appears to be getting inlined).  I'm
>>wondering if the problem is actually Xen spinning there waiting for tx
>>space and preventing the guest from calling end().  I added a loop
>>counter for debug, but I haven't been able to make it pop out yet.
>>Thanks,
>>
>>      Alex
>>
>
>That's the possible cause. Actually I seldom considered serial driver
>itself before:-). Does it spin tx buffer in irq handler or somewhere else
>with irq disabled? Which event may cause xen into infinite spin? If spin
>can exit, xenlinux can be resumed and then end() should be triggered...
>
>Thanks,
>Kevin

_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ia64-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.