[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-ia64-devel] RE: Latest status about multiple domains on XEN/IPF


  • To: "Tian, Kevin" <kevin.tian@xxxxxxxxx>, "Magenheimer, Dan \(HP Labs Fort Collins\)" <dan.magenheimer@xxxxxx>
  • From: "Tian, Kevin" <kevin.tian@xxxxxxxxx>
  • Date: Mon, 19 Sep 2005 22:18:22 +0800
  • Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
  • Delivery-date: Mon, 19 Sep 2005 14:16:02 +0000
  • List-id: Discussion of the ia64 port of Xen <xen-ia64-devel.lists.xensource.com>
  • Thread-index: AcW4ZfOzzchRRTCPTFuQnRhZtvFcDwADcXEgAAKwUYAACZFncAAB1poAABrDDSAABg+TcAAau19QAAO834AAEcfo8AAMivGQAA1I17AAsvIwEA==
  • Thread-topic: [Xen-ia64-devel] RE: Latest status about multiple domains on XEN/IPF

Now I found the issue coming from the event injection mechanism. Actually under 
some circumstance, the evtchn_upcall_pending and some evtchn_pending will be 
set on but related irr bit is cleared. Once this condition happens, later event 
notification always failed to enable irr bit. The reason comes from guest who 
may re-generate event ignored before and this path has nothing to do with irr 
however. Based upon following rough patch, I can see event injected into guest 
however, to see nested event injection and dead lock happening. So may need a 
bit more investigation. If this mechanism can be promised to work again, we may 
see something interesting happen since previous progress was completely 
triggered by "xm console" instead of event.

[Xen]
diff -r 55bc6698c889 xen/arch/ia64/xen/domain.c
--- a/xen/arch/ia64/xen/domain.c        Thu Sep 15 00:00:23 2005
+++ b/xen/arch/ia64/xen/domain.c        Mon Sep 19 22:02:27 2005
@@ -916,7 +916,7 @@
 #endif
 
        /* Mask all upcalls... */
-       for ( i = 0; i < MAX_VIRT_CPUS; i++ )
+       for ( i = 1; i < MAX_VIRT_CPUS; i++ )
            d->shared_info->vcpu_data[i].evtchn_upcall_mask = 1;
 
 #ifdef CONFIG_VTI
diff -r 55bc6698c889 xen/arch/ia64/xen/vcpu.c
--- a/xen/arch/ia64/xen/vcpu.c  Thu Sep 15 00:00:23 2005
+++ b/xen/arch/ia64/xen/vcpu.c  Mon Sep 19 22:02:27 2005
@@ -21,6 +21,7 @@
 #include <asm/processor.h>
 #include <asm/delay.h>
 #include <asm/vmx_vcpu.h>
+#include <xen/event.h>
 
 typedef        union {
        struct ia64_psr ia64_psr;
@@ -631,6 +632,16 @@
 {
        UINT64 *p, *q, *r, bits, bitnum, mask, i, vector;
 
+       /* Always check pending event, since guest may just ack the
+        * event injection without handle. Later guest may throw out
+        * the event itself.
+        */
+       if (event_pending(vcpu) && 
+               !test_bit(vcpu->vcpu_info->arch.evtchn_vector,
+                       &PSCBX(vcpu, insvc[0])))
+               test_and_set_bit(vcpu->vcpu_info->arch.evtchn_vector,
+                       &PSCBX(vcpu, irr[0]));
+
        p = &PSCBX(vcpu,irr[3]);
        /* q = &PSCB(vcpu,delivery_mask[3]); */
        r = &PSCBX(vcpu,insvc[3]);

[XENO]
diff -r c9522a6d03a8 drivers/xen/core/evtchn_ia64.c
--- a/drivers/xen/core/evtchn_ia64.c    Wed Sep 14 23:35:51 2005
+++ b/drivers/xen/core/evtchn_ia64.c    Mon Sep 19 22:04:04 2005
@@ -81,6 +81,7 @@
     shared_info_t *s = HYPERVISOR_shared_info;
     vcpu_info_t   *vcpu_info = &s->vcpu_data[smp_processor_id()];
 
+    vcpu_info->evtchn_upcall_mask = 1;
     vcpu_info->evtchn_upcall_pending = 0;
 
     /* NB. No need for a barrier here -- XCHG is a barrier on x86. */
@@ -107,6 +108,7 @@
            }
         }
     }
+    vcpu_info->evtchn_upcall_mask = 0;
     return IRQ_HANDLED;
 }

Thanks,
Kevin
>-----Original Message-----
>From: xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx
>[mailto:xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Tian, Kevin
>Sent: 2005年9月16日 8:45
>To: Magenheimer, Dan (HP Labs Fort Collins)
>Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>Subject: [Xen-ia64-devel] RE: Latest status about multiple domains on XEN/IPF
>
>Yeah, seems we're on same page now. I doubt the console issue may be also the
>reason of the blkfront connection, since unwanted delay may cause timeout. 
>Still
>need more investigation. ;-(
>
>Thanks,
>Kevin
>
>>-----Original Message-----
>>From: Magenheimer, Dan (HP Labs Fort Collins)
>[mailto:dan.magenheimer@xxxxxx]
>>Sent: 2005年9月16日 3:24
>>To: Tian, Kevin
>>Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>>Subject: RE: Latest status about multiple domains on XEN/IPF
>>
>>I got it all built with all the patches.  I am now
>>able to run xend.  But when I do "xm create"
>>I just get as far as:
>>
>>xen-event-channel using irq 233
>>store-evtchn = 1
>>
>>and then the 0+1+01 (etc) debug output.
>>
>>Wait... I tried launching another domain and got
>>further.  Or I guess this is just delayed console
>>output from the first "xm create"?
>>
>>It gets as far as:
>>Xen virtual console successfully installed as tty0
>>Event-channel device installed.
>>xen_blk: Initialising virtual block device driver
>>
>>and then nothing else.
>>
>>So I tried launching some more domains (with name=xxx).
>>Now I get as far as the kernel unable-to-mount-root
>>panic.
>>
>>It's hard to tell what is working because of the
>>console problems (that I see you have posted a question
>>about on xen-devel).
>>
>>Dan
>>
>>> -----Original Message-----
>>> From: Tian, Kevin [mailto:kevin.tian@xxxxxxxxx]
>>> Sent: Thursday, September 15, 2005 6:32 AM
>>> To: Magenheimer, Dan (HP Labs Fort Collins)
>>> Cc: ipf-xen
>>> Subject: RE: Latest status about multiple domains on XEN/IPF
>>>
>>> Hi, Dan,
>>>
>>>     Attached are updated xeno patch (xen patch still same),
>>> but no functional enhancement actually. Some Makefile change
>>> is required to build latest xenolinux.hg, though bit ugly.
>>> ;-) Together with another patch I sent out for solving domU
>>> crash on the mailing list (Took me most time of the day),
>>> hope you can reach same point as mine:
>>>     Blkfront failed to connect to xenstore, and mount root fs panic.
>>>
>>> Thanks,
>>> Kevin
>>>
>>> >-----Original Message-----
>>> >From: Magenheimer, Dan (HP Labs Fort Collins)
>>> [mailto:dan.magenheimer@xxxxxx]
>>> >Sent: 2005年9月15日 12:05
>>> >To: Tian, Kevin
>>> >Cc: ipf-xen
>>> >Subject: RE: Latest status about multiple domains on XEN/IPF
>>> >
>>> >>  Thanks for comments. When I sent out the patch, I
>>> >> didn't mean it as the final one and just for you to continue
>>> >> debug. So the style is a bit messed, and your most comments
>>> >> regarding coding style are correct. I anyway will be careful
>>> >> next time even when sending out temp patch.
>>> >
>>> >Oh, OK.  I didn't realize it was a "continue debug" patch.
>>> >
>>> >> >I haven't seen any machine crashes, but I am both
>>> >> >running on a different machine and exercising it
>>> >> >differently.  If you have any test to reproduce
>>> >> >it, please let me know.  I have noticed that
>>> >> >running "hg clone" seems to reproducibly cause
>>> >> >a segmentation fault... I haven't had any time
>>> >> >to try to track this down.  (I think Intel has better
>>> >> >hardware debugging capabilities... perhaps if you
>>> >> >can reproduce this, someone on the Intel team can
>>> >> >track it down?)
>>> >>
>>> >> I see the crash when domU was executing. Actually if only
>>> >> dom0 is up, it can run safely for several days.
>>> >
>>> >OK.  Yes, I have seen dom0 stay up for many days
>>> >too; that's why I was concerned if it was crashing.
>>> >
>>> >> >When I last tried, I wasn't able to get xend to
>>> >> >run (lots of python errors).  It looks like you
>>> >> >have gotten it to run?
>>> >>
>>> >> Is it possible due to the python version? The default python
>>> >> version on EL3 is 2.2, and with it we saw many python errors
>>> >> before. Now we're using 2.4.1.
>>> >
>>> >I am using 2.3.5 but that has always worked before.
>>> >
>>> >> One more question. Did you try xend with all my patches
>>> >> applied? Without change to do_memory_ops which is explained
>>> >> below, xend doesn't start since its memory reservation
>>> >> request will fail.
>>> >
>>> >I bet that is the problem.  I haven't tried it since
>>> >receiving your patch and will try it again tomorrow.
>>> >
>>> >> >3) In privcmd.c (other than the same comment about
>>> >> >   ifdef'ing every change), why did you change the
>>> >> >   direct_remap_... --> remap__... define back?
>>> >> >   Was it incorrect or just a style change?  Again,
>>> >> >   I am trying to change the patches to something that
>>> >> >   will likely be more acceptable upstream and
>>> >> >   I think we will be able to move this simple
>>> >> >   define into an asm header file.  If my change
>>> >> >   to your patch is broken, please let me know.
>>> >>
>>> >> But as you may note, two functions requires different
>>> >> parameters, one for mm_struct and another for vma. So your
>>> >> previous change is incorrect.
>>> >
>>> >No I missed that difference entirely!  Good catch!
>>> >
>>> >> >6) I will add your patch to hypercall.c (in the hypervisor).
>>> >> >   But the comment immediately preceding concerns me...
>>> >> >   are reservations implemented or not?  (I think not,
>>> >> >   unless maybe they are only in VTI?)
>>> >>
>>> >> No, both don't handle the reservation. However the issue is
>>> >> that now nr_extents is not the level 1 parameter which
>>> >> previous code simply retrieves from pt_regs. Now it's a sub
>>> >> field in a new reservation structure, with the later only
>>> >> parameter passed in. So I have to add above logic to get
>>> >> nr_extents and return result that caller wants.
>>> >
>>> >OK.
>>> >
>>> >If you have an updated patch by the end of your day,
>>> >please send it and I will try it out tomorrow.
>>> >
>>> >Dan
>>>
>
>_______________________________________________
>Xen-ia64-devel mailing list
>Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>http://lists.xensource.com/xen-ia64-devel

_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ia64-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.