[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-ia64-devel] RE: Multiple domains up on a bit old Rev
Hi Kevin (and Fred please read) -- Congratulations on getting multiple domains running again! I know you have worked very hard on this! <voice changing to a whisper> Unfortunately, I have still not been able to reproduce your success :-( It is probably an environment difference of some kind since your environment failed and then later succeeded with no code change. Since multiple domains is critical to Xen/ia64 (and to all of the companies involved), let me suggest the following plan. Fred, is there someone on your team that can be asked to reproduce Kevin's results FROM SCRATCH? Ideally this would be: - install a fresh RHEL 3.2 system (according to system configuration provided by Kevin) - download and install python 2.3+ and mercurial - download fresh xen-ia64-unstable and xenlinux-ia64 bits - build, install and test Xen with just domain0 - (document any missing steps required in the above so others can reproduce) - using ONLY WRITTEN INSTRUCTIONS from Kevin, build, install and test Xen for multiple domains and demonstrate a shell prompt in domU - (document any changes to Kevin's recipe) If someone other than Kevin, myself, and John can do this without assistance (other than written documentation), I think we are ready to move on to the next steps**. If nobody can (and I haven't yet), it is hard to say we have multiple domains working again. One positive side effect of this plan is that we have documentation that others can use. What do you think? Thanks, Dan ** Next steps include: merging back to latest xen-unstable, building possibly-patched drivers from -sparse (see separate message), ensuring domU is stable (can we build linux on it?), getting networking working, ..., ??? > -----Original Message----- > From: Tian, Kevin [mailto:kevin.tian@xxxxxxxxx] > Sent: Wednesday, September 21, 2005 6:03 AM > To: Magenheimer, Dan (HP Labs Fort Collins) > Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx > Subject: RE: Multiple domains up on a bit old Rev > > I also couldn't reproduce same result on tip last night. > However today I made it. No special steps compared to previous flow. > > In the start, I found out the image size calculated for xenU > kernel is incorrect (By check /var/log/xend-debug.log), which > made parseelfimage in control panel failed to retrieve ELF > header info and thus failed in "xm create". > > However when I added some debug information to libxc, > everything worked well then and xenU can boot to shell. Then > even when I remove those debug information and roll back, > xenU still worked. > > So I doubt there's some environmental dirty between first and > second try. Of course it's also possible that some > intermittent bug hides behind. Maybe you can make a fresh try > tomorrow and see whether it works for you then. > > BTW, please add attached changeset into xen-ia64-unstable.hg, > which was made by Keir last week to remove bad lines in > xen-backend.agent. That's the reason why I always failed to > see "physical-device" created in xenstore. > > Thanks, > Kevin > > >-----Original Message----- > >From: Magenheimer, Dan (HP Labs Fort Collins) > [mailto:dan.magenheimer@xxxxxx] > >Sent: 2005å9æ21æ 3:14 > >To: Tian, Kevin > >Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx > >Subject: RE: Multiple domains up on a bit old Rev > > > >I checked in the changes but am unable to reproduce your > >success on tip, even with the public structure changes > >propogated. I think only arch-ia64.h changed, as part > >of Anthony's merge_cpu_2 patch, and arch-ia64.h gets > >auto-copied when xenlinux is rebuilt so I can't explain > >why it would work on rev 6857 and not on tip. > > > >Can you get reproduce your success on tip and, if so, > >describe the steps? > > > >Thanks, > >Dan > > > >> -----Original Message----- > >> From: Tian, Kevin [mailto:kevin.tian@xxxxxxxxx] > >> Sent: Tuesday, September 20, 2005 7:29 AM > >> To: Tian, Kevin; Magenheimer, Dan (HP Labs Fort Collins) > >> Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx > >> Subject: Multiple domains up on a bit old Rev > >> > >> OK, attached patches can make multiple domains working > again, however > >> with a bit old Rev upon which I'm working consistently: > >> > >> Xen-ia64-unstable.hg: Rev 6857 > >> Xenlinux-ia64: Rev 37 > >> > >> I tried the latest xen-ia64-unstable (6867) and failed at > "xm create". > >> This is reasonable since there're some public structure > >> changes in a few > >> new changesets. Lucky thing is, that problem is easy to debug since > >> these changes are well controlled, not like past 1.5 weeks > to struggle > >> with some unknown feature changes in common part. ;-) > >> > >> Signed-off-by Kevin Tian <kevin.tian@xxxxxxxxx> > >> > >> Some interesting issues found related to two small patches: > >> 1. Only "xm console 1" and "Ctrl + ]" can make xenU > forward progress, > >> and however still failed to connect blkback later. > >> [Reason] Previous event injection on XEN/IPF only set vIRR bit > >> when evtchn_set_pending. However with the latest xenlinux > code, it's > >> possible for xenlinux to set pending indication and selector when > >> unmasking some pending event channel. This path has > nothing to do with > >> vIRR bit. > >> > >> [Solution] We should check event pending every time when > >> checking pending interrupts before returning to guest. > >> > >> 2. After fixing first issue, nested event is injected when > first event > >> is still in handle with lock held. Then dead lock happens at end of > >> "xend start". > >> [Reason] Due to same logic as above, xenlinux may set pending > >> indication and re-trigger pending event by > force_evtchn_callback. On > >> x86, this stub just does a dummy xen_version hypercall and > >> pending event > >> will be injected back when leaving hypervisor. However on IA64, > >> force_evtchn_callback is incautiously made invoking > evtchn_interrupt() > >> directly, while the former may be called with lock held. > >> > >> [Solution] Just let force_evtchn_callback as empty for simple > >> now. > >> > >> 3. There's one xenstore node named "physical-device", > which contains > >> major/minor number of device taken as disk for xenU. > However that node > >> is not created automatically and thus later blkfront/blkback > >> communication failed since no virtual disk is found > >> [Reason] Dunno yet. I sent a mail to xen-devel, and hope someone > >> can answer the puzzle for me. ;-) > >> > >> [Solution] This one really took me much time, and below is a > >> temp hack for you to try out. (Don't check-in) > >> > >> diff -r 7f9acc83ffcd tools/python/xen/xend/XendDomainInfo.py > >> --- a/tools/python/xen/xend/XendDomainInfo.py Mon Sep 19 > >> 17:08:20 2005 > >> +++ b/tools/python/xen/xend/XendDomainInfo.py Tue Sep 20 > >> 21:16:13 2005 > >> @@ -419,7 +419,8 @@ > >> back = { 'type' : type, > >> 'params' : params, > >> 'frontend' : frontpath, > >> - 'frontend-id' : "%i" % self.domid } > >> + 'frontend-id' : "%i" % self.domid, > >> + 'physical-device' : "%li" % > >> blkdev_name_to_number(params) } > >> xstransact.Write(backpath, back) > >> > >> return > >> > >> Thanks, > >> Kevin > >> > _______________________________________________ Xen-ia64-devel mailing list Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-ia64-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |