[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-users] GPLPV (9.11pre20) in Win2003 x64 onXenServerEnterprise 5.0 (CD drive missing)


  • To: xen-users@xxxxxxxxxxxxxxxxxxx
  • From: Roel Broersma <roel@xxxxxxxxxx>
  • Date: Sat, 15 Nov 2008 05:32:36 -0800 (PST)
  • Delivery-date: Sat, 15 Nov 2008 05:33:20 -0800
  • List-id: Xen user discussion <xen-users.lists.xensource.com>

James, thanks for your reply!

OK, first things first, so here are the xenstore-ls <4 devices>


[root@xensvr2 ~]# xenstore-ls /local/domain/0/backend/vbd/28/832
frontend = "/local/domain/28/device/vbd/832"
online = "1"
params =
"/dev/VG_XenStorage-97538472-f24d-6a24-f880-7ba5e765a66f/LV-5d27e\..."
state = "4"
dev = "hdb"
physical-device = "fc:0"
removable = "1"
mode = "w"
sm-data = ""
 scsi = ""
  0x12 = ""
   0x83 =
"AIMAMQIBAC1YRU5TUkMgIDVkMjdlMzRjLTlkYzItNDdlYi05ZTM2LTc2NDNiMDM\..."
   0x80 = "AIAAJjVkMjdlMzRjLTlkYzItNDdlYi05ZTM2LTc2NDNiMDMxNGUzMCAg"
 vdi-uuid = "5d27e34c-9dc2-47eb-9e36-7643b0314e30"
frontend-id = "28"
type = "phy"
feature-barrier = "1"
sectors = "140509184"
info = "0"
sector-size = "512"
kthread-pid = "31456"
[root@xensvr2 ~]# 


[root@xensvr2 ~]# xenstore-ls /local/domain/0/backend/vbd/28/768
frontend = "/local/domain/28/device/vbd/768"
online = "1"
params =
"/dev/VG_XenStorage-97538472-f24d-6a24-f880-7ba5e765a66f/LV-c07c7\..."
state = "4"
dev = "hda"
physical-device = "fc:1"
removable = "1"
mode = "w"
sm-data = ""
 scsi = ""
  0x12 = ""
   0x83 =
"AIMAMQIBAC1YRU5TUkMgIGMwN2M3NDY0LTUxNGQtNGUyNS05MWMwLTJhZDMwN2J\..."
   0x80 = "AIAAJmMwN2M3NDY0LTUxNGQtNGUyNS05MWMwLTJhZDMwN2JjMjIwYyAg"
 vdi-uuid = "c07c7464-514d-4e25-91c0-2ad307bc220c"
frontend-id = "28"
type = "phy"
feature-barrier = "1"
sectors = "62914560"
info = "0"
sector-size = "512"
kthread-pid = "31457"
[root@xensvr2 ~]# 


[root@xensvr2 ~]# xenstore-ls /local/domain/0/backend/vbd/28/5632
frontend = "/local/domain/28/device/vbd/5632"
online = "1"
params =
"/dev/VG_XenStorage-130038a8-e836-3449-2805-c11e0e05660b/LV-74641\..."
state = "4"
dev = "hdc"
physical-device = "fc:7"
removable = "1"
mode = "w"
sm-data = ""
 scsi = ""
  0x12 = ""
   0x83 =
"AIMAMQIBAC1YRU5TUkMgIDc0NjQxZjg0LTVkYmYtNDg0MS04NjAzLTg5ZDEyMTN\..."
   0x80 = "AIAAJjc0NjQxZjg0LTVkYmYtNDg0MS04NjAzLTg5ZDEyMTNmMTc3MSAg"
 vdi-uuid = "74641f84-5dbf-4841-8603-89d1213f1771"
frontend-id = "28"
type = "phy"
feature-barrier = "1"
sectors = "140509184"
info = "0"
sector-size = "512"
kthread-pid = "31458"
[root@xensvr2 ~]# 


[root@xensvr2 ~]# xenstore-ls /local/domain/0/backend/vbd/28/5696
frontend = "/local/domain/28/device/vbd/5696"
online = "1"
params =
"/var/run/sr-mount/7d66b2a3-3e7f-7718-db04-76ba3a57d0c5/en_win_sr\..."
state = "5"
dev = "hdd"
removable = "1"
mode = "r"
frontend-id = "28"
type = "file"
[root@xensvr2 ~]# 


(i think the last one it the cd-drive... it gives less data..)


And the answer on your other question:
"Can you tell me, during the time the DomU is 'hung' because the SAN has
disconnected, does the SAN come back online before the reboot?"
NO, the SAN was very down...   and we first shut down all the Xenservers
(and VM's) before starting up the SAN.
BTW, another thought:  The VM (with the GPLPV drivers and the BSOD i told
about) was a Mailserver, mailservers have a lot of small files (1 or2 Kb). 
I've head/saw somewhere that all files under 1,5Kb are not written to the
disk but to the MFT directly because making a pointer in the MFT to the
address on the disk where the (small) file is, is too much overhead.  So,
maybe the fact that this server is a mailserver with small files, 'helped'
in getting the MFT down.

Roel



James Harper wrote:
> 
>> James Harper wrote:
>> >
>> > . send me the output of 'xenstore-ls /local/domain/<id>/device'
>> > (substitute <id> for the domain id of the domain in question)
>> > . In device manager, you should see one 'Xen Block Device Driver'
>> > adapter per device (disk or cdrom). For each one, can you tell me
> the
>> > value of 'Device Instance Id' in the Properties -> Details tab?
>> > . send me a copy of your DomU config
>> > . if you know how to use the windows debugger, connect that to the
> DomU
>> > and send me the output. If you don't know, then just the above stuff
>> > might be sufficient to get started - it may be that the XenSource
>> > version does things a little differently for CDROM's or something
> which
>> > I might be able to tell immediately.
>> >
>> 
>> I did a "xe vm-list params=dom-id,name-label"  to see a list of VM's
> and
>> there IDs.
>> Then i did "xenstore-ls /local/domain/28/device"  which have me:
>> 
>> [root@xensvr2 ~]# xenstore-ls /local/domain/28/device
>> vbd = ""
>>  832 = ""
>>   backend = "/local/domain/0/backend/vbd/28/832"
>>   state = "4"
>>   backend-id = "0"
>>   device-type = "disk"
>>   virtual-device = "832"
>>   event-channel = "6"
>>   ring-ref = "16383"
>>  768 = ""
>>   backend = "/local/domain/0/backend/vbd/28/768"
>>   state = "4"
>>   backend-id = "0"
>>   device-type = "disk"
>>   virtual-device = "768"
>>   event-channel = "7"
>>   ring-ref = "16238"
>>  5632 = ""
>>   backend = "/local/domain/0/backend/vbd/28/5632"
>>   state = "4"
>>   backend-id = "0"
>>   device-type = "disk"
>>   virtual-device = "5632"
>>   event-channel = "8"
>>   ring-ref = "16093"
>>  5696 = ""
>>   backend = "/local/domain/0/backend/vbd/28/5696"
>>   state = "4"
>>   backend-id = "0"
>>   device-type = "cdrom"
>>   virtual-device = "5696"
>>   event-channel = "9"
>>   ring-ref = "15948"
>> vif = ""
>>  0 = ""
>>   backend = "/local/domain/0/backend/vif/28/0"
>>   backend-id = "0"
>>   state = "4"
>>   handle = "0"
>>   mac = "1a:87:80:a6:b9:a2"
>>   tx-ring-ref = "15947"
>>   rx-ring-ref = "15946"
>>   event-channel = "10"
>>   feature-no-csum-offload = "0"
>>   feature-sg = "1"
>>   feature-gso-tcpv4 = "1"
>>   request-rx-copy = "1"
>>   feature-rx-notify = "1"
>> [root@xensvr2 ~]#
>> 
>> I think that is the Xen equivalent of XenServer-api: "xe vbd-list
>> params=all
>> vm-name-label=mailsvr1" which gives me this:  (see attached file
>> file1.txt)
>> http://www.nabble.com/file/p20515016/file1.txt file1.txt
>> 
>> Driver instance IDs:
>> - XEN\VBD\4&32FE5319&1&5632
>> - XEN\VBD\4&32FE5319&1&5696
>> - XEN\VBD\4&32FE5319&1&768
>> - XEN\VBD\4&32FE5319&1&832
> 
> Well there are 4 devices that the gplpv frontend is seeing, but
> obviously something is going wrong and the cdrom devices are never being
> reported to windows properly.
> 
> See the 4 'backend="/local/domain/0/backend/vbd/<id>/<dev>"' lines
> above? Can you do a xenstore-ls against each of those too. The frontend
> xenstore stuff looks okay, including 'state=4' which means that the
> frontend and backends are connected, but maybe the backend is giving
> some wrong information or something.
> 
>> (btw: i have now 3 drives connected and should have 1 cd-drive
>> connected,..
>> which i couldn't see)
>> 
>> Behavior
>> --------
>> When i hot-plug a device from the Xenserver, i can not see it in the
>> Windows
>> 2003 VM.  (even not after a rescan disk  or  hardware detect)   When i
>> reboot the VM, it will detect a new device when starting Windows. I
>> click..
>> next..next.. and it adds another "Xen Block Device Driver".
> 
> When I hot-add a network adapter it appears to work, but then all the
> network adapters go into 'acquiring dhcp address', but after that is
> done it all works again. Hot-removing a network adapter appears to work
> too, although after I do it from Xen, I have to 'safely remove' the
> device before it disappears from windows. Not sure exactly why that
> would be the case but I suppose it can be fixed.
> 
> Block devices though aren't going to work... I deliberately fail any
> attempt by Windows to recognise block devices added after system boot,
> just in case one of them is the same as the qemu devices (eg because
> you've just installed the drivers), with all the problems that that
> entails. I may be able to fix that too, but I'll have to be careful.
> 
>> Other BAD behavior
>> -------------------
>> Most of our VM's are on the SAN and connected with iSCSI to the
> Xenserver.
>> When the shit-hits-the-fan and the SAN is going down (broken switch..
>> cable
>> broken.. or just something else)  all our Windows VM's give BSOD's.
> Which
>> is a quite normal behavior.  99% of the time we can reboot thse VM's
> later
>> without any problems,.. very sometimes we need to run a chkdsk.
> (luckily
>> NTFS is a journalling filesystem).
>> BUT:  With the GPLPV drivers.. we do NOT get a BSOD's,  i've waited 10
>> minutes for it.  First i see some popups: "Can't write to <filename>
> or
>> <disk>"  and it will raise many..many popups.  Finally i did a
>> force-shutdown from the Xenserver.  Then when rebooting this VM, the
>> Master
>> Filesystem Table (MFT) was corrupt and couldn't be repaired with
> chkdsk.
>> The
>> were lots of errors on the drive and i had to recover some files with
>> "GetDataBack for NTFS".  ... a long night...   :(
>> I never had this with the XenServer PV-tools.   I think the GPLPV
> drivers
>> have a too large disk-cache (write cache?) or something ?  The best is
>> too:
>> freeze the OS (i've seen that on Linux) or to give a BSOD within a
> short
>> time  (Windows)... otherwise you're really screwing up things...
>> Just test it:  Put 5 VM's on.  4 with the Xenserver PV-tools and 1
> with
>> the
>> GPLPV drivers,  then pull-off the storage.  The 4 VM's are the first
> give
>> a
>> BSOD.. and the GPLPV is probably... never.. ?
>> (the thing i don't understand is that when storage is completely
> broken,
>> it
>> wouldn't matter if the VM is on for 10secs. or 10mins..  it can't
> write
>> through the storage so it can't corrupt things...    This thought let
> me
>> think about a too-big storage buffer maybe?  So a too-big piece is
>> missing... or journalling is not in sync.. ?)
> 
> Now that is interesting... yes, you are right in saying that once the
> 'plug' is pulled to the storage it doesn't really matter (from a data
> integrity point of view)  what happens thereafter... a BSoD may be the
> correct thing to do. I wonder what the backend will tell me... will it
> report a fail on the block request, or will it in turn wait for ages
> relying on me to fail the request instead? I'm also not sure if my
> drivers should be invoking the BSoD directly... I suspect that they
> should fail the request in such a way that Windows knows that all hope
> is lost and so Windows should instigate the BSoD.
> 
> Either way, it does sound like I'm doing something a bit strange that is
> causing problems. This may happen with requests that aren't aligned to a
> 512 byte boundary - requests larger than 4096 bytes may be written out
> of order (wrt other write requests), but those are seldom (never?) seen
> during normal use, just at boot time and during a few infrequent
> operations like formats.
> 
> I have definitely seen filesystem corruption after a crash (hanging the
> windows domu 'hard' should have the same effect as you were seeing -
> data not getting committed to the disk - that I didn't expect. I put it
> down to the circumstances of the crash but maybe there is more to it.
> 
> I am definitely not doing any write caching though - I don't tell
> Windows that the write is completed until the backend has finished with
> the write. The backend may, in turn, be doing write caching, but that
> should be the same as with the xensource drivers too.
> 
> Can you tell me, during the time the DomU is 'hung' because the SAN has
> disconnected, does the SAN come back online before the reboot? If I'm
> not managing read or write failures correctly, and suddenly the SAN
> comes back online again, then that could be causing problems. If you
> think that's the case I can look at the failure paths a bit closer.
> 
>> 
>> I'm still using 9.11pre20.  I will try to find ou the Windows debugger
>> stuf..
> 
> Just give me the xenstore-ls of the backend for now. That may be enough
> to figure out what is going on.
> 
> James
> 
> _______________________________________________
> Xen-users mailing list
> Xen-users@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-users
> 
> 

-- 
View this message in context: 
http://www.nabble.com/GPLPV-%289.11pre20%29-in-Win2003-x64--on-XenServer-Enterprise-5.0-%28CD-drive-missing%29-tp20499705p20515504.html
Sent from the Xen - User mailing list archive at Nabble.com.


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.