[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: [Xen-users] GPLPV (9.11pre20) in Win2003 x64 onXenServerEnterprise 5.0 (CD drive missing)
> James Harper wrote: > > > > . send me the output of 'xenstore-ls /local/domain/<id>/device' > > (substitute <id> for the domain id of the domain in question) > > . In device manager, you should see one 'Xen Block Device Driver' > > adapter per device (disk or cdrom). For each one, can you tell me the > > value of 'Device Instance Id' in the Properties -> Details tab? > > . send me a copy of your DomU config > > . if you know how to use the windows debugger, connect that to the DomU > > and send me the output. If you don't know, then just the above stuff > > might be sufficient to get started - it may be that the XenSource > > version does things a little differently for CDROM's or something which > > I might be able to tell immediately. > > > > I did a "xe vm-list params=dom-id,name-label" to see a list of VM's and > there IDs. > Then i did "xenstore-ls /local/domain/28/device" which have me: > > [root@xensvr2 ~]# xenstore-ls /local/domain/28/device > vbd = "" > 832 = "" > backend = "/local/domain/0/backend/vbd/28/832" > state = "4" > backend-id = "0" > device-type = "disk" > virtual-device = "832" > event-channel = "6" > ring-ref = "16383" > 768 = "" > backend = "/local/domain/0/backend/vbd/28/768" > state = "4" > backend-id = "0" > device-type = "disk" > virtual-device = "768" > event-channel = "7" > ring-ref = "16238" > 5632 = "" > backend = "/local/domain/0/backend/vbd/28/5632" > state = "4" > backend-id = "0" > device-type = "disk" > virtual-device = "5632" > event-channel = "8" > ring-ref = "16093" > 5696 = "" > backend = "/local/domain/0/backend/vbd/28/5696" > state = "4" > backend-id = "0" > device-type = "cdrom" > virtual-device = "5696" > event-channel = "9" > ring-ref = "15948" > vif = "" > 0 = "" > backend = "/local/domain/0/backend/vif/28/0" > backend-id = "0" > state = "4" > handle = "0" > mac = "1a:87:80:a6:b9:a2" > tx-ring-ref = "15947" > rx-ring-ref = "15946" > event-channel = "10" > feature-no-csum-offload = "0" > feature-sg = "1" > feature-gso-tcpv4 = "1" > request-rx-copy = "1" > feature-rx-notify = "1" > [root@xensvr2 ~]# > > I think that is the Xen equivalent of XenServer-api: "xe vbd-list > params=all > vm-name-label=mailsvr1" which gives me this: (see attached file > file1.txt) > http://www.nabble.com/file/p20515016/file1.txt file1.txt > > Driver instance IDs: > - XEN\VBD\4&32FE5319&1&5632 > - XEN\VBD\4&32FE5319&1&5696 > - XEN\VBD\4&32FE5319&1&768 > - XEN\VBD\4&32FE5319&1&832 Well there are 4 devices that the gplpv frontend is seeing, but obviously something is going wrong and the cdrom devices are never being reported to windows properly. See the 4 'backend="/local/domain/0/backend/vbd/<id>/<dev>"' lines above? Can you do a xenstore-ls against each of those too. The frontend xenstore stuff looks okay, including 'state=4' which means that the frontend and backends are connected, but maybe the backend is giving some wrong information or something. > (btw: i have now 3 drives connected and should have 1 cd-drive > connected,.. > which i couldn't see) > > Behavior > -------- > When i hot-plug a device from the Xenserver, i can not see it in the > Windows > 2003 VM. (even not after a rescan disk or hardware detect) When i > reboot the VM, it will detect a new device when starting Windows. I > click.. > next..next.. and it adds another "Xen Block Device Driver". When I hot-add a network adapter it appears to work, but then all the network adapters go into 'acquiring dhcp address', but after that is done it all works again. Hot-removing a network adapter appears to work too, although after I do it from Xen, I have to 'safely remove' the device before it disappears from windows. Not sure exactly why that would be the case but I suppose it can be fixed. Block devices though aren't going to work... I deliberately fail any attempt by Windows to recognise block devices added after system boot, just in case one of them is the same as the qemu devices (eg because you've just installed the drivers), with all the problems that that entails. I may be able to fix that too, but I'll have to be careful. > Other BAD behavior > ------------------- > Most of our VM's are on the SAN and connected with iSCSI to the Xenserver. > When the shit-hits-the-fan and the SAN is going down (broken switch.. > cable > broken.. or just something else) all our Windows VM's give BSOD's. Which > is a quite normal behavior. 99% of the time we can reboot thse VM's later > without any problems,.. very sometimes we need to run a chkdsk. (luckily > NTFS is a journalling filesystem). > BUT: With the GPLPV drivers.. we do NOT get a BSOD's, i've waited 10 > minutes for it. First i see some popups: "Can't write to <filename> or > <disk>" and it will raise many..many popups. Finally i did a > force-shutdown from the Xenserver. Then when rebooting this VM, the > Master > Filesystem Table (MFT) was corrupt and couldn't be repaired with chkdsk. > The > were lots of errors on the drive and i had to recover some files with > "GetDataBack for NTFS". ... a long night... :( > I never had this with the XenServer PV-tools. I think the GPLPV drivers > have a too large disk-cache (write cache?) or something ? The best is > too: > freeze the OS (i've seen that on Linux) or to give a BSOD within a short > time (Windows)... otherwise you're really screwing up things... > Just test it: Put 5 VM's on. 4 with the Xenserver PV-tools and 1 with > the > GPLPV drivers, then pull-off the storage. The 4 VM's are the first give > a > BSOD.. and the GPLPV is probably... never.. ? > (the thing i don't understand is that when storage is completely broken, > it > wouldn't matter if the VM is on for 10secs. or 10mins.. it can't write > through the storage so it can't corrupt things... This thought let me > think about a too-big storage buffer maybe? So a too-big piece is > missing... or journalling is not in sync.. ?) Now that is interesting... yes, you are right in saying that once the 'plug' is pulled to the storage it doesn't really matter (from a data integrity point of view) what happens thereafter... a BSoD may be the correct thing to do. I wonder what the backend will tell me... will it report a fail on the block request, or will it in turn wait for ages relying on me to fail the request instead? I'm also not sure if my drivers should be invoking the BSoD directly... I suspect that they should fail the request in such a way that Windows knows that all hope is lost and so Windows should instigate the BSoD. Either way, it does sound like I'm doing something a bit strange that is causing problems. This may happen with requests that aren't aligned to a 512 byte boundary - requests larger than 4096 bytes may be written out of order (wrt other write requests), but those are seldom (never?) seen during normal use, just at boot time and during a few infrequent operations like formats. I have definitely seen filesystem corruption after a crash (hanging the windows domu 'hard' should have the same effect as you were seeing - data not getting committed to the disk - that I didn't expect. I put it down to the circumstances of the crash but maybe there is more to it. I am definitely not doing any write caching though - I don't tell Windows that the write is completed until the backend has finished with the write. The backend may, in turn, be doing write caching, but that should be the same as with the xensource drivers too. Can you tell me, during the time the DomU is 'hung' because the SAN has disconnected, does the SAN come back online before the reboot? If I'm not managing read or write failures correctly, and suddenly the SAN comes back online again, then that could be causing problems. If you think that's the case I can look at the failure paths a bit closer. > > I'm still using 9.11pre20. I will try to find ou the Windows debugger > stuf.. Just give me the xenstore-ls of the backend for now. That may be enough to figure out what is going on. James _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-users
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |