Hi,
this morning I encountered a problem with XCP 1.6 in relation with a NFS storage repository.
We are using a hardware pool with three identical servers, all are accessing a shared NFS VHD stroage repository on an external NFS server. This morning the NFS server crashed, therefore all VMs lost their hard drive
and were more or less hanging.
What is the official recovery method in this case? XenCenter still showed the SR as “connected”, but a rescan of the SR failed. Directly on the console of the XCP I could not cd into the mountpoint due to “Stale NFS handle”.
I wasn’t able to unmount or remount the SR because of open files from the still running VMs. Shutting down or migrating VMs of course wasn’t possible either.
The only solution I found was a hard reboot of all servers in the pool. Is there a better way for such a problem?
Thanks in advance,
Michael