[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-API] XCP 1.0 beta - Locked VDI issue


  • To: Jonathon Royle <jonathon@xxxxxxxxxxxxxxxx>, xen-api@xxxxxxxxxxxxxxxxxxx
  • From: Chris Percol <chris.percol@xxxxxxxxx>
  • Date: Mon, 10 Jan 2011 16:55:08 +0000
  • Cc:
  • Delivery-date: Mon, 10 Jan 2011 09:06:02 -0800
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=dv9IP47Mms2Mgg00YDql8sjHUeEiSXPQRRKYj+ke2ZqWxr+QY9mOwBveByzp4NhgwE rhsSYDoB6lcSZ2/Q6rYbls+nmcOv92WwamWZ6y6WdP+0TWLqFy0jMjEbxxcHnwZzVh1U T50sTJzPVsJwr5N/YH2+HEhVzUFWTQkayCc64=
  • List-id: Discussion of API issues surrounding Xen <xen-api.lists.xensource.com>

Thanks Jonathan for confirming vdi issue of not being able to recover from a bad shutdown does not occur in xenserver 5.6fp1. We are excited about our new qsan but not so excited about the prospect of a power outage leaving every guest unusable.

I'm also hoping someone cracks problem as it is also beyond my expertise.

Chris

On Sat, Jan 8, 2011 at 10:07 PM, Jonathon Royle <jonathon@xxxxxxxxxxxxxxxx> wrote:
Update following some more testing.

This bug/feature appears to be specific to XCP 1.0 - pulled power on identical Xenserver 5.6FP1 system and it comes back up with no issue. ÂI also see Chris had same issue with iscsi disk.

Also when vdi is not available (in XCP) I checked and there was no process using underlying vhd (fuser)

This is beyond my technical expertise to fix but attached are extracts from logs

\var\log\messages

Jan Â8 11:57:15 durham xenguest: Determined the following parameters from xenstore:
Jan Â8 11:57:15 durham xenguest: vcpu/number:1 vcpu/affinity:0 vcpu/weight:0 vcpu/cap:0 nx: 0 viridian: 1 apic: 1 acpi: 1 pae: 1 acpi_s4: 0 acpi_s3: 0
Jan Â8 11:57:15 durham fe: 8251 (/opt/xensource/libexec/xenguest -controloutfd 6 -controlinfd 7 -debuglog /tmp...) exitted with code 2
Jan Â8 11:57:16 durham xapi: [error|durham|642|Async.VM.start R:b2baff469ec4|xapi] Memory F 5011684 KiB S 0 KiB T 6141 MiB
Jan Â8 11:57:16 durham xapi: [error|durham|305 xal_listen||event] event could not be processed because VM record not in database
Jan Â8 11:57:16 durham xapi: [error|durham|305 xal_listen|VM (domid: 3) device_event = ChangeUncooperative false D:7c80cc6a38b5|event] device_event could not be processed because VM record not in database

\var\log\xensource


[20110108T11:57:16.349Z|debug|durham|642|Async.VM.start R:b2baff469ec4|sm] SM ext vdi_detach sr=OpaqueRef:da1ceffc-453a-9bf3-108c-1255793bc4a0 vdi=OpaqueRef:fafa5ede-b57a-648d-2f88-0a7aa9ea9b30
[20110108T11:57:16.514Z|debug|durham|642|Async.VM.start R:b2baff469ec4|storage_access] Executed detach succesfully on VDI '96ccff27-332b-4fb4-b01b-0ee6e70d3a43'; attach refcount now: 0
[20110108T11:57:16.514Z|debug|durham|642|Async.VM.start R:b2baff469ec4|xapi] Vmops.start_paused caught: SR_BACKEND_FAILURE_46: [ ; The VDI is not available [opterr=VDI 96ccff27-332b-4fb4-b01b-0ee6e70d3a43 already attached RW]; Â]: calling domain_destroy
[20110108T11:57:16.515Z|error|durham|642|Async.VM.start R:b2baff469ec4|xapi] Memory F 5011684 KiB S 0 KiB T 6141 MiB
[20110108T11:57:16.515Z|debug|durham|642|Async.VM.start R:b2baff469ec4|xenops] Domain.destroy: all known devices = [ Â]
[20110108T11:57:16.515Z|debug|durham|642|Async.VM.start R:b2baff469ec4|xenops] Domain.destroy calling Xc.domain_destroy (domid 3)
[20110108T11:57:16.755Z|debug|durham|642|Async.VM.start R:b2baff469ec4|xenops] No qemu-dm pid in xenstore; assuming this domain was PV
[20110108T11:57:16.756Z|debug|durham|642|Async.VM.start R:b2baff469ec4|xenops] Domain.destroy: rm /local/domain/3
[20110108T11:57:16.762Z|debug|durham|642|Async.VM.start R:b2baff469ec4|xenops] Domain.destroy: deleting backend paths
[20110108T11:57:16.768Z|debug|durham|642|Async.VM.start R:b2baff469ec4|locking_helpers] Released lock on VM OpaqueRef:478974cb-fed4-f2ac-8192-868c9e9cfe41 with token 4
[20110108T11:57:16.775Z|debug|durham|305 xal_listen|VM (domid 3) @releaseDomain D:021cb80f6c7a|dispatcher] Server_helpers.exec exception_handler: Got exception INTERNAL_ERROR: [ Vmopshelpers.Vm_corresponding_to_domid_not_in_db(3) ]
[20110108T11:57:16.776Z|error|durham|305 xal_listen||event] event could not be processed because VM record not in database
[20110108T11:57:16.776Z|debug|durham|305 xal_listen||event] VM (domid: 3) device_event = ChangeUncooperative false
[20110108T11:57:16.778Z|debug|durham|642|Async.VM.start R:b2baff469ec4|xapi] Raised at pervasiveext.ml:26.22-25 -> pervasiveext.ml:22.2-9
[20110108T11:57:16.780Z|error|durham|305 xal_listen|VM (domid: 3) device_event = ChangeUncooperative false D:7c80cc6a38b5|event] device_event could not be processed because VM record not in database
[20110108T11:57:16.782Z|debug|durham|642|Async.VM.start R:b2baff469ec4|xapi] Check operation error: op=snapshot
[20110108T11:57:16.782Z|debug|durham|642|Async.VM.start R:b2baff469ec4|xapi] vdis_reset_and_caching: [(false,false);(false,false)]
[20110108T11:57:16.782Z|debug|durham|642|Async.VM.start R:b2baff469ec4|xapi] Checking for vdis_reset_and_caching...
[20110108T11:57:16.782Z|debug|durham|642|Async.VM.start R:b2baff469ec4|xapi] Op allowed!
[20110108T11:57:16.782Z|debug|durham|642|Async.VM.start R:b2baff469ec4|xapi] Check operation error: op=copy
[20110108T11:57:16.782Z|debug|durham|642|Async.VM.start R:b2baff469ec4|xapi] vdis_reset_and_caching: [(false,false);(false,false)]
[20110108T11:57:16.783Z|debug|durham|642|Async.VM.start R:b2baff469ec4|xapi] Check operation error: op=clone
[20110108T11:57:16.786Z|debug|durham|642|Async.VM.start R:b2baff469ec4|dispatcher] Server_helpers.exec exception_handler: Got exception SR_BACKEND_FAILURE_46: [ ; The VDI is not available [opterr=VDI 96ccff27-332b-4fb4-b01b-0ee6e70d3a43 already attached RW]; Â]

Hopefully somebody can diagnose and fix this bug.

Pls note VDIs had to be recreated post last post so UUID are not the same - everything else is.

Regards,

Jon



-----Original Message-----
From: George Shuklin [mailto:george.shuklin@xxxxxxxxx]
Sent: 31 December 2010 14:11
To: Jonathon Royle
Cc: xen-api@xxxxxxxxxxxxxxxxxxx
Subject: RE: [Xen-API] XCP 1.0 beta - Locked VDI issue

Well... I'm not familiar with file-based VDI provisioning, but I'm think
the problem is not in XCP (well, XCP have the bug, but it only trigger
state), but in some forgotten mount.

Check fuser/lsof output, and try to restart xapi (xe-toolstack-restart).

And look to /var/log/messages and /var/log/xensource.log, every domain
start it filling with huge amount of debug info (I sometime thinking,
that this info reduce domain start/shutdown speed at least half).

Ð ÐÑÐ, 31/12/2010 Ð 13:56 +0000, Jonathon Royle ÐÐÑÐÑ:
> George,
>
> Thanks - now my oops, I thought I had included SR details
>
> SR - /dev/cciss/C0d0p3 ext3, thin provisioned
>
> Server HP ML370 G5 - running Raid1
>
> NB Same thing happens on RAID 5 /dev/cciss/C0d1p1 also ext3. Not tested with Local LVM but I can do.
>
> Jon
>
>
>
> -----Original Message-----
> From: George Shuklin [mailto:george.shuklin@xxxxxxxxx]
> Sent: 31 December 2010 13:13
> To: Jonathon Royle
> Cc: xen-api@xxxxxxxxxxxxxxxxxxx
> Subject: RE: [Xen-API] XCP 1.0 beta - Locked VDI issue
>
> Oops, sorry, miss it.
>
> Next: is SR iscsi-based? Check if corresponding volume mounted and try
> to deactivate it by lvchange. (name of LV will contain VDI uuid).
>
>
> Ð ÐÑÐ, 31/12/2010 Ð 12:13 +0000, Jonathon Royle ÐÐÑÐÑ:
> > George,
> >
> > Thanks for quick response.
> >
> > List of created VBD is as per original post - ie only attached to intended VM. ÂAs part of initial trouble shoot I did remove all VBDs as there was (from memory) an errant one.
> >
> > Regards,
> >
> > Jon
> >
> >
> > -----Original Message-----
> > From: George Shuklin [mailto:george.shuklin@xxxxxxxxx]
> > Sent: 31 December 2010 12:06
> > To: Jonathon Royle
> > Cc: xen-api@xxxxxxxxxxxxxxxxxxx
> > Subject: Re: [Xen-API] XCP 1.0 beta - Locked VDI issue
> >
> > Try to see created vbd for this vdi (xe vbd-list vdi-uuid=UUID), some of
> > them will be attached to control domain where VM was stopped.
> >
> >
> > Ð ÐÑÐ, 31/12/2010 Ð 11:42 +0000, Jonathon Royle ÐÐÑÐÑ:
> > > First time post so hope I am using the correct list.
> > >
> > >
> > >
> > > I have been trialling XCP1.0 beta for a few weeks now and have had no
> > > issues until now. ÂIf the host is ungracefully shutdown (power fail in
> > > my case) then the VDI of the running VM become unusable upon host
> > > restart.
> > >
> > >
> > >
> > >
> > >
> > > [root@----]# xe vm-start uuid=03ed2489-49f6-eb48-0819-549c74a96269
> > >
> > > Error code: SR_BACKEND_FAILURE_46
> > >
> > > Error parameters: , The VDI is not available [opterr=VDI
> > > fc77b366-950b-49be-90ce-2a466cf73502 already attached RW],
> > >
> > >
> > >
> > > I have been able to repeat this on several occasions.
> > >
> > >
> > >
> > > I have tried toolstack restart, host reboot as well as vbd-unplug
> > > etc. ÂThe only solution I have found is to use sr-forget (a bit
> > > drastic) and then reintroduce the SR
> > >
> > >
> > >
> > > Some config output.
> > >
> > >
> > >
> > > [root@----]# xe vdi-list uuid=fc77b366-950b-49be-90ce-2a466cf73502
> > >
> > > uuid ( RO) Â Â Â Â Â Â Â Â: fc77b366-950b-49be-90ce-2a466cf73502
> > >
> > > Â Â Â Â Â name-label ( RW): Cacti - /
> > >
> > > Â Â name-description ( RW): System
> > >
> > > Â Â Â Â Â Â Âsr-uuid ( RO): 0fe9e89c-e244-5cf2-d35d-1cdca89f798e
> > >
> > > Â Â Â Â virtual-size ( RO): 8589934592
> > >
> > > Â Â Â Â Â Â sharable ( RO): false
> > >
> > > Â Â Â Â Â Âread-only ( RO): false
> > >
> > >
> > >
> > > [root@----]# xe vbd-list vdi-uuid=fc77b366-950b-49be-90ce-2a466cf73502
> > >
> > > uuid ( RO) Â Â Â Â Â Â : 350d819b-ec36-faf4-5457-0a81668407f0
> > >
> > > Â Â Â Â Â vm-uuid ( RO): 03ed2489-49f6-eb48-0819-549c74a96269
> > >
> > > Â Â vm-name-label ( RO): Cacti
> > >
> > > Â Â Â Â Âvdi-uuid ( RO): fc77b366-950b-49be-90ce-2a466cf73502
> > >
> > > Â Â Â Â Â Â empty ( RO): false
> > >
> > > Â Â Â Â Â Âdevice ( RO):
> > >
> > >
> > >
> > >
> > >
> > > Is this a known bug, is there a better solution â happy to test
> > > further
> > >
> > >
> > >
> > > Regards,
> > >
> > >
> > >
> > > Jon
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > _______________________________________________
> > > xen-api mailing list
> > > xen-api@xxxxxxxxxxxxxxxxxxx
> > > http://lists.xensource.com/mailman/listinfo/xen-api
> >
> >
>
>



_______________________________________________
xen-api mailing list
xen-api@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/mailman/listinfo/xen-api


_______________________________________________
xen-api mailing list
xen-api@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/mailman/listinfo/xen-api

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.