[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] how to start VMs in a particular order


  • To: xen-users@xxxxxxxxxxxxx
  • From: Joost Roeleveld <joost@xxxxxxxxxxxx>
  • Date: Mon, 30 Jun 2014 09:11:19 +0200
  • Delivery-date: Mon, 30 Jun 2014 07:11:53 +0000
  • List-id: Xen user discussion <xen-users.lists.xen.org>

On Sunday 29 June 2014 17:35:17 lee wrote:
> "J. Roeleveld" <joost@xxxxxxxxxxxx> writes:
> > On Sunday, June 29, 2014 10:09:32 AM lee wrote:
> >> > Quite possibly. Am I correct in assuming you are using old hardware
> >> > with
> >> > closed-source software?
> >> 
> >> It's an IBM x3650 7979 L2G with a ServeRaid 8k.  Arcconf seems to be
> >> closed source --- I don't really need arcconf, though.
> >> 
> >> Unfortunately, disabling the status checking hasn't solved the problem.
> >> The server goes down with messages about the SCSI bus hanging and trying
> >> to reset it.  I suspect that the controller doesn't like the --- rather
> >> unsuited --- WD20EARS I plugged in.  They have been working fine with a
> >> HP smart array P800, though.  I might have to take them out to see if
> >> the problem persists.
> > 
> > SCSI bus hanging, sounds like an I/O issue.
> > Try to read the SMART-values of the disk.
> 
> I'm not sure how to do that, and what would they tell me?

Either connect the disks directly to a sata port on a mainboard (normal 
desktop would suffice). Disabling the raid-functionality of the card might also 
suffice.
Then use (assuming the disk is /dev/sda)
# smartctl --all /dev/sda

> > Also, try a different disk...
> 
> Unfortunately, I don't have one I could try --- and I'd need three.
> 
> > The WD20EARS is a "green" desktop disk. I had numerous issues when using a
> > couple of those in my old server when using software raid (mdadm).
> > Some hardware raid cards do not like disks that do not properly return
> > error- states. And especially the green disks that have a tendency to go
> > into powersave mode when not used for a short period of time.
> 
> I know, they aren't suited for this purpose.  Yet they have been working
> fine on the P800, and that three disks should decide to go bad in a way
> that blocks the controller (or whatever happens) every now and then
> seems unlikely.

No, it doesn't.
Does the error occur after the server has been idle for a while? Or when the 
disks are being stressed?

If the former, then you need to figure out how to AVOID the disks to enter 
powersaving mode. It takes time for the disks to spin up again afterwards. The 
raid controller is timing out on access to the disks.

If the latter, then you might have issues on the drives themselves which the 
drives are trying to solve themselves.

My guess is that it is the former. (eg. when the server has been idle for a 
while)

> So I think it's more likely an incompatibility of these disks with the
> ServeRaid controller than the disks being bad, and I'd have to replace
> all of them.  Or this controller just sucks.

Yep, incompatibility. Not necessarily with these disks, but with the 
powersaving settings in the disks firmware. I believe there are tools available 
you could use to adjust those settings. But I have no experience with them and 
you need to connect the disks directly to a standard sata port and use ms 
windows. (As I think those are ms windows tools)

> IBM has supposedly fixed such issues with firmware updates, and
> I updated everything I could even before installing the disks.

Check the settings on the raid card for powersaving/spindown/powerup 
timeouts/....

> > The raid-card can easily end up trying to throw that disk out of the
> > raid- array. If that is the only disk, that will mean the disk
> > suddenly disappeared, causing kernel panics.
> 
> It's three in a RAID-5, data only.  There are two small SAS disks in a
> RAID-1 for the system.
> 
> > I currently use WD Red drives with hardware raid cards.
> 
> Yes, I have two of those, 3TB each --- in the desktop on SATA ports now
> in software RAID-1 because I need them for backups.  I don't like
> backups on hardware raid, and both RAID controllers are limited to
> max. 2TB per disk.  The WD reds work fine on the P800, though it only
> sees them as 2TB.  I can't put them into the server because I need more
> than 2TB.

You could try changing the raid controller?

> So there I'm stuck :(  The plan was to have my data on the server.
> Perhaps I'll have to declare the experiment as failed and sell the
> server.

Not necessarily, but I would advice against using green drives in a server 
when using hardware raid cards.

> >> > I use Xen on servers where stability is more important then a fast
> >> > boottime. (especially as the BIOS takes longer then booting the OS)
> >> 
> >> Well, I wish the server was running stable!
> > 
> > See my comment about your disk above. Replace it or connect it directly to
> > the mainboard, bypassing the raid controller.
> 
> Afaik, the board doesn't have SATA connectors.  The disks are neatly
> contained in an enclosure, through which they are connected to the
> available SAS/SATA ports, which are provided via the ServeRaid 8k.
> Even if the board had additional SATA ports, I'd have the disks lying
> around on top of the case and would need an external power supply for
> them, which I don't have.
> 
> I could probably run the disks as JBOD.  If they are incompatible with
> the controller, that won't help.

Try putting the disks through individually to the OS. Then use Linux software 
raid (mdadm) to do the RAID. That should work better as the RAID-software on 
the card won't end up with timeout issues after powersaving kicks in.

> Perhaps the controller is broken.  Or it's something that xen does.

Xen has nothing to do with this.
Most likely: raid-controler <-> disks incompatibility.

> >> > All the domUs have their console inside a screen-session. These also
> >> > log
> >> > the output to:
> >> > /var/log/xen-consoles/<domu-name>.log
> >> > 
> >> > By checking if these have the login prompt, you can also ensure the
> >> > domU
> >> > has started correctly. At least the scripts I get with Gentoo cycle
> >> > when
> >> > the screen-session is created.
> >> 
> >> Hmmm ... That is really going to lengths.
> > 
> > Many roads lead to Rome :)
> > Likewise, many ways exist to achieve what you (and I) want. I do not know
> > of an existing tool that does this simply. On a different list, people
> > talk about using puppet or adding additional scripts as dependencies.
> 
> I wish it was a feature of xen --- that would make sense, but how would
> xen know when a VM is fully up ...

It can, actually.

If you have client-utilities running inside the VM, those can check easily 
when the VM is fully booted. (put those to start last, for instance)

Then those utilities use the xen-api to inform the host.
Read up on xenfs, it is usable to communicate between the guest and the host.

--
Joost

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.