Let's see...the SAN has two controllers with a 4GB cache in each controller. Each controller has a single 4 x 2Gb FC controller. Two of those ports go to the switch; the other two create redundant loops with the disk array (going from the controller to one disk array, then to the next disk array, then to the second controler). The disks are FCATA disks, there are 30 active disks (with 2 hot-spares). The SAN does RAIDs across the disks on a per-volume basis, and my e-mail volume is using a RAID10 configuration.
I've done most of the filesystem tuning I can without completely rebuilding the filesystem - atime is turned off. I've also adjusted the elevator per previous suggestions and played with some of the tuning parameters for the elevators. I haven't got around to trying something other than XFS, yet - it's going to take a while to sync over stuff from the existing FS to an EXT3 or something similar. I'm also contacting the SAN vendor to get their help in the situation.
-Nick
>>> On 2009/08/27 at 08:15, John Madden <jmadden@xxxxxxxxxxx> wrote:
> I'm not really sure that bandwidth is an issue - perhaps latency more > than that. I don't think the amount of data is what's causing the > problem; rather the number of transactions that the e-mail system is > trying to do on the volume. The file sizes are actually pretty small > - 1 to 4 Kb on average, so I think it's the large number of these > files that it has to try to read rather than streaming a large amount > of data. Both the SAN and the iostat output on both dom0 and domU > indicate somewhere between 5000 and 20000 kB/s read rates - that's > somewhere around 40Mb/s to 160Mb/s, which is well within the > capability of the FC connection. The SAN is indicating I/O operations > between 500 and 1500 I/O requests per second, which I assume is what's > causing the problem.
What's the backend inside the SAN look like? Look into amount of cache, number of spindles, RAID used, what else is using those spindles, etc.
500-1500 iops isn't a lot for a "SAN" in general, but given that your FC disks are going to get around 200 worst-case iops, you'd still need quite a few of them to push 1500 continuously (with your cache picking up some of the spikes). And that depends on workload (read/write, random or not, block size) and RAID type.
In case you haven't already, I'd look into the usual filesystem performance guides and do things like turning off atime and that lot. My feeling on this is that you're going to need to drive down those iops numbers.
What were your results on trying something other than xfs?
John
-- John Madden Sr UNIX Systems Engineer Ivy Tech Community College of Indiana jmadden@xxxxxxxxxxx
|
<br><hr>
This e-mail may contain confidential and privileged material for the sole use of the intended recipient. If this email is not intended for you, or you are not responsible for the delivery of this message to the intended recipient, please note that this message may contain SEAKR Engineering (SEAKR) Privileged/Proprietary Information. In such a case, you are strictly prohibited from downloading, photocopying, distributing or otherwise using this message, its contents or attachments in any way. If you have received this message in error, please notify us immediately by replying to this e-mail and delete the message from your mailbox. Information contained in this message that does not relate to the business of SEAKR is neither endorsed by nor attributable to SEAKR.
|