[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] Cheap IOMMU hardware and ECC support importance


  • To: xen-users@xxxxxxxxxxxxx
  • From: Gordan Bobic <gordan@xxxxxxxxxx>
  • Date: Sat, 28 Jun 2014 13:26:22 +0100
  • Delivery-date: Sat, 28 Jun 2014 12:26:30 +0000
  • List-id: Xen user discussion <xen-users.lists.xen.org>

On 06/28/2014 12:25 PM, lee wrote:
Kuba <kuba.0000@xxxxx> writes:

W dniu 2014-06-28 09:45, lee pisze:

I don't know about ZFS, though, never used that.  How much CPU overhead
is involved with that?  I don't need any more CPU overhead like comes
with software raid.

ZFS offers you two things RAID controller AFAIK cannot do for you:
end-to-end data checksumming and SSD caching.
There might be RAID controllers that can do SSD caching.
I never heard of one.

SSD caching
means two extra disks for the cache (or what happens when the cache disk
fails?),
For ZIL (write caching), yes, you can use a mirrored device. For read 
caching it obviously doesn't matter.
and ZFS doesn't increase the number of SAS/SATA ports you have.
No, but it does deprecate the RAID and caching parts of a controller, so 
you might as well just use an HBA (cheaper). Covering the whole stack, 
ZFS can also make much better use of on-disk caches (my 4TB HGSTs have 
64MB of RAM each. If you have 20 of them on a 4-port SATA card with a 
5-port multiplier on each port, that's 1280MB of cache - more than any 
comparably priced caching controller. Being aware of FS level 
operations, ZFS can be much cleverer about exactly when to flush what 
data to what disk. A caching controller, in contrast, being unaware of 
what is actually going on at file system level, cannot leverage the 
on-disk cache for write-caching, it has to rely on it's own on-board 
cache for write-caching, thus effectively wasting those 1280MB of disk 
cache.
How does it do the checksumming?
Every block is checksummed, and this is stored and checked on every read 
of that block. In addition, every block (including it's checksum) are 
encoded for any extra redundancy specified (e.g. mirroring or n+1, n+2 
or n+3). So if you read the block, you also read the checksum stored 
with it, and if it checks out, you hand the data to the app with nothing 
else to be done. If the checksum doesn't match the data (silent 
corruption), or read of one of the disks containing a piece of the block 
fails (non-silent corruption, failed sector)), ZFS will go and
Read everything after it's been written to verify?
No, just written with a checksum on the block and encoded for extra 
redundancy. If you have Seagate disks that support the feature you can 
enable Write-Read-Verify at disk level. I wrote a patch for hdparm for 
toggling the feature.
I'll consider using it next time I need to create a file system.
ZFS is one of those things that once you start using them you soon 
afterwards have no idea how you ever managed without them. And when you 
have to make do without them, it feels like you're trying to read 
braille with hooks.


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.