[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] add_random must be set to 1 for me - Archlinux HVM x64 - XenServer 7 Latest Patched



On Tue, Oct 25, 2016 at 1:29 PM, Austin S. Hemmelgarn
<ahferroin7@xxxxxxxxx> wrote:
> On 2016-10-25 13:40, WebDawg wrote:
>>
>> On Tue, Oct 25, 2016 at 6:29 AM, Austin S. Hemmelgarn
>> <ahferroin7@xxxxxxxxx> wrote:
>>>
>>> On 2016-10-24 14:53, WebDawg wrote:
>>>>
>>>>
>>>> On Wed, Oct 19, 2016 at 2:23 PM, Austin S. Hemmelgarn
>>>> <ahferroin7@xxxxxxxxx> wrote:
>>>> So adding 3 more vCPU for a total of 4x on the domU, this just by
>>>> itself speeds up the dd write to xvda to 20MB a second.  But, the IO
>>>> load also adds a sy aka system cpu time load to almost all the CPU's.
>>>> (the CPU load has been sy load the entire time) All in all it sticks
>>>> to about 200-300% CPU use at this point.
>>>
>>>
>>>
>>>> The only thing that changes anything at this point is to add the
>>>> oflag=direct to dd.  When I add that CPU use dramatically lowers and
>>>> write speed goes much higher.  Still the CPU use compared to debian is
>>>> no change.
>>>
>>>
>>> OK, this suggests the issue is somewhere in the caching in the guest OS.
>>> My
>>> first thoughts knowing that are:
>>> 1. How much RAM does the VM have?
>>> 2. What value does `cat /proc/sys/vm/vfs_cache_pressure` show?
>>> 3. Are you doing anything with memory ballooning?
>>
>>
>>  /proc/sys/vm/vfs_cache_pressure shows a value of 100 on both domU's.
>>
>> THE BALLOONING IS THE ANSWER.
>>
>> Okay,
>>
>> So I do not know what the deal is but when you mentioned ballooning I
>> was looking at the memory settings of the domU.  These are the
>> settings that I had:
>>
>> Static:  128 MiB/ 2 GiB
>> Dynamic:  2 GiB/ 4 GiB
>>
>> I cannot even set this at command line.  I wanted to replicate the
>> problem after I fixed it and tried this:
>>
>> xe vm-memory-limits-set dynamic-max=400000000 dynamic-min=200000000
>> static-max=200000000 static-min=16777216 name-label=domU-name
>>
>> Error code: MEMORY_CONSTRAINT_VIOLATION
>>
>> Error parameters: Memory limits must satisfy: static_min <=
>> dynamic_min <= dynamic_max <= static_max
>>
>> The dynamic MAX was bigger then the static MAX, which is impossible to
>> set, but somehow happened.
>>
>> I do not know if it happened from import from XenServer 6.5 to
>> XenServer 7, or the multiple software products I was using to manage
>> it or just something got corrupted.
>>
>> So I have been checking it all out after setting everything to this:
>>
>> Static:  128 MiB/ 2 GiB
>> Dynamic:  2 GiB/ 2 GiB
>>
>> It is all working as expected now!
>>
>> Like I said...I cannot understand how the dynamic MAX was bigger then
>> the static MAX when XenServer does not allow you to set this.  Does
>> anyone have any experience in setting these bad settings and can
>> explain why I was having such bad CPU use issues?
>
> I have no idea how it could have happened, or why it was causing what you
> were seeing to happen.
>>
>>
>>>>
>>>> Debian is:  3.16.0-4-amd64
>>>>
>>>> archlinux is:  4.8.4-1-ARCH #1 SMP PREEMPT Sat Oct 22 18:26:57 CEST
>>>> 2016 x86_64 GNU/Linux
>>>>
>>>>
>>>> scsi_mod.use_blk_mq=0 and it looks like it did nothing.  My
>>>> /proc/cmdline shows that it is there and it should be doing
>>>> something....but my scheduler setting in the queue dir still says
>>>> none.
>>>>
>>>> I think looking into this that the none is a result of the xen PVHVM
>>>> block front driver?
>>>
>>>
>>> Probably.  I've not been able to find any way to turn it off for the Xen
>>> PV
>>> block device driver (which doesn't really surprise me, xen-blkfront has
>>> multiple (virtual) 'hardware' queues, and stuff like that is exactly what
>>> blk-mq was designed to address (although it kind of sucks for anything
>>> like
>>> that except NVMe devices right now)).
>>>
>>
>> If I remember too, the kernel line stuff to tune this, is not in the
>> docs yet :/  At least the ones I was looking at.
>
> Well, I've been looking straight at the code in Linux, and can't find it
> either (although I could just be missing it, C is not my language of choice,
> and I have even less experience reading kernel code).
>
>>
>>>>
>>>>>>
>>>>>>
>>>>>> If someone could shed some insight why enabling IO generation/linking
>>>>>> of timing/entropy data to /dev/random makes the 'system work' this
>>>>>> would be great.  Like I said, I am just getting into this and I will
>>>>>> be doing more tuning if I can.
>>>>>
>>>>>
>>>>>
>>>> ONCE AGAIN, I am wrong here.  add_random does nothing to help me
>>>> anymore.  In fact I cannot find any setting under queue that does
>>>> anything to help, at least in what I am trying to fix.
>>>>
>>>> I am sorry for this false information.
>>>>
>>>>> I'm kind of surprised at this though, since I've
>>>>> got half a dozen domains running fine with blk-mq getting within 1% of
>>>>> the
>>>>> disk access speed the host sees (and the host is using blk-mq too, both
>>>>> in
>>>>> the device-mapper layer, and the lower block layer).  Some info about
>>>>> the
>>>>> rest of the storage stack might be helpful (ie, what type of backing
>>>>> storage
>>>>> are you using for the VM disks (on LVM, MD RAID, flat partitions, flat
>>>>> files, etc), what Xen driver (raw disk, blktap, something else?), and
>>>>> what
>>>>> are you accessing in the VM (raw disk, partition, LVM volume, etc))?
>>>>
>>>>
>>>>
>>>> This is a RAID 6 SAS Array.
>>>>
>>>> The kernel that I am using (archlinux: linux), is all vanilla except
>>>> for, it looks like, one patch:
>>>>
>>>>
>>>>
>>>> https://git.archlinux.org/svntogit/packages.git/tree/trunk?h=packages/linux
>>>>
>>>> That patch changes from 7 to
>>>> CONSOLE_LOGLEVEL_DEFAULT 4
>>>>
>>>> These are the results from some tests:
>>>>
>>>>  dd if=/dev/zero of=/root/test2 bs=1M oflag=direct
>>>>  19.7 MB/s
>>>>  CPU:  20%
>>>>
>>>>  dd if=/dev/zero of=/root/test2 bs=1M
>>>>  2.5 MB/s
>>>>  CPU:  100%
>>>
>>>
>>> This brings one other thing to mind: What filesystem are you using in the
>>> domU?  My guess is that this is some kind of interaction between the
>>> filesystem and the blk-mq code.  One way to check that would be to try
>>> writing directly to a disk from the domU instead of through the
>>> filesystem.
>>>
>>
>> I am using ext4.
>
> Even aside from the fact that you figured out what was causing the issue,
> the fact that your using ext4 pretty much completely rules out the
> filesystem as a potential cause.
>>
>>
>>>>
>>>> This is sy CPU / system CPU use; so something in the kernel?
>>>>
>>>> One the debian domU almost no CPU is hit.
>>>>
>>>> I am also thinking that 20MB/s is bad in general for my RAID6 as
>>>> almost nothing is reading and writing to it.  But one thing at a time
>>>> and the only reason I mention it, is that it might help to figure out
>>>> this issue.
>>>
>>>
>>>
>>
>> So, now that it looks like I have figured out what is wrong, but not
>> figured out how it got that wrong:  Does anyone have any pointers for
>> increasing the speed of the Local Storage array?  I know I can add the
>> backup battery, but even without that...a SAS RAID6 running at 20mb a
>> second in domU seems so slow....
>
> It really depends on a bunch of factors.  Things ranging like how many disks
> in the array, how many arrays is the HBA managing, is it set up for
> multipath, how much RAM dom0 and the domU have, what kind of memory
> bandwidth you have, and even how well the HBA driver is written can have an
> impact.
>>
>>
>> In dom0 I get about 20-21 mb a second with dd oflag=direct
>
> I would expect this to be better without oflag=direct.  Direct I/O tends to
> slow things down because it completely bypasses the page cache _and_ it
> pretty much forces things to be synchronous (which means it slows down
> writes _way_ more than it slows down reads).
>
> As a couple of points of comparison, using the O_DIRECT flag on the output
> file (which is what oflag=direct does in dd) cuts write speed by about 25%
> on the high-end SSD I have in my laptop, and roughly 30% on the consumer
> grade SATA drives in my home server system.
>
> And, as a general rule, performance with O_DIRECT isn't a good reference
> point for most cases, since very little software uses it without being
> configured to do so, and pretty much everything in widespread use that can
> use it is set up so it's opt-in (and usually used with AIO as well, which dd
> can't emulate).
>>
>>
>> I think XenServer or Xen has some type of disk io load balancing stuff
>> or am I wrong?
>
> Not that I know of.  All the I/O scheduling is done by Domain 0 (or at least
> all the scheduling on the host side).  You might try booting dom0 with
> blk-mq disabled if it isn't already.
>

Thanks for all the help with this.  It has been about a week coming
and while I am very satisfied to have figured it out, putting this
much work into a configuration error was really dissatisfying.  At
least I learned a great deal about the IO schedulers in linux and even
more about how xen handles it all.

I have been meaning to dive into Linux/Xen IO because of some bad
performance in the past but never got to it.  I also have been meaning
to work on compiling kernels in Archlinux and such so I have learned a
bit about that.

Overall the experience has not all been that bad.

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
https://lists.xen.org/xen-users

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.