[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Performance degradation in 4.15 and above
On 2023-05-23 10:16, Tomas Mozes wrote:
Another thing that came to my mind, the lockups occurred when the
grant table was full.
domU config:
max_grant_frames = 256
grub config:
GRUB_CMDLINE_XEN="gnttab_max_frames=256 sched=credit ..."
You can check it with:
xen-diag gnttab_query_size [domid]
for Dom0 nr_frames is 1, for the DomUs it's between 15-30 while the
max_nr_frames is 64 for all
On Fri, May 19, 2023 at 1:04 PM Gabor Hudiczius <ghudiczius@xxxxxxxxx>
wrote:
On 2023-05-19 11:48, Tomas Mozes wrote:
On Fri, May 19, 2023 at 11:19 AM Gabor Hudiczius
<ghudiczius@xxxxxxxxx> wrote:
Hi,
I have an old Proliant DL380 server running Gentoo Linux as
Dom0 on Xen
with several DomUs also running Gentoo Linux. After upgrading
to 4.15 I
have noticed that in some of the DomUs (that are used as
Kubernetes
nodes) the load slowly keeps climbing until it reaches a
level that the
DomU becomes unresponsive and needs to be restarted. This
issue is not
present when running on Xen 4.14 and went away once I
downgraded bask to
4.14. The same issue presented itself again after upgrading
to 4.16.
According to some Munin graphs the load increases by 2-4 per
day, but as
far as I can tell nothing else really changes (CPU usage,
number of
processes - ) so I don't really have an idea what is causing
the issue.
Both the Dom0 and DomUs are running on a hardened-gentoo
kernel version
5.10.156 (see the attached .config).
Tried with kernel version 5.15.110, but that did not help, I will give
6.1.28 a try as well
If anyone has any pointers regarding where to look or what
can be
tweaked, I would be grateful for the information.
Regards,
Gabor
Hello Gabor,
I remember having these problems:
- with credit2 scheduler
I am using the credit scheduler since after upgrading to 4.12 my
box stalled several times and I followed the recommendation from
the Gentoo wiki
(https://wiki.gentoo.org/wiki/Xen#Xen_domU_hanging_with_Xen_4.12.2B)
which seemed to solve the issue.
- kernel 5.15 in some point of time (around kernel 5.15.32), but
is ok with current versions.
Tomas
I also noticed that restarting the DomUs has little to no effect on the
load, only restarting the Dom0 decreases the load back to normal levels
|