[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Null scheduler and vwfi native problem



Hi Dario,

On 21/01/2021 18:32, Dario Faggioli wrote:
On Thu, 2021-01-21 at 11:54 +0100, Anders Törnqvist wrote:
Hi,
I see a problem with destroy and restart of a domain. Interrupts are
not
available when trying to restart a domain.

The situation seems very similar to the thread "null scheduler bug"
https://lists.xenproject.org/archives/html/xen-devel/2018-09/msg01213.html
.

Right. Back then, PCI passthrough was involved, if I remember
correctly. Is it the case for you as well?

PCI passthrough is not yet supported on Arm :). However, the bug was reported with platform device passthrough.

[...]

"xl create" results in:
(XEN) IRQ 210 is already used by domain 1
(XEN) End of domain_destroy function

Then repeated "xl create" looks the same until after a few tries I
also get:
(XEN) Begin of complete_domain_destroy function

After that the next "xl create" creates the domain.


I have also applied the patch from
https://lists.xenproject.org/archives/html/xen-devel/2018-09/msg02469.html
.
This does seem to change the results.

Ah... Really? That's a bit unexpected, TBH.

Well, I'll think about it. >
Starting the system without "sched=null vwfi=native" does not result
in
the problem.

Ok, how about, if you're up for some more testing:

  - booting with "sched=null" but not with "vwfi=native"
  - booting with "sched=null vwfi=native" but not doing the IRQ
    passthrough that you mentioned above

?

I think we can skip the testing as the bug was fully diagnostics back then. Unfortunately, I don't think a patch was ever posted. The interesting bits start at [1]. Let me try to summarize here.

This has nothing to do with device passthrough, but the bug is easier to spot as interrupts are only going to be released when then domain is fully destroyed (we should really release them during the relinquish period...).

The last step of the domain destruction (complete_domain_destroy()) will *only* happen when all the CPUs are considered quiescent from the RCU PoV.

As you pointed out on that thread, the RCU implementation in Xen requires the pCPU to enter in the hypervisor (via hypercalls, interrupts...) time to time.

This assumption doesn't hold anymore when using "sched=null vwfi=native" because a vCPU will not exit when it is idling (vwfi=native) and there may not be any other source of interrupt on that vCPU.

Therefore the quiescent state will never be reached on the pCPU running that vCPU.

From Xen PoV, any pCPU executing guest context can be considered quiescent. So one way to solve the problem would be to mark the pCPU when entering to the guest.

Cheers,

[1] https://lore.kernel.org/xen-devel/acbeae1c-fda1-a079-322a-786d7528ecfc@xxxxxxx/

--
Julien Grall



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.