[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] PV-shim 4.13 assertion failures during vcpu_wake()
On 22.10.19 15:50, Roger Pau Monné wrote: On Tue, Oct 22, 2019 at 01:50:44PM +0200, Jürgen Groß wrote:On 22.10.19 13:25, Roger Pau Monné wrote:On Tue, Oct 22, 2019 at 01:01:09PM +0200, Jürgen Groß wrote:On 22.10.19 12:52, Roger Pau Monné wrote:On Tue, Oct 22, 2019 at 11:27:41AM +0200, Jürgen Groß wrote:Since commit 8d3c326f6756d1 ("xen: let vcpu_create() select processor") the initial processor for all pv-shim vcpus will be 0, as no other cpus are online when the vcpus are created. Before that commit the vcpus would have processors set not being online yet, which worked just by chance.So all vCPUs for the shim have their hard affinity set to pCPU#0 if INo, the hard affinity is set to pcpu#(vcpu-id), but the initial cpu to run on is pcpu#0 as no other cpu is online when the vcpus are being created, and v->processor should always be a valid online cpu.Oh, I didn't know v->processor must always be valid, even for offline vCPUs. I'm quite sure the shim previously set v->processor to pCPUs that where not yet online. Yes, that's the reason I wrote it was working just by chance. understand it correctly. From my reading of sched_setup_dom0_vcpus it seems like in the shim case all sched units are pinned to their id, which would imply sched units != 0 are not pinned to CPU#0?Right.Or maybe there's only one sched unit that contains all the shim vCPUs?No.When the pv-shim vcpu becomes active it will have a hard affinity not matching its initial processor assignment leading to failing ASSERT()s or other problems depending on the selected scheduler.I'm slightly lost here, who has set this hard affinity on the pvshim vCPUs?That is done in sched_setup_dom0_vcpus().Fix that by redoing the affinity setting after onlining the cpu but before taking the vcpu up.The change seems fine to me, but I don't understand why the lack of this can cause asserts to trigger, as reported by Sergey. I also wonder why a change to pin vCPU#0 to pCPU#0 is not required, because pv_shim_cpu_up is only used for APs.When vcpu 0 is being created pcpu 0 is online already. So the affinity set in sched_setup_dom0_vcpus() is fine in that case.IIRC all shim vCPUs where pinned to their identity pCPU at creation, and there was no need to do this pining when the vCPU is brought online. I guess this is no longer possible.The problem is not the pinning, but the initial cpu stored in v->processor. This results in v->processor not being set in the hard affinity mask of the vcpu (or better: unit) which then triggers the problems.I guess just setting v->processor in pv_shim_cpu_up directly would be too intrusive? Doing that behind the scheduler's back is asking for trouble. In any case, it seems dangerous to allow vCPUs (even when offline) to be in a state that when woken up will cause assertions inside the scheduling logic. Ie: it would be best IMO to not set the hard affinity in sched_setup_dom0_vcpus and instead set it when the pCPU is brought online, or maybe have vcpu_wake select a suitable v->processor value? Yes, maybe we should remove the affinity setting for all but vcpu0 from sched_setup_dom0_vcpus(). In case Sergey can confirm the current patch is working I can resend it with the affinity setting removed in sched_setup_dom0_vcpus(). All other cases should be fine already, so no need to tweak vcpu_wake(). Juergen _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |