[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: handle_pio looping during domain shutdown, with qemu 4.2.0 in stubdom



On 05.06.2020 18:18, 'Marek Marczykowski-Górecki' wrote:
> On Fri, Jun 05, 2020 at 04:39:56PM +0100, Paul Durrant wrote:
>>> From: Jan Beulich <jbeulich@xxxxxxxx>
>>> Sent: 05 June 2020 14:57
>>>
>>> On 05.06.2020 15:37, Paul Durrant wrote:
>>>>> From: Jan Beulich <jbeulich@xxxxxxxx>
>>>>> Sent: 05 June 2020 14:32
>>>>>
>>>>> On 05.06.2020 13:05, Paul Durrant wrote:
>>>>>> That would mean we wouldn't be seeing the "Unexpected PIO" message. From 
>>>>>> that message this clearly
>>>>> X86EMUL_UNHANDLEABLE which suggests a race with ioreq server teardown, 
>>>>> possibly due to selecting a
>>>>> server but then not finding a vcpu match in ioreq_vcpu_list.
>>>>>
>>>>> I was suspecting such, but at least the tearing down of all servers
>>>>> happens only from relinquish-resources, which gets started only
>>>>> after ->is_shut_down got set (unless the tool stack invoked
>>>>> XEN_DOMCTL_destroydomain without having observed XEN_DOMINF_shutdown
>>>>> set for the domain).
>>>>>
>>>>> For individually unregistered servers - yes, if qemu did so, this
>>>>> would be a problem. They need to remain registered until all vCPU-s
>>>>> in the domain got paused.
>>>>
>>>> It shouldn't be a problem should it? Destroying an individual server is 
>>>> only done with the domain
>>> paused, so no vcpus can be running at the time.
>>>
>>> Consider the case of one getting destroyed after it has already
>>> returned data, but the originating vCPU didn't consume that data
>>> yet. Once that vCPU gets unpaused, handle_hvm_io_completion()
>>> won't find the matching server anymore, and hence the chain
>>> hvm_wait_for_io() -> hvm_io_assist() ->
>>> vcpu_end_shutdown_deferral() would be skipped. handle_pio()
>>> would then still correctly consume the result.
>>
>> True, and skipping hvm_io_assist() means the vcpu internal ioreq state will 
>> be left set to IOREQ_READY and *that* explains why we would then exit 
>> hvmemul_do_io() with X86EMUL_UNHANDLEABLE (from the first switch).
> 
> I can confirm X86EMUL_UNHANDLEABLE indeed comes from the first switch in
> hvmemul_do_io(). And it happens shortly after ioreq server is destroyed:
> 
> (XEN) d12v0 XEN_DMOP_remote_shutdown domain 11 reason 0
> (XEN) d12v0 domain 11 domain_shutdown vcpu_id 0 defer_shutdown 1
> (XEN) d12v0 XEN_DMOP_remote_shutdown domain 11 done
> (XEN) d12v0 hvm_destroy_ioreq_server called for 11, id 0

Can either of you tell why this is? As said before, qemu shouldn't
start tearing down ioreq servers until the domain has made it out
of all shutdown deferrals, and all its vCPU-s have been paused.
For the moment I think the proposed changes, while necessary, will
mask another issue elsewhere. The @releaseDomain xenstore watch,
being the trigger I would consider relevant here, will trigger
only once XEN_DOMINF_shutdown is reported set for a domain, which
gets derived from d->is_shut_down (i.e. not mistakenly
d->is_shutting_down).

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.