[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86/ioreq server: Fix DomU couldn't reboot when using p2m_ioreq_server p2m_type



On 08/05/17 11:52, Zhang, Xiong Y wrote:
>>>>> On 06.05.17 at 03:51, <xiong.y.zhang@xxxxxxxxx> wrote:
>>>>>>> On 05.05.17 at 05:52, <xiong.y.zhang@xxxxxxxxx> wrote:
>>>>> 'commit 1679e0df3df6 ("x86/ioreq server: asynchronously reset
>>>>> outstanding p2m_ioreq_server entries")' will call
>>>>> p2m_change_entry_type_global() which set entry.recalc=1. Then
>>>>> the following get_entry(p2m_ioreq_server) will return
>>>>> p2m_ram_rw type.
>>>>> But 'commit 6d774a951696 ("x86/ioreq server: synchronously reset
>>>>> outstanding p2m_ioreq_server entries when an ioreq server unmaps")'
>>>>> assume get_entry(p2m_ioreq_server) will return p2m_ioreq_server
>>>>> type, then reset p2m_ioreq_server entries. The fact is the assumption
>>>>> isn't true, and sysnchronously reset function couldn't work. Then
>>>>> ioreq.entry_count is larger than zero after an ioreq server unmaps,
>>>>> finally this results DomU couldn't reboot.
>>>>
>>>> I've had trouble understanding this part already on v1 (btw, why is
>>>> this one not tagged v2?), and since I still can't figure it I have to ask:
>>>> Why is it that guest reboot is being impacted here? From what I recall
>>>> a non-zero count should only prevent migration.
>>> [Zhang, Xiong Y] Sorry, although they solve the same issue, the solution is
>>> totally different, so I didn't mark this as V2, I will mark the following
>>> as v2 with this solution.
>>> During DomU reboot, it will first unmap ioreq server in shutdown process,
>>> then it call map ioreq server in boot process. The following sentence in
>>> p2m_set_ioreq_server() result mapping ioreq server failure, then DomU
>>> couldn't continue booting.
>>> If ( read_atomic(&p->ioreq.entry_count))
>>>    goto out;
>>
>> It is clear that it would be this statement to be the problem one,
>> but I continue to not see why this would affect reboot: The rebooted
>> guest runs in another VM with, hence, a different p2m. I cannot see
>> why there would be a non-zero ioreq.entry_count the first time an
>> ioreq server claims the p2m_ioreq_server type for this new domain.
>>
> [Zhang, Xiong Y] This is what I see from xl dmesg when a DomU reboot
> 1) unmap io_req_server with old domid
> 2) map io_req_server with old domid 
> 3)unmap io_req_server with old domid
> 4) map io_req_server with new domid
> 
> The 1) and 2) are triggered by our device reset handler in qemu, it will
> destroy old device handler, then create device handler with the old domid
> again. so we could see ioreq.entry_could > 0 with old domid, then reboot
> process terminated.

Oh, so it prevents reboot of XenGT, but not of normal guests?

Why does a reboot cause the device to detach, re-attach, and then
re-detach again?

Also, I'm sorry for missing the bug during review, but it's a bit
annoying to find out that the core functionality of patch -- detaching
and re-attaching -- wasn't tested at all before submission.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.