[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86/ioreq server: Fix DomU couldn't reboot when using p2m_ioreq_server p2m_type

To: George Dunlap <george.dunlap@xxxxxxxxxx>, "Zhang, Xiong Y" <xiong.y.zhang@xxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxx>
From: Yu Zhang <yu.c.zhang@xxxxxxxxxxxxxxx>
Date: Tue, 9 May 2017 13:21:53 +0800
Cc: "george.dunlap@xxxxxxxxxxxxx" <george.dunlap@xxxxxxxxxxxxx>, "andrew.cooper3@xxxxxxxxxx" <andrew.cooper3@xxxxxxxxxx>, "paul.durrant@xxxxxxxxxx" <paul.durrant@xxxxxxxxxx>, "Lv, Zhiyuan" <zhiyuan.lv@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>
Delivery-date: Tue, 09 May 2017 05:32:16 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>



On 5/8/2017 7:12 PM, George Dunlap wrote:

On 08/05/17 11:52, Zhang, Xiong Y wrote:

On 06.05.17 at 03:51, <xiong.y.zhang@xxxxxxxxx> wrote:

On 05.05.17 at 05:52, <xiong.y.zhang@xxxxxxxxx> wrote:

'commit 1679e0df3df6 ("x86/ioreq server: asynchronously reset
outstanding p2m_ioreq_server entries")' will call
p2m_change_entry_type_global() which set entry.recalc=1. Then
the following get_entry(p2m_ioreq_server) will return
p2m_ram_rw type.
But 'commit 6d774a951696 ("x86/ioreq server: synchronously reset
outstanding p2m_ioreq_server entries when an ioreq server unmaps")'
assume get_entry(p2m_ioreq_server) will return p2m_ioreq_server
type, then reset p2m_ioreq_server entries. The fact is the assumption
isn't true, and sysnchronously reset function couldn't work. Then
ioreq.entry_count is larger than zero after an ioreq server unmaps,
finally this results DomU couldn't reboot.

I've had trouble understanding this part already on v1 (btw, why is
this one not tagged v2?), and since I still can't figure it I have to ask:
Why is it that guest reboot is being impacted here? From what I recall
a non-zero count should only prevent migration.

[Zhang, Xiong Y] Sorry, although they solve the same issue, the solution is
totally different, so I didn't mark this as V2, I will mark the following
as v2 with this solution.
During DomU reboot, it will first unmap ioreq server in shutdown process,
then it call map ioreq server in boot process. The following sentence in
p2m_set_ioreq_server() result mapping ioreq server failure, then DomU
couldn't continue booting.
If ( read_atomic(&p->ioreq.entry_count))
    goto out;

It is clear that it would be this statement to be the problem one,
but I continue to not see why this would affect reboot: The rebooted
guest runs in another VM with, hence, a different p2m. I cannot see
why there would be a non-zero ioreq.entry_count the first time an
ioreq server claims the p2m_ioreq_server type for this new domain.

[Zhang, Xiong Y] This is what I see from xl dmesg when a DomU reboot
1) unmap io_req_server with old domid
2) map io_req_server with old domid
3)unmap io_req_server with old domid
4) map io_req_server with new domid

The 1) and 2) are triggered by our device reset handler in qemu, it will
destroy old device handler, then create device handler with the old domid
again. so we could see ioreq.entry_could > 0 with old domid, then reboot
process terminated.

Oh, so it prevents reboot of XenGT, but not of normal guests?

Why does a reboot cause the device to detach, re-attach, and then
re-detach again?

Also, I'm sorry for missing the bug during review, but it's a bit
annoying to find out that the core functionality of patch -- detaching
and re-attaching -- wasn't tested at all before submission.


Thanks for your reply, George.

This error was introduced in the last version of patch "x86/ioreqserver: Asynchronously resetoutstanding p2m_ioreq_server entries.", which included 2 changes fromits previous version:

1> Do not reset p2m_ioreq_server back to p2m_ram_rw when an ioreq serveris mapped;

2> Use a helper function to return the p2m type - this new helperroutine returns p2m_ram_rwinstead of p2m_ioreq_server if there's no mapping ioreq server. Inprevious versions I just returnedp2m_ioreq_server, and XenGT tests all passed successfully. But I had notrealized this issue for thislast version and did not have enough time to do the test for thisversion in the last moment before

code freeze...

Sorry about my negligence, we will perform all necessary tests nextbefore we send out our new

patches. :-)

Yu

  -George

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

Follow-Ups:
- Re: [Xen-devel] [PATCH] x86/ioreq server: Fix DomU couldn't reboot when using p2m_ioreq_server p2m_type
  - From: George Dunlap

References:
- [Xen-devel] [PATCH] x86/ioreq server: Fix DomU couldn't reboot when using p2m_ioreq_server p2m_type
  - From: Xiong Zhang
- Re: [Xen-devel] [PATCH] x86/ioreq server: Fix DomU couldn't reboot when using p2m_ioreq_server p2m_type
  - From: Jan Beulich
- Re: [Xen-devel] [PATCH] x86/ioreq server: Fix DomU couldn't reboot when using p2m_ioreq_server p2m_type
  - From: Zhang, Xiong Y
- Re: [Xen-devel] [PATCH] x86/ioreq server: Fix DomU couldn't reboot when using p2m_ioreq_server p2m_type
  - From: Jan Beulich
- Re: [Xen-devel] [PATCH] x86/ioreq server: Fix DomU couldn't reboot when using p2m_ioreq_server p2m_type
  - From: Zhang, Xiong Y
- Re: [Xen-devel] [PATCH] x86/ioreq server: Fix DomU couldn't reboot when using p2m_ioreq_server p2m_type
  - From: George Dunlap

Prev by Date: [Xen-devel] [PATCH V2] x86/ioreq server: Fix XenGT couldn't reboot when XenGT use p2m_ioreq_server p2m_type
Next by Date: [Xen-devel] [xen-unstable baseline-only test] 71266: trouble: blocked/broken
Previous by thread: Re: [Xen-devel] [PATCH] x86/ioreq server: Fix DomU couldn't reboot when using p2m_ioreq_server p2m_type
Next by thread: Re: [Xen-devel] [PATCH] x86/ioreq server: Fix DomU couldn't reboot when using p2m_ioreq_server p2m_type
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.