[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] x86/ioreq server: Fix DomU couldn't reboot when using p2m_ioreq_server p2m_type
>>> On 08.05.17 at 13:59, <xiong.y.zhang@xxxxxxxxx> wrote: >> On 08/05/17 11:52, Zhang, Xiong Y wrote: >> >>>>> On 06.05.17 at 03:51, <xiong.y.zhang@xxxxxxxxx> wrote: >> >>>>>>> On 05.05.17 at 05:52, <xiong.y.zhang@xxxxxxxxx> wrote: >> >>>>> 'commit 1679e0df3df6 ("x86/ioreq server: asynchronously reset >> >>>>> outstanding p2m_ioreq_server entries")' will call >> >>>>> p2m_change_entry_type_global() which set entry.recalc=1. Then >> >>>>> the following get_entry(p2m_ioreq_server) will return >> >>>>> p2m_ram_rw type. >> >>>>> But 'commit 6d774a951696 ("x86/ioreq server: synchronously reset >> >>>>> outstanding p2m_ioreq_server entries when an ioreq server unmaps")' >> >>>>> assume get_entry(p2m_ioreq_server) will return p2m_ioreq_server >> >>>>> type, then reset p2m_ioreq_server entries. The fact is the assumption >> >>>>> isn't true, and sysnchronously reset function couldn't work. Then >> >>>>> ioreq.entry_count is larger than zero after an ioreq server unmaps, >> >>>>> finally this results DomU couldn't reboot. >> >>>> >> >>>> I've had trouble understanding this part already on v1 (btw, why is >> >>>> this one not tagged v2?), and since I still can't figure it I have to >> >>>> ask: >> >>>> Why is it that guest reboot is being impacted here? From what I recall >> >>>> a non-zero count should only prevent migration. >> >>> [Zhang, Xiong Y] Sorry, although they solve the same issue, the solution >> >>> is >> >>> totally different, so I didn't mark this as V2, I will mark the following >> >>> as v2 with this solution. >> >>> During DomU reboot, it will first unmap ioreq server in shutdown process, >> >>> then it call map ioreq server in boot process. The following sentence in >> >>> p2m_set_ioreq_server() result mapping ioreq server failure, then DomU >> >>> couldn't continue booting. >> >>> If ( read_atomic(&p->ioreq.entry_count)) >> >>> goto out; >> >> >> >> It is clear that it would be this statement to be the problem one, >> >> but I continue to not see why this would affect reboot: The rebooted >> >> guest runs in another VM with, hence, a different p2m. I cannot see >> >> why there would be a non-zero ioreq.entry_count the first time an >> >> ioreq server claims the p2m_ioreq_server type for this new domain. >> >> >> > [Zhang, Xiong Y] This is what I see from xl dmesg when a DomU reboot >> > 1) unmap io_req_server with old domid >> > 2) map io_req_server with old domid >> > 3)unmap io_req_server with old domid >> > 4) map io_req_server with new domid >> > >> > The 1) and 2) are triggered by our device reset handler in qemu, it will >> > destroy old device handler, then create device handler with the old domid >> > again. so we could see ioreq.entry_could > 0 with old domid, then reboot >> > process terminated. >> >> Oh, so it prevents reboot of XenGT, but not of normal guests? > [Zhang, Xiong Y] Yes, only XenGT has this issue. The patch description should say so then. >> Why does a reboot cause the device to detach, re-attach, and then >> re-detach again? > [Zhang, Xiong Y] As our vgt_reset() function will destroy a vgt instance, > then create a vgt instance. And this vgt_reset function is also used > in normal boot process. Sounds like that's where the immediate problem is then. (Of course we want the hypervisor side taken care of too.) Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |