[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2 COLOPre 03/13] libxc/restore: zero ioreq page only one time
On 06/10/2015 06:58 PM, Paul Durrant wrote: >> -----Original Message----- >> From: Wen Congyang [mailto:wency@xxxxxxxxxxxxxx] >> Sent: 10 June 2015 11:55 >> To: Paul Durrant; Andrew Cooper; Yang Hongyang; xen-devel@xxxxxxxxxxxxx >> Cc: Wei Liu; Ian Campbell; yunhong.jiang@xxxxxxxxx; Eddie Dong; >> guijianfeng@xxxxxxxxxxxxxx; rshriram@xxxxxxxxx; Ian Jackson >> Subject: Re: [Xen-devel] [PATCH v2 COLOPre 03/13] libxc/restore: zero ioreq >> page only one time >> >> On 06/10/2015 06:40 PM, Paul Durrant wrote: >>>> -----Original Message----- >>>> From: Wen Congyang [mailto:wency@xxxxxxxxxxxxxx] >>>> Sent: 10 June 2015 10:06 >>>> To: Andrew Cooper; Yang Hongyang; xen-devel@xxxxxxxxxxxxx; Paul >> Durrant >>>> Cc: Wei Liu; Ian Campbell; yunhong.jiang@xxxxxxxxx; Eddie Dong; >>>> guijianfeng@xxxxxxxxxxxxxx; rshriram@xxxxxxxxx; Ian Jackson >>>> Subject: Re: [Xen-devel] [PATCH v2 COLOPre 03/13] libxc/restore: zero >> ioreq >>>> page only one time >>>> >>>> Cc: Paul Durrant >>>> >>>> On 06/10/2015 03:44 PM, Andrew Cooper wrote: >>>>> On 10/06/2015 06:26, Yang Hongyang wrote: >>>>>> >>>>>> >>>>>> On 06/09/2015 03:30 PM, Andrew Cooper wrote: >>>>>>> On 09/06/2015 01:59, Yang Hongyang wrote: >>>>>>>> >>>>>>>> >>>>>>>> On 06/08/2015 06:15 PM, Andrew Cooper wrote: >>>>>>>>> On 08/06/15 10:58, Yang Hongyang wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 06/08/2015 05:46 PM, Andrew Cooper wrote: >>>>>>>>>>> On 08/06/15 04:43, Yang Hongyang wrote: >>>>>>>>>>>> ioreq page contains evtchn which will be set when we resume >> the >>>>>>>>>>>> secondary vm the first time. The hypervisor will check if the >>>>>>>>>>>> evtchn is corrupted, so we cannot zero the ioreq page more >>>>>>>>>>>> than one time. >>>>>>>>>>>> >>>>>>>>>>>> The ioreq->state is always STATE_IOREQ_NONE after the vm is >>>>>>>>>>>> suspended, so it is OK if we only zero it one time. >>>>>>>>>>>> >>>>>>>>>>>> Signed-off-by: Yang Hongyang <yanghy@xxxxxxxxxxxxxx> >>>>>>>>>>>> Signed-off-by: Wen congyang <wency@xxxxxxxxxxxxxx> >>>>>>>>>>>> CC: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> >>>>>>>>>>> >>>>>>>>>>> The issue here is that we are running the restore algorithm over >> a >>>>>>>>>>> domain which has already been running in Xen for a while. This >> is a >>>>>>>>>>> brand new usecase, as far as I am aware. >>>>>>>>>> >>>>>>>>>> Exactly. >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Does the qemu process associated with this domain get frozen >>>>>>>>>>> while the >>>>>>>>>>> secondary is being reset, or does the process get destroyed and >>>>>>>>>>> recreated. >>>>>>>>>> >>>>>>>>>> What do you mean by reset? do you mean secondary is >> suspended >>>> at >>>>>>>>>> checkpoint? >>>>>>>>> >>>>>>>>> Well - at the point that the buffered records are being processed, >> we >>>>>>>>> are in the process of resetting the state of the secondary to match >>>>>>>>> the >>>>>>>>> primary. >>>>>>>> >>>>>>>> Yes, at this point, the qemu process associated with this domain is >>>>>>>> frozen. >>>>>>>> the suspend callback will call libxl__qmp_stop(vm_stop() in qemu) >> to >>>>>>>> pause >>>>>>>> qemu. After we processed all records, qemu will be restored with >> the >>>>>>>> received >>>>>>>> state, that's why we add a >> libxl__qmp_restore(qemu_load_vmstate() >>>> in >>>>>>>> qemu) >>>>>>>> api to restore qemu with received state. Currently in libxl, qemu only >>>>>>>> start >>>>>>>> with the received state, there's no api to load received state while >>>>>>>> qemu is >>>>>>>> running for a while. >>>>>>> >>>>>>> Now I consider this more, it is absolutely wrong to not zero the page >>>>>>> here. The event channel in the page is not guaranteed to be the >> same >>>>>>> between the primary and secondary, >>>>>> >>>>>> That's why we don't zero it on secondary. >>>>> >>>>> I think you missed my point. Apologies for the double negative. It >>>>> must, under all circumstances, be zeroed at this point, for safety >> reasons. >>>>> >>>>> The page in question is subject to logdirty just like any other guest >>>>> pages, which means that if the guest writes to it naturally (i.e. not a >>>>> Xen or Qemu write, both of whom have magic mappings which are not >>>>> subject to logdirty), it will be transmitted in the stream. As the >>>>> event channel could be different, the lack of zeroing it at this point >>>>> means that the event channel would be wrong as opposed to simply >>>>> missing. This is a worse position to be in. >>>> >>>> The guest should not access this page. I am not sure if the guest can >>>> access the ioreq page. >>>> >>>> But in the exceptional case, the ioreq page is dirtied, and is copied to >>>> the secondary vm. The ioreq page will contain a wrong event channel, the >>>> hypervisor will check it: if the event channel is wrong, the guest will >>>> be crashed. >>>> >>>>> >>>>>> >>>>>>> and we don't want to unexpectedly >>>>>>> find a pending/in-flight ioreq. >>>>>> >>>>>> ioreq->state is always STATE_IOREQ_NONE after the vm is suspended, >>>> there >>>>>> should be no pending/in-flight ioreq at checkpoint. >>>>> >>>>> In the common case perhaps, but we must consider the exceptional >> case. >>>>> The exceptional case here is some corruption which happens to appear >> as >>>>> an in-flight ioreq. >>>> >>>> If the state is STATE_IOREQ_NONE, it may be hypervisor's bug. If the >>>> hypervisor >>>> has a bug, anything can happen. I think we should trust the hypervisor. >>>> >>>>> >>>>>> >>>>>>> >>>>>>> Either qemu needs to take care of re-initialising the event channels >>>>>>> back to appropriate values, or Xen should tolerate the channels >>>>>>> disappearing. >>>>> >>>>> I still stand by this statement. I believe it is the only safe way of >>>>> solving the issue you have discovered. >>>> >>>> Add a new qemu monitor command to update ioreq page? >>>> >>> >>> If you're attaching to a 'new' VM (i.e one with an updated image) then I >> suspect you're going to have to destroy and re-create the ioreq server so >> that the shared page gets re-populated with the correct event channels. >> Either that or you're going to have to ensure that the page is not part of >> restored image and sample the new one that Xen should have set up. >> >> >> I agree with it. I will try to add a new qemu monitor command(or do it when >> updating qemu's state) to destroy and re-create it. > > The slightly tricky part of that is that you're going to have to cache and > replay all the registrations that were done on the old instance, but you need > to do that in any case as it's not state that is transferred in the VM save > record. Why do we have to cache and replay all the registrations that were done on the old instance? We will set to the guest to a new state, the old state should be dropped. Thanks Wen Congyang > > Paul > >> >> Thanks >> Wen Congyang >> >>> >>> Paul >>> >>> >>>> Thanks >>>> Wen Congyang >>>> >>>>> >>>>> ~Andrew >>>>> . >>>>> >>> >>> . >>> > > . > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |