[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH V3] X86/vMCE: handle broken page with regard to migration
George Dunlap wrote: > On 21/11/12 13:26, Liu, Jinsong wrote: >> Ian Campbell wrote: >>> On Wed, 2012-11-21 at 11:34 +0000, George Dunlap wrote: >>>> On 20/11/12 18:42, Ian Jackson wrote: >>>>> Liu, Jinsong writes ("RE: [Xen-devel] [PATCH V3] X86/vMCE: handle >>>>> broken page with regard to migration"): >>>>>> Ian Jackson wrote: >>>>>>> Liu, Jinsong writes ("RE: [Xen-devel] [PATCH V3] X86/vMCE: >>>>>>> handle broken page with regard to migration"): >>>>>>>> No, at last lter, there are 4 points: >>>>>>>> 1. start last iter >>>>>>>> 2. get and transfer pfn_type to target >>>>>>>> 3. copy page to target >>>>>>>> 4. end last iter >>>>> ... >>>>>> It indeed checks mce after point 3 for each page, but what's the >>>>>> advantage of keeping a separate list? >>>>> It avoids yet another loop over all the pages. Unless I have >>>>> misunderstood. Which I may have, because: if it checks for mce >>>>> after point 3 then surely that is sufficient ? We don't need to >>>>> worry about mces after that check. >>>> It's sufficient, but wouldn't each check require a separate >>>> hypercall? That would surely be slower than just a single hypercall >>>> and a loop (which is what Jinsong's patch does). >>>> >>>> We don't actually need a list -- I think we just need to know, >>>> "Have any pages broken between reading the p2m table ( >>>> xc_get_pfn_type_batch() ); if so, we do another full iteration. >>> If a page fails between 2. and 3. above then what happens at point >>> 3? I presume we can't map and send the page (since it is broken), >>> do we get some sort of failure to map? >>> >>> What happens if the failure occurs during stage 3, i.e. while the >>> page is mapped and we are reading from it? >>> >>> Ian. >> If read a broken page, it generates more serious error (say, SRAR >> error). >> I don't think guest has good opportunity to survive under this case >> --> most probably it kill itself and of course we don't need care >> migration now. >> However, if guest can luckly survive (say complete broken page >> copying to target), it's OK to continue --> its broken pfn_type will >> transfer to target next iter so guest will kill itself if access >> then. > > But in this case, I'm asking what happens if the migration code reads > the page. If reading the page in the migration code causes dom0 to > crash, then the whole "last iteration" stuff is fairly pointless. :-) > > -George If migration code read the page it will trigger more serious error and may kill hypervisor or guest. But unfortunately we cannot prevent it since we cannot predict whether a vmce will occur *during* migration. What we can do is do our best to handle it: 1. for vmce occur before migration, we can safely handle it; 2. for vmce occur during migration, we can only do our best: 2.1 if fortunately vmce occur at some area (say, before point2), we can successfully prevent page reading; 2.1 if vmce occur after point2, it will read the page, under such case * if guest/hypervisor can survive, it's OK to transfer broken pfn_type to target so that no further harm to target; * if guest/hypervisor crash, we definitely needn't care migration any more; The key point is, before migration we have no way to predict it, and we cannot forbid migration for fear that it potentially crash system. Thanks, Jinsong Thanks, Jinsong _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |