[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH V3] X86/vMCE: handle broken page with regard to migration



Jan Beulich wrote:
>>>> On 21.11.12 at 14:26, "Liu, Jinsong" <jinsong.liu@xxxxxxxxx>
>>>> wrote: Ian Campbell wrote: On Wed, 2012-11-21 at 11:34 +0000,
>>>> George Dunlap wrote: On 20/11/12 18:42, Ian Jackson wrote: 
>>>>> Liu, Jinsong writes ("RE: [Xen-devel] [PATCH V3] X86/vMCE: handle
>>>>> broken page with regard to migration"):
>>>>>> Ian Jackson wrote:
>>>>>>> Liu, Jinsong writes ("RE: [Xen-devel] [PATCH V3] X86/vMCE:
>>>>>>> handle broken page with regard to migration"):
>>>>>>>> No, at last lter, there are 4 points:
>>>>>>>> 1. start last iter
>>>>>>>> 2. get and transfer pfn_type to target
>>>>>>>> 3. copy page to target
>>>>>>>> 4. end last iter
>>>>> ...
>>>>>> It indeed checks mce after point 3 for each page, but what's the
>>>>>> advantage of keeping a separate list?
>>>>> It avoids yet another loop over all the pages.  Unless I have
>>>>> misunderstood.  Which I may have, because: if it checks for mce
>>>>> after point 3 then surely that is sufficient ?  We don't need to
>>>>> worry about mces after that check.
>>>> 
>>>> It's sufficient, but wouldn't each check require a separate
>>>> hypercall? That would surely be slower than just a single hypercall
>>>> and a loop (which is what Jinsong's patch does).
>>>> 
>>>> We don't actually need a list -- I think we just need to know,
>>>> "Have any pages broken between reading the p2m table (
>>>> xc_get_pfn_type_batch() ); if so, we do another full iteration.
>>> 
>>> If a page fails between 2. and 3. above then what happens at point
>>> 3? I presume we can't map and send the page (since it is broken),
>>> do we get some sort of failure to map?
>>> 
>>> What happens if the failure occurs during stage 3, i.e. while the
>>> page is mapped and we are reading from it?
>>> 
>>> Ian.
>> 
>> If read a broken page, it generates more serious error (say, SRAR
>> error). 
>> I don't think guest has good opportunity to survive under this case
>> --> most probably it kill itself and of course we don't need care
>> migration now. However, if guest can luckly survive (say complete
>> broken page copying to target), it's OK to continue --> its broken
>> pfn_type will transfer to target next iter so guest will kill itself
>> if access then. 
> 
> I think you misread the question - it said "we", as in "the tools/
> kernel/hypervisor" (at least that's how I'm reading it). The MCE
> would surface in host context in this case, and whether that's
> fatal to the host depends on the precise properties of the event.
> 
> Jan

Yes, depending on error types, both hypervisor and guest may crash.
As for tools I think it's OK if only hypervisor OK.

Thanks,
Jinsong
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.