[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 2/2] Xen/vMCE: bugfix to remove problematic is_vmce_ready check
On 16.05.13 07:53, Liu, Jinsong wrote: > Jan Beulich wrote: >>>>> On 14.05.13 at 17:29, "Liu, Jinsong" <jinsong.liu@xxxxxxxxx> >>>>> wrote: Christoph Egger wrote: On 13.05.13 17:21, Liu, Jinsong >>>>> wrote: Christoph Egger wrote: >>>>>> On 13.05.13 15:35, Liu, Jinsong wrote: >>>>>>> Christoph Egger wrote: >>>>>>>> On 13.05.13 12:44, Liu, Jinsong wrote: >>>>>>>>>>> Please refer to the description of patch 2/2, especially >>>>>>>>>>> >>>>>>>>>>> * For dom0, if really need check, it should check whether >>>>>>>>>>> vMCE injection for dom0 ready (say, exception trap bounce >>>>>>>>>>> check, which has been done at inject_vmce()), not check >>>>>>>>>>> dom0 mcelog ready (which has been done at mce_softirq() >>>>>>>>>>> before send global virq to dom0). >>>>>>>>>>> >>>>>>>>>>> Which means before hypervisor send error log via virq to >>>>>>>>>>> dom0, current code has checked whether mcelog ready at dom0 >>>>>>>>>>> or not --> that's the right place for your concern, and it >>>>>>>>>>> has indeed done check. >>>>>>>>>> >>>>>>>>>> I think, I do not understand the patch description. >>>>>>>>>> Let me rephrase if I do now due to this discussion: >>>>>>>>>> >>>>>>>>>> The mcelog driver in Dom0 registers itself to the virq handler >>>>>>>>>> to provide the machine check logging service. >>>>>>>>> >>>>>>>>> Yes. >>>>>>>>> >>>>>>>>>> Xen checks if a virq handler has been registered >>>>>>>>> >>>>>>>>> Yes. >>>>>>>>> >>>>>>>>>> but does not check >>>>>>>>>> if the dom0 handler is actually ready to take the errors. >>>>>>>>>> This patch fixes this. >>>>>>>>>> >>>>>>>>> >>>>>>>>> I'm not clear your question 'does not check if the dom0 handler >>>>>>>>> is actually ready to take the errors'. Could you elaborate more >>>>>>>>> your concern at this point? >>>>>>>> >>>>>>>> Yes, this is exactly my question. You got it. >>>>>>>> >>>>>>>> Christoph >>>>>>> >>>>>>> Hmm, seems you misunderstand my word. What I meant is, >>>>>>> I don't know what you are asking by 'does not check if the dom0 >>>>>>> handler is actually ready to take the errors'. Could you >>>>>>> elaborate more your question? >>>>>> >>>>>> I reread your patch description: >>>>>> >>>>>>> * For dom0, if really need check, it should check whether vMCE >>>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >>>>>>> injection for dom0 ready (say, exception trap bounce check, which >>>>>> ^^^^^^^^^^^^^^^^^^^^^^^^ >>>>>>> has been done at inject_vmce()), not check dom0 mcelog ready >>>>>>> (which >>>>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^ >>>>>>> has been done at mce_softirq() before send global virq to dom0). >>>>>> >>>>>> My question is: Is it possible when mcelog driver registers >>>>>> the virq handler that it cannot deal with machine check errors >>>>>> immediately? >>>>>> >>>>>> Christoph >>>>> >>>>> Yes, there is a small time window when dom0 mcelog driver init: >>>>> >>>>> The last step of xen_late_init_mcelog() >>>>> --> bind_virq_for_mce >>>>> --> bind_virq_to_irqhandler >>>>> >>>>> irq = bind_virq_to_irq(virq, cpu); >>>>> if (irq < 0) >>>>> return irq; >>>>> retval = request_irq(irq, handler, irqflags, devname, >>>>> dev_id); >>>>> >>>>> Time window: between bind_virq_to_irq and request_irq >>>>> >>>>> If hypervisor inject virq to notify error log to dom0 exactly >>>>> during this init time window, dom0 no-ops handler will not fetch >>>>> error log. However, it's OK since error log in hypervisor is still >>>>> there, until next time when hypervisor inject virq again, dom0 >>>>> mcelog driver will fetch them. Considering it occurs rarely (and >>>>> no harm), I think it's OK. >>>>> >>>>> Patch 2/2 itself is not to fix this issue. Patch 2/2 is to remove >>>>> is_vmce_ready() check since it's problematic (wrong check), >>>>> overkilled (kill system unnecessary), deprecated (vMCE is not bound >>>>> to host MCE any more) and redundant (both vmce trap callback and >>>>> mcelog virq has been checked in other hypervisor code points). >>>>> >>>>> I agree the description is not clear at some points, so I will >>>>> re-write the description later. >>>> >>>> Thanks. From reading the new description >>>> I think, I got it now why this check is wrong: >>>> >>>> Whenever a vmce is injected into dom0 is_vmce_ready() checks if dom0 >>>> installed an virq handler for logging and in case dom0 does no >>>> logging dom0 is crashed instead. >>>> >>>> Assuming above is right then: >>>> the code path in mcaction.c not only runs when a vmce is injected >>>> into dom0, it also runs when a vmce is injected into a domU. >>>> So that means when dom0 does no logging then it is crashed whenever >>>> a vMCE is injected into any guest. OUCH! >>>> >>>> Make Jan happy with his 'goto vmce_failed' concern and you have my >>>> ack. >>>> >>>> Christoph >>> >>> Jan, any more concern? >> >> No, I was fine with your earlier explanation. >> >> Christoph - I wasn't sure whether your above statement implied >> Jinsong needing to do further changes, or whether you really just >> referred to the initial discussion Jinsong and I had which got >> already settled. Please clarify. >> >> Jan > > I think Christoph's statement agrees that we need patch 2/2, so ack patch 2/2 > as far as you have no more concern about 'goto vmce_failed'. > Christoph, am I right? > Yes, right. Christoph _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |