[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 1/2] Xen/MCA: bugfix for mca bank clear
> -----Original Message----- > From: Liu, Jinsong > Sent: Tuesday, March 12, 2013 10:07 AM > To: Jan Beulich; Ren, Yongjie > Cc: xen-devel > Subject: RE: [Xen-devel] [PATCH 1/2] Xen/MCA: bugfix for mca bank clear > > Jan Beulich wrote: > >>>> On 11.03.13 at 11:26, "Liu, Jinsong" <jinsong.liu@xxxxxxxxx> wrote: > >> Jan Beulich wrote: > >>>>>> On 28.02.13 at 14:21, "Liu, Jinsong" <jinsong.liu@xxxxxxxxx> > >>>>>> wrote: > >>>> Jan Beulich wrote: > >>>>>>>> On 28.02.13 at 09:05, "Jan Beulich" <JBeulich@xxxxxxxx> > wrote: > >>>>>> And then I don't see why you would do physical MSR accesses in > >>>>>> the injection case at all. It should really be mca_wrmsr() to > >>>>>> take care of this, it just needs to have a way to know it got > >>>>>> called in the context of an injection. One fundamental question > >>>>>> here is > >>>>>> why the invalidation of the interposed data is being done > >>>>>> before the MSR write rather than after. One thing we surely > >>>>>> don't care much about in the injection logic is interference with > >>>>>> a real #MC. > >>>>> > >>>>> Like this (RFC only, untested so far), based on having gone > >>>>> through all call sites of mca_wrmsr(): > >>>>> > >>>>> --- a/xen/arch/x86/cpu/mcheck/mce.c > >>>>> +++ b/xen/arch/x86/cpu/mcheck/mce.c > >>>>> @@ -1065,13 +1065,15 @@ static void intpose_add(unsigned int > cpu > >>>>> printk("intpose_add: interpose array full - request > >>>>> dropped\n"); } > >>>>> > >>>>> -void intpose_inval(unsigned int cpu_nr, uint64_t msr) > >>>>> +bool_t intpose_inval(unsigned int cpu_nr, uint64_t msr) { - > >>>>> struct intpose_ent *ent; + struct intpose_ent *ent = > >>>>> intpose_lookup(cpu_nr, msr, NULL); > >>>>> > >>>>> - if ((ent = intpose_lookup(cpu_nr, msr, NULL)) != NULL) { > >>>>> - ent->cpu_nr = -1; > >>>>> - } > >>>>> + if ( !ent ) > >>>>> + return 0; > >>>>> + > >>>>> + ent->cpu_nr = -1; > >>>>> + return 1; > >>>>> } > >>>>> > >>>>> #define IS_MCA_BANKREG(r) \ > >>>>> --- a/xen/arch/x86/cpu/mcheck/mce.h > >>>>> +++ b/xen/arch/x86/cpu/mcheck/mce.h > >>>>> @@ -77,7 +77,7 @@ extern void mce_recoverable_register(mce > >>>>> /* Read an MSR, checking for an interposed value first */ > >>>>> extern struct intpose_ent *intpose_lookup(unsigned int, uint64_t, > >>>>> uint64_t *); -extern void intpose_inval(unsigned int, uint64_t); > >>>>> +extern bool_t intpose_inval(unsigned int, uint64_t); > >>>>> > >>>>> static inline uint64_t mca_rdmsr(unsigned int msr) { > >>>>> @@ -89,9 +89,9 @@ static inline uint64_t mca_rdmsr(unsigne > >>>>> > >>>>> /* Write an MSR, invalidating any interposed value */ > >>>>> #define mca_wrmsr(msr, val) do { \ > >>>>> - intpose_inval(smp_processor_id(), msr); \ > >>>>> - wrmsrl(msr, val); \ > >>>>> -} while (0) > >>>>> + if ( !intpose_inval(smp_processor_id(), msr) ) \ + > >>>>> wrmsrl(msr, val); \ +} while ( 0 ) > >>>>> > >>>>> > >>>>> /* Utility function to "logout" all architectural MCA telemetry > >>>>> from the MCA > >>>> > >>>> No, it doesn't work, considering mce broadcast to all cpus, while > >>>> s/w only simulate 1 cpu. > >>> > >>> I don't see how that case would be handled any better with the > >>> invalidation happening before the MSR write (as is the case now). > >>> > >>> Please explain, Jan > >> > >> Sorry for late reply. > >> > >> What I meant is, mca_wrmsr updated in this way is not quite clean, > >> since per syntax it still has the risk 'sometimes access simulated > >> value, sometimes access physical bank'. > >> > >> But of course your patch works for current xen-mceinj tools, so if > >> you think it's OK we will have a test at the specific machine and > >> ACK later. > > > > As said before, I don't think we need to consider interference > > between injection and real events (or if we do, then I think there > > are other aspects that would need fixing too). > > > > Hence, as long as what I proposed works for the injection case, > > that ought to be better than what we have currently. Perhaps > > explicitly filtering out the not supported case of broadcast being > > requested with the rejection (iirc there's a way to request this) > > would need to be added to the change suggested earlier. > > > > Jan > > Rongjie, > > Would you please test Jan's patch at *the* WSM-EX machine? It should be > fine but test is necessary and better. > Jan, Your patch can work fine during my testing. Tested-by: Yongjie Ren <yongjie.ren@xxxxxxxxx> _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |