[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [BUG] EDAC infomation partially missing
On 16/05/17 10:54, Jan Beulich wrote: >>>> On 16.05.17 at 05:47, <ehem+debian@xxxxxxx> wrote: >> On Mon, May 15, 2017 at 02:02:53AM -0600, Jan Beulich wrote: >>>>>> On 14.05.17 at 00:36, <ehem+debian@xxxxxxx> wrote: >>>> I haven't yet done as much experimentation as Andreas Pflug has, but I >>>> can confirm I'm also running into this bug with Xen 4.4.1. >>>> >>>> I've only tried Linux kernel 3.16.43, but as Dom0: >>>> >>>> EDAC MC: Ver: 3.0.0 >>>> AMD64 EDAC driver v3.4.0 >>>> EDAC amd64: DRAM ECC enabled. >>>> EDAC amd64: NB MCE bank disabled, set MSR 0x0000017b[4] on node 0 to >>>> enable. >>>> EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not >>>> load. >>>> AMD64 EDAC driver v3.4.0 >>>> EDAC amd64: DRAM ECC enabled. >>>> EDAC amd64: NB MCE bank disabled, set MSR 0x0000017b[4] on node 0 to >>>> enable. >>>> EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not >>>> load. >>> Afaict the driver as is simply can't work in a Xen Dom0; it needs >>> enabling (read: para-virtualizing). I'm actually glad to see it doesn't >>> load (the worse alternative would be for it to load and then do the >>> wrong thing or give you a false sense of safety of your data). >> I'm unsure of how to evaluate the situation. Since ECC is enabled in the >> BIOS, data should be safe whether or not the EDAC driver loads. I >> /suspect/ the EDAC driver failing to load merely means reportting of ECC >> errors won't happen. > "Merely" being relative here: The missing reports mean a false feeling > of safety, as they may be early indications of later double-bit errors. > >> I suspect the only paravirtualization needed is to >> map the physical address of the soft|hard errors to which VM's memory >> range was effected. What this effects is which VM should panic in case >> of hard errors. > Which in turn obviously requires hypervisor interaction. It's not really > clear to me whether perhaps the driver would better live in the > hypervisor in the first place for that reason. The driver should probably live directly in Xen; it needs to program a number of nothbridge and CPU registers including interrupt information. For the reporting side of things, it looks like it would require vMCE to pass on fault information to guests. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |