[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Xen-devel] RFC: MCA/MCE concept
Hi On 06/01/07 10:48, Petersson, Mats wrote: [cut]
Note that Windows kernel drivers are allowed to use thekernel exceptionhandling, and ARE allowed to "allow" GP faults if they wishto do so.[Don't ask me why MS allows this, but that's the case, sowe have to liveIn that case, it will die sooner or later *after* consuming the data in error.with it].That means, the guest continues to live for an unknown time...Yes. What I'm worried about is that if you have a "transient" or "few-bit"
> error in a rarely used, the guest may well live a LONG time with incorrect > data and potentially not get it detected for quite some time again (say it's
two bits have stuck to 0, and the data is then written back with the zero's there
> - next time we read it, no error, since the data has zero's in that location. I don't believe GP faults and uncorrectable errors really overlap that much. In a GP fault the extent of the damage is known - you tried to read from an address not in your address space, you lacked permissions for an operation etc. In an uncorrected error situation it is difficult to understand the bounds of the problem in that way - unless the hardware assists with data poisoning etc such errors may well be unconstrained and affect a wider area than just the bracket of code that caught a GP fault. You can often ring-fence critical code sequences by inserting error barrier instructions before and after it. Those operations are usually very expensive (drain the pipeline or similar) and are suitable only in special places. When running natively it is usually the "owner" of affected data that sees it bad in memory, eg from a read it made. In those cases we have the owner on cpu and can kill/signal it synchronously. There are times when the kernel may be shifting some data on behalf of the application owner (eg, copyin/copyout, shift network data etc) in which case we still have a handle on the real owner. If the access is from a scrub then we should not panic - just wait and see if the owner does indeed use the bad data at which time we take appropriate action. With the virtualisation layer there is the additional case of the HV or dom0 performing operations on behalf of a guest, ie the HV may make the access that traps but it's own state is not affected. CPU errors get still trickier. For example what do we do when we're told that while running guest A we displaced modified data from l2cache that had uncorrectable ECC? We have a physical address only, and no idea of who the data belongs to (guest A, a recently scheduled guest, or the HV?). Where cachelines are tagged with some form of context or guest ID you have a chance, provided that is reported in the error state.
Also consider the case where one cell (or small block of cells) has gone bad,
> but it's only used by one single piece of code that is using this try/catch code? > I know, this is probably relatively rare, but I'm still worried that it will "break" things...
I'm not sure if Linux, Solaris, *BSD, OS/2 or other OS's will allow"catching" a Kernel GP fault in a non-precise fashion (Iknow Linux hasexception handling for EXACT positions in the code). Butsince at least onekernel DOES allow this, we can't be sure that a GPF willdestroy the guest.When Linux and *BSD see a GPF while they are in userspace, then they kill the process with a SIGSEGV. If they are in kernelspace, then they panic.
Solaris has some wrappers that can be applied, maybe at some expense to performance, to make protected accesses that will catch and survive various types of error including hardware errors, wild pointers etc.
Second point to note is of course that if the guest is inuser-mode whenthe GPF happens, then almost all OS's will just kill theapplication - andthere's absolutely no reason to believe that theapplication running isnecessarily where the actual memory problem is - it may becaused by memoryscrubbing for example.
Yes, these are the myriad permutations I was alluding to above.
Whatever we do to the guest, it should be a "certaindeath", unless the
Yes, certain and instant death unless it is a PV guest that has registered the ability to deal with these more elegantly.
It is obvious that there is no absolute generic way to handle all sort of buggy guests. I vote for:kernel has told us "I can handle MCE's".If DomU has a PV MCA driver use this or inject a GPF.Multiplexing all the MSR's related to emulate MCA/MCE for the guests is muchmore complex than just injecting a GPF - and slower.
Do we need to send the non-PV guest a signal of any kind to kill it? After all, we can stop it running any further instructions (and perhaps avoid the use of bad data) by deciding within the HV or dom0 simply to abort that guest. There is no loss to diagnosability since the HV/dom0 combination is doing that, anyway.
Emulating MCE to the guest wasn't my intended alternative suggestion. Instead,my idea was that if the guest hasn't registered a "PV MCE handler", we just immediately kill the domain as such - e.g similar to "domain_crash_synchronous()".
> Don't let the guest have any chance to "do something wrong" in the process - it's > already broken, and letting it run any further will almost certainly not help matters. > This may not be the prettiest solution, but then on the other hand, a "Windows blue-screen" > or Linux "oops" saying GP fault happened at some random place in the guest isn't exactly
helping the SysAdmin understand the problem either.
Agreed - don't let the affected guest run one more instruction if we can. Sysadmins will learn to consult dom0 diagnostics to see if they explain any sudden guest deaths - no need, as you say, to splurge any raw error data to them. Gavin
-- MatsKeir, what are your opinions on this thread? Christoph -- AMD Saxony, Dresden, Germany Operating System Research Center Legal Information: AMD Saxony Limited Liability Company & Co. KG Sitz (Geschäftsanschrift): Wilschdorfer Landstr. 101, 01109 Dresden, Deutschland Registergericht Dresden: HRA 4896 vertretungsberechtigter Komplementär: AMD Saxony LLC (Sitz Wilmington, Delaware, USA) Geschäftsführer der AMD Saxony LLC: Dr. Hans-R. Deppe, Thomas McCoy _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
-- Gavin Maltby, Solaris Kernel Development. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
Lists.xenproject.org is hosted with RackSpace, monitoring our