[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH V2] x86/emulate: synchronize LOCKed instruction emulation
>>> On 23.03.17 at 09:27, <rcojocaru@xxxxxxxxxxxxxxx> wrote: > On 03/23/2017 10:19 AM, Jan Beulich wrote: >>>>> On 22.03.17 at 18:22, <rcojocaru@xxxxxxxxxxxxxxx> wrote: >>> On 03/21/2017 07:02 PM, Jan Beulich wrote: >>>>>>> On 21.03.17 at 17:44, <rcojocaru@xxxxxxxxxxxxxxx> wrote: >>>>> On 03/21/2017 06:29 PM, Jan Beulich wrote: >>>>>>>>> On 21.03.17 at 16:38, <rcojocaru@xxxxxxxxxxxxxxx> wrote: >>>>>>> On 03/15/2017 06:57 PM, Jan Beulich wrote: >>>>>>>>>>> On 15.03.17 at 17:46, <rcojocaru@xxxxxxxxxxxxxxx> wrote: >>>>>>>>> On 03/15/2017 06:30 PM, Jan Beulich wrote: >>>>>>>>>>>>> On 15.03.17 at 17:04, <rcojocaru@xxxxxxxxxxxxxxx> wrote: >>>>>>>>>>> --- >>>>>>>>>>> Changes since V1: >>>>>>>>>>> - Added Andrew Cooper's credit, as he's kept the patch current >>>>>>>>>>> througout non-trivial code changes since the initial patch. >>>>>>>>>>> - Significantly more patch testing (with XenServer). >>>>>>>>>>> - Restricted lock scope. >>>>>>>>>> >>>>>>>>>> Not by much, as it seems. In particular you continue to take the >>>>>>>>>> lock even for instructions not accessing memory at all. >>>>>>>>> >>>>>>>>> I'll take a closer look. >>>>>>>>> >>>>>>>>>> Also, by "reworked" I did assume you mean converted to at least the >>>>>>>>>> cmpxchg based model. >>>>>>>>> >>>>>>>>> I haven't been able to follow the latest emulator changes closely, >>>>>>>>> could >>>>>>>>> you please clarify what you mean by "the cmpxchg model"? Thanks. >>>>>>>> >>>>>>>> This is unrelated to any recent changes. The idea is to make the >>>>>>>> ->cmpxchg() hook actually behave like what its name says. It's >>>>>>>> being used for LOCKed insn writeback already, and it could >>>>>>>> therefore simply force a retry of the full instruction if the compare >>>>>>>> part of it fails. It may need to be given another parameter, to >>>>>>>> allow the hook function to tell LOCKed from "normal" uses. >>>>>>> >>>>>>> I assume this is what you have in mind? >>>>>> >>>>>> Hmm, not exactly. I'd expect an actual CMPXCHG instruction to >>>>>> be used, with a LOCK prefix when the original access had one. >>>>>> There should certainly not be any tail call to hvmemul_write() >>>>>> anymore. >>>>> >>>>> There seems to be a misunderstanding here: if I understand this reply >>>>> correctly, you'd like hvmemul_cmpxchg() to become a stub that really >>>>> runs CMPXCHG - but if that comes with the smp_lock part as well, it >>>>> doesn't matter (except for the now-ignored p_old part, i.e. the >>>>> comparison). >>>>> >>>>> I've initially asked if I should bring the old XenServer-queued LOCK >>>>> patch back, and I've understood that it could serve a purpose, at least >>>>> as a temporary workaround until your, and Andrew's emulator changes, >>>>> make it unrequired. >>>> >>>> Well, yes, as said earlier (still visible in context above) I had >>>> originally assumed "reworked" would mean making the cmpxchg >>>> hook actually match its name. >>>> >>>>> However, it appears that you've understood this to >>>>> mean the addition of the CMPXCHG stub (speaking of which, could you >>>>> please point out examples of implementing such stubs in the Xen code? I >>>>> have never done one of those before.) >>>> >>>> There's no talk about stubs here. All I had hoped would have been >>>> done _instead_ of the smp-lock approach was to use the CMPXCHG >>>> instruction inside the hook function, at least for RAM case (I think >>>> we could live with MMIO still being forwarded to the write handler). >>>> >>>> So all this isn't to say that the smp-lock approach might not be >>>> worthwhile to take on its own, provided the locked region gets >>>> shrunk as much as possible without losing correctness. The >>>> CMPXCHG model would just not have this locking granularity >>>> problem in the first place. >>> >>> Sadly, I've now written this (rough) patch: >>> >>> http://pastebin.com/3DJ5WYt0 >>> >>> only to find that it does not solve our issue. With multiple processors >>> per guest and heavy emulation at boot time, the VM got stuck at roughly >>> the same point in its life as before the patch. Looks like more than >>> CMPXCHG needs synchronizing. >>> >>> So it would appear that the smp_lock patch is the only way to bring this >>> under control. Either that, or my patch misses something. >>> Single-processor guests work just fine. >> >> Well, first of all the code to return RETRY is commented out. I >> may guess that you've tried with it not commented out, but I >> can't be sure. > > Indeed I have tried with it on, with the condition for success being at > first "val != old", and then, as it is in the pasted code, "val == new". > Both of these caused BSODs and random crashes. The guest only boots > without any apparent issues (and with 1 VCPU only) after I've stopped > returning RETRY. "val == new" is clearly wrong: Whether to retry depends on whether the cmpxchg was unsuccessful. >> And then "stuck" of course can mean many things, so a better >> description of the state the VM is in and/or the operations that >> it's stuck on would be needed. > > I agree, but on a first look it would appear that the hang is just like > the one without the patch, when multiple processors are involved. This > also is in line with my previous observation that when only cmpxchg() is > locked, the problem still occurs (you may remember that initially I've > tried to only synchronize the cmpxchg() call in x86_emulate()). > > I'll try to give the patch a few more spins and compare the guest stuck > state with the stuck state without the patch. Are you sure you don't run into a case of a RMW insn's memory access crossing a page boundary? Your code does not handle this case at all afaics. Other than that I can't see any obvious omission. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |