[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v4 for-4.14] x86/monitor: revert default behavior when monitoring register write events
On 6/8/20 11:44 PM, Tamas K Lengyel wrote: > On Mon, Jun 8, 2020 at 2:14 PM Razvan Cojocaru > <rcojocaru@xxxxxxxxxxxxxxx> wrote: >> >> On 6/8/20 10:54 PM, Tamas K Lengyel wrote: >>> On Mon, Jun 8, 2020 at 12:58 PM Razvan Cojocaru >>>> And last but not least, the proper sequence is: 1. unsubscribe from >>>> register write events, 2. process all events "still in the chamber" >>>> (keep checking the ring buffer for a while), 3. detach from the guest >>>> (disable the vm_event subsystem). Not ideal perhaps (in that it's not >>>> guaranteed that a VCPU won't resume after a longer period than our >>>> timeout), but if the sequence is followed there should be no guest hangs >>>> or crashes (at least none that we or our clients have observed so far). >>> >>> Incorrect. That's not enough. You also have to wait for all the vCPUs >>> to get scheduled before disabling vm_event or otherwise the emulation >>> is skipped entirely. Please read the patch message. If the user >>> decides to disable vm_event after getting a CR3 event delivered, the >>> CR3 never gets updated and results in the guest crashing in >>> unpredictable ways. Same happens with all the other registers. >> >> I did read the patch message. As I've previously stated ("it's not >> guaranteed that a VCPU won't resume after a longer period than our >> timeout"), it's not ideal, and I've made no claim that the problem does >> not exist or that it shouldn't be fixed - but really if you've got a >> VCPU that will wait more than a couple of seconds to get scheduled then >> you've got a bigger problem with the VM. > > Sorry, missed the part where you say you knew about this issue. We > didn't and it was absolutely not documented anywhere and certainly not > mentioned in the original patch that added the feature. It literally > took us years to figure out what the hell is going on. While as you it > can be a couple seconds before its safe to disable it can be a lot > longer depending on the system state, how many VMs are running and how > many vCPUs are active on the VM. There is absolutely necessary > use-cases where you want to keep the VM paused after a CR3 event is > received and exit vm_event afterwards. This bug totally blocks those > use-cases. Not to mention that it's a total mess having an interface > where the user has to guess when its safe to do something. If this was > pointed out when the patch was made I would have objected to that > being merged. No, we didn't know about the issue. It's apparently not my most eloquent evening. I was merely saying that I did understand what the issue was from your description, and offering an explanation on why we've never seen this in QA or production. Of course the issue would have been mentioned at least (but ideally not exist to begin with) had it been known. We do take several passes through the ring buffer (and as a side-effect reduce the probability of a race occuring to almost zero), but we do that to make sure we've cleaned up all EPT vm_events; the fact that it has helped with _this_ issue is a coincidence. IIRC Windows, at least, will be upset if a VCPU is stuck for too long. As for how the vm_event system behaves: 1. A security application that is unable to _prevent_ things being done to a system is not doing a very good job, since in that case you can only collect stats and not veto anything. I would argue that the default for such a monitoring application should be the current one (all events should be pre-action). 2. This is further supported by the fact that that's exactly how the EPT vm_events work: you get a "I want to write to this page" event _before_ the write occurs. If register writes behave differently, you have _two_ different models: one where you get an event post-action, and one where you get one pre-action. Hope that's clearer, Razvan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |