[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Locking on vm-event operations (monitor)



On 7/22/2016 2:13 PM, Andrew Cooper wrote:


On 22/07/16 11:31, Corneliu ZUZU wrote:
On 7/22/2016 12:51 PM, Andrew Cooper wrote:
On 22/07/16 10:27, Corneliu ZUZU wrote:

Hi,

I've been inspecting vm-event code parts to try and understand when and why domain pausing/locking is done. If I understood correctly, domain pausing is done solely to force all the vCPUs of that domain to see a configuration update and act upon it (e.g. in the case of a XEN_DOMCTL_MONITOR_EVENT_WRITE_CTRLREG that enables CR3 monitoring, domain pausing/unpausing ensures immediate enabling of CR3 load-exiting for all vCPUs), not to synchronize concurrent operations (lock-behavior).

Correct. Without the vcpu/domain pause, a vcpu could be midway through a vmexit path with the monitor configuration changing under its feet. OTOH, it could be running in guest context, and only receive the update one scheduling timeslice later.


As for locking, I see that for example vm_event_init_domain(), vm_event_cleanup_domain() and monitor_domctl() are all protected by the domctl-lock, but I don't think that's enough.
Here are a few code-paths that led me to believe that:
* do_hvm_op()->monitor_guest_request() reads d.monitor.guest_request_* resources, but it doesn't seem to take the domctl lock, so it seems possible for that to happen _while_ those resources are initialized/cleaned-up * monitor_guest_request() also calls monitor_traps()->...->vm_event_wait_slot()->...->vm_event_grab_slot() which attempts a vm_event_ring_lock(ved), which could also be called _while_ that's initialized (vm_event_enable()) or cleaned-up (vm_event_disable()) * hvm_monitor_cr() - e.g. on the code-path vmx_vmexit_handler(EXIT_REASON_CR_ACCESS)->vmx_cr_access(VMX_CONTROL_REG_ACCESS_TYPE_MOV_TO_CR)->hvm_mov_to_cr()->hvm_set_cr0()->hvm_monitor_crX() there doesn't seem to be taken into account the possibility of a concurrent monitor_init_domain()/monitor_cleanup_domain()

Am I missing something with these conclusions?
As a resolution for this, I've been thinking of adding a 'subsys_lock' field in the vm_event_domain structure, either spinlock or rw-lock, which would be initialised/uninitialised when d.vm_event is allocated/freed (domain_create()/complete_domain_destroy()).

I don't think you are missing anything. It seems like a monitor lock is needed.

FWIW, the domctl lock is only used at the moment because it was easy, and worked when everything was a domctl. Being a global spinlock, it is a severe bottlekneck for concurrent domain management, which I need to find some time to break apart, so the less reliance on it the better from my point of view.

~Andrew


I was thinking of not touching the way the domctl lock is acquired in these cases at all, but rather leave that as it is and introduce the rw-lock separately. Are you suggesting I should also change do_domctl to skip domctl_lock_acquire()-ing if op->cmd is XEN_DOMCTL_vm_event_op ? That would certainly be a first step towards amending that bottleneck you mentioned, but it would break consistency of locking-behavior for domctls. I think it would also add significant complexity to what I'll be doing so I hope it wouldn't be a problem if we leave that for a future patch.

Sorry - I didn't intend to suggest altering the domctl lock. Just that adding in a new monitor lock and using it everywhere where appropriate would be a good step.

Breaking the domctl lock is a very large can of worms, which you definitely won't want to open while attempting to do something else.

~Andrew


Yep, agreed.

Corneliu.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.