Xen project Mailing List

Re: [Xen-devel] SMP Guest Proposal RFC

To: Ian Pratt <m+Ian.Pratt@xxxxxxxxxxxx>

Date: Fri, 1 Apr 2005 19:46:05 -0600

Cc: Bryan Rosenburg <rosnbrg@xxxxxxxxxx>, Ryan Harper <ryanh@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxx, Orran Krieger <okrieg@xxxxxxxxxx>

Delivery-date: Sat, 02 Apr 2005 01:46:18 +0000

List-id: Xen developer discussion <xen-devel.lists.xensource.com>

* Ian Pratt <m+Ian.Pratt@xxxxxxxxxxxx> [2005-04-01 18:55]: > > > Attached is a proposal authored by Bryan Rosenburg, Orran > > Krieger and Ryan Harper. Comments, questions, and criticism > > requested. > > Ryan, > > Much of what you're proposing closely matches our own plans: It's always > better that a domain have the minimum number of VCPUs active that are > required to meet its CPU load, and gang scheduling is clearly preferred > where possible. That sounds good. > However, I'm convinced that pre-emption notifcations are not the way to > go: Kernels typically have no way to back-out of holding a lock early, > so giving them an active call-back is not very useful. With a notification method using interrupts the kernel is informing the hypervisor when it is safe to preempt. That is, the interrupt is serviced only when no locks are being held which is ideal for avoiding preemption of a lock-holder. If the kernel does not yield in time, then we are no worse off than preemption with no notification w.r.t. preempting lock-holders. The notification allows the kernel to prepare for preemption, such as migrating applications to other cpus that are not being preempted. > I think its better to have a counter that the VCPU increments whenever > it grabs a lock and decrements when it releases a lock. When the > pre-emption timer goes off, the hypervisor can check the counter. If its > non zero, the hypervisor can choose to hold-off the preemption for e.g. > 50us. It can also set a bit in another word indiciating that a > pre-emption is pending. Whenever the '#locks held' counter is > decremented to zero, the pre-emption pending bit can be checked, and the > VCPU should imediately yield if it is. One of our concerns was the accounting overhead incurred during each spinlock acquisition and release. Linux acquires and release spinlocks at an incredible rate. Rather than affect the fast path of the spinlock code, in our proposal, we only pay when we need to preempt. > An alternative/complementary scheme would be to have each lock able to > store the number of the VCPU that's holding it. If a VCPU finds that a > lock is already taken, it can look in the shared info page to see if the > VCPU that's holding the lock is actually running. If its not, it can > issue a hypervisor_yield_to_VCPU X hypercall and avoid further spinning, > passing its time slice to the VCPU holding the lock. The directed yield is complementary to any of the schemes dicussed here as it helps out when lock-holder preemption actually occurs. This is the current method employed by the IBM production hypervisor. You can see the Linux/Power implementation in arch/ppc64/lib/locks.h Thanks for the comments. I look forward to further discussion. Ryan Harper Software Engineer; Linux Technology Center IBM Corp., Austin, Tx (512) 838-9253 T/L: 678-9253 ryanh@xxxxxxxxxx _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.