Xen project Mailing List

Re: [Xen-devel] [PATCH v10 6/7] vmx: VT-d posted-interrupt core logic handling

From: "Jan Beulich" <JBeulich@xxxxxxxx>

Date: Wed, 20 Jan 2016 04:35:33 -0700

Cc: Kevin Tian <kevin.tian@xxxxxxxxx>, Keir Fraser <keir@xxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Dario Faggioli <dario.faggioli@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>

Delivery-date: Wed, 20 Jan 2016 13:02:08 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

>>> On 20.01.16 at 12:20, <feng.wu@xxxxxxxxx> wrote: >> From: Jan Beulich [mailto:JBeulich@xxxxxxxx] >> Sent: Wednesday, January 20, 2016 4:35 PM >> >>> On 20.01.16 at 08:49, <feng.wu@xxxxxxxxx> wrote: >> >> From: Jan Beulich [mailto:JBeulich@xxxxxxxx] >> >> Sent: Monday, January 18, 2016 11:14 PM >> >> >>> On 03.12.15 at 09:35, <feng.wu@xxxxxxxxx> wrote: >> >> > --- a/xen/arch/x86/hvm/vmx/vmx.c >> >> > +++ b/xen/arch/x86/hvm/vmx/vmx.c >> >> > @@ -83,7 +83,131 @@ static int vmx_msr_write_intercept(unsigned int >> >> msr, uint64_t msr_content); >> >> > + ASSERT(v->arch.hvm_vmx.pi_block_cpu == NR_CPUS); >> >> > + >> >> > + /* >> >> > + * The vCPU is blocking, we need to add it to one of the per pCPU >> lists. >> >> > + * We save v->processor to v->arch.hvm_vmx.pi_block_cpu and use >> it >> >> for >> >> > + * the per-CPU list, we also save it to posted-interrupt descriptor >> and >> >> > + * make it as the destination of the wake-up notification event. >> >> > + */ >> >> > + v->arch.hvm_vmx.pi_block_cpu = v->processor; >> >> > + >> >> > + spin_lock_irqsave(&per_cpu(pi_blocked_vcpu_lock, >> >> > + v->arch.hvm_vmx.pi_block_cpu), flags); >> >> > + list_add_tail(&v->arch.hvm_vmx.pi_blocked_vcpu_list, >> >> > + &per_cpu(pi_blocked_vcpu, v- >> >arch.hvm_vmx.pi_block_cpu)); >> >> > + spin_unlock_irqrestore(&per_cpu(pi_blocked_vcpu_lock, >> >> > + v->arch.hvm_vmx.pi_block_cpu), flags); >> >> > + >> >> > + ASSERT(!pi_test_sn(pi_desc)); >> >> > + >> >> > + /* >> >> > + * We don't need to set the 'NDST' field, since it should point to >> >> > + * the same pCPU as v->processor. >> >> > + */ >> >> > + >> >> > + write_atomic(&pi_desc->nv, pi_wakeup_vector); >> >> >> >> Stray blank line between comment and corresponding code. >> >> >> >> Also the ASSERT() is rather more disconnected from the write >> >> than seems desirable: Wouldn't it be better to cmpxchg() the >> >> whole 32-bit value, validating that SN is clear in the result? >> > >> > Not quite understand this. The control field is 64 bits, do you >> > mean cmpxchg() the whole 64-bit value like follows: >> > >> > + do { >> > + old.control = new.control = pi_desc->control; >> > + new.nv = pi_wakeup_vector; >> > + } while ( cmpxchg(&pi_desc->control, old.control, new.control) >> > + != old.control ); >> > >> > This a somewhat like the implementation in the earlier versions, >> > why do you want to change it back? >> >> Did you read what I've said? I'm worried about the disconnect of >> assertion and update: You really care about SN being clear >> _after_ the update aiui. > > No, the ASSERT has no connection with the update. the SN bit should > be clear before updating pi_desc->nv, adding ASSERT here is just kind > of protection code. And SN can't get set behind your back between the ASSERT() and the update? If the answer is "yes", then the code is indeed fine as is. >> >> > + pi_block_cpu = v->arch.hvm_vmx.pi_block_cpu; >> >> > + if ( pi_block_cpu == NR_CPUS ) >> >> > + return; >> >> >> >> Are this condition and the one in the immediately preceding if() >> >> connected in any way? >> > >> > I am not sure I understand this correctly. Which one is >> > " the one in the immediately preceding if()"? >> >> If you hadn't ripped out too much of the context, it would be a >> matter of pointing you back up. Now I have to re-quote the full >> code: >> >> + if ( pi_desc->nv != posted_intr_vector ) >> + write_atomic(&pi_desc->nv, posted_intr_vector); >> + >> + /* the vCPU is not on any blocking list. */ >> + pi_block_cpu = v->arch.hvm_vmx.pi_block_cpu; >> + if ( pi_block_cpu == NR_CPUS ) >> + return; >> >> These are the two if()-s the question is about. >> >> >> I.e. could the latter perhaps become an >> >> ASSERT() by connecting the two blocks suitably? I'm simply >> >> trying to limit the number of different cases to consider mentally >> >> when looking at this code ... >> > >> > If we get a true from above check, it means the vcpu was removed by >> > pi_wakeup_interrupt(), then we just need to return. If we get a false, >> > we will acquire the spinlock as the following code does, there are two >> > scenarios: >> > - pi_wakeup_interrupt() acquires the spinlock before us, then when >> > we get the lock, 'v->arch.hvm_vmx.pi_block_cpu' has already been >> > set to NR_CPUS by pi_wakeup_interrupt(), we need do nothing. >> > - We get the spinlock before pi_wakeup_interrupt(), this time we need >> > to remove the vCPU from the list and set 'v->arch.hvm_vmx.pi_block_cpu' >> > to NR_CPUS. >> >> This is all only about the second condition, but the question really is >> whether one being true or false may imply the result of other. In >> which case this would better be ASSERT()ed than handled by two >> conditionals. > > Thanks for the explanation. I don't think there are any connections > Between " if ( pi_desc->nv != posted_intr_vector )" and > " if ( pi_block_cpu == NR_CPUS )". Well, it would have seemed to me that some of the four possible combinations can't validly occur, e.g. due to NV always having a particular value when pi_block_cpu != NR_CPUS. >> >> > @@ -106,10 +230,17 @@ static int vmx_vcpu_initialise(struct vcpu *v) >> >> > >> >> > spin_lock_init(&v->arch.hvm_vmx.vmcs_lock); >> >> > >> >> > + INIT_LIST_HEAD(&v->arch.hvm_vmx.pi_blocked_vcpu_list); >> >> > + >> >> > + v->arch.hvm_vmx.pi_block_cpu = NR_CPUS; >> >> > + >> >> > v->arch.schedule_tail = vmx_do_resume; >> >> > v->arch.ctxt_switch_from = vmx_ctxt_switch_from; >> >> > v->arch.ctxt_switch_to = vmx_ctxt_switch_to; >> >> > >> >> > + if ( iommu_intpost ) >> >> > + v->arch.vcpu_block = vmx_vcpu_block; >> >> >> >> Considering the locking question above - couldn't this be done only >> >> when a device gets added, and the pointer zapped upon removal >> >> of the last device, at once saving the conditional inside the hook? >> > >> > That is another solution, but I think it make simple thing difficult, >> > I feel the current logic is simple and clear. Btw, this hook is not >> > a critical path, another conditional in it should not a big deal. >> >> Then you didn't understand: The question isn't this path, but the >> path where the hook gets called if non-NULL (and hence the >> possibility to avoid such needless calls). > > I understand you mean the overhead happens when the hooks > is called. My point is the hook is not called in a critical path, so I doubt > whether it worth doing so to make the logic complex. Are you sure scheduling code is not a critical path? >> >> > --- a/xen/include/asm-x86/hvm/vmx/vmcs.h >> >> > +++ b/xen/include/asm-x86/hvm/vmx/vmcs.h >> >> >> >> > + unsigned int pi_block_cpu; >> >> >> >> Looking at all the use sites I wonder whether it wouldn't make for >> >> easier to read code if you instead stored the lock to acquire. This >> >> would then perhaps also prepare a road for balancing lists growing >> >> too large (i.e. rather than having one list per CPU, you could >> >> then have a variable number of lists, depending on current needs, >> >> and [kind of] freely select one of them). >> > >> > Since this patch is still experimental, can we put the "length of the list" >> > issue in the next stage when we turn it to default on? >> >> I in no way meant to say that the list length issue needs to be >> dealt with right away. All I was saying was that storing a lock >> pointer here instead of a CPU number may make that easier (and >> might therefore be a good idea to do right away). > > Do you mean storing a pointer to ' pi_blocked_vcpu_lock ' here? Whatever you name it, but yes. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.