[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] xc_hvm_inject_trap() races



On 11/08/2016 10:15 AM, Jan Beulich wrote:
>>>> On 07.11.16 at 18:01, <rcojocaru@xxxxxxxxxxxxxxx> wrote:
>> On 11/07/2016 06:10 PM, Jan Beulich wrote:
>>>>>> On 07.11.16 at 16:24, <rcojocaru@xxxxxxxxxxxxxxx> wrote:
>>>> The one-shot vm_event does sound reasonable. I could set a flag
>>>> per-VCPU, basically similar to v->arch.hvm_vcpu.inject_trap.vector, and
>>>> fire a single event from hvm_inject_trap() if it's set (then unset it) -
>>>> the flag would be set via an xc_monitor_next_interrupt() call in libxc.
>>>
>>> Doing this in hvm_inject_trap() would not cover all cases afict.
>>> I'd suggest doing this from hvm_do_resume() _after_ the
>>> (conditional) call to hvm_inject_trap(), if there is _any_ event
>>> pending.
>>
>> But that would only cover the hypercall-injected traps. The condition in
>> hvm_do_resume() is "if ( v->arch.hvm_vcpu.inject_trap.vector != -1 )",
>> and inject_trap.vector seems to only ever be set by the hypercall:
>> [...]
>> So if the next interrupt is not caused by the hypercall, we'll never get
>> another event. Am I reading the code wrong?
> 
> No, maybe I expressed myself ambiguously: I meant to say that the
> event should be delivered from hvm_do_resume(), but _outside_ the
> conditional guarding the call to hvm_inject_trap(). Otherwise things
> would have been worse than when doing it inside hvm_inject_trap().

While working on this patch, I've had a new idea, which would require
less changes and fix the problem in a more elegant manner if validated.
Looking at vmx_idtv_reinject(), the real problem seems to be that
VM_ENTRY_INTR_INFO is being written to directly:

3229 static void vmx_idtv_reinject(unsigned long idtv_info)
3230 {
3231
3232     /* Event delivery caused this intercept? Queue for redelivery. */
3233     if ( unlikely(idtv_info & INTR_INFO_VALID_MASK) )
3234     {
3235         if ( hvm_event_needs_reinjection(MASK_EXTR(idtv_info,
3236
INTR_INFO_INTR_TYPE_MASK),
3237                                          idtv_info &
INTR_INFO_VECTOR_MASK) )
3238         {
3239             /* See SDM 3B 25.7.1.1 and .2 for info about masking
resvd bits. */
3240             __vmwrite(VM_ENTRY_INTR_INFO,
3241                       idtv_info & ~INTR_INFO_RESVD_BITS_MASK);
3242             if ( idtv_info & INTR_INFO_DELIVER_CODE_MASK )
3243             {
3244                 unsigned long ec;
3245
3246                 __vmread(IDT_VECTORING_ERROR_CODE, &ec);
3247                 __vmwrite(VM_ENTRY_EXCEPTION_ERROR_CODE, ec);
3248             }
3249         }
3250
3251         /*
3252          * Clear NMI-blocking interruptibility info if an NMI
delivery faulted.
3253          * Re-delivery will re-set it (see SDM 3B 25.7.1.2).
3254          */
3255         if ( cpu_has_vmx_vnmi &&
3256              ((idtv_info & INTR_INFO_INTR_TYPE_MASK) ==
3257               MASK_INSR(X86_EVENTTYPE_NMI, INTR_INFO_INTR_TYPE_MASK)) )
3258         {
3259             unsigned long intr_info;
3260
3261             __vmread(GUEST_INTERRUPTIBILITY_INFO, &intr_info);
3262             __vmwrite(GUEST_INTERRUPTIBILITY_INFO,
3263                       intr_info & ~VMX_INTR_SHADOW_NMI);
3264         }
3265     }
3266 }

where the hypercall looks at v->arch.hvm_vcpu.inject_trap.vector only.
Then we notice that the hypercall _fails_immediately_ with -EBUSY if
v->arch.hvm_vcpu.inject_trap.vector is already set:

5922     case HVMOP_inject_trap:
5923     {
5924         xen_hvm_inject_trap_t tr;
5925         struct domain *d;
5926         struct vcpu *v;
5927
5928         if ( copy_from_guest(&tr, arg, 1 ) )
5929             return -EFAULT;
5930
5931         rc = rcu_lock_remote_domain_by_id(tr.domid, &d);
5932         if ( rc != 0 )
5933             return rc;
5934
5935         rc = -EINVAL;
5936         if ( !is_hvm_domain(d) )
5937             goto injtrap_fail;
5938
5939         rc = xsm_hvm_control(XSM_DM_PRIV, d, op);
5940         if ( rc )
5941             goto injtrap_fail;
5942
5943         rc = -ENOENT;
5944         if ( tr.vcpuid >= d->max_vcpus || (v = d->vcpu[tr.vcpuid])
== NULL )
5945             goto injtrap_fail;
5946
5947         if ( v->arch.hvm_vcpu.inject_trap.vector != -1 )
5948             rc = -EBUSY;
5949         else
5950         {
5951             v->arch.hvm_vcpu.inject_trap.vector = tr.vector;
5952             v->arch.hvm_vcpu.inject_trap.type = tr.type;
5953             v->arch.hvm_vcpu.inject_trap.error_code = tr.error_code;
5954             v->arch.hvm_vcpu.inject_trap.insn_len = tr.insn_len;
5955             v->arch.hvm_vcpu.inject_trap.cr2 = tr.cr2;
5956             rc = 0;
5957         }
5958
5959     injtrap_fail:
5960         rcu_unlock_domain(d);
5961         break;
5962     }

The conclusion: instead of __vmwrite(VM_ENTRY_INTR_INFO, ...);
__vmwrite(VM_ENTRY_EXCEPTION_ERROR_CODE, ...) in vmx_idtv_reinject(),
simply do what the hypercall would have done, i.e. write
inject_trap.vector, inject_trap.type, etc.

This way, the hypercall will fail immediately if there's a pending
interrupt set on exit.

Is this too good to be true? :)


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.