[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH V2 3/3] xen/vm_event: Deny register writes if refused by vm_event reply



On 06/26/2015 11:28 AM, Jan Beulich wrote:
>>>> On 15.06.15 at 11:03, <rcojocaru@xxxxxxxxxxxxxxx> wrote:
>> Deny register writes if a vm_client subscribed to mov_to_msr events
>> forbids them. Currently supported for MSR, CR0, CR3 and CR4 events.
> 
> Is the first sentence referring the mov_to_msr stale?

Yes, the patch now applies to MSR, CR0, CR3 and CR4. I'll update the
patch description.

>> --- a/xen/arch/x86/hvm/event.c
>> +++ b/xen/arch/x86/hvm/event.c
>> @@ -90,7 +90,7 @@ static int hvm_event_traps(uint8_t sync, 
>> vm_event_request_t *req)
>>      return 1;
>>  }
>>  
>> -void hvm_event_cr(unsigned int index, unsigned long value, unsigned long 
>> old)
>> +bool_t hvm_event_cr(unsigned int index, unsigned long value, unsigned long 
>> old)
>>  {
>>      struct arch_domain *currad = &current->domain->arch;
>>      unsigned int ctrlreg_bitmask = monitor_ctrlreg_bitmask(index);
>> @@ -109,7 +109,10 @@ void hvm_event_cr(unsigned int index, unsigned long 
>> value, unsigned long old)
>>  
>>          hvm_event_traps(currad->monitor.write_ctrlreg_sync & 
>> ctrlreg_bitmask,
>>                          &req);
>> +        return 1;
>>      }
>> +
>> +    return 0;
>>  }
>>  
>>  void hvm_event_msr(unsigned int msr, uint64_t value)
> 
> Why is knowing whether an event was sent relevant for
> hvm_event_cr(), but not for hvm_event_msr()?

MSR events don't honor onchangeonly, so they're always being sent. A CR
event won't be sent if onchangeonly is true and the register is being
set to the same value again.

Since the change prompted the question, I should perhaps add a comment
to explain that?

>> @@ -468,6 +469,37 @@ void hvm_do_resume(struct vcpu *v)
>>          }
>>      }
>>  
>> +    ASSERT(v == current);
>> +
>> +    if ( d->arch.event_write_data )
> 
> unlikely() to not penalize the common case? Also an ASSERT() like
> this belongs either at the beginning of the function or, if really only
> relevant with the changes you do here, inside the if().

Ack, will move the ASSERT() and wrap the condition with an unlikely().

>> @@ -3189,12 +3221,13 @@ static void hvm_update_cr(struct vcpu *v, unsigned 
>> int cr, unsigned long value)
>>      hvm_update_guest_cr(v, cr);
>>  }
>>  
>> -int hvm_set_cr0(unsigned long value)
>> +int hvm_set_cr0(unsigned long value, bool_t event_only)
> 
> "event_only" seems pretty misleading, as it limits the function's
> operation only when ...
> 
>> @@ -3224,6 +3257,22 @@ int hvm_set_cr0(unsigned long value)
>>          goto gpf;
>>      }
>>  
>> +    if ( event_only && unlikely(currad->monitor.write_ctrlreg_enabled &
>> +                                monitor_ctrlreg_bitmask(VM_EVENT_X86_CR0)) 
>> &&
>> +                                value != old_value )
> 
> ... a number of other conditions are true. Maybe "may_defer" or
> some such making clear this is conditional behavior?
> 
> Also - indentation.

Ack on both counts.

>> --- a/xen/arch/x86/hvm/vmx/vmx.c
>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
>> @@ -2010,9 +2010,9 @@ static int vmx_cr_access(unsigned long 
>> exit_qualification)
>>      }
>>      case VMX_CONTROL_REG_ACCESS_TYPE_CLTS: {
>>          unsigned long old = curr->arch.hvm_vcpu.guest_cr[0];
>> +        hvm_event_crX(CR0, curr->arch.hvm_vcpu.guest_cr[0], old);
>>          curr->arch.hvm_vcpu.guest_cr[0] &= ~X86_CR0_TS;
>>          vmx_update_guest_cr(curr, 0);
>> -        hvm_event_crX(CR0, curr->arch.hvm_vcpu.guest_cr[0], old);
>>          HVMTRACE_0D(CLTS);
>>          break;
>>      }
> 
> I suppose it is intentional to ignore a possible deny here? If so,
> shouldn't the be documented by way of a comment?

Actually on second thought I should probably not ignore a deny there.
While this wasn't required in tests, it's probably better to be consistent.

> Also, since you already touch this code, please add a blank line
> between declaration and statements.

Ack.

>> --- a/xen/arch/x86/hvm/vmx/vvmx.c
>> +++ b/xen/arch/x86/hvm/vmx/vvmx.c
>> @@ -1048,15 +1048,16 @@ static void load_shadow_guest_state(struct vcpu *v)
>>  
>>      nvcpu->guest_cr[0] = __get_vvmcs(vvmcs, CR0_READ_SHADOW);
>>      nvcpu->guest_cr[4] = __get_vvmcs(vvmcs, CR4_READ_SHADOW);
>> -    hvm_set_cr0(__get_vvmcs(vvmcs, GUEST_CR0));
>> -    hvm_set_cr4(__get_vvmcs(vvmcs, GUEST_CR4));
>> -    hvm_set_cr3(__get_vvmcs(vvmcs, GUEST_CR3));
>> +    hvm_set_cr0(__get_vvmcs(vvmcs, GUEST_CR0), 1);
>> +    hvm_set_cr4(__get_vvmcs(vvmcs, GUEST_CR4), 1);
>> +    hvm_set_cr3(__get_vvmcs(vvmcs, GUEST_CR3), 1);
> 
> Are you sure there are no dependencies between the individual
> registers getting set? And what about the order here versus how
> you carry out the deferred writes in hvm_do_resume()? I don't
> think any good can come from updating CR3 before CR4...

Ack, I'll mirror this order in hvm_do_resume().

>>      control = __get_vvmcs(vvmcs, VM_ENTRY_CONTROLS);
>>      if ( control & VM_ENTRY_LOAD_GUEST_PAT )
>>          hvm_set_guest_pat(v, __get_vvmcs(vvmcs, GUEST_PAT));
>>      if ( control & VM_ENTRY_LOAD_PERF_GLOBAL_CTRL )
>> -        hvm_msr_write_intercept(MSR_CORE_PERF_GLOBAL_CTRL, 
>> __get_vvmcs(vvmcs, GUEST_PERF_GLOBAL_CTRL));
>> +        hvm_msr_write_intercept(MSR_CORE_PERF_GLOBAL_CTRL,
>> +                                __get_vvmcs(vvmcs, GUEST_PERF_GLOBAL_CTRL), 
>> 1);
> 
> So one special MSR gets potentially denied writes to - how about
> all the other ones like PAT (visible above), EFER, etc? And once
> you send events for multiple ones - how would you know which
> ones were denied access to be the time to reach hvm_do_resume()?
> 
> All the same questions of course apply to nested SVM code too, just
> that there the problems are leass easily visible from looking at the
> patch.

"Regular" MSRs (the ones that go through hvm_msr_write_intercept()) are
sufficient in all the introspection use cases we've tested. The code
would of course have the problem you've mentioned if the code would be
modified to include PAT & friends, but for now this doesn't seem to be
an issue.

Should I add a comment, maybe in hvm_do_resume()?

>> @@ -1249,15 +1250,16 @@ static void load_vvmcs_host_state(struct vcpu *v)
>>          __vmwrite(vmcs_h2g_field[i].guest_field, r);
>>      }
>>  
>> -    hvm_set_cr0(__get_vvmcs(vvmcs, HOST_CR0));
>> -    hvm_set_cr4(__get_vvmcs(vvmcs, HOST_CR4));
>> -    hvm_set_cr3(__get_vvmcs(vvmcs, HOST_CR3));
>> +    hvm_set_cr0(__get_vvmcs(vvmcs, HOST_CR0), 1);
>> +    hvm_set_cr4(__get_vvmcs(vvmcs, HOST_CR4), 1);
>> +    hvm_set_cr3(__get_vvmcs(vvmcs, HOST_CR3), 1);
>>  
>>      control = __get_vvmcs(vvmcs, VM_EXIT_CONTROLS);
>>      if ( control & VM_EXIT_LOAD_HOST_PAT )
>>          hvm_set_guest_pat(v, __get_vvmcs(vvmcs, HOST_PAT));
>>      if ( control & VM_EXIT_LOAD_PERF_GLOBAL_CTRL )
>> -        hvm_msr_write_intercept(MSR_CORE_PERF_GLOBAL_CTRL, 
>> __get_vvmcs(vvmcs, HOST_PERF_GLOBAL_CTRL));
>> +        hvm_msr_write_intercept(MSR_CORE_PERF_GLOBAL_CTRL,
>> +                                __get_vvmcs(vvmcs, HOST_PERF_GLOBAL_CTRL), 
>> 1);
> 
> Considering these together with the above - do you really want/
> need to intercept and send events for both host and guest shadow
> state changes?

Guest MSRs should suffice indeed. I was just trying to be consistent,
but no, the host changes should not be necessary.

>> --- a/xen/arch/x86/mm/p2m.c
>> +++ b/xen/arch/x86/mm/p2m.c
>> @@ -1417,6 +1417,35 @@ static void p2m_vm_event_fill_regs(vm_event_request_t 
>> *req)
>>  void p2m_mem_access_emulate_check(struct vcpu *v,
>>                                    const vm_event_response_t *rsp)
>>  {
>> +    ASSERT(v->domain->arch.event_write_data != NULL);
>> +
>> +    if ( rsp->flags & MEM_ACCESS_DENY )
>> +    {
>> +        struct monitor_write_data *w =
>> +            &v->domain->arch.event_write_data[v->vcpu_id];
>> +
> 
> I think the ASSERT() above belongs here.

Ack.

>> --- a/xen/include/asm-x86/domain.h
>> +++ b/xen/include/asm-x86/domain.h
>> @@ -250,6 +250,21 @@ struct pv_domain
>>      struct mapcache_domain mapcache;
>>  };
>>  
>> +struct monitor_write_data {
>> +    struct {
>> +        uint8_t msr : 1;
>> +        uint8_t cr0 : 1;
>> +        uint8_t cr3 : 1;
>> +        uint8_t cr4 : 1;
> 
> unsigned int

Ack.

> 
>> +    } do_write;
>> +
>> +    uint64_t msr;
> 
> unsigned int or uint32_t

Ack.

> 
>> --- a/xen/include/public/vm_event.h
>> +++ b/xen/include/public/vm_event.h
>> @@ -158,6 +158,11 @@ struct vm_event_regs_x86 {
>>   * MEM_ACCESS_EMULATE_NOWRITE.
>>   */
>>  #define MEM_ACCESS_SET_EMUL_READ_DATA   (1 << 8)
>> + /*
>> +  * Deny completion of the operation that triggered the event.
>> +  * Currently only useful for MSR write events.
>> +  */
>> +#define MEM_ACCESS_DENY                 (1 << 9)
> 
> Same as for the description - isn't the comment, referring to only
> MSR writes, stale?

It is. Will update it.


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.