[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH V5 07/12] xen: Introduce monitor_op domctl



On 13/02/15 16:33, Tamas K Lengyel wrote:

> In preparation for allowing for introspecting ARM and PV domains the old
> control interface via the hvm_op hypercall is retired. A new control mechanism
> is introduced via the domctl hypercall: monitor_op.
>
> This patch aims to establish a base API on which future applications can build
> on.
>
> Suggested-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> Signed-off-by: Tamas K Lengyel <tamas.lengyel@xxxxxxxxxxxx>
> Acked-by: Ian Campbell <ian.campbell@xxxxxxxxxx>
> Acked-by: Kevin Tian <kevin.tian@xxxxxxxxx>
> Acked-by: Daniel De Graaf <dgdegra@xxxxxxxxxxxxx>

This is a very good step in the right direction!

> diff --git a/tools/libxc/xc_monitor.c b/tools/libxc/xc_monitor.c
> new file mode 100644
> index 0000000..9e807d1
> --- /dev/null
> +++ b/tools/libxc/xc_monitor.c
> @@ -0,0 +1,118 @@
> +/******************************************************************************
> + *
> + * xc_monitor.c
> + *
> + * Interface to VM event monitor
> + *
> + * Copyright (c) 2015 Tamas K Lengyel (tamas@xxxxxxxxxxxxx)
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * This library is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this library; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301 
>  USA
> + */
> +
> +#include "xc_private.h"
> +
> +int xc_monitor_mov_to_cr0(xc_interface *xch, domid_t domain_id,
> +                          unsigned int op, unsigned int sync,
> +                          unsigned int onchangeonly)

As an API review, I would suggest changing these all to bool (as they
are indeed booleans), and rename 'op' to 'enable' (or equivalent) to
avoid giving the impression that it might accept an arbitrary subop of
"monitor_move_to_cr0".

> diff --git a/tools/tests/xen-access/xen-access.c 
> b/tools/tests/xen-access/xen-access.c
> index d951703..d52b175 100644
> --- a/tools/tests/xen-access/xen-access.c
> +++ b/tools/tests/xen-access/xen-access.c
> @@ -432,13 +432,13 @@ int main(int argc, char *argv[])
>      }
>  
>      if ( int3 )
> -        rc = xc_hvm_param_set(xch, domain_id, HVM_PARAM_MEMORY_EVENT_INT3, 
> HVMPME_mode_sync);
> -    else
> -        rc = xc_hvm_param_set(xch, domain_id, HVM_PARAM_MEMORY_EVENT_INT3, 
> HVMPME_mode_disabled);
> -    if ( rc < 0 )
>      {
> -        ERROR("Error %d setting int3 vm_event\n", rc);
> -        goto exit;
> +        rc = xc_monitor_software_breakpoint(xch, domain_id, 1);
> +        if ( rc < 0 )
> +        {
> +            ERROR("Error %d setting int3 vm_event\n", rc);

s/int3/breakpoint/

> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 2c4d0ff..4a05fb2 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -5785,23 +5785,6 @@ long do_hvm_op(unsigned long op, 
> XEN_GUEST_HANDLE_PARAM(void) arg)
>              case HVM_PARAM_ACPI_IOPORTS_LOCATION:
>                  rc = pmtimer_change_ioport(d, a.value);
>                  break;
> -            case HVM_PARAM_MEMORY_EVENT_CR0:
> -            case HVM_PARAM_MEMORY_EVENT_CR3:
> -            case HVM_PARAM_MEMORY_EVENT_CR4:
> -                if ( d == current->domain )
> -                    rc = -EPERM;
> -                break;
> -            case HVM_PARAM_MEMORY_EVENT_INT3:
> -            case HVM_PARAM_MEMORY_EVENT_SINGLE_STEP:
> -            case HVM_PARAM_MEMORY_EVENT_MSR:
> -                if ( d == current->domain )
> -                {
> -                    rc = -EPERM;
> -                    break;
> -                }
> -                if ( a.value & HVMPME_onchangeonly )
> -                    rc = -EINVAL;
> -                break;

These case statements should remain, and unconditionally fail with with
EPERM for guests (to maintain interface consistency), and fail with
something more appropriate for toolstacks (EOPNOTSUPP, ENXIO?)

Currently, removing these case statements will cause the HVM params to
become blanket read/write to everyone, including the guest.  (I should
really get around to fixing this.)

> diff --git a/xen/arch/x86/monitor.c b/xen/arch/x86/monitor.c
> new file mode 100644
> index 0000000..be4bb20
> --- /dev/null
> +++ b/xen/arch/x86/monitor.c
> @@ -0,0 +1,210 @@
> +/*
> + * arch/x86/monitor.c
> + *
> + * Architecture-specific monitor_op domctl handler.
> + *
> + * Copyright (c) 2015 Tamas K Lengyel (tamas@xxxxxxxxxxxxx)
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public
> + * License v2 as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public
> + * License along with this program; if not, write to the
> + * Free Software Foundation, Inc., 59 Temple Place - Suite 330,
> + * Boston, MA 021110-1307, USA.
> + */
> +
> +#include <xen/config.h>
> +#include <xen/sched.h>
> +#include <xen/mm.h>
> +#include <asm/domain.h>
> +
> +#define DISABLE_OPTION(option)              \
> +    do {                                    \
> +        if ( !option->enabled )             \
> +            return -EFAULT;                 \

EFAULT is very specifically used for signalling a bad virtual address. 
It is not an appropriate error code to use here.

Furthermore, hiding a conditional return in a macro is very antisocial.

What you should be probably be using is:

static void alter_option(struct domain *d, bool_t * opt_ptr, bool_t enable)
{
    domain_pause(d);
    *opt_ptr = enable;
    domain_unpause(d);
}

instead of these macros.


> +        domain_pause(d);                    \
> +        option->enabled = 0;                \
> +        domain_unpause(d);                  \
> +    } while (0)
> +
> +#define ENABLE_OPTION(option)               \
> +    do {                                    \
> +        domain_pause(d);                    \
> +        option->enabled = 1;                \
> +        domain_unpause(d);                  \
> +    } while (0)
> +
> +int monitor_domctl(struct xen_domctl_monitor_op *domctl, struct domain *d)

Convention would dictate that struct domain is the first parameter. 
Also, the monitor op pointer should probably be named 'mop' or something
else more specific than domctl.

> +{
> +    /*
> +     * At the moment only Intel HVM domains are supported. However, event
> +     * delivery could be extended to AMD and PV domains. See comments below.
> +     */
> +    if ( !is_hvm_domain(d) || !cpu_has_vmx)

Space before the closing bracket.

> +        return -ENOSYS;

EOPNOTSUPP is preferred for circumstantial restrictions like this,
especially for ones we hope to relax.

> +
> +    if ( domctl->op != XEN_DOMCTL_MONITOR_OP_ENABLE &&
> +         domctl->op != XEN_DOMCTL_MONITOR_OP_DISABLE )
> +        return -EFAULT;

ENOSYS is the most appropriate for "no such subop".

> +
> +    switch ( domctl->subop )
> +    {
> +    case XEN_DOMCTL_MONITOR_SUBOP_MOV_TO_CR0:
> +    {
> +        /* Note: could be supported on PV domains. */

Every single case in this switch could be supported on PV domains.  I
wouldn't call some of them out specifically.

> +        struct mov_to_cr *options = &d->arch.monitor_options.mov_to_cr0;
> +
> +        if ( domctl->op == XEN_DOMCTL_MONITOR_OP_ENABLE )
> +        {
> +            if ( options->enabled )
> +                return -EBUSY;
> +
> +            options->sync = domctl->u.mov_to_cr.sync;
> +            options->onchangeonly = domctl->u.mov_to_cr.onchangeonly;
> +            ENABLE_OPTION(options);
> +        }
> +        else
> +        {
> +            DISABLE_OPTION(options);
> +        }
> +        break;

All of these case statements follow a similar pattern.  Can I suggest
something along the following lines

At the top:

bool_t enable = (mop->op == XEN_DOMCTL_MONITOR_OP_ENABLE);

And in each case:

struct $FOO *options = &d->$BAR;

if ( !(options->enable ^ enable) )
    return -EEXIST;

if ( enable )
    $BAZ

alter_option(d, &options->enable, enable);


It is a good idea to fail any attempt to redundantly enable/disable
monitoring options, as it is a sign of a toolstack bug or two toolstack
components interfering with each other.

> diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
> index 85f05a5..ae56cc4 100644
> --- a/xen/include/asm-x86/domain.h
> +++ b/xen/include/asm-x86/domain.h
> @@ -241,6 +241,24 @@ struct time_scale {
>      u32 mul_frac;
>  };
>  
> +/************************************************/
> +/*            monitor event options             */
> +/************************************************/
> +struct mov_to_cr {
> +    uint8_t enabled;
> +    uint8_t sync;
> +    uint8_t onchangeonly;
> +};
> +
> +struct mov_to_msr {
> +    uint8_t enabled;
> +    uint8_t extended_capture;
> +};
> +
> +struct debug_event {
> +    uint8_t enabled;
> +};

I believe that all of these should be bool_t's instead.

> +
>  struct pv_domain
>  {
>      l1_pgentry_t **gdt_ldt_l1tab;
> @@ -335,6 +353,16 @@ struct arch_domain
>      unsigned long pirq_eoi_map_mfn;
>  
>      unsigned int psr_rmid; /* RMID assigned to the domain for CMT */
> +
> +    /* Monitor options */
> +    struct {
> +        struct mov_to_cr mov_to_cr0;
> +        struct mov_to_cr mov_to_cr3;
> +        struct mov_to_cr mov_to_cr4;
> +        struct mov_to_msr mov_to_msr;
> +        struct debug_event singlestep;
> +        struct debug_event software_breakpoint;
> +    } monitor_options;

I would name this struct simply 'monitor'.  {d,v}->arch.$FOO.$BAR.BAZ
lines are already long enough.

>  } __cacheline_aligned;
>  
>  #define has_arch_pdevs(d)    (!list_empty(&(d)->arch.pdev_list))
>
> diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
> index 9d4972a..0242914 100644
> --- a/xen/include/public/domctl.h
> +++ b/xen/include/public/domctl.h
> @@ -804,7 +804,6 @@ struct xen_domctl_gdbsx_domstatus {
>  
>  #define XEN_VM_EVENT_MONITOR_ENABLE                           0
>  #define XEN_VM_EVENT_MONITOR_DISABLE                          1
> -#define XEN_VM_EVENT_MONITOR_ENABLE_INTROSPECTION             2
>  
>  /*
>   * Sharing ENOMEM helper.
> @@ -1002,6 +1001,53 @@ struct xen_domctl_psr_cmt_op {
>  typedef struct xen_domctl_psr_cmt_op xen_domctl_psr_cmt_op_t;
>  DEFINE_XEN_GUEST_HANDLE(xen_domctl_psr_cmt_op_t);
>  
> +/*  XEN_DOMCTL_MONITOR_*
> + *
> + * Enable/disable monitoring various VM events.
> + * This domctl configures what events will be reported to helper apps
> + * via the ring buffer "MONITOR". The ring has to be first enabled
> + * with the domctl XEN_DOMCTL_VM_EVENT_OP_MONITOR.
> + *
> + * NOTICE: mem_access events are also delivered via the "MONITOR" ring 
> buffer;
> + * however, enabling/disabling those events is performed with the use of
> + * memory_op hypercalls!
> + */
> +#define XEN_DOMCTL_MONITOR_OP_ENABLE   0
> +#define XEN_DOMCTL_MONITOR_OP_DISABLE  1
> +
> +#define XEN_DOMCTL_MONITOR_SUBOP_MOV_TO_CR0            0
> +#define XEN_DOMCTL_MONITOR_SUBOP_MOV_TO_CR3            1
> +#define XEN_DOMCTL_MONITOR_SUBOP_MOV_TO_CR4            2
> +#define XEN_DOMCTL_MONITOR_SUBOP_MOV_TO_MSR            3
> +#define XEN_DOMCTL_MONITOR_SUBOP_SINGLESTEP            4
> +#define XEN_DOMCTL_MONITOR_SUBOP_SOFTWARE_BREAKPOINT   5
> +
> +struct xen_domctl_monitor_op {
> +    uint32_t op; /* XEN_DOMCTL_MONITOR_OP_* */
> +    uint32_t subop; /* XEN_DOMCTL_MONITOR_SUBOP_* */

These are not really subops.  I would s/subop/event/ everywhere.

> +
> +    /*
> +     * Further options when issuing XEN_DOMCTL_MONITOR_OP_ENABLE.
> +     */
> +    union {
> +        struct {
> +            /* Pause vCPU until response */
> +            uint8_t sync;
> +            /* Send event only on a change of value */
> +            uint8_t onchangeonly;
> +            uint8_t _pad[6];

You don't need to explicitly pad.  This struct itself lives inside a
padded struct domctl union, and is safe to change under the safety of
the DOMCTL interface version.

> +        } mov_to_cr;
> +
> +        struct {
> +            /* Enable the capture of an extended set of MSRs */
> +            uint8_t extended_capture;
> +            uint8_t _pad[7];
> +        } mov_to_msr;
> +    } u;
> +};
> +typedef struct xen_domctl__op xen_domctl_monitor_op_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_domctl_monitor_op_t);
> +
>  struct xen_domctl {
>      uint32_t cmd;
>  #define XEN_DOMCTL_createdomain                   1
> @@ -1077,6 +1123,7 @@ struct xen_domctl {
>  #define XEN_DOMCTL_setvnumainfo                  74
>  #define XEN_DOMCTL_psr_cmt_op                    75
>  #define XEN_DOMCTL_arm_configure_domain          76
> +#define XEN_DOMCTL_monitor_op                    77
>  #define XEN_DOMCTL_gdbsx_guestmemio            1000
>  #define XEN_DOMCTL_gdbsx_pausevcpu             1001
>  #define XEN_DOMCTL_gdbsx_unpausevcpu           1002
> @@ -1142,6 +1189,7 @@ struct xen_domctl {
>          struct xen_domctl_gdbsx_domstatus   gdbsx_domstatus;
>          struct xen_domctl_vnuma             vnuma;
>          struct xen_domctl_psr_cmt_op        psr_cmt_op;
> +        struct xen_domctl_monitor_op        monitor_op;
>          uint8_t                             pad[128];
>      } u;
>  };
> diff --git a/xen/include/public/hvm/params.h b/xen/include/public/hvm/params.h
> index 6efcc0b..5de6a4b 100644
> --- a/xen/include/public/hvm/params.h
> +++ b/xen/include/public/hvm/params.h
> @@ -162,21 +162,6 @@
>   */
>  #define HVM_PARAM_ACPI_IOPORTS_LOCATION 19
>  
> -/* Enable blocking memory events, async or sync (pause vcpu until response) 
> - * onchangeonly indicates messages only on a change of value */
> -#define HVM_PARAM_MEMORY_EVENT_CR0          20
> -#define HVM_PARAM_MEMORY_EVENT_CR3          21
> -#define HVM_PARAM_MEMORY_EVENT_CR4          22
> -#define HVM_PARAM_MEMORY_EVENT_INT3         23
> -#define HVM_PARAM_MEMORY_EVENT_SINGLE_STEP  25
> -#define HVM_PARAM_MEMORY_EVENT_MSR          30
> -
> -#define HVMPME_MODE_MASK       (3 << 0)
> -#define HVMPME_mode_disabled   0
> -#define HVMPME_mode_async      1
> -#define HVMPME_mode_sync       2
> -#define HVMPME_onchangeonly    (1 << 2)
> -

I think these params need to stay (to avoid their accidental reuse) and
be identified explicitly as deprecated.

~Andrew

>  /* Boolean: Enable nestedhvm (hvm only) */
>  #define HVM_PARAM_NESTEDHVM    24
>  


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.