[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v9 2/5] x86/ioreq server: Add DMOP to map guest ram with p2m_ioreq_server to an ioreq server.



> From: Yu Zhang [mailto:yu.c.zhang@xxxxxxxxxxxxxxx]
> Sent: Tuesday, March 21, 2017 10:53 AM
> 
> A new DMOP - XEN_DMOP_map_mem_type_to_ioreq_server, is added to let
> one ioreq server claim/disclaim its responsibility for the handling of guest
> pages with p2m type p2m_ioreq_server. Users of this DMOP can specify
> which kind of operation is supposed to be emulated in a parameter named
> flags. Currently, this DMOP only support the emulation of write operations.
> And it can be further extended to support the emulation of read ones if an
> ioreq server has such requirement in the future.

p2m_ioreq_server was already introduced before. Do you want to
give some background how current state is around that type which
will be helpful about purpose of this patch?

> 
> For now, we only support one ioreq server for this p2m type, so once an
> ioreq server has claimed its ownership, subsequent calls of the
> XEN_DMOP_map_mem_type_to_ioreq_server will fail. Users can also
> disclaim the ownership of guest ram pages with p2m_ioreq_server, by
> triggering this new DMOP, with ioreq server id set to the current owner's and
> flags parameter set to 0.
> 
> Note both XEN_DMOP_map_mem_type_to_ioreq_server and
> p2m_ioreq_server are only supported for HVMs with HAP enabled.
> 
> Also note that only after one ioreq server claims its ownership of
> p2m_ioreq_server, will the p2m type change to p2m_ioreq_server be
> allowed.
> 
> Signed-off-by: Paul Durrant <paul.durrant@xxxxxxxxxx>
> Signed-off-by: Yu Zhang <yu.c.zhang@xxxxxxxxxxxxxxx>
> Acked-by: Tim Deegan <tim@xxxxxxx>
> ---
> Cc: Jan Beulich <jbeulich@xxxxxxxx>
> Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> Cc: Paul Durrant <paul.durrant@xxxxxxxxxx>
> Cc: George Dunlap <george.dunlap@xxxxxxxxxxxxx>
> Cc: Jun Nakajima <jun.nakajima@xxxxxxxxx>
> Cc: Kevin Tian <kevin.tian@xxxxxxxxx>
> Cc: Tim Deegan <tim@xxxxxxx>
> 
> changes in v8:
>   - According to comments from Jan & Paul: comments changes in
> hvmemul_do_io().
>   - According to comments from Jan: remove the redundant code which
> would only
>     be useful for read emulations.
>   - According to comments from Jan: change interface which maps mem type
> to
>     ioreq server, removed uint16_t pad and added an uint64_t opaque.
>   - Address other comments from Jan, i.e. correct return values; remove stray
>     cast.
> 
> changes in v7:
>   - Use new ioreq server interface -
> XEN_DMOP_map_mem_type_to_ioreq_server.
>   - According to comments from George: removed domain_pause/unpause()
> in
>     hvm_map_mem_type_to_ioreq_server(), because it's too expensive,
>     and we can avoid the:
>     a> deadlock issue existed in v6 patch, between p2m lock and ioreq server
>        lock by using these locks in the same order - solved in patch 4;
>     b> for race condition between vm exit and ioreq server unbinding, we can
>        just retry this instruction.
>   - According to comments from Jan and George: continue to clarify logic in
>     hvmemul_do_io().
>   - According to comments from Jan: clarify comment in
> p2m_set_ioreq_server().
> 
> changes in v6:
>   - Clarify logic in hvmemul_do_io().
>   - Use recursive lock for ioreq server lock.
>   - Remove debug print when mapping ioreq server.
>   - Clarify code in ept_p2m_type_to_flags() for consistency.
>   - Remove definition of P2M_IOREQ_HANDLE_WRITE_ACCESS.
>   - Add comments for HVMMEM_ioreq_server to note only changes
>     to/from HVMMEM_ram_rw are permitted.
>   - Add domain_pause/unpause() in hvm_map_mem_type_to_ioreq_server()
>     to avoid the race condition when a vm exit happens on a write-
>     protected page, just to find the ioreq server has been unmapped
>     already.
>   - Introduce a seperate patch to delay the release of p2m
>     lock to avoid the race condition.
>   - Introduce a seperate patch to handle the read-modify-write
>     operations on a write protected page.
> 
> changes in v5:
>   - Simplify logic in hvmemul_do_io().
>   - Use natual width types instead of fixed width types when possible.
>   - Do not grant executable permission for p2m_ioreq_server entries.
>   - Clarify comments and commit message.
>   - Introduce a seperate patch to recalculate the p2m types after
>     the ioreq server unmaps the p2m_ioreq_server.
> 
> changes in v4:
>   - According to Paul's advice, add comments around the definition
>     of HVMMEM_iore_server in hvm_op.h.
>   - According to Wei Liu's comments, change the format of the commit
>     message.
> 
> changes in v3:
>   - Only support write emulation in this patch;
>   - Remove the code to handle race condition in hvmemul_do_io(),
>   - No need to reset the p2m type after an ioreq server has disclaimed
>     its ownership of p2m_ioreq_server;
>   - Only allow p2m type change to p2m_ioreq_server after an ioreq
>     server has claimed its ownership of p2m_ioreq_server;
>   - Only allow p2m type change to p2m_ioreq_server from pages with type
>     p2m_ram_rw, and vice versa;
>   - HVMOP_map_mem_type_to_ioreq_server interface change - use uint16,
>     instead of enum to specify the memory type;
>   - Function prototype change to p2m_get_ioreq_server();
>   - Coding style changes;
>   - Commit message changes;
>   - Add Tim's Acked-by.
> 
> changes in v2:
>   - Only support HAP enabled HVMs;
>   - Replace p2m_mem_type_changed() with p2m_change_entry_type_global()
>     to reset the p2m type, when an ioreq server tries to claim/disclaim
>     its ownership of p2m_ioreq_server;
>   - Comments changes.
> ---
>  xen/arch/x86/hvm/dm.c            | 37 ++++++++++++++++++--
>  xen/arch/x86/hvm/emulate.c       | 65
> ++++++++++++++++++++++++++++++++---
>  xen/arch/x86/hvm/ioreq.c         | 38 +++++++++++++++++++++
>  xen/arch/x86/mm/hap/nested_hap.c |  2 +-
>  xen/arch/x86/mm/p2m-ept.c        |  8 ++++-
>  xen/arch/x86/mm/p2m-pt.c         | 19 +++++++----
>  xen/arch/x86/mm/p2m.c            | 74
> ++++++++++++++++++++++++++++++++++++++++
>  xen/arch/x86/mm/shadow/multi.c   |  3 +-
>  xen/include/asm-x86/hvm/ioreq.h  |  2 ++
>  xen/include/asm-x86/p2m.h        | 26 ++++++++++++--
>  xen/include/public/hvm/dm_op.h   | 28 +++++++++++++++
>  xen/include/public/hvm/hvm_op.h  |  8 ++++-
>  12 files changed, 290 insertions(+), 20 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/dm.c b/xen/arch/x86/hvm/dm.c index
> 333c884..3f9484d 100644
> --- a/xen/arch/x86/hvm/dm.c
> +++ b/xen/arch/x86/hvm/dm.c
> @@ -173,9 +173,14 @@ static int modified_memory(struct domain *d,
> 
>  static bool allow_p2m_type_change(p2m_type_t old, p2m_type_t new)  {
> +    if ( new == p2m_ioreq_server )
> +        return old == p2m_ram_rw;
> +
> +    if ( old == p2m_ioreq_server )
> +        return new == p2m_ram_rw;
> +
>      return p2m_is_ram(old) ||
> -           (p2m_is_hole(old) && new == p2m_mmio_dm) ||
> -           (old == p2m_ioreq_server && new == p2m_ram_rw);
> +           (p2m_is_hole(old) && new == p2m_mmio_dm);
>  }
> 
>  static int set_mem_type(struct domain *d, @@ -202,6 +207,19 @@ static int
> set_mem_type(struct domain *d,
>           unlikely(data->mem_type == HVMMEM_unused) )
>          return -EINVAL;
> 
> +    if ( data->mem_type  == HVMMEM_ioreq_server )
> +    {
> +        unsigned int flags;
> +
> +        /* HVMMEM_ioreq_server is only supported for HAP enabled hvm. */
> +        if ( !hap_enabled(d) )
> +            return -EOPNOTSUPP;
> +
> +        /* Do not change to HVMMEM_ioreq_server if no ioreq server mapped.
> */
> +        if ( !p2m_get_ioreq_server(d, &flags) )
> +            return -EINVAL;
> +    }
> +
>      while ( iter < data->nr )
>      {
>          unsigned long pfn = data->first_pfn + iter; @@ -365,6 +383,21 @@
> static int dm_op(domid_t domid,
>          break;
>      }
> 
> +    case XEN_DMOP_map_mem_type_to_ioreq_server:
> +    {
> +        const struct xen_dm_op_map_mem_type_to_ioreq_server *data =
> +            &op.u.map_mem_type_to_ioreq_server;
> +
> +        rc = -EOPNOTSUPP;
> +        /* Only support for HAP enabled hvm. */

Isn't it obvious from code?

> +        if ( !hap_enabled(d) )
> +            break;
> +
> +        rc = hvm_map_mem_type_to_ioreq_server(d, data->id,
> +                                              data->type, data->flags);
> +        break;
> +    }
> +
>      case XEN_DMOP_set_ioreq_server_state:
>      {
>          const struct xen_dm_op_set_ioreq_server_state *data = diff --git
> a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c index
> f36d7c9..37139e6 100644
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -99,6 +99,7 @@ static int hvmemul_do_io(
>      uint8_t dir, bool_t df, bool_t data_is_addr, uintptr_t data)  {
>      struct vcpu *curr = current;
> +    struct domain *currd = curr->domain;
>      struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
>      ioreq_t p = {
>          .type = is_mmio ? IOREQ_TYPE_COPY : IOREQ_TYPE_PIO, @@ -140,7
> +141,7 @@ static int hvmemul_do_io(
>               (p.dir != dir) ||
>               (p.df != df) ||
>               (p.data_is_ptr != data_is_addr) )
> -            domain_crash(curr->domain);
> +            domain_crash(currd);
> 
>          if ( data_is_addr )
>              return X86EMUL_UNHANDLEABLE; @@ -177,8 +178,64 @@ static int
> hvmemul_do_io(
>          break;
>      case X86EMUL_UNHANDLEABLE:
>      {
> -        struct hvm_ioreq_server *s =
> -            hvm_select_ioreq_server(curr->domain, &p);
> +        /*
> +         * Xen isn't emulating the instruction internally, so see if
> +         * there's an ioreq server that can handle it. Rules:
> +         *
> +         * - PIO and "normal" MMIO run through hvm_select_ioreq_server()

why highlights "normal" here? What does a "abnormal" MMIO
mean here? p2m_ioreq_server type?

> +         * to choose the ioreq server by range. If no server is found,
> +         * the access is ignored.
> +         *
> +         * - p2m_ioreq_server accesses are handled by the designated
> +         * ioreq_server for the domain, but there are some corner
> +         * cases:

since only one case is listed, "there is a corner case"

> +         *
> +         *   - If the domain ioreq_server is NULL, assume there is a
> +         *   race between the unbinding of ioreq server and guest fault
> +         *   so re-try the instruction.
> +         */
> +        struct hvm_ioreq_server *s = NULL;
> +        p2m_type_t p2mt = p2m_invalid;
> +
> +        if ( is_mmio )
> +        {
> +            unsigned long gmfn = paddr_to_pfn(addr);
> +
> +            get_gfn_query_unlocked(currd, gmfn, &p2mt);
> +
> +            if ( p2mt == p2m_ioreq_server )
> +            {
> +                unsigned int flags;
> +
> +                /*
> +                 * Value of s could be stale, when we lost a race

better describe it in higher level, e.g. just "no ioreq server is
found".

what's the meaning of "lost a race"? shouldn't it mean
"likely we suffer from a race with..."?

> +                 * with dm_op which unmaps p2m_ioreq_server from the
> +                 * ioreq server. Yet there's no cheap way to avoid

again, not talking about specific code, focus on the operation,
e.g. "race with an unmap operation on the ioreq server"

> +                 * this, so device model need to do the check.
> +                 */

How is above comment related to below line?

> +                s = p2m_get_ioreq_server(currd, &flags);
> +
> +                /*
> +                 * If p2mt is ioreq_server but ioreq_server is NULL,

p2mt is definitely ioreq_server within this if condition.

> +                 * we probably lost a race with unbinding of ioreq
> +                 * server, just retry the access.
> +                 */

looks redundant to earlier comment. Or earlier one should
be just removed?

> +                if ( s == NULL )
> +                {
> +                    rc = X86EMUL_RETRY;
> +                    vio->io_req.state = STATE_IOREQ_NONE;
> +                    break;
> +                }
> +            }
> +        }
> +
> +        /*
> +         * Value of s could be stale, when we lost a race with dm_op
> +         * which unmaps this PIO/MMIO address from the ioreq server.
> +         * The device model side need to do the check.
> +         */

another duplicated comment. below code is actually for 'normal'
MMIO case...

> +        if ( !s )
> +            s = hvm_select_ioreq_server(currd, &p);
> 
>          /* If there is no suitable backing DM, just ignore accesses */
>          if ( !s )
> @@ -189,7 +246,7 @@ static int hvmemul_do_io(
>          else
>          {
>              rc = hvm_send_ioreq(s, &p, 0);
> -            if ( rc != X86EMUL_RETRY || curr->domain->is_shutting_down )
> +            if ( rc != X86EMUL_RETRY || currd->is_shutting_down )
>                  vio->io_req.state = STATE_IOREQ_NONE;
>              else if ( data_is_addr )
>                  rc = X86EMUL_OKAY;
> diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c index
> ad2edad..746799f 100644
> --- a/xen/arch/x86/hvm/ioreq.c
> +++ b/xen/arch/x86/hvm/ioreq.c
> @@ -753,6 +753,8 @@ int hvm_destroy_ioreq_server(struct domain *d,
> ioservid_t id)
> 
>          domain_pause(d);
> 
> +        p2m_destroy_ioreq_server(d, s);
> +
>          hvm_ioreq_server_disable(s, 0);
> 
>          list_del(&s->list_entry);
> @@ -914,6 +916,42 @@ int
> hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
>      return rc;
>  }
> 
> +int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
> +                                     uint32_t type, uint32_t flags) {
> +    struct hvm_ioreq_server *s;
> +    int rc;
> +
> +    /* For now, only HVMMEM_ioreq_server is supported. */

obvious comment

> +    if ( type != HVMMEM_ioreq_server )
> +        return -EINVAL;
> +
> +    /* For now, only write emulation is supported. */

ditto. 

> +    if ( flags & ~(XEN_DMOP_IOREQ_MEM_ACCESS_WRITE) )
> +        return -EINVAL;
> +
> +    spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
> +
> +    rc = -ENOENT;
> +    list_for_each_entry ( s,
> +                          &d->arch.hvm_domain.ioreq_server.list,
> +                          list_entry )
> +    {
> +        if ( s == d->arch.hvm_domain.default_ioreq_server )
> +            continue;

any reason why we cannot let default server to claim this
new type?

> +
> +        if ( s->id == id )
> +        {
> +            rc = p2m_set_ioreq_server(d, flags, s);
> +            break;
> +        }
> +    }
> +
> +    spin_unlock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
> +
> +    return rc;
> +}
> +
>  int hvm_set_ioreq_server_state(struct domain *d, ioservid_t id,
>                                 bool_t enabled)  { diff --git
> a/xen/arch/x86/mm/hap/nested_hap.c
> b/xen/arch/x86/mm/hap/nested_hap.c
> index 162afed..408ea7f 100644
> --- a/xen/arch/x86/mm/hap/nested_hap.c
> +++ b/xen/arch/x86/mm/hap/nested_hap.c
> @@ -172,7 +172,7 @@ nestedhap_walk_L0_p2m(struct p2m_domain *p2m,
> paddr_t L1_gpa, paddr_t *L0_gpa,
>      if ( *p2mt == p2m_mmio_direct )
>          goto direct_mmio_out;
>      rc = NESTEDHVM_PAGEFAULT_MMIO;
> -    if ( *p2mt == p2m_mmio_dm )
> +    if ( *p2mt == p2m_mmio_dm || *p2mt == p2m_ioreq_server )
>          goto out;
> 
>      rc = NESTEDHVM_PAGEFAULT_L0_ERROR;
> diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
> index 568944f..cc1eb21 100644
> --- a/xen/arch/x86/mm/p2m-ept.c
> +++ b/xen/arch/x86/mm/p2m-ept.c
> @@ -131,6 +131,13 @@ static void ept_p2m_type_to_flags(struct
> p2m_domain *p2m, ept_entry_t *entry,
>              entry->r = entry->w = entry->x = 1;
>              entry->a = entry->d = !!cpu_has_vmx_ept_ad;
>              break;
> +        case p2m_ioreq_server:
> +            entry->r = 1;
> +            entry->w = !(p2m->ioreq.flags &
> XEN_DMOP_IOREQ_MEM_ACCESS_WRITE);
> +            entry->x = 0;
> +            entry->a = !!cpu_has_vmx_ept_ad;
> +            entry->d = entry->w && entry->a;
> +            break;
>          case p2m_mmio_direct:
>              entry->r = entry->x = 1;
>              entry->w = !rangeset_contains_singleton(mmio_ro_ranges,
> @@ -170,7 +177,6 @@ static void ept_p2m_type_to_flags(struct
> p2m_domain *p2m, ept_entry_t *entry,
>              entry->a = entry->d = !!cpu_has_vmx_ept_ad;
>              break;
>          case p2m_grant_map_ro:
> -        case p2m_ioreq_server:
>              entry->r = 1;
>              entry->w = entry->x = 0;
>              entry->a = !!cpu_has_vmx_ept_ad; diff --git
> a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c index
> 07e2ccd..f6c45ec 100644
> --- a/xen/arch/x86/mm/p2m-pt.c
> +++ b/xen/arch/x86/mm/p2m-pt.c
> @@ -70,7 +70,9 @@ static const unsigned long pgt[] = {
>      PGT_l3_page_table
>  };
> 
> -static unsigned long p2m_type_to_flags(p2m_type_t t, mfn_t mfn,
> +static unsigned long p2m_type_to_flags(const struct p2m_domain *p2m,
> +                                       p2m_type_t t,
> +                                       mfn_t mfn,
>                                         unsigned int level)  {
>      unsigned long flags;
> @@ -92,8 +94,12 @@ static unsigned long p2m_type_to_flags(p2m_type_t t,
> mfn_t mfn,
>      default:
>          return flags | _PAGE_NX_BIT;
>      case p2m_grant_map_ro:
> -    case p2m_ioreq_server:
>          return flags | P2M_BASE_FLAGS | _PAGE_NX_BIT;
> +    case p2m_ioreq_server:
> +        flags |= P2M_BASE_FLAGS | _PAGE_RW | _PAGE_NX_BIT;
> +        if ( p2m->ioreq.flags & XEN_DMOP_IOREQ_MEM_ACCESS_WRITE )
> +            return flags & ~_PAGE_RW;
> +        return flags;
>      case p2m_ram_ro:
>      case p2m_ram_logdirty:
>      case p2m_ram_shared:
> @@ -440,7 +446,8 @@ static int do_recalc(struct p2m_domain *p2m,
> unsigned long gfn)
>              p2m_type_t p2mt = p2m_is_logdirty_range(p2m, gfn & mask, gfn |
> ~mask)
>                                ? p2m_ram_logdirty : p2m_ram_rw;
>              unsigned long mfn = l1e_get_pfn(e);
> -            unsigned long flags = p2m_type_to_flags(p2mt, _mfn(mfn), level);
> +            unsigned long flags = p2m_type_to_flags(p2m, p2mt,
> +                                                    _mfn(mfn), level);
> 
>              if ( level )
>              {
> @@ -578,7 +585,7 @@ p2m_pt_set_entry(struct p2m_domain *p2m,
> unsigned long gfn, mfn_t mfn,
>          ASSERT(!mfn_valid(mfn) || p2mt != p2m_mmio_direct);
>          l3e_content = mfn_valid(mfn) || p2m_allows_invalid_mfn(p2mt)
>              ? l3e_from_pfn(mfn_x(mfn),
> -                           p2m_type_to_flags(p2mt, mfn, 2) | _PAGE_PSE)
> +                           p2m_type_to_flags(p2m, p2mt, mfn, 2) |
> + _PAGE_PSE)
>              : l3e_empty();
>          entry_content.l1 = l3e_content.l3;
> 
> @@ -615,7 +622,7 @@ p2m_pt_set_entry(struct p2m_domain *p2m,
> unsigned long gfn, mfn_t mfn,
> 
>          if ( mfn_valid(mfn) || p2m_allows_invalid_mfn(p2mt) )
>              entry_content = p2m_l1e_from_pfn(mfn_x(mfn),
> -                                             p2m_type_to_flags(p2mt, mfn, 
> 0));
> +                                         p2m_type_to_flags(p2m, p2mt,
> + mfn, 0));
>          else
>              entry_content = l1e_empty();
> 
> @@ -652,7 +659,7 @@ p2m_pt_set_entry(struct p2m_domain *p2m,
> unsigned long gfn, mfn_t mfn,
>          ASSERT(!mfn_valid(mfn) || p2mt != p2m_mmio_direct);
>          if ( mfn_valid(mfn) || p2m_allows_invalid_mfn(p2mt) )
>              l2e_content = l2e_from_pfn(mfn_x(mfn),
> -                                       p2m_type_to_flags(p2mt, mfn, 1) |
> +                                       p2m_type_to_flags(p2m, p2mt,
> + mfn, 1) |
>                                         _PAGE_PSE);
>          else
>              l2e_content = l2e_empty();
> diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c index
> a5651a3..dd4e477 100644
> --- a/xen/arch/x86/mm/p2m.c
> +++ b/xen/arch/x86/mm/p2m.c
> @@ -82,6 +82,8 @@ static int p2m_initialise(struct domain *d, struct
> p2m_domain *p2m)
>      else
>          p2m_pt_init(p2m);
> 
> +    spin_lock_init(&p2m->ioreq.lock);
> +
>      return ret;
>  }
> 
> @@ -286,6 +288,78 @@ void p2m_memory_type_changed(struct domain *d)
>      }
>  }
> 
> +int p2m_set_ioreq_server(struct domain *d,
> +                         unsigned int flags,
> +                         struct hvm_ioreq_server *s) {
> +    struct p2m_domain *p2m = p2m_get_hostp2m(d);
> +    int rc;
> +
> +    /*
> +     * Use lock to prevent concurrent setting attempts
> +     * from multiple ioreq serers.

serers -> servers

> +     */
> +    spin_lock(&p2m->ioreq.lock);
> +
> +    /* Unmap ioreq server from p2m type by passing flags with 0. */
> +    if ( flags == 0 )
> +    {
> +        rc = -EINVAL;
> +        if ( p2m->ioreq.server != s )
> +            goto out;
> +
> +        p2m->ioreq.server = NULL;
> +        p2m->ioreq.flags = 0;
> +    }
> +    else
> +    {
> +        rc = -EBUSY;
> +        if ( p2m->ioreq.server != NULL )
> +            goto out;
> +
> +        p2m->ioreq.server = s;
> +        p2m->ioreq.flags = flags;
> +    }
> +
> +    rc = 0;
> +
> + out:
> +    spin_unlock(&p2m->ioreq.lock);
> +
> +    return rc;
> +}
> +
> +struct hvm_ioreq_server *p2m_get_ioreq_server(struct domain *d,
> +                                              unsigned int *flags) {
> +    struct p2m_domain *p2m = p2m_get_hostp2m(d);
> +    struct hvm_ioreq_server *s;
> +
> +    spin_lock(&p2m->ioreq.lock);
> +
> +    s = p2m->ioreq.server;
> +    *flags = p2m->ioreq.flags;
> +
> +    spin_unlock(&p2m->ioreq.lock);
> +    return s;
> +}
> +
> +void p2m_destroy_ioreq_server(const struct domain *d,
> +                              const struct hvm_ioreq_server *s) {
> +    struct p2m_domain *p2m = p2m_get_hostp2m(d);
> +
> +    spin_lock(&p2m->ioreq.lock);
> +
> +    if ( p2m->ioreq.server == s )
> +    {
> +        p2m->ioreq.server = NULL;
> +        p2m->ioreq.flags = 0;
> +    }
> +
> +    spin_unlock(&p2m->ioreq.lock);
> +}
> +
>  void p2m_enable_hardware_log_dirty(struct domain *d)  {
>      struct p2m_domain *p2m = p2m_get_hostp2m(d); diff --git
> a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
> index 7ea9d81..521b639 100644
> --- a/xen/arch/x86/mm/shadow/multi.c
> +++ b/xen/arch/x86/mm/shadow/multi.c
> @@ -3269,8 +3269,7 @@ static int sh_page_fault(struct vcpu *v,
>      }
> 
>      /* Need to hand off device-model MMIO to the device model */
> -    if ( p2mt == p2m_mmio_dm
> -         || (p2mt == p2m_ioreq_server && ft == ft_demand_write) )
> +    if ( p2mt == p2m_mmio_dm )
>      {
>          gpa = guest_walk_to_gpa(&gw);
>          goto mmio;
> diff --git a/xen/include/asm-x86/hvm/ioreq.h b/xen/include/asm-
> x86/hvm/ioreq.h index fbf2c74..b43667a 100644
> --- a/xen/include/asm-x86/hvm/ioreq.h
> +++ b/xen/include/asm-x86/hvm/ioreq.h
> @@ -37,6 +37,8 @@ int hvm_map_io_range_to_ioreq_server(struct domain
> *d, ioservid_t id,  int hvm_unmap_io_range_from_ioreq_server(struct
> domain *d, ioservid_t id,
>                                           uint32_t type, uint64_t start,
>                                           uint64_t end);
> +int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
> +                                     uint32_t type, uint32_t flags);
>  int hvm_set_ioreq_server_state(struct domain *d, ioservid_t id,
>                                 bool_t enabled);
> 
> diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h index
> 470d29d..3786680 100644
> --- a/xen/include/asm-x86/p2m.h
> +++ b/xen/include/asm-x86/p2m.h
> @@ -89,7 +89,8 @@ typedef unsigned int p2m_query_t;
>                         | p2m_to_mask(p2m_ram_paging_out)      \
>                         | p2m_to_mask(p2m_ram_paged)           \
>                         | p2m_to_mask(p2m_ram_paging_in)       \
> -                       | p2m_to_mask(p2m_ram_shared))
> +                       | p2m_to_mask(p2m_ram_shared)          \
> +                       | p2m_to_mask(p2m_ioreq_server))
> 
>  /* Types that represent a physmap hole that is ok to replace with a shared
>   * entry */
> @@ -111,8 +112,7 @@ typedef unsigned int p2m_query_t;
>  #define P2M_RO_TYPES (p2m_to_mask(p2m_ram_logdirty)     \
>                        | p2m_to_mask(p2m_ram_ro)         \
>                        | p2m_to_mask(p2m_grant_map_ro)   \
> -                      | p2m_to_mask(p2m_ram_shared)     \
> -                      | p2m_to_mask(p2m_ioreq_server))
> +                      | p2m_to_mask(p2m_ram_shared))
> 
>  /* Write-discard types, which should discard the write operations */
>  #define P2M_DISCARD_WRITE_TYPES (p2m_to_mask(p2m_ram_ro)     \
> @@ -336,6 +336,20 @@ struct p2m_domain {
>          struct ept_data ept;
>          /* NPT-equivalent structure could be added here. */
>      };
> +
> +     struct {
> +         spinlock_t lock;
> +         /*
> +          * ioreq server who's responsible for the emulation of
> +          * gfns with specific p2m type(for now, p2m_ioreq_server).
> +          */
> +         struct hvm_ioreq_server *server;
> +         /*
> +          * flags specifies whether read, write or both operations
> +          * are to be emulated by an ioreq server.
> +          */
> +         unsigned int flags;
> +     } ioreq;
>  };
> 
>  /* get host p2m table */
> @@ -827,6 +841,12 @@ static inline unsigned int
> p2m_get_iommu_flags(p2m_type_t p2mt, mfn_t mfn)
>      return flags;
>  }
> 
> +int p2m_set_ioreq_server(struct domain *d, unsigned int flags,
> +                         struct hvm_ioreq_server *s); struct
> +hvm_ioreq_server *p2m_get_ioreq_server(struct domain *d,
> +                                              unsigned int *flags);
> +void p2m_destroy_ioreq_server(const struct domain *d, const struct
> +hvm_ioreq_server *s);
> +
>  #endif /* _XEN_ASM_X86_P2M_H */
> 
>  /*
> diff --git a/xen/include/public/hvm/dm_op.h
> b/xen/include/public/hvm/dm_op.h index f54cece..2a36833 100644
> --- a/xen/include/public/hvm/dm_op.h
> +++ b/xen/include/public/hvm/dm_op.h
> @@ -318,6 +318,32 @@ struct xen_dm_op_inject_msi {
>      uint64_aligned_t addr;
>  };
> 
> +/*
> + * XEN_DMOP_map_mem_type_to_ioreq_server : map or unmap the
> IOREQ Server <id>
> + *                                      to specific memroy type <type>

memroy->memory

> + *                                      for specific accesses <flags>
> + *
> + * For now, flags only accept the value of
> +XEN_DMOP_IOREQ_MEM_ACCESS_WRITE,
> + * which means only write operations are to be forwarded to an ioreq
> server.
> + * Support for the emulation of read operations can be added when an
> +ioreq
> + * server has such requirement in future.
> + */
> +#define XEN_DMOP_map_mem_type_to_ioreq_server 15
> +
> +struct xen_dm_op_map_mem_type_to_ioreq_server {
> +    ioservid_t id;      /* IN - ioreq server id */
> +    uint16_t type;      /* IN - memory type */
> +    uint32_t flags;     /* IN - types of accesses to be forwarded to the
> +                           ioreq server. flags with 0 means to unmap the
> +                           ioreq server */
> +
> +#define XEN_DMOP_IOREQ_MEM_ACCESS_READ (1u << 0) #define
> +XEN_DMOP_IOREQ_MEM_ACCESS_WRITE (1u << 1)
> +
> +    uint64_t opaque;    /* IN/OUT - only used for hypercall continuation,
> +                           has to be set to zero by the caller */ };
> +
>  struct xen_dm_op {
>      uint32_t op;
>      uint32_t pad;
> @@ -336,6 +362,8 @@ struct xen_dm_op {
>          struct xen_dm_op_set_mem_type set_mem_type;
>          struct xen_dm_op_inject_event inject_event;
>          struct xen_dm_op_inject_msi inject_msi;
> +        struct xen_dm_op_map_mem_type_to_ioreq_server
> +                map_mem_type_to_ioreq_server;
>      } u;
>  };
> 
> diff --git a/xen/include/public/hvm/hvm_op.h
> b/xen/include/public/hvm/hvm_op.h index bc00ef0..0bdafdf 100644
> --- a/xen/include/public/hvm/hvm_op.h
> +++ b/xen/include/public/hvm/hvm_op.h
> @@ -93,7 +93,13 @@ typedef enum {
>      HVMMEM_unused,             /* Placeholder; setting memory to this type
>                                    will fail for code after 4.7.0 */  #endif
> -    HVMMEM_ioreq_server
> +    HVMMEM_ioreq_server        /* Memory type claimed by an ioreq server;
> type
> +                                  changes to this value are only allowed 
> after
> +                                  an ioreq server has claimed its ownership.
> +                                  Only pages with HVMMEM_ram_rw are allowed 
> to
> +                                  change to this type; conversely, pages with
> +                                  this type are only allowed to be changed 
> back
> +                                  to HVMMEM_ram_rw. */
>  } hvmmem_type_t;
> 
>  /* Hint from PV drivers for pagetable destruction. */
> --
> 1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.