[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] abstract model of IOMMU unmaping/mapping failures



All,

Here is a summary of my investigation of the abstract model:

Below policies are adopted when deciding whether to rollback a callchain:

1. Domain will be crashed immediately within iommu_{,un}map_page, treated as a 
fatal error (with the exception of the hardware one). Whether to rollback 
depends on the need of hardware domain;

2. For hardware domain, roll back on a best effort basis. When rollback is not 
feasible (in early initialization phase or trade-off of complexity), at least 
unmap upon maps error and then throw out error message;

Below are a detail analysis of all existing callers on IOMMU interfaces (8-11 
needs more discussions):

----
1. memory_add(): rollback (no change)

(Existing code)
   >>...
   if ( ret )
        goto destroy_m2p;

    if ( iommu_enabled && !iommu_passthrough && !need_iommu(hardware_domain) )
    {
        for ( i = spfn; i < epfn; i++ )
            if ( iommu_map_page(hardware_domain, i, i, 
IOMMUF_readable|IOMMUF_writable) )
                break;
        if ( i != epfn )
        {
            while (i-- > old_max)
                iommu_unmap_page(hardware_domain, i);
            goto destroy_m2p;
        }
    } 
   <<...
Existing code already rolls back through destroy_m2p cleanly.

----
2. vtd_set_hwdom_mapping(): no rollback (no change).

It is in early initialization phase. A quick thought looks a bit tricky to 
fully rollback this path, so propose to leave as it is.

----
3. __gnttab_map_grant_ref():rollback (no change)


(Existing code)
        >>...
        if ( (act_pin & (GNTPIN_hstw_mask|GNTPIN_devw_mask)) &&
             !(old_pin & (GNTPIN_hstw_mask|GNTPIN_devw_mask)) )
        {
            if ( !(kind & MAPKIND_WRITE) )
                err = iommu_map_page(ld, frame, frame,
                                     IOMMUF_readable|IOMMUF_writable);
        }
        else if ( act_pin && !old_pin )
        {
            if ( !kind )
                err = iommu_map_page(ld, frame, frame, IOMMUF_readable);
        }
        if ( err )
        {
            double_gt_unlock(lgt, rgt);
            rc = GNTST_general_error;
            goto undo_out;
        }
        <<...

rollback is already cleanly handled.

----
4. __gnttab_unmap_common():rollback (no change)

(Existing code)
       >>...
             if ( !kind )
            err = iommu_unmap_page(ld, op->frame);
        else if ( !(kind & MAPKIND_WRITE) )
            err = iommu_map_page(ld, op->frame, op->frame, IOMMUF_readable);

        double_gt_unlock(lgt, rgt);

        if ( err )
            rc = GNTST_general_error;
       <<...

Similarly no change required, as error has been correctly handled.

----
5. guest_physmap_add_entry():rollback (no change)

(Existing code)
   >>...
    rc = iommu_map_page(
        d, mfn + i, mfn + i, IOMMUF_readable|IOMMUF_writable);
    if ( rc != 0 )
    {
        while ( i-- > 0 )
            iommu_unmap_page(d, mfn + i);
        return rc;
    }
   <<...

Above is in the start of the function, so no additional state to be cleared 
thus it is in good shape too.

----
6. clear_identity_p2m_entry():rollback (no change).

(Existing code)
  >>...
    if ( !paging_mode_translate(d) )
    {
        if ( !need_iommu(d) )
            return 0;
        return iommu_unmap_page(d, gfn);
    }
  <<...

in the start, and error already rolled back to caller.

----
7. set_identity_p2m_entry():rollback (no change).

(Existing code)
  >>...
    if ( !paging_mode_translate(p2m->domain) )
    {
        if ( !need_iommu(d) )
            return 0;
        return iommu_map_page(d, gfn, gfn, IOMMUF_readable|IOMMUF_writable);
    }
  <<...
         if ( is_hardware_domain(d) && !iommu_use_hap_pt(d) )
        ret = iommu_map_page(d, gfn, gfn, IOMMUF_readable|IOMMUF_writable);
   >> ...

error handling and rollback already in-place.

----
8. p2m_remove_page():rollback one level(change required).

(Existing code)
  >>...
    if ( !paging_mode_translate(p2m->domain) )
    {
        if ( need_iommu(p2m->domain) )
            for ( i = 0; i < (1 << page_order); i++ )
                iommu_unmap_page(p2m->domain, mfn + i);
        return 0;
    }
  <<...

There is no error checking of iommu_unmap_page. We need to add.
However, there are several callers of this function which don't handle error at 
all. I'm not sure whether hardware domain will use this function.
Since it is in core p2m logic (which I don't have much knowledge), I hope we 
can print a warning and simply return error here (given the fact that 
non-hardware domains are all crashed immediately within unmap call)

----
9. p2m_pt_set_entry(): open (propose no-rollback).

(Existing code)
  >>...
           for ( i = 0; i < (1UL << page_order); i++ )
                iommu_map_page(p2m->domain, gfn + i, mfn_x(mfn) + i,
                               iommu_pte_flags);
        else
            for ( i = 0; i < (1UL << page_order); i++ )
                iommu_unmap_page(p2m->domain, gfn + i);
  <<...

Above is in the end of the func.

I'm not sure whether it will be called for hardware domain ( PVH mode is one 
potential concern. as we know, PVH has been introduced on Xen 4.4 as a DomU, 
and on Xen 4.5 as a Dom0.).

If not, we can leave it as today. Otherwise, I want to hear your suggestion 
whether we can use no-rollback (just erring) for hardware domain here since 
this code path is very complex.

  10. ept_set_entry(): open (propose no-rollback).

(Existing code)
  >>...
            if ( iommu_flags )
                for ( i = 0; i < (1 << order); i++ )
                    iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags);
            else
                for ( i = 0; i < (1 << order); i++ )
                    iommu_unmap_page(d, gfn + i);
  <<...

Above is in the end of the func.

Same open as 9) whether we can use no-rollback here.

----
11. __get_page_type(): open (propose no-rollback).
The following is in detail:
    >>...
      if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) )
        {
            if ( (x & PGT_type_mask) == PGT_writable_page )
                iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page)));
            else if ( type == PGT_writable_page )
                iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)),
                               page_to_mfn(page),
                               IOMMUF_readable|IOMMUF_writable);
        }
    <<...

It's almost in the end of the func. 
As the limited set of cases where __get_page_type() calls 
iommu_{,un}map_page(), and a get_page_type() invocation that
previously was known to never fail (or at least so we hope, based on the 
existing code) [I got it from Jan's reply.],
we can classify it as normal PV domain and be treated as a fatal error. 
So for I am inclined to leave it as today. 


I welcome your comments and feedback!!

Quan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.