|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] abstract model of IOMMU unmaping/mapping failures
On April 08, 2016 6:44am, <JBeulich@xxxxxxxx> wrote:
> >>> On 06.04.16 at 09:38, <quan.xu@xxxxxxxxx> wrote:
> > On April 01, 2016 7:57pm, <JBeulich@xxxxxxxx> wrote:
> >> >>> On 31.03.16 at 11:06, <quan.xu@xxxxxxxxx> wrote:
> >> > 4. __gnttab_unmap_common():rollback (no change)
> >> >
> >> > (Existing code)
> >> > >>...
> >> > if ( !kind )
> >> > err = iommu_unmap_page(ld, op->frame);
> >> > else if ( !(kind & MAPKIND_WRITE) )
> >> > err = iommu_map_page(ld, op->frame, op->frame,
> >> > IOMMUF_readable);
> >> >
> >> > double_gt_unlock(lgt, rgt);
> >> >
> >> > if ( err )
> >> > rc = GNTST_general_error;
> >> > <<...
> >> >
> >> > Similarly no change required, as error has been correctly handled.
> >>
> >> I wouldn't call this "correctly handled", but for the hardware domain
> >> it should be good enough, and by crashing DomU-s simply reporting the
> >> error up the call tree is sufficient. One question though is whether
> >> the loops higher up the call tree in grant_table.c shouldn't be
> >> exited when noticing the domain has crashed, both to avoid unnecessary
> work and to reduce the risk of secondary problems.
> >>
> >
> > For this point, I suppose that the domain structure is not destructed
> > (we are safe to call domain's element, e.g. ld->grant_table) in
> > do_grant_table_op hypercall cycle, even the domain is crashed down. I
> > am not quite sure, whether it is correct or not, could you explain
> > more?
>
> Explain what more?
> Sure, struct domain stays around until the domain gets
> actually cleaned up,
Explain why it is sure. Now i have got it.
> so accesses to its grant table (and alike) remain valid while
> execution didn't leave the context of that guest (vCPU) yet.
>
Thanks for your reminding.
> >> > 8. p2m_remove_page():rollback one level(change required).
> >> >
> >> > (Existing code)
> >> > >>...
> >> > if ( !paging_mode_translate(p2m->domain) )
> >> > {
> >> > if ( need_iommu(p2m->domain) )
> >> > for ( i = 0; i < (1 << page_order); i++ )
> >> > iommu_unmap_page(p2m->domain, mfn + i);
> >> > return 0;
> >> > }
> >> > <<...
> >> >
> >> > There is no error checking of iommu_unmap_page. We need to add.
> >> > However, there are several callers of this function which don't
> >> > handle error at all. I'm not sure whether hardware domain will use this
> function.
> >> > Since it is in core p2m logic (which I don't have much knowledge),
> >> > I hope we can print a warning and simply return error here (given
> >> > the fact that non-hardware domains are all crashed immediately
> >> > within unmap call)
> >>
> >> Yes, at least error propagation needs to be added here. I don't think
> >> more is required. (Be careful, btw, with adding warnings - you must
> >> not spam the log with such.)
> >>
> >
> >
> > agreed. A quick question about error propagation:
> > ...
> > for ( i = 0; i < (1UL << page_order); i++ )
> > iommu_unmap_page(p2m->domain, gfn + i); ...
> >
> > As you mentioned, as a special case, unmapping should perhaps continue
> > despite an error, in an attempt to do best effort cleanup.
> > Then i could modify the code as:
> >
> > ...
> > for ( i = 0; i < (1UL << page_order); i++ ) {
> > rc = iommu_unmap_page(p2m->domain, gfn + i);
> > if ( rc )
> > ret = rc;
> > }
> > ..
> > return ret;
> > ...
> >
> > It looks cumbersome to me, any suggestion?
>
> What's cumbersome here?
>
Sorry, I don't like to introduce 'rc'/'ret' in the same function. Ignore me, if
it is working.
> >> > 11. __get_page_type(): open (propose no-rollback).
> >> > The following is in detail:
> >> > >>...
> >> > if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) )
> >> > {
> >> > if ( (x & PGT_type_mask) == PGT_writable_page )
> >> > iommu_unmap_page(d, mfn_to_gmfn(d,
> >> page_to_mfn(page)));
> >> > else if ( type == PGT_writable_page )
> >> > iommu_map_page(d, mfn_to_gmfn(d,
> >> page_to_mfn(page)),
> >> > page_to_mfn(page),
> >> >
> >> IOMMUF_readable|IOMMUF_writable);
> >> > }
> >> > <<...
> >> >
> >> > It's almost in the end of the func.
> >> > As the limited set of cases where __get_page_type() calls
> >> > iommu_{,un}map_page(), and a get_page_type() invocation that
> >> > previously was known to never fail (or at least so we hope, based
> >> > on the existing code) [I got it from Jan's reply.],
> >>
> >> Please don't mis-quote me: I said this for a specific caller of the
> >> function, not about the function itself.
> >>
> >
> > Sorry for that.
> >
> >> > we can classify it as normal PV domain and be treated as a fatal error.
> >> > So for I am inclined to leave it as today.
> >>
> >
> > I think one level propagation, only propagated in __get_page_type(),
> > is enough.
>
> Not sure what you mean with "one level propagation" here.
>
one level propagation is propagating error only in __get_page_type(), and
leaving the callers (one or more levels up) as they are.
e.g. the call tree,
__get_page_type() -- get_page_type() -- do_mmu_update()--..
propagate error only in __get_page_type(), and leaving the get_page_type() /
do_mmu_update() as they are.
Jan, thanks for your review. I'd try my best on the release window.
Quan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |