[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 4/4] xen/arm: clean and invalidate all guest caches by VMID after domain build.



On Wed, 2014-01-29 at 15:01 +0000, Jan Beulich wrote:
> >>> On 29.01.14 at 15:15, Ian Campbell <Ian.Campbell@xxxxxxxxxx> wrote:
> > On Wed, 2014-01-29 at 13:00 +0000, Jan Beulich wrote:
> >> >>> On 29.01.14 at 13:11, Ian Campbell <ian.campbell@xxxxxxxxxx> wrote:
> >> > +    case XEN_DOMCTL_cacheflush:
> >> > +    {
> >> > +        long rc = p2m_cache_flush(d, &domctl->u.cacheflush.start_mfn);
> >> > +        if ( __copy_to_guest(u_domctl, domctl, 1) )
> >> 
> >> While you certainly say so in the public header change, I think you
> >> recall that we pretty recently changed another hypercall to not be
> >> the only inconsistent one modifying the input structure in order to
> >> handle hypercall preemption.
> > 
> > That was a XNEMEM though IIRC -- is the same requirement also true of
> > domctl's?
> 
> Not necessarily - I was just trying to point out the issue to
> avoid needing to fix it later on.

OK, but you do think it should be fixed "transparently" rather than made
an explicit part of the API?

> > How/where would you recommend saving the progress here?
> 
> Depending on the nature, a per-domain or per-vCPU field that
> gets acted upon before issuing any new, similar operation. I.e.
> something along the lines of x86's old_guest_table. It's ugly, I
> know. But with exposing domctls to semi-trusted guests in
> mind, you may use state modifiable by the caller here only if
> tampering with that state isn't going to harm the whole system
> (if the guest being started is affected in the case here that
> obviously wouldn't be a problem).

Hrm, thanks for raising this -- it made me realise that we cannot
necessarily rely on the disaggregated domain builder to even issue this
call at all.

That would be OK from the point of view of not flushing the things which
the builder touched (as you say, it can only harm the domain it is
building). But it is not ok from the PoV of flushing scrubbed data from
the cache, ensuring that the scrubbed bytes reach RAM (i.e. it can leak
old data).

So I think I need an approach which flushes the scrubbed pages as it
does the scrubbing (this makes a certain logical sense anyway) and have
the toolstack issue hypercalls to flush the stuff it has written. (the
first approach to this issue tried to do this but used a system call
provided by Linux which didn't have the quite correct semantics, but
using a version of this hypercall with a range should work).

Before I get too deep into that, do you think that
        struct xen_domctl_cacheflush {
            /* start_mfn is updated for progress over preemption. */
            xen_pfn_t start_mfn;
            xen_pfn_t end_mfn;
        };
        
is acceptable or do you want me to try and find a way to do preemption
without the write back?

The blobs written by the toolstack aren't likely to be >1GB in size, so
rejecting over large ranges would be a potential option, but it's not
totally satisfactory.

> >> >          /* Preempt every 2MiB (mapped) or 32 MiB (unmapped) - arbitrary 
> >> > */
> >> > -        if ( op == RELINQUISH && count >= 0x2000 )
> >> > +        switch ( op )
> >> >          {
> >> > -            if ( hypercall_preempt_check() )
> >> > +        case RELINQUISH:
> >> > +        case CACHEFLUSH:
> >> > +            if (count >= 0x2000 && hypercall_preempt_check() )
> >> >              {
> >> >                  p2m->lowest_mapped_gfn = addr >> PAGE_SHIFT;
> >> >                  rc = -EAGAIN;
> >> >                  goto out;
> >> >              }
> >> >              count = 0;
> >> > +            break;
> >> > +        case INSERT:
> >> > +        case ALLOCATE:
> >> > +        case REMOVE:
> >> > +            /* No preemption */
> >> > +            break;
> >> >          }
> >> 
> >> Unrelated to the patch here, but don't you have a problem if you
> >> don't preempt _at all_ here for certain operation types? Or is a
> >> limit on the number of iterations being enforced elsewhere for
> >> those?
> > 
> > Good question.
> > 
> > The tools/guest accessible paths here are through
> > guest_physmap_add/remove_page. I think the only paths which are exposed
> > that pass a non-zero order are XENMEM_populate_physmap and
> > XENMEM_exchange, bot of which restrict the maximum order.
> > 
> > I don't think those guest_physmap_* are preemptible on x86 either?
> 
> They aren't, but they have a strict upper limit of at most dealing
> with a 1Gb page at a time. If that's similar for ARM, I don't see
> an immediate issue.

Same on ARM (through common code using MAX_ORDER == 20)

> > It's possible that we should nevertheless handle preemption on those
> > code paths as well, but I don't think it is critical right now (*or at
> > least not critical enough to warrant an freeze exception for 4.4).
> 
> Indeed. Of course the 1Gb limit mentioned above, while perhaps
> acceptable to process without preemption right now, is still pretty
> high for achieving really good responsiveness, so we may need to
> do something about that going forward.

Right. I won't worry about it now though.

> 
> Jan
> 



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.