[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] xen/HVM: atomically access pointers in bufioreq handling



On Wed, 22 Jul 2015, Jan Beulich wrote:
> >> The number of slots per page being 511 (i.e. not a power of two) means
> >> that the (32-bit) read and write indexes going beyond 2^32 will likely
> >> disturb operation. The hypervisor side gets I/O req server creation
> >> extended so we can indicate that we're using suitable atomic accesses
> >> where needed (not all accesses to the two pointers really need to be
> >> atomic), allowing it to atomically canonicalize both pointers when both
> >> have gone through at least one cycle.
> > 
> > The description is a bit terse: which accesses don't really need to be
> > atomic?
> 
> Perhaps I should drop this part - I more or less copied the hypervisor
> side's commit message, and the above really applies to e.g.
> 
>     if ( (pg->ptrs.write_pointer - pg->ptrs.read_pointer) >=
>          (IOREQ_BUFFER_SLOT_NUM - qw) )
> 
> in hypervisor code.

Yes, please remove it as it is confusing.


> >> --- a/xen-hvm.c
> >> +++ b/xen-hvm.c
> >> @@ -981,19 +981,30 @@ static void handle_ioreq(XenIOState *sta
> >>  
> >>  static int handle_buffered_iopage(XenIOState *state)
> >>  {
> >> +    buffered_iopage_t *buf_page = state->buffered_io_page;
> >>      buf_ioreq_t *buf_req = NULL;
> >>      ioreq_t req;
> >>      int qw;
> >>  
> >> -    if (!state->buffered_io_page) {
> >> +    if (!buf_page) {
> >>          return 0;
> >>      }
> >>  
> >>      memset(&req, 0x00, sizeof(req));
> >>  
> >> -    while (state->buffered_io_page->read_pointer != 
> >> state->buffered_io_page->write_pointer) {
> >> -        buf_req = &state->buffered_io_page->buf_ioreq[
> >> -            state->buffered_io_page->read_pointer % 
> >> IOREQ_BUFFER_SLOT_NUM];
> >> +    for (;;) {
> >> +        uint32_t rdptr = buf_page->read_pointer, wrptr;
> >> +
> >> +        xen_rmb();
> > 
> > We don't need this barrier.
> 
> How would we not? We need to make sure we read in this order
> read_pointer, write_pointer, and read_pointer again (in the
> comparison).  Only that way we can be certain to hold a matching
> pair in hands at the end.

See below


> >> +        wrptr = buf_page->write_pointer;
> >> +        xen_rmb();
> >> +        if (rdptr != buf_page->read_pointer) {
> > 
> > I think you have to use atomic_read to be sure that the second read to
> > buf_page->read_pointer is up to date and not optimized away.
> 
> No, suppressing such an optimization is an intended (side) effect
> of the barriers used.

I understand what you are saying but I am not sure if your assumption
is correct. Can the compiler optimize the second read anyway?
In any case, if you follow my suggestion below, it won't matter.


> > But if I think that it would be best to simply use atomic_read to read
> > both pointers at once using uint64_t as type, so you are sure to get a
> > consistent view and there is no need for this check.
> 
> But I'm specifically trying to avoid e.g. a locked cmpxchg8b here on
> ix86.

OK, but we don't need cmpxchg8b, just:

#define atomic_read(ptr)       (*(__typeof__(*ptr) volatile*) (ptr))

something like:

 for (;;) {
     uint64_t ptrs;
     uint32_t rdptr, wrptr;
 
     ptrs = atomic_read((uint64_t*)&state->buffered_io_page->read_pointer);
     rdptr = (uint32_t)ptrs;
     wrptr = *(((uint32_t*)&ptrs) + 1);
 
     if (rdptr == wrptr) {
         continue;
     }
 
     [work]
 
     atomic_add(&buf_page->read_pointer, qw + 1);
 }

it would work, wouldn't it?


> >>          handle_ioreq(state, &req);
> >>  
> >> -        xen_mb();
> >> -        state->buffered_io_page->read_pointer += qw ? 2 : 1;
> >> +        atomic_add(&buf_page->read_pointer, qw + 1);
> > 
> > I couldn't get specific info on the type of barrier implemented by
> > __sync_fetch_and_add, so I cannot tell for sure whether removing
> > xen_mb() is appropriate. Do you have a link? I suspect that given the
> > strong guarantees of the x86 architecture we'll be fine. I would be less
> > confident if this code was used on other archs.
> 
> gcc.pdf, in the section covering them, says "In most cases, these
> built-in functions are considered a full barrier. [...] Further,
> instructions are issued as necessary to prevent the processor from
> speculating loads across the operation and from queuing stores
> after the operation." Details on individual builtins subsequently
> tell the exceptions from this general rule, but the one used here is
> not among the exceptions.

Good, thanks for looking it up


> >> --- a/include/hw/xen/xen_common.h
> >> +++ b/include/hw/xen/xen_common.h
> >> @@ -370,7 +370,8 @@ static inline void xen_unmap_pcidev(XenX
> >>  static inline int xen_create_ioreq_server(XenXC xc, domid_t dom,
> >>                                            ioservid_t *ioservid)
> >>  {
> >> -    int rc = xc_hvm_create_ioreq_server(xc, dom, 1, ioservid);
> >> +    int rc = xc_hvm_create_ioreq_server(xc, dom, 
> >> HVM_IOREQSRV_BUFIOREQ_ATOMIC,
> >> +                                        ioservid);
> > 
> > I am concerned that passing 2 instead of 1 could break older
> > hypervisors. However handle_bufioreq was never defined as a true
> > boolean, so maybe it is OK?
> 
> Indeed I'm building on it only having done == 0 or != 0 checks.
> 
> > The alternative would be to create a xen_xc_hvm_create_ioreq_server
> > versioned wrapper in include/hw/xen/xen_common.h.
> 
> Which is what I aimed at avoiding.

OK

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.