[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Re: [PATCH 1/3] libxc: add xc_gnttab_map_grant_ref_notify



On Wed, 2011-09-21 at 16:02 +0100, Daniel De Graaf wrote:
> On 09/21/2011 06:03 AM, Ian Campbell wrote:
> > On Mon, 2011-09-19 at 23:43 +0100, Daniel De Graaf wrote:
> >> @@ -116,4 +116,35 @@ struct ioctl_gntdev_set_max_grants {
> >>    uint32_t count;
> >>  };
> >>  
> >> +/*
> >> + * Sets up an unmap notification within the page, so that the other side 
> >> can do
> >> + * cleanup if this side crashes. Required to implement cross-domain robust
> >> + * mutexes or close notification on communication channels.
> >> + *
> >> + * Each mapped page only supports one notification; multiple calls 
> >> referring to
> >> + * the same page overwrite the previous notification. You must clear the
> >> + * notification prior to the IOCTL_GNTALLOC_DEALLOC_GREF if you do not 
> >> want it
> >> + * to occur.
> >> + */
> >> +#define IOCTL_GNTDEV_SET_UNMAP_NOTIFY \
> >> +_IOC(_IOC_NONE, 'G', 7, sizeof(struct ioctl_gntdev_unmap_notify))
> >> +struct ioctl_gntdev_unmap_notify {
> >> +  /* IN parameters */
> >> +  /* Offset in the file descriptor for a byte within the page (same as
> >> +   * used in mmap).
> > 
> > I'm probably being thick but I don't understand what this means, i.e.
> > what this thing is relative to.
> 
> This is an offset that acts like a byte offset in the /dev/xen/gntdev device.
> Conceptually, if /dev/xen/evtchn implemented pwrite() then this offset would
> be the offset you would pass to modify your target byte.
> 
> For example, if you use gntdev to map two pages, the first will be at offset
> 0 and the second at 4096; you would pass 4098 here to indicate the third byte
> of the second page. The offsets (0, 4096) are returned by the map-grant 
> ioctls.

Hmm. I think I was confused because it was an offset into the file
rather than, say, a virtual address.

When I map a page how do I know what the offset of it is wrt the file
descriptor? DO I just have to remember how many pages I mapped an *=
4096?

> 
> >>  If using UNMAP_NOTIFY_CLEAR_BYTE, this is the byte to
> >> +   * be cleared. Otherwise, it can be any byte in the page whose
> >> +   * notification we are adjusting.
> >> +   */
> >> +  uint64_t index;
> >> +  /* Action(s) to take on unmap */
> >> +  uint32_t action;
> >> +  /* Event channel to notify */
> >> +  uint32_t event_channel_port;
> > 
> > evtchn_port_t ?
> 
> Using that would require an include dependency on event_channel.h which is
> not exposed as a userspace-visible header. Linux's evtchn.h also does not
> use evtchn_port_t (it uses unsigned int).
> 
> Since this is just a direct copy of the header from the linux source tree,
> any changes really need to happen there first.

OK, that's fine as it is then.

> 
> >> +};
> >> +
> >> +/* Clear (set to zero) the byte specified by index */
> >> +#define UNMAP_NOTIFY_CLEAR_BYTE 0x1
> >> +/* Send an interrupt on the indicated event channel */
> >> +#define UNMAP_NOTIFY_SEND_EVENT 0x2
> >> +
> >>  #endif /* __LINUX_PUBLIC_GNTDEV_H__ */
> >> diff --git a/tools/libxc/xc_gnttab.c b/tools/libxc/xc_gnttab.c
> >> index 4f55fce..3d3c63b 100644
> >> --- a/tools/libxc/xc_gnttab.c
> >> +++ b/tools/libxc/xc_gnttab.c
> >> @@ -18,6 +18,7 @@
> >>   */
> >>  
> >>  #include "xc_private.h"
> >> +#include <errno.h>
> >>  
> >>  int xc_gnttab_op(xc_interface *xch, int cmd, void * op, int op_size, int 
> >> count)
> >>  {
> >> @@ -174,6 +175,28 @@ void *xc_gnttab_map_domain_grant_refs(xc_gnttab *xcg,
> >>                                                    count, domid, refs, 
> >> prot);
> >>  }
> >>  
> >> +void *xc_gnttab_map_grant_ref_notify(xc_gnttab *xcg,
> >> +                                     uint32_t domid,
> >> +                                     uint32_t ref,
> >> +                                     uint32_t notify_offset,
> >> +                                     evtchn_port_t notify_port,
> >> +                                     int *notify_result)
> >> +{
> >> +  if (xcg->ops->u.gnttab.map_grant_ref_notify)
> >> +          return xcg->ops->u.gnttab.map_grant_ref_notify(xcg, 
> >> xcg->ops_handle,
> >> +                  domid, ref, notify_offset, notify_port, notify_result);
> >> +  else {
> >> +          void* area = xcg->ops->u.gnttab.map_grant_ref(xcg, 
> >> xcg->ops_handle,
> >> +                  domid, ref, PROT_READ|PROT_WRITE);
> >> +          if (area && notify_result) {
> >> +                  *notify_result = -1;
> >> +                  errno = ENOSYS;
> >> +          }
> >> +          return area;
> >> +  }
> > 
> > I think the new public interface is fine but do we really need a new
> > internal interface here?
> > 
> > I think you can just add the notify_* arguments to the existing OSDEP
> > function and have those OS backends which don't implement that feature
> > return ENOSYS if notify_offset != 0 (or ~0 or whatever invalid value
> > works).
> > 
> > Why doesn't the *_notify variant take a prot argument?
> 
> At least for the byte-clear portion of the notify, the page must be writable
> or the notify will not work. I suppose an event channel alone could be used
> for a read-only mapping, but normally the unmapping of a read-only mapping is
> not an important event - although I guess you could contrive a use for this
> type of notificaiton, so there's no reason not to allow it.

If you combine this the two functions then returning EINVAL for attempts
to map without PROT_WRITE (or whatever perm is necessary) would be
reasonable IMHO.

> > I'd be tempted to do away with notify_result too -- if the caller asked
> > for notification and we fail to give that then we can cleanup and return
> > an error. If they want to try again without the notification then that's
> > up to them.
> 
> The source of the error might be unclear, but this would make the interface
> cleaner.
> 
> > 
> >> +}
> >> +
> >> +
> >>  int xc_gnttab_munmap(xc_gnttab *xcg,
> >>                       void *start_address,
> >>                       uint32_t count)
> >> diff --git a/tools/libxc/xc_linux_osdep.c b/tools/libxc/xc_linux_osdep.c
> >> index dca6718..3040cb6 100644
> >> --- a/tools/libxc/xc_linux_osdep.c
> >> +++ b/tools/libxc/xc_linux_osdep.c
> >> @@ -613,6 +613,62 @@ static void 
> >> *linux_gnttab_map_domain_grant_refs(xc_gnttab *xcg, xc_osdep_handle
> >>      return do_gnttab_map_grant_refs(xcg, h, count, &domid, 0, refs, prot);
> >>  }
> >>  
> >> +static void *linux_gnttab_map_grant_ref_notify(xc_gnttab *xch, 
> >> xc_osdep_handle h,
> >> +                                               uint32_t domid, uint32_t 
> >> ref,
> >> +                                               uint32_t notify_offset,
> >> +                                               evtchn_port_t notify_port,
> >> +                                               int *notify_result)
> >> +{
> >> +    int fd = (int)h;
> >> +    int rv = 0;
> >> +    struct ioctl_gntdev_map_grant_ref map;
> >> +    struct ioctl_gntdev_unmap_notify notify;
> >> +    void *addr;
> >> +
> >> +    map.count = 1;
> >> +    map.refs[0].domid = domid;
> >> +    map.refs[0].ref = ref;
> >> +
> >> +    if ( ioctl(fd, IOCTL_GNTDEV_MAP_GRANT_REF, &map) ) {
> >> +        PERROR("xc_gnttab_map_grant_ref: ioctl MAP_GRANT_REF failed");
> >> +        return NULL;
> >> +    }
> >> +
> >> +    addr = mmap(NULL, XC_PAGE_SIZE, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 
> >> map.index);
> >> +    if ( addr == MAP_FAILED )
> >> +    {
> >> +        int saved_errno = errno;
> >> +        struct ioctl_gntdev_unmap_grant_ref unmap_grant;
> >> +
> >> +        PERROR("xc_gnttab_map_grant_ref: mmap failed");
> >> +        unmap_grant.index = map.index;
> >> +        unmap_grant.count = 1;
> >> +        ioctl(fd, IOCTL_GNTDEV_UNMAP_GRANT_REF, &unmap_grant);
> >> +        errno = saved_errno;
> >> +        return NULL;
> >> +    }
> > 
> > The non-notify variant handles EAGAIN, why doesn't this one need to do
> > so?
> 
> The non-notify variant doesn't need to handle EAGAIN anymore (with modern
> kernels, at least... perhaps it should remain for older kernels). Also,
> do_gnttab_map_grant_refs does not handle EAGAIN.

OK I guess that is fine (although if you combine them as I suggest it
comes back?)

I hadn't noticed that we have both map_gratn_ref and map_grant_refs,
that seems like an area which could be cleaned up. (not saying you
should, just noticing it)

> 
> >> +
> >> +    notify.index = map.index;
> >> +    notify.action = 0;
> >> +    if (notify_offset >= 0) {
> >> +        notify.index += notify_offset;
> >> +        notify.action |= UNMAP_NOTIFY_CLEAR_BYTE;
> >> +    }
> >> +    if (notify_port >= 0) {
> >> +        notify.event_channel_port = notify_port;
> >> +        notify.action |= UNMAP_NOTIFY_SEND_EVENT;
> >> +    }
> >> +    if (notify.action)
> >> +        rv = ioctl(fd, IOCTL_GNTDEV_SET_UNMAP_NOTIFY, &notify);
> > 
> > Is there a race if the other end (or this process) dies between the MAP
> > ioctl and here?
> > 
> > Ian.
> > 
> 
> Technically it's a race, but it doesn't impact any reasonable use of the
> notification. The local process can't actually be using the shared page
> at this point, and the other side will not be certain that the map has
> actually succeeded until after the function returns (and it is notified
> in some way - libvchan changes the notify byte from 2->1 at this point).
> 
> If the domain whose memory we are mapping crashes, this ioctl will succeed
> unless the event channel it refers to has already been invalidated - but
> either way, the notifications are now irrelevant as there is nobody to
> listen to them.

OK.

Thanks,
Ian.

> 




_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.