[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Re: [PATCH 1/3] libxc: add xc_gnttab_map_grant_ref_notify



On 09/21/2011 06:03 AM, Ian Campbell wrote:
> On Mon, 2011-09-19 at 23:43 +0100, Daniel De Graaf wrote:
>> @@ -116,4 +116,35 @@ struct ioctl_gntdev_set_max_grants {
>>      uint32_t count;
>>  };
>>  
>> +/*
>> + * Sets up an unmap notification within the page, so that the other side 
>> can do
>> + * cleanup if this side crashes. Required to implement cross-domain robust
>> + * mutexes or close notification on communication channels.
>> + *
>> + * Each mapped page only supports one notification; multiple calls 
>> referring to
>> + * the same page overwrite the previous notification. You must clear the
>> + * notification prior to the IOCTL_GNTALLOC_DEALLOC_GREF if you do not want 
>> it
>> + * to occur.
>> + */
>> +#define IOCTL_GNTDEV_SET_UNMAP_NOTIFY \
>> +_IOC(_IOC_NONE, 'G', 7, sizeof(struct ioctl_gntdev_unmap_notify))
>> +struct ioctl_gntdev_unmap_notify {
>> +    /* IN parameters */
>> +    /* Offset in the file descriptor for a byte within the page (same as
>> +     * used in mmap).
> 
> I'm probably being thick but I don't understand what this means, i.e.
> what this thing is relative to.

This is an offset that acts like a byte offset in the /dev/xen/gntdev device.
Conceptually, if /dev/xen/evtchn implemented pwrite() then this offset would
be the offset you would pass to modify your target byte.

For example, if you use gntdev to map two pages, the first will be at offset
0 and the second at 4096; you would pass 4098 here to indicate the third byte
of the second page. The offsets (0, 4096) are returned by the map-grant ioctls.

>>  If using UNMAP_NOTIFY_CLEAR_BYTE, this is the byte to
>> +     * be cleared. Otherwise, it can be any byte in the page whose
>> +     * notification we are adjusting.
>> +     */
>> +    uint64_t index;
>> +    /* Action(s) to take on unmap */
>> +    uint32_t action;
>> +    /* Event channel to notify */
>> +    uint32_t event_channel_port;
> 
> evtchn_port_t ?

Using that would require an include dependency on event_channel.h which is
not exposed as a userspace-visible header. Linux's evtchn.h also does not
use evtchn_port_t (it uses unsigned int).

Since this is just a direct copy of the header from the linux source tree,
any changes really need to happen there first.

>> +};
>> +
>> +/* Clear (set to zero) the byte specified by index */
>> +#define UNMAP_NOTIFY_CLEAR_BYTE 0x1
>> +/* Send an interrupt on the indicated event channel */
>> +#define UNMAP_NOTIFY_SEND_EVENT 0x2
>> +
>>  #endif /* __LINUX_PUBLIC_GNTDEV_H__ */
>> diff --git a/tools/libxc/xc_gnttab.c b/tools/libxc/xc_gnttab.c
>> index 4f55fce..3d3c63b 100644
>> --- a/tools/libxc/xc_gnttab.c
>> +++ b/tools/libxc/xc_gnttab.c
>> @@ -18,6 +18,7 @@
>>   */
>>  
>>  #include "xc_private.h"
>> +#include <errno.h>
>>  
>>  int xc_gnttab_op(xc_interface *xch, int cmd, void * op, int op_size, int 
>> count)
>>  {
>> @@ -174,6 +175,28 @@ void *xc_gnttab_map_domain_grant_refs(xc_gnttab *xcg,
>>                                                      count, domid, refs, 
>> prot);
>>  }
>>  
>> +void *xc_gnttab_map_grant_ref_notify(xc_gnttab *xcg,
>> +                                     uint32_t domid,
>> +                                     uint32_t ref,
>> +                                     uint32_t notify_offset,
>> +                                     evtchn_port_t notify_port,
>> +                                     int *notify_result)
>> +{
>> +    if (xcg->ops->u.gnttab.map_grant_ref_notify)
>> +            return xcg->ops->u.gnttab.map_grant_ref_notify(xcg, 
>> xcg->ops_handle,
>> +                    domid, ref, notify_offset, notify_port, notify_result);
>> +    else {
>> +            void* area = xcg->ops->u.gnttab.map_grant_ref(xcg, 
>> xcg->ops_handle,
>> +                    domid, ref, PROT_READ|PROT_WRITE);
>> +            if (area && notify_result) {
>> +                    *notify_result = -1;
>> +                    errno = ENOSYS;
>> +            }
>> +            return area;
>> +    }
> 
> I think the new public interface is fine but do we really need a new
> internal interface here?
> 
> I think you can just add the notify_* arguments to the existing OSDEP
> function and have those OS backends which don't implement that feature
> return ENOSYS if notify_offset != 0 (or ~0 or whatever invalid value
> works).
> 
> Why doesn't the *_notify variant take a prot argument?

At least for the byte-clear portion of the notify, the page must be writable
or the notify will not work. I suppose an event channel alone could be used
for a read-only mapping, but normally the unmapping of a read-only mapping is
not an important event - although I guess you could contrive a use for this
type of notificaiton, so there's no reason not to allow it.

> I'd be tempted to do away with notify_result too -- if the caller asked
> for notification and we fail to give that then we can cleanup and return
> an error. If they want to try again without the notification then that's
> up to them.

The source of the error might be unclear, but this would make the interface
cleaner.

> 
>> +}
>> +
>> +
>>  int xc_gnttab_munmap(xc_gnttab *xcg,
>>                       void *start_address,
>>                       uint32_t count)
>> diff --git a/tools/libxc/xc_linux_osdep.c b/tools/libxc/xc_linux_osdep.c
>> index dca6718..3040cb6 100644
>> --- a/tools/libxc/xc_linux_osdep.c
>> +++ b/tools/libxc/xc_linux_osdep.c
>> @@ -613,6 +613,62 @@ static void 
>> *linux_gnttab_map_domain_grant_refs(xc_gnttab *xcg, xc_osdep_handle
>>      return do_gnttab_map_grant_refs(xcg, h, count, &domid, 0, refs, prot);
>>  }
>>  
>> +static void *linux_gnttab_map_grant_ref_notify(xc_gnttab *xch, 
>> xc_osdep_handle h,
>> +                                               uint32_t domid, uint32_t ref,
>> +                                               uint32_t notify_offset,
>> +                                               evtchn_port_t notify_port,
>> +                                               int *notify_result)
>> +{
>> +    int fd = (int)h;
>> +    int rv = 0;
>> +    struct ioctl_gntdev_map_grant_ref map;
>> +    struct ioctl_gntdev_unmap_notify notify;
>> +    void *addr;
>> +
>> +    map.count = 1;
>> +    map.refs[0].domid = domid;
>> +    map.refs[0].ref = ref;
>> +
>> +    if ( ioctl(fd, IOCTL_GNTDEV_MAP_GRANT_REF, &map) ) {
>> +        PERROR("xc_gnttab_map_grant_ref: ioctl MAP_GRANT_REF failed");
>> +        return NULL;
>> +    }
>> +
>> +    addr = mmap(NULL, XC_PAGE_SIZE, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 
>> map.index);
>> +    if ( addr == MAP_FAILED )
>> +    {
>> +        int saved_errno = errno;
>> +        struct ioctl_gntdev_unmap_grant_ref unmap_grant;
>> +
>> +        PERROR("xc_gnttab_map_grant_ref: mmap failed");
>> +        unmap_grant.index = map.index;
>> +        unmap_grant.count = 1;
>> +        ioctl(fd, IOCTL_GNTDEV_UNMAP_GRANT_REF, &unmap_grant);
>> +        errno = saved_errno;
>> +        return NULL;
>> +    }
> 
> The non-notify variant handles EAGAIN, why doesn't this one need to do
> so?

The non-notify variant doesn't need to handle EAGAIN anymore (with modern
kernels, at least... perhaps it should remain for older kernels). Also,
do_gnttab_map_grant_refs does not handle EAGAIN.

>> +
>> +    notify.index = map.index;
>> +    notify.action = 0;
>> +    if (notify_offset >= 0) {
>> +        notify.index += notify_offset;
>> +        notify.action |= UNMAP_NOTIFY_CLEAR_BYTE;
>> +    }
>> +    if (notify_port >= 0) {
>> +        notify.event_channel_port = notify_port;
>> +        notify.action |= UNMAP_NOTIFY_SEND_EVENT;
>> +    }
>> +    if (notify.action)
>> +        rv = ioctl(fd, IOCTL_GNTDEV_SET_UNMAP_NOTIFY, &notify);
> 
> Is there a race if the other end (or this process) dies between the MAP
> ioctl and here?
> 
> Ian.
> 

Technically it's a race, but it doesn't impact any reasonable use of the
notification. The local process can't actually be using the shared page
at this point, and the other side will not be certain that the map has
actually succeeded until after the function returns (and it is notified
in some way - libvchan changes the notify byte from 2->1 at this point).

If the domain whose memory we are mapping crashes, this ioctl will succeed
unless the event channel it refers to has already been invalidated - but
either way, the notifications are now irrelevant as there is nobody to
listen to them.

-- 
Daniel De Graaf
National Security Agency

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.