[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v15 3/7] remus netbuffer: implement remus network buffering for nic devices



Yang Hongyang writes ("[PATCH v15 3/7] remus netbuffer: implement remus network 
buffering for nic devices"):
> diff --git a/tools/hotplug/Linux/remus-netbuf-setup 
> b/tools/hotplug/Linux/remus-netbuf-setup
> new file mode 100644
> index 0000000..58c46f3
> --- /dev/null
> +++ b/tools/hotplug/Linux/remus-netbuf-setup
> @@ -0,0 +1,203 @@
> +#!/bin/bash
> +#============================================================================
> +# ${XEN_SCRIPT_DIR}/remus-netbuf-setup
...
> +# IFB         ifb interface to be cleaned up (required). [for teardown op 
> only]

It occurs to me to ask: Is it wise to use such a bare name for the
variable IFB ?  I would suggest that it might be better to use a
variable with XEN or REMUS or something in it.  Otherwise perhaps
someone somewhere might think IFB stood for something else and would
set it to something irrelevant (or your setting of it might break
their usage...)

I had some other comments about this script in v10, which we were
having a conversation about.  I'm afraid I seem to have dropped the
thread of that.  I will reply separately in that thread.

> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> index bcbd02b..b7d62c1 100644
> --- a/tools/libxl/libxl.c
> +++ b/tools/libxl/libxl.c
> @@ -788,6 +788,24 @@ int libxl_domain_remus_start(libxl_ctx *ctx, 
> libxl_domain_remus_info *info,
...
> +    /* Setup network buffering */

This comment is misleading.  This code doesn't actually set up any
network buffering.  It doesn't call the setup script.  All it does is
some checks and computing the script path.

It would probably be better to simply remove the comment.  Or you
could replace it with something more accurate.

> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index 8ef20a0..2fe36a6 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -310,6 +310,10 @@ struct libxl__gc {
>      libxl_ctx *owner;
>  };
>  
> +/* remus device ops specific structures start */
> +typedef struct libxl__remus_netbuf_state libxl__remus_netbuf_state;
> +/* remus device ops specific structures end */

I don't think these comments add anything.  You should move these
struct typedefs alongside the other struct typedefs (libxl__ao et al).
There is then no need to comment what they are.

>  struct libxl__ctx {
>      xentoollog_logger *lg;
>      xc_interface *xch;
> @@ -374,6 +378,9 @@ struct libxl__ctx {
>      LIBXL_LIST_ENTRY(libxl_ctx) sigchld_users_entry;
>  
>      libxl_version_info version_info;
> +
> +    /* remus device ops specific structures */
> +    libxl__remus_netbuf_state *rns;

Again, I think this comment adds nothing that isn't in the type name.
There is no need to say in a comment anything that is plain from the
code.

But before you delete it (and other comments that I'm going to quibble
about), please wait for a comment from the other tools maintainer, Ian
Campbell, who may disagree.

> @@ -2631,6 +2639,8 @@ struct libxl__remus_state {
>      libxl__ao *ao;
>      uint32_t domid;
>      libxl__remus_callback *callback;
> +    /* Script to setup/teardown network buffers */
> +    const char *netbufscript;

And again.

> diff --git a/tools/libxl/libxl_netbuffer.c b/tools/libxl/libxl_netbuffer.c
> index 52d593c..025ee89 100644
> --- a/tools/libxl/libxl_netbuffer.c
> +++ b/tools/libxl/libxl_netbuffer.c
> @@ -17,11 +17,486 @@
>  
>  #include "libxl_internal.h"
>  
> +#include <netlink/cache.h>
> +#include <netlink/socket.h>
> +#include <netlink/attr.h>
> +#include <netlink/route/link.h>
> +#include <netlink/route/route.h>
> +#include <netlink/route/qdisc.h>
> +#include <netlink/route/qdisc/plug.h>
> +
> +struct libxl__remus_netbuf_state {
> +    libxl__ao *ao;
> +    uint32_t domid;
> +    const char *netbufscript;

Why is there a copy of netbufscript here ?

> +/*----- init() and destroy() -----*/
> +
> +static int nic_init(const libxl__remus_device_ops *self,
> +                    libxl__remus_state *rs)
> +{

This function seems to leak the things it allocates, on error.
The `out' section should probably call destroy.

> +    ns->nlsock = nl_socket_alloc();
> +    if (!ns->nlsock) {
> +        LOG(ERROR, "cannot allocate nl socket");
> +        goto out;
> +    }

Last time I commented on the error handling in this patch.  I see that
you have added an nl_geterror in some places.

When I make a comment like that, you need to apply it whereever
relevant in the code, and to the whole patch series.

> +static void nic_destroy(const libxl__remus_device_ops *self,
> +                        libxl__remus_state *rs)
...
> +    /* free qdisc cache */
> +    if (ns->qdisc_cache) {
> +        nl_cache_clear(ns->qdisc_cache);
> +        nl_cache_free(ns->qdisc_cache);

Can these functions fail ?  If so, what does it mean and what should
the program do ?

> +/*----- checkpointing APIs -----*/
> +
> +/* The buffer_op's value, not the value passed to kernel */

I would say "The value of buffer_op, not the value passed to kernel".

> +static int remus_netbuf_op(libxl__remus_device_nic *remus_nic,
> +                           libxl__remus_netbuf_state *netbuf_state,
> +                           int buffer_op)
> +{
> +    int rc;
> +
> +    STATE_AO_GC(netbuf_state->ao);
> +
> +    if (buffer_op == tc_buffer_start)
> +        rc = rtnl_qdisc_plug_buffer(remus_nic->qdisc);
> +    else
> +        rc = rtnl_qdisc_plug_release_one(remus_nic->qdisc);

Error handling again.

Also do not put anything other than a libxl error code in a variable
called "rc".

> +/*----- main flow of control -----*/
> +
> +/* helper functions */
> +
> +/*
> + * If the device has a vifname, then use that instead of
> + * the vifX.Y format.
> + * it must ONLY be used for remus because if driver domains
> + * were in use it would constitute a security vulnerability.
> + */
> +static const char *get_vifname(libxl__remus_device *dev,
> +                               const libxl_device_nic *nic)
> +{
...
> +    path = libxl__sprintf(gc, "%s/backend/vif/%d/%d/vifname",
> +                          libxl__xs_get_dompath(gc, 0), domid, nic->devid);

Please use GCSPRINTF where applicable.

> +    rc = libxl__xs_read_checked(gc, XBT_NULL, path, &vifname);
> +    if (!rc && !vifname) {
> +        /* use the default name */

This comment adds nothing to the comment above.

> +        vifname = libxl__device_nic_devname(gc, domid,
> +                                            nic->devid,
> +                                            nic->nictype);

> +    rc = ERROR_FAIL;
> +    ifindex = rtnl_link_get_ifindex(ifb);
> +    if (!ifindex) {
> +        LOG(ERROR, "interface %s has no index", remus_nic->ifb);
> +        goto out;

How does rtnl_link_get_ifindex report its error code ?  Or can it fail
only one way ?

> +    qdisc = rtnl_qdisc_get_by_parent(netbuf_state->qdisc_cache, ifindex,
> +                                     TC_H_ROOT);
> +
> +    if (qdisc) {
> +        const char *tc_kind = rtnl_tc_get_kind(TC_CAST(qdisc));
> +        /* Sanity check: Ensure that the root qdisc is a plug qdisc. */
> +        if (!tc_kind || strcmp(tc_kind, "plug")) {
> +            nl_object_put((struct nl_object *)qdisc);

This freeing of qdisc should be in the `out' section.

You can do this either by arranging that the `out' section is not used
at all on successful completion, or by saying something like
    if (rc) {
        if (qdisc) nl_object_put((struct nl_object *)qdisc);
    }

> +            LOG(ERROR, "plug qdisc is not installed on %s", remus_nic->ifb);
> +            goto out;
> +        }
> +        remus_nic->qdisc = qdisc;
> +        rc = 0;
> +    } else {
> +        LOG(ERROR, "Cannot get qdisc handle from ifb %s", remus_nic->ifb);

This path out of the function seems not to set rc.

... In fact, I see that you initialise rc to ERROR_FAIL above.  IMO
this is not a good programming paradigm.  rc should be set, obviously,
on every exit path, at the point the error is generated.

> +/*
> + * In return, the script writes the name of IFB device (during setup) to be
> + * used for output buffering into XENBUS_PATH/ifb
> + */
> +static void netbuf_setup_script_cb(libxl__egc *egc,
> +                                   libxl__async_exec_state *aes,
> +                                   int status)

This callback is not in chronological order.

I'm afraid I've run out of time now so I'm going to stop here.  I hope
these comments have been helpful.

Thanks,
Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.