[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH V3] Switch from select() to poll() in xenconsoled's IO loop.



On Fri, 2013-01-04 at 17:17 +0000, Wei Liu wrote:
> In Linux select() typically supports up to 1024 file descriptors. This can be
> a problem when user tries to boot up many guests. Switching to poll() has
> minimum impact on existing code and has better scalibility.
> 
> Tracking arrays are dynamically allocated / reallocated. If the tracking
> arrays fail to expand, we just ignore the incoming fd.
> 
> Change from V2:
>   * remove unnecessary malloc in initialize_pollfd_arrays
>   * use ROUND_UP to get new size of arrays
> 
> Signed-off-by: Wei Liu <wei.liu2@xxxxxxxxxx>
> ---
>  tools/console/daemon/io.c |  147 
> +++++++++++++++++++++++++++++++++------------
>  1 file changed, 109 insertions(+), 38 deletions(-)
> 
> diff --git a/tools/console/daemon/io.c b/tools/console/daemon/io.c
> index 48fe151..0c53d30 100644
> --- a/tools/console/daemon/io.c
> +++ b/tools/console/daemon/io.c
> @@ -28,7 +28,7 @@
>  #include <stdlib.h>
>  #include <errno.h>
>  #include <string.h>
> -#include <sys/select.h>
> +#include <poll.h>
>  #include <fcntl.h>
>  #include <unistd.h>
>  #include <termios.h>
> @@ -928,9 +928,87 @@ static void handle_log_reload(void)
>         }
>  }
> 
> +
> +/* Should have at least max_fd + 1 elements */
> +static struct pollfd  *fds;
> +static struct pollfd **fd_to_pollfd;
> +static unsigned int current_array_size;
> +static unsigned int nr_fds;
> +
> +static void initialize_pollfd_arrays(void)
> +{
> +       fds = NULL;
> +       fd_to_pollfd = NULL;
> +       current_array_size = 0;
> +}

These can all be done as part of the definition of the variables.

> +
> +static void destroy_pollfd_arrays(void)
> +{
> +       free(fds);
> +       free(fd_to_pollfd);
> +       current_array_size = 0;
> +}

Having done the above I'd be inclined to inline this at the single
callsite.

> +
> +#define ROUNDUP(_x,_w) (((unsigned long)(_x)+(1UL<<(_w))-1) & 
> ~((1UL<<(_w))-1))
> +static void set_fds(int fd, short events)
> +{
> +       if (current_array_size < fd+1) {
> +               struct pollfd  *p1 = NULL;
> +               struct pollfd **p2 = NULL;
> +               unsigned long newsize;
> +
> +               /* Round up to 2^8 boundary, in practice this just
> +                * make newsize larger than current_array_size.
> +                */
> +               newsize = ROUNDUP(fd+1, 8);
> +
> +               p1 = realloc(fds, sizeof(struct pollfd)*newsize);
> +               if (!p1)
> +                       goto fail;
> +               fds = p1;
> +
> +               p2 = realloc(fd_to_pollfd, sizeof(struct pollfd *)*newsize);
> +               if (!p2)
> +                       goto fail;
> +               fd_to_pollfd = p2;

Do these two arrays really need to be the same size?

I can see why fd_to_pollfd needs to be as big as the highest fd we are
polling but the fds array only actually needs to scale with the total
*number* of fds we are polling. Is it the case that the set of fds we
are waiting on is pretty densely packed?

Or maybe even if the fds are sparse it doesn't matter and this is
simpler at the cost of a little bit of extra memory usage?


> +
> +               memset(&fds[0] + current_array_size, 0,
> +                      sizeof(struct pollfd) * (newsize-current_array_size));
> +               memset(&fd_to_pollfd[0] + current_array_size, 0,
> +                      sizeof(struct pollfd *) * 
> (newsize-current_array_size));
> +               current_array_size = newsize;
> +       }
> +
> +       fds[nr_fds].fd = fd;
> +       fds[nr_fds].events = events;
> +       fd_to_pollfd[fd] = &fds[nr_fds];
> +       nr_fds++;
> +
> +       return;
> +fail:
> +       dolog(LOG_ERR, "realloc failed, ignoring fd %d\n", fd);
> +       return;
> +}
[...]
> -               if (log_hv && FD_ISSET(xc_evtchn_fd(xce_handle), &readfds))
> +               if (log_hv && fd_revents(xc_evtchn_fd(xce_handle)) & POLLIN)
>                         handle_hv_logs();
> 
>                 if (ret <= 0)
>                         continue;
> 
> -               if (FD_ISSET(xs_fileno(xs), &readfds))
> +               if (fd_revents(xs_fileno(xs)) & POLLIN)

It seems like both this and the logging fd could be handled with a
struct pollfd * variable specifically for each. Which combined with...

>                         handle_xs();
> 
>                 for (d = dom_head; d; d = n) {
>                         n = d->next;
>                         if (d->event_count < RATE_LIMIT_ALLOWANCE) {
>                                 if (d->xce_handle != NULL &&
> -                                   FD_ISSET(xc_evtchn_fd(d->xce_handle),
> -                                            &readfds))
> +                                   fd_revents(xc_evtchn_fd(d->xce_handle)) &

... adding some struct pollfd pointers to struct domain (for this and
master_fd) would remove the need for the fd_to_pollfd lookup array
altogether.

> +                                   POLLIN)
>                                         handle_ring_read(d);
>                         }
> 
> -                       if (d->master_fd != -1 && FD_ISSET(d->master_fd,
> -                                                          &readfds))
> +                       if (d->master_fd != -1 &&
> +                           fd_revents(d->master_fd) & POLLIN)
>                                 handle_tty_read(d);
> 
> -                       if (d->master_fd != -1 && FD_ISSET(d->master_fd,
> -                                                          &writefds))
> +                       if (d->master_fd != -1 &&
> +                           fd_revents(d->master_fd) & POLLOUT)
>                                 handle_tty_write(d);
> 
>                         if (d->last_seen != enum_pass)
[...]

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.