[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 09/20] libxl: Crash (more sensibly) on malloc failure



On Fri, 2012-03-16 at 16:26 +0000, Ian Jackson wrote:
> Formally change the libxl memory allocation failure policy to "crash".
> 
> Previously we had a very uneven approach; much code assumed that
> libxl__sprintf (for example) would never return NULL, but some code
> was written more carefully.
> 
> We think it is unlikely that we will be able to make the library
> actually robust against allocation failure (since that would be an
> awful lot of never-tested error paths) and few calling environments
> will be able to cope anyway.  So, instead, adopt the alternative
> approach: provide allocation functions which never return null, but
> will crash the whole process instead.
> 
> Consequently,
>  - New noreturn function libxl__alloc_failed which may be used for
>    printing a vaguely-useful error message, rather than simply
>    dereferencing a null pointer.

I guess the log infrastructure does (or could do) memory allocation and
so can't be used here in the normal way.

Perhaps libxl could format a (short) message into a static emergency
buffer and use a (new, optional) emergency variant of the logger
callback which does not accept a format string, which should maximise
the chances of writing to the log? Even if we could just get the message
"Out of memory, aborting" into the logs that would be useful.

Alternatively perhaps an application could be allowed to provide a
function which is called first in libxl__alloc_failed, the application
ought should be in a better position to fprintf to the appropriate fd to
log the issue somewhere useful.

My concern is that many users of libxl will not necessarily be capturing
stderr and so the appearance will be of random unexplained exiting.

>  - libxl__ptr_add now returns void as it crashes on failure.
>  - libxl__zalloc, _calloc, _strdup, _strndup, crash on failure using
>    libxl__alloc_failed.  So all the code that uses these can no longer
>    dereference null on malloc failure.

I took a look for a gcc __attribute__ which means "cannot return NULL"
but sadly there doesn't seem to be one, this would have allowed gcc to
warn us about (now) pointless error handling. I don't know if gcc is
smart enough to catch this if we make each of these a macro which did
        if (!res) abort()
(or something similar) after calling the inner-function which does the
work, I don't think it's worth the effort anyway.

> diff --git a/tools/libxl/libxl_internal.c b/tools/libxl/libxl_internal.c
> index 12c32dc..dfa2153 100644
> --- a/tools/libxl/libxl_internal.c
> +++ b/tools/libxl/libxl_internal.c
> @@ -17,40 +17,40 @@
>  
>  #include "libxl_internal.h"
>  
> -int libxl__error_set(libxl__gc *gc, int code)
> -{
> -    return 0;
> +void libxl__alloc_failed(const char *func, size_t nmemb, size_t size) {
> +    fprintf(stderr,
> +            "libxl: FATAL ERROR: memory allocation failure (%s, %lu x 
> %lu)\n",
> +            func, (unsigned long)nmemb, (unsigned long)size);
> +    fflush(stderr);
> +    _exit(-1);
>  }
>  
> -int libxl__ptr_add(libxl__gc *gc, void *ptr)
> +void libxl__ptr_add(libxl__gc *gc, void *ptr)
>  {
>      int i;
> -    void **re;
>  
>      if (!ptr)
> -        return 0;
> +        return;
>  
>      /* fast case: we have space in the array for storing the pointer */
>      for (i = 0; i < gc->alloc_maxsize; i++) {
>          if (!gc->alloc_ptrs[i]) {
>              gc->alloc_ptrs[i] = ptr;
> -            return 0;
> +            return;
>          }
>      }
> -    /* realloc alloc_ptrs manually with calloc/free/replace */
> -    re = calloc(gc->alloc_maxsize + 25, sizeof(void *));
> -    if (!re)
> -        return -1;
> -    for (i = 0; i < gc->alloc_maxsize; i++)
> -        re[i] = gc->alloc_ptrs[i];
> -    /* assign the next pointer */
> -    re[i] = ptr;
> +    int new_maxsize = gc->alloc_maxsize * 2 + 25;
> +    assert(new_maxsize < INT_MAX / sizeof(void*) / 2);

This is not exactly the same as ENOMEM but there is an argument for
treating it similarly rather than asserting? I suppose the counter
argument is that the cap on max size is so huge that we "assert" it can
never be reached in normal use?

[...]

> @@ -126,16 +118,13 @@ char *libxl__sprintf(libxl__gc *gc, const char *fmt, 
> ...)
>      ret = vsnprintf(NULL, 0, fmt, ap);
>      va_end(ap);
>  
> -    if (ret < 0) {
> -        return NULL;
> -    }
> +    assert(ret >= 0);

If ret is -ve then it can be ENOMEM in which case we should call the
emergency function?

Oh, it can't be ENOMEM because you called with NULL. I guess all the
other possible returns (EILSEQ, EINVAL, EOVERFLOW) are abort-worthy.

>  
>      s = libxl__zalloc(gc, ret + 1);
> -    if (s) {
> -        va_start(ap, fmt);
> -        ret = vsnprintf(s, ret + 1, fmt, ap);
> -        va_end(ap);
> -    }
> +    va_start(ap, fmt);
> +    ret = vsnprintf(s, ret + 1, fmt, ap);

No check on ret here? In this case I think it can be ENOMEM.

> +    va_end(ap);
> +
>      return s;
>  }
>  




_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.