[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 08/28] libxc: ocaml: add simple binding for xentoollog (output only).



On 11/04/13 12:31, Ian Campbell wrote:
On Fri, 2013-04-05 at 15:04 +0100, Rob Hoes wrote:
I think we should avoid returning "bare pointers" to the OCaml heap for two
reasons:

malloc is the "C" runtime heap, is that the same as the ocaml heap?
(From your description I understand the distinction isn't relevant in
this context, but I'm curious).

The OCaml heap is a set of blocks acquired via malloc(), so you could say that the OCaml heap is a subset of the C heap :-) I guess in a long-running program you'd end up with some interleaving of <ocaml heap block>; <normal C data> throughout application memory.

To make the GC work, the OCaml heap has to have enough structure for pointers to other OCaml values to be found. The current OCaml convention is that a "value" is a pointer if its least significant bit = 0 (the Val_int macro sets the bit to 1 to mark a value as a plain integer) and the pointer must point to somewhere within a previously-allocated OCaml heap block. This last check is what makes it possible to stuff the normal result of malloc() directly into an OCaml value and have it be ignored by the GC... for a while :-)

It's also worth knowing that every heap block has a tag and the tags are divided into two groups: one for blocks which should be introspected and assumed to contain arrays of OCaml values (tuples, records, arrays etc), and one group for blocks which should be ignored (and which are safe for us to stick our stuff in). For example an OCaml string is a block with a 'String_tag' and won't be examined further by the GC.

in ocaml/byterun/mlvalues.h:

  /* The lowest tag for blocks containing no value. */
  #define No_scan_tag 251

  /* Strings. */
  #define String_tag 252

  /* Arrays of floating-point numbers. */
  #define Double_array_tag 254
  ...


[...]

I see, thanks for pointing that out.

Instead of returning a "bare pointer" I think we should use a "Custom"
value. This involves declaring a "struct custom_operations" like this:

    static struct custom_operations foo_custom_operations = {
       "foo_custom_operations",
       custom_finalize_default,
       custom_compare_default,
       custom_hash_default,
       custom_serialize_default,
       custom_deserialize_default
    };

And then wrapping and unwrapping "Custom" blocks using something like:

    #define Foo_val(x) (*((struct foo *)Data_custom_val(x)))

    static value
    Val_foo (struct foo *x)
    {
      CAMLparam0 ();
      CAMLlocal1 (result);
      result = caml_alloc_custom (&foo_custom_operations,
                                 sizeof (struct foo*), 0, 1);
      Foo_val (result) = x;
      CAMLreturn (result);
   }

I'll update all occurrences of this pattern. The same think happens
for the libxl context as well.

And probably the libxc context too in that set of bindings?

There's also a malloc in stub_xc_domain_get_pfn_list but that is free'd
in the same C function, which I guess is not subject to this issue?

It looks ok to me -- the result of the malloc is not being stored in an OCaml "value". The OCaml values look like they're being created properly (caml_copy_nativeint) and stored in a place the GC can see (CAMLlocal2 and Store_field):

        CAMLlocal2(array, v);
        ...
        c_array = malloc(sizeof(uint64_t) * c_nr_pfns);
        ...
        array = caml_alloc(ret, 0);
        for (i = 0; i < ret; i++) {
                v = caml_copy_nativeint(c_array[i]);
                Store_field(array, i, v);
        }
        free(c_array);


BTW when looking I found the mmap library uses:
         result = caml_alloc(sizeof(struct mmap_interface), Abstract_tag);

Is that valid? If so then it avoids all the custom operations stuff,
although if you want the finalizer callback then this is worthwhile
anyway.

I think so... Looking in in ocaml/byterun/mlvalues.h:

  /* Abstract things.  Their contents is not traced by the GC;
     therefore they must not contain any [value]. */
  #define Abstract_tag 251


  /* Custom blocks.  They contain a pointer to a "method suite"
     of functions (for finalization, comparison, hashing, etc)
     followed by raw data.  The contents of custom blocks is not traced
     by the GC; therefore, they must not contain any [value].
     See [custom.h] for operations on method suites. */
  #define Custom_tag 255

So it looks like both Abstract_tag and Custom_tag shield the data within from the GC. So it should be safe to store anything which isn't an OCaml "value", including the raw result of a malloc(). If we stored an OCaml "value" in there it would be hidden from the GC and probably prematurely deallocated.

As for which option we should pick, as you say I think it just depends whether we want the finalizer.


Also in the same bit of the mmap library the mmap(2) result is stored as
a bare pointer inside that "result" from above, is that visible to the
GC and therefore also dangerous?

I think it's ok because

   result = caml_alloc(sizeof(struct mmap_interface), Abstract_tag);

-- this creates a block on the OCaml heap where the contents will be ignored by the GC ("not traced by the GC" above). If we created a different type of block then it would probably be an immediate disaster since the GC would look inside and see something different to what it expects: it expects a simple array of "value"s and we've stuck a custom struct in there.

Does that make sense?

Cheers,
Dave

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.