[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream



On 16/06/15 15:31, Ian Campbell wrote:
> On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
>> From: Ross Lagerwall <ross.lagerwall@xxxxxxxxxx>
>>
>> Signed-off-by: Ross Lagerwall <ross.lagerwall@xxxxxxxxxx>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>> CC: Ian Campbell <Ian.Campbell@xxxxxxxxxx>
>> CC: Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx>
>> CC: Wei Liu <wei.liu2@xxxxxxxxxx>
> Overall looks good, I've got some comments below and I think it almost
> certainly wants eyes from Ian who knows more about the dc infra etc.
>
>> +void libxl__stream_read_start(libxl__egc *egc,
>> +                              libxl__stream_read_state *stream)
>> +{
>> +    libxl__datacopier_state *dc = &stream->dc;
>> +    int ret = 0;
>> +
>> +    /* State initialisation. */
>> +    assert(!stream->running);
>> +
>> +    memset(dc, 0, sizeof(*dc));
> libxl__datacopier_init, please

That call is made by libxl__datacopier_start() each and every time, and
unlike here, is matched with an equivalent _kill() call.

>
>> +    dc->ao = stream->ao;
>> +    dc->readfd = stream->fd;
>> +    dc->writefd = -1;
>> +
>> +    /* Start reading the stream header. */
>> +    dc->readwhat = "stream header";
>> +    dc->readbuf = &stream->hdr;
>> +    stream->expected_len = dc->bytes_to_read = sizeof(stream->hdr);
>> +    dc->used = 0;
>> +    dc->callback = stream_header_done;
> This pattern of resetting and reinitialising the dc occurs in multiple
> places, I think a helper would be in order, some sort of
> stream_next_record_init or something perhaps?

The only feasible helper would have to take everything as parameters; 
there is insufficient similarity between all users.

I dunno whether that would be harder to read...

>
>> +void libxl__stream_read_abort(libxl__egc *egc,
>> +                              libxl__stream_read_state *stream, int rc)
>> +{
>> +    stream_failed(egc, stream, rc);
>> +}
>> +
>> +static void stream_success(libxl__egc *egc, libxl__stream_read_state 
>> *stream)
>> +{
>> +    stream->rc = 0;
>> +    stream->running = false;
>> +
>> +    stream_done(egc, stream);
> Push the running = false into stream_done and flip the assert there?
> Logically the stream is still running until it is done, so having done
> assert it isn't running seems counter-intuitive.

This is more for piece of mind.  stream_done() my strictly only ever be
called once, hence its assert.

>
>> +static void stream_done(libxl__egc *egc,
>> +                        libxl__stream_read_state *stream)
>> +{
>> +    libxl__domain_create_state *dcs = CONTAINER_OF(stream, *dcs, srs);
>> +
>> +    assert(!stream->running);
>> +
>> +    stream->completion_callback(egc, dcs, stream->rc);
>> +}
>> +
>> +static void stream_header_done(libxl__egc *egc,
>> +                               libxl__datacopier_state *dc,
>> +                               int onwrite, int errnoval)
>> +{
>> +    libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc);
>> +    libxl_sr_hdr *hdr = &stream->hdr;
>> +    STATE_AO_GC(dc->ao);
>> +    int ret = 0;
>> +
>> +    if (onwrite || dc->used != stream->expected_len) {
>> +        ret = ERROR_FAIL;
>> +        LOG(ERROR, "write %d, err %d, expected %zu, got %zu",
>> +            onwrite, errnoval, stream->expected_len, dc->used);
>> +        goto err;
>> +    }
> I think you need to check errnoval == 0 in the !onwrite case, otherwise
> you may miss a read error?

"dc->used != stream->expected_len" covers all possible read errors, in
the "something went wrong" kind of way.

>
> Also it looks like onwrite can be -1, which is a separate error case.
>
>> +
>> +static void record_header_done(libxl__egc *egc,
>> +                               libxl__datacopier_state *dc,
>> +                               int onwrite, int errnoval)
>> +{
>> +    libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc);
>> +    libxl_sr_rec_hdr *rec_hdr = &stream->rec_hdr;
>> +    STATE_AO_GC(dc->ao);
>> +    int ret = 0;
>> +
>> +    if (onwrite || dc->used != stream->expected_len) {
>> +        ret = ERROR_FAIL;
>> +        LOG(ERROR, "write %d, err %d, expected %zu, got %zu",
>> +            onwrite, errnoval, stream->expected_len, dc->used);
>> +        goto err;
>> +    }
> Same comments wrt the arguments as the previous one.
>
> Maybe a common helper to check (and log) the status at the head of each
> callback? So you can effectively do if (!everything_ok(stream, dc) goto
> err?

I will see what I can do.

>
>> +    assert(!ret);
>> +    if (rec_hdr->length) {
>> +        free(stream->rec_body);
>> +        stream->rec_body = NULL;
> reset length too?
>
>> +static void read_emulator_body(libxl__egc *egc,
>> +                               libxl__stream_read_state *stream)
>> +{
>> +    libxl__domain_create_state *dcs = CONTAINER_OF(stream, *dcs, srs);
>> +    libxl__datacopier_state *dc = &stream->dc;
>> +    libxl_sr_rec_hdr *rec_hdr = &stream->rec_hdr;
>> +    libxl_sr_emulator_hdr *emu_hdr = stream->rec_body;
>> +    STATE_AO_GC(stream->ao);
>> +    char path[256];
>> +    int ret = 0;
>> +
>> +    sprintf(path, XC_DEVICE_MODEL_RESTORE_FILE".%u", dcs->guest_domid);
>> +
>> +    dc->readwhat = "save/migration stream";
>> +    dc->copywhat = "emulator context";
>> +    dc->writewhat = "qemu save file";
>> +    dc->readbuf = NULL;
>> +    dc->writefd = open(path, O_WRONLY | O_CREAT | O_TRUNC, 0666);
> Since it this is all done in the same process (or children of it) with
> not setuid etc, I think 0600 would be better to avoid accidentally
> leaving the save state world readable (just in case it matters).

Probably best.

>
> Also, should consider whether this fd needs to be subject to the carefd
> machinery.

Probably does.

>
> Sharing the dc between al these differing usages is starting to rankle a
> little, but I think it is necessary because it may have queued data from
> a previous read which was larger than the current record, correct?
>
> Hrm, isn't setting dc->used = 0 on each reset potentially throwing some
> stuff away?

We should never be in a case where we are setting up a new read/write
from the dc with any previous IO pending.

>
>> +    if (dc->writefd == -1) {
>> +        ret = ERROR_FAIL;
>> +        LOGE(ERROR, "Unable to open '%s'", path);
>> +        goto err;
>> +    }
>> +    dc->maxsz = dc->bytes_to_read = rec_hdr->length - sizeof(*emu_hdr);
>> +    stream->expected_len = dc->used = 0;
> expecting 0? This differs from the pattern common everywhere else and
> I'm not sure why.

The datacopier has been overloaded so many times, it is messy to use.

In this case, we are splicing from read fd to a write fd, rather than to
a local buffer.

Therefore, when the IO is complete, we expect 0 bytes in the local
buffer, as all data should end up in the fd.

>
>> +    dc->callback = emulator_body_done;
>> +
>> +    ret = libxl__datacopier_start(dc);
>> +    if (ret)
>> +        goto err;
>> +    return;
>> +
>> + err:
>> +    assert(ret);
>> +    stream_failed(egc, stream, ret);
>> +}
>> +
>> +static void emulator_body_done(libxl__egc *egc,
>> +                               libxl__datacopier_state *dc,
>> +                               int onwrite, int errnoval)
>> +{
>> +    /* Safe to be static, as it is a write-only discard buffer. */
>> +    static char padding[1U << REC_ALIGN_ORDER];
>> +
>> +    libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc);
>> +    libxl_sr_rec_hdr *rec_hdr = &stream->rec_hdr;
>> +    STATE_AO_GC(dc->ao);
>> +    unsigned int nr_padding_bytes = (1U << REC_ALIGN_ORDER);
>> +    int ret = 0;
>> +
>> +    if (onwrite || dc->used != stream->expected_len) {
>> +        ret = ERROR_FAIL;
>> +        LOG(ERROR, "write %d, err %d, expected %zu, got %zu",
>> +            onwrite, errnoval, stream->expected_len, dc->used);
>> +        goto err;
>> +    }
>> +
>> +    /* Undo modifications for splicing the emulator context. */
> Hrm, not so much undo as nuke and rebuild. Is that really necessary,
> can't you just reset what you need to in the inverse of the other thing?
>
> If there isn't a problem with buffered stuff on callback, then perhaps
> it would be clearer to use a separate dc, at least for the qemu side. Or
> to _always_ teardown and restart the dc from scratch instead of doing it
> partially in some places and fully in others.
>
>
>> +    memset(dc, 0, sizeof(*dc));
>> +    dc->ao = stream->ao;
>> +    dc->readfd = stream->fd;
>> +    dc->writefd = -1;
>> +
>> +    /* Do we need to eat some padding out of the stream? */
> Why only now and not for e.g. the xenstore stuff (which doesn't appear
> to be explicitly padded).

Any record which is read into a local buffer has the local buffer
aligned up, and the padding read onto the end.

>
> And given that why not handle this in some central place rather than in
> the emulator only place?

Experimentally, some versions of Qemu barf if they have trailing zeros
in save file.  I think they expect to find eof() on a qemu record boundary.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.