|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2 08/10] libxl: Kill QEMU by uid when possible
On 12/12/18 4:17 PM, Ian Jackson wrote:
> George Dunlap writes ("[PATCH v2 08/10] libxl: Kill QEMU by uid when
> possible"):
>> The privcmd fd that a dm_restrict'ed QEMU has gives it permission to
>> one specific domain ID. This domain ID will probably eventually be
>> used again. It is therefore necessary to make absolutely sure that a
>> rogue QEMU process cannot hang around after its domain has exited.
>
> Thanks. Detailed comments mostly on error handling follow...
>
>> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
>> index f9e0bf6578..cd3208f4b8 100644
>> --- a/tools/libxl/libxl_internal.h
>> +++ b/tools/libxl/libxl_internal.h
>> @@ -1135,7 +1135,7 @@ typedef struct {
>> const char *shim_cmdline;
>> const char *pv_cmdline;
>>
>> - char *dm_runas;
>> + char *dm_runas, *dm_uid;
>
> I think `dm_uid' is misnamed. It is only set if the dm has a
> *dedicated* uid. If the dm is run as uid 0 or as a shared qemu uid,
> it is set to NULL.
dm_kill_uid? If not some suggestions would be helpful.
I'll add a comment here describing these as well.
>> @@ -148,8 +150,10 @@ static int libxl__domain_get_device_model_uid(libxl__gc
>> *gc,
>> LOGD(ERROR, guest_domid, "Couldn't find device_model_user %s",
>> user);
>> rc = ERROR_INVAL;
>> - } else
>> + } else {
>> intended_uid = user_base->pw_uid;
>> + kill_by_uid = true;
>> + }
>>
>> goto out;
>> }
>
> I think your changes to the out block newly imply that all `goto out'
> with `rc=0' must also set kill_by_uid.
Yes; I didn't initialize kill_by_uid in the hopes (perhaps overly
optimistic) that the compiler would notice if there were paths where it
wasn't explicitly set and complain.
>> + /*
>> + * If we're starting the dm with a non-root UID, save the UID so
>> + * that we can reliably kill it and any subprocesses
>> + */
>> + if (state->dm_uid)
>> + libxl__xs_printf(gc, XBT_NULL,
>> + GCSPRINTF("%s/image/device-model-uid", dom_path),
>
> My comment about the misnamed libxl variable applies to
> device-model-uid too I think.
Ack
>> +#define PROPAGATE_RC if(!ddms->rc) ddms->rc = rc
> ^
>
> I like this macro. But (i) it needs proper macro hygiene - either a
> do{ }while(0) block, or refactoring into an expression (and then
> putting in parens). And (ii) you missed out a space.
[snip]
>> +#undef PROPAGATE_RC
>
> I am tempted to suggest replacing each call
> PROPAGATE_RC;
> with
> ACCUMULATE_RC(ddms);
> and put the definition in libxl_internal.h for use elsewhere.
I like that better. What about `ACCUMULATE_RC(ddms->rc)` instead? Then
the same macro could be used for a local variable.
[snip]
>> + /*
>> + * See if we should try to kill by uid
>> + */
>> + path = GCSPRINTF("/local/domain/%d/image/device-model-uid", domid);
>> + rc = libxl__xs_read_checked(gc, XBT_NULL, path, &dm_uid_str);
>> +
>> + /*
>> + * If there was an error here, accumulate the error and fall back
>> + * to killing by pid.
>> + */
>> + if (rc) {
>> + PROPAGATE_RC;
>> + LOGD(ERROR, domid, "Reading dm UID path failed for %s", path);
>> + }
>
> From the comment for libxl__xs_read_checked:
> | * On error, *result_out is undefined.
> Arguably this is a bear trap. Maybe you would like to fix it there
> rather than by setting dm_uid_str to 0 here.
Saying it's "undefined" is probably a bear trap. But you don't like the
way it's actually written -- i.e., that the pointer is only modified if
the value is successfully read?
>> + reaper_pid = libxl__ev_child_fork(gc, &ddms->destroyer,
>> + kill_device_model_uid_cb);
>> + if (reaper_pid < 0) {
>> + rc = ERROR_FAIL;
>> + PROPAGATE_RC;
>> + /*
>> + * Note that if this fails, we still don't kill by pid, to
>> + * make sure that an untrusted DM has not "maliciously"
>> + * exited (potentially causing us to kill an unrelated
>> + * process which happened to get the same pid).
>> + */
>> + goto out;
>> + }
>> +
>> + if (!reaper_pid) { /* child */
>> + rc = kill_device_model_uid_child(ddms, dm_uid_str);
>> + _exit(rc);
>> + }
>
> You cannot _exit(rc). See my comments below...
Is the 'not' attached to "rc" (i.e., the value must be positive), or
'_exit()' (i.e., you must call exit() rather than _exit())?
>
> Personally I like to put the child process block right after the fork,
> especially when (like here) it is very short because the meat has been
> lifted elsewhere. Just a suggestion; you can leave it here if you
> prefer.
OK -- I think this is long enough that I prefer it separate.
>
>> - /* We should try to destroy the device model anyway. */
>> - rc = kill_device_model(gc,
>> - GCSPRINTF("/local/domain/%d/image/device-model-pid", domid));
>> + /*
>> + * No uid to kill; attept to kill by pid.
>> + */
>> + LOGD(DEBUG, domid, "Didn't find dm UID; destroying by pid");
>> +
>> + path = GCSPRINTF("/local/domain/%d/image/device-model-pid", domid);
>> + rc = kill_device_model(gc, path);
>> +
>> + if (rc) {
>> + PROPAGATE_RC;
>> + LOGD(ERROR, domid, "Killing device model pid from path %s", path);
>> + }
>> +
>> +out:
>
> I would prefer the apparently-redundant:
>
> + rc = 0;
>> +out:
OK -- if I don't set rc initially it won't be redundant. :-)
>> + /*
>> + * NB that we always return '0' here for the "status of exited
>> + * process"; since there is no process, it always "succeeds".
>
> I had to read this three times to figure out what this meant. Perhaps
> I'm being dense but would you mind writing
>
> + * NB that we always pass '0' here for the "status of exited
> ^^^^
Ack
>> +static void kill_device_model_uid_cb(libxl__egc *egc,
>> + libxl__ev_child *destroyer,
>> + pid_t pid, int status)
>> +{
>> + libxl__destroy_devicemodel_state *ddms = CONTAINER_OF(destroyer, *ddms,
>> destroyer);
>> + STATE_AO_GC(ddms->ao);
>> +
>> + if (status) {
>> + int rc = ERROR_FAIL;;
>> +
>> + if (WIFEXITED(status))
>> + rc = WEXITSTATUS(status) - 128;
>
> This can't be right. Where does this 128 come from ?
Looks like I was trying to figure out how WEXITSTATUS worked from
looking at the child of devices_destroy_cb() put in _exit() and how
domain_destroy_domid_cb() interpreted it: Namely, I inferred that
_exit(-1) would get you WEXITSTATUS(status) == 127, and extrapolated
that -2 would get you 126, and so on.
> I don't see a comment anywhere about your encoding of the libxl error
> value in the exit status.
>
> A libxl error code does not necessarily fit in an exit status since an
> exit status is just one byte. Your protocol reserves the exit status
> 0 for success. The C implementation typically reserves -1 (255) and
> sometimes also 127. By convention the exit status is normally
> regarded as positive and libxl error codes are negative.
>
> I suggest one of the following strategies:
>
> - Give up on the idea of distinguishing these error codes at all and
> simply _exit(!!rc). (After all the real error is logged and the
> function only ever returns FAIL.)
>
> - Say that the child function may only return one of a limited
> subset of libxl error codes (since only FAIL is currently needed),
> and assert that -125 <= rc <= -1, and _exit(-rc). Then the exit
> status can be recovered with -WEXITSTATUS(status).
For some reason, I really don't want to drop the exit code. If you'd
rather I do #1 I will, but given the choice I'd go with #2.
Thanks,
-George
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |