|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] libxl: Retry QMP PCI device_add
On Wed, Mar 30, 2022 at 03:46:56PM -0400, Jason Andryuk wrote:
> PCI device assignment to an HVM with stubdom is potentially racy. First
> the PCI device is assigned to the stubdom via the PV PCI protocol. Then
> QEMU is sent a QMP command to attach the PCI device to QEMU running
> within the stubdom. However, the sysfs entries within the stubdom may
> not have appeared by the time QEMU receives the device_add command
> resulting in errors like:
>
> libxl_qmp.c:1838:qmp_ev_parse_error_messages:Domain 10:Could not open
> '/sys/bus/pci/devices/0000:00:1f.3/config': No such file or directory
>
> This patch retries the device assignment up to 10 times with a 1 second
> delay between. That roughly matches the overall hotplug timeout.
>
> The qmp_ev_parse_error_messages error is still printed since it happens
> at a lower level than the pci code controlling the retries. With that,
> the "Retrying PCI add %d" message is also printed at ERROR level to
> clarify what is happening.
>
> Signed-off-by: Jason Andryuk <jandryuk@xxxxxxxxx>
> ---
> @@ -1252,10 +1258,22 @@ static void pci_add_qmp_device_add_cb(libxl__egc *egc,
> const libxl__json_object *response,
> int rc)
> {
> - EGC_GC;
> pci_add_state *pas = CONTAINER_OF(qmp, *pas, qmp);
> + STATE_AO_GC(pas->aodev->ao);
I think this changes are wrong, what is the reason to use STATE_AO_GC
instead of EGC_GC?
I think in this case, it is fine to keep using EGC_GC, as there doesn't
seems to be any allocation that needs to live after this function
returns. And you could just pass `pas->aodev->ao` to
libxl__ev_time_register_rel().
>
> - if (rc) goto out;
> + if (rc) {
> + if (pas->retries++ < 10) {
> + LOGD(ERROR, qmp->domid, "Retrying PCI add %d", pas->retries);
> + rc = libxl__ev_time_register_rel(ao, &pas->timeout_retries,
> + pci_add_qmp_device_add_retry,
> + 1000);
> + if (rc) goto out;
> +
> + return; /* wait for the timeout to then retry */
> + } else {
> + goto out;
> + }
> + }
So this retry logic would be done regardless of whether stubdom is in
use or not. It's not going to be useful in the non-stubdom case, is it?
When adding a pci device to a domain that has its device model in a
stubdomain, there's a first call to do_pci_add() which works fine,
right? Then there's a callback to device_pci_add_stubdom_wait(), which
is supposed to wait for the stubdom to be ready, but the sysfs entry
might still be missing at that time, right? Then, there is a second call
to do_pci_add() for the guest, and it's the one that fail in your case,
right?
If my reasoning is correct, could we enable the retry logic only for the
second do_pci_add() call? So that guests without stubdom aren't impacted
as I don't think retrying in this case would be useful and would just
delay the error.
Cheers,
--
Anthony PERARD
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |