Xen project Mailing List

Re: [PATCH] libxl: Retry QMP PCI device_add

From: Anthony PERARD <anthony.perard@xxxxxxxxxx>

Date: Fri, 8 Apr 2022 15:56:24 +0100

Authentication-results: esa6.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none

Cc: <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Juergen Gross <jgross@xxxxxxxx>, Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>

Delivery-date: Fri, 08 Apr 2022 14:56:53 +0000

Ironport-data: A9a23:c9J7u6uZHXG5DMrK+TVAz0NffefnVGFeMUV32f8akzHdYApBsoF/q tZmKTuGP62JMGPye9pxatmx9R4G656AzdA2TVBtrXw0FyMR+JbJXdiXEBz9bniYRiHhoOOLz Cm8hv3odp1coqr0/0/1WlTZhSAgk/nOHNIQMcacUsxLbVYMpBwJ1FQyw4bVvqYy2YLjW1/V6 YuoyyHiEATNNwBcYzp8B52r8HuDjNyq0N/PlgVjDRzjlAa2e0g9VPrzF4noR5fLatA88tqBb /TC1NmEElbxpH/BPD8HfoHTKSXmSpaKVeSHZ+E/t6KK2nCurQRquko32WZ1he66RFxlkvgoo Oihu6BcRi81EfWWpf0+cCBTLBBsOpZb94PgGyiG5Jn7I03uKxMAwt1rBUAye4YZ5vx2ESdF8 vlwxDIlN07ZwbjsmfTiF7cq1p9LwMrDZevzvllpyy3ZCvA3B4jOWazQ6fdT3Ssqh9AIFvHbD yYcQWQ/NkqfO0YVUrsRII0gos2x11TUTxJdpBWwgIcL4XrRxSUkhdABN/KKI4fXFK25hH2wp G3c+H/iKgoHL9HZwj2Amlqpj/XKlDn2W6oTEqO57f9ghFCPxm0VBwYSXFH9qv684mahX/pPJ kpS/TAhxYAw/UqnVMjgXDW3pXeFulgXXN84O/037kSBx7TZ5y6dB3MYVXhRZdo+rsg0SDc2k FiTkLvU6SdH6ePPDyjHr/HN8G30aXN9wXI+iTEsUSRdueT6g6IKhS3+XNFYDaGJqeelIGSlq 9yVlxQWi7IWhM8N8qy0+1Hbnj6hzqT0oh4JChb/BTz8sF4gDGKxT8nxsAWAs64cRGqMZgPZ1 EXojfRy+wzn4XulsCWWCNsAE7iyjxpuGG2N2AU/d3XNGtnExpJCQWyyyGwmTKuKGpxdEdMMX KM1kVkMjHO0FCH3BZKbm6rrV6wXIVHITLwJrMz8YNtUeYRWfwSa5ixobkP49zmzzBl2wfxia cjEKJ/E4ZMm5UJPlmfeqwA1i+FD+8zD7TmLGcCTI+qPj9Jym0J5uZ9aaQDTP4jVHYuPoRnP8 sY3Cid54043bQEKWQGOqdR7BQlTdRATXMmqw+QKJr/rClc3QwkJVq6OqY7NjqQ4xsy5YM+Tp SrjMqKZoXKi7UD6xfKiNig4OOyyB84mxZ/5VAR1VWuVN7EYSd7HxM8im1EfJNHLKMQLISZIc sQ4

Ironport-hdrordr: A9a23:8A8z2qj1s27QWDThLB2O8XU/XHBQXtoji2hC6mlwRA09TySZ// rBoB0+726RtN9xYgBEpTnuAsS9qB/nmaKdpLNhWotKPzOW2ldATrsD0WKK+VSJcEfDH6xmpM RdmsBFebvN5DNB7PoSjjPWL+od

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Wed, Mar 30, 2022 at 03:46:56PM -0400, Jason Andryuk wrote: > PCI device assignment to an HVM with stubdom is potentially racy. First > the PCI device is assigned to the stubdom via the PV PCI protocol. Then > QEMU is sent a QMP command to attach the PCI device to QEMU running > within the stubdom. However, the sysfs entries within the stubdom may > not have appeared by the time QEMU receives the device_add command > resulting in errors like: > > libxl_qmp.c:1838:qmp_ev_parse_error_messages:Domain 10:Could not open > '/sys/bus/pci/devices/0000:00:1f.3/config': No such file or directory > > This patch retries the device assignment up to 10 times with a 1 second > delay between. That roughly matches the overall hotplug timeout. > > The qmp_ev_parse_error_messages error is still printed since it happens > at a lower level than the pci code controlling the retries. With that, > the "Retrying PCI add %d" message is also printed at ERROR level to > clarify what is happening. > > Signed-off-by: Jason Andryuk <jandryuk@xxxxxxxxx> > --- > @@ -1252,10 +1258,22 @@ static void pci_add_qmp_device_add_cb(libxl__egc *egc, > const libxl__json_object *response, > int rc) > { > - EGC_GC; > pci_add_state *pas = CONTAINER_OF(qmp, *pas, qmp); > + STATE_AO_GC(pas->aodev->ao); I think this changes are wrong, what is the reason to use STATE_AO_GC instead of EGC_GC? I think in this case, it is fine to keep using EGC_GC, as there doesn't seems to be any allocation that needs to live after this function returns. And you could just pass `pas->aodev->ao` to libxl__ev_time_register_rel(). > > - if (rc) goto out; > + if (rc) { > + if (pas->retries++ < 10) { > + LOGD(ERROR, qmp->domid, "Retrying PCI add %d", pas->retries); > + rc = libxl__ev_time_register_rel(ao, &pas->timeout_retries, > + pci_add_qmp_device_add_retry, > + 1000); > + if (rc) goto out; > + > + return; /* wait for the timeout to then retry */ > + } else { > + goto out; > + } > + } So this retry logic would be done regardless of whether stubdom is in use or not. It's not going to be useful in the non-stubdom case, is it? When adding a pci device to a domain that has its device model in a stubdomain, there's a first call to do_pci_add() which works fine, right? Then there's a callback to device_pci_add_stubdom_wait(), which is supposed to wait for the stubdom to be ready, but the sysfs entry might still be missing at that time, right? Then, there is a second call to do_pci_add() for the guest, and it's the one that fail in your case, right? If my reasoning is correct, could we enable the retry logic only for the second do_pci_add() call? So that guests without stubdom aren't impacted as I don't think retrying in this case would be useful and would just delay the error. Cheers, -- Anthony PERARD

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.