[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 RESEND] xen: Fix SEGV on domain disconnect



On 20/04/2023 12:02, mark.syms@xxxxxxxxxx wrote:
From: Mark Syms <mark.syms@xxxxxxxxxx>

Ensure the PV ring is drained on disconnect. Also ensure all pending
AIO is complete, otherwise AIO tries to complete into a mapping of the
ring which has been torn down.

Signed-off-by: Mark Syms <mark.syms@xxxxxxxxxx>
---
CC: Stefano Stabellini <sstabellini@xxxxxxxxxx>
CC: Anthony Perard <anthony.perard@xxxxxxxxxx>
CC: Paul Durrant <paul@xxxxxxx>
CC: xen-devel@xxxxxxxxxxxxxxxxxxxx

v2:
  * Ensure all inflight requests are completed before teardown
  * RESEND to fix formatting
---
  hw/block/dataplane/xen-block.c | 31 +++++++++++++++++++++++++------
  1 file changed, 25 insertions(+), 6 deletions(-)

diff --git a/hw/block/dataplane/xen-block.c b/hw/block/dataplane/xen-block.c
index 734da42ea7..d9da4090bf 100644
--- a/hw/block/dataplane/xen-block.c
+++ b/hw/block/dataplane/xen-block.c
@@ -523,6 +523,10 @@ static bool xen_block_handle_requests(XenBlockDataPlane 
*dataplane)
dataplane->more_work = 0; + if (dataplane->sring == 0) {
+        return done_something;
+    }
+

I think you could just return false here... Nothing is ever going to be done if there's no ring :-)

      rc = dataplane->rings.common.req_cons;
      rp = dataplane->rings.common.sring->req_prod;
      xen_rmb(); /* Ensure we see queued requests up to 'rp'. */
@@ -666,14 +670,35 @@ void xen_block_dataplane_destroy(XenBlockDataPlane 
*dataplane >   void xen_block_dataplane_stop(XenBlockDataPlane *dataplane)
  {
      XenDevice *xendev;
+    XenBlockRequest *request, *next;
if (!dataplane) {
          return;
      }
+ /* We're about to drain the ring. We can cancel the scheduling of any
+     * bottom half now */
+    qemu_bh_cancel(dataplane->bh);
+
+    /* Ensure we have drained the ring */
+    aio_context_acquire(dataplane->ctx);
+    do {
+        xen_block_handle_requests(dataplane);
+    } while (dataplane->more_work);
+    aio_context_release(dataplane->ctx);
+

I don't think we want to be taking new requests, do we?

+    /* Now ensure that all inflight requests are complete */
+    while (!QLIST_EMPTY(&dataplane->inflight)) {
+        QLIST_FOREACH_SAFE(request, &dataplane->inflight, list, next) {
+            blk_aio_flush(request->dataplane->blk, xen_block_complete_aio,
+                        request);
+        }
+    }
+

I think this could possibly be simplified by doing the drain after the call to blk_set_aio_context(), as long as we set dataplane->ctx to qemu_get_aio_context(). Alos, as long as more_work is not set then it should still be safe to cancel the bh before the drain AFAICT.

  Paul

      xendev = dataplane->xendev;
aio_context_acquire(dataplane->ctx);
+
      if (dataplane->event_channel) {
          /* Only reason for failure is a NULL channel */
          xen_device_set_event_channel_context(xendev, dataplane->event_channel,
@@ -684,12 +709,6 @@ void xen_block_dataplane_stop(XenBlockDataPlane *dataplane)
      blk_set_aio_context(dataplane->blk, qemu_get_aio_context(), &error_abort);
      aio_context_release(dataplane->ctx);
- /*
-     * Now that the context has been moved onto the main thread, cancel
-     * further processing.
-     */
-    qemu_bh_cancel(dataplane->bh);
-
      if (dataplane->event_channel) {
          Error *local_err = NULL;




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.