[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v3 13/20] block/export: rewrite vduse-blk drain code
On Wed, Apr 26, 2023 at 12:43 AM Stefan Hajnoczi <stefanha@xxxxxxxxxx> wrote: > > On Fri, Apr 21, 2023 at 11:36:02AM +0800, Yongji Xie wrote: > > Hi Stefan, > > > > On Thu, Apr 20, 2023 at 7:39 PM Stefan Hajnoczi <stefanha@xxxxxxxxxx> wrote: > > > > > > vduse_blk_detach_ctx() waits for in-flight requests using > > > AIO_WAIT_WHILE(). This is not allowed according to a comment in > > > bdrv_set_aio_context_commit(): > > > > > > /* > > > * Take the old AioContex when detaching it from bs. > > > * At this point, new_context lock is already acquired, and we are now > > > * also taking old_context. This is safe as long as > > > bdrv_detach_aio_context > > > * does not call AIO_POLL_WHILE(). > > > */ > > > > > > Use this opportunity to rewrite the drain code in vduse-blk: > > > > > > - Use the BlockExport refcount so that vduse_blk_exp_delete() is only > > > called when there are no more requests in flight. > > > > > > - Implement .drained_poll() so in-flight request coroutines are stopped > > > by the time .bdrv_detach_aio_context() is called. > > > > > > - Remove AIO_WAIT_WHILE() from vduse_blk_detach_ctx() to solve the > > > .bdrv_detach_aio_context() constraint violation. It's no longer > > > needed due to the previous changes. > > > > > > - Always handle the VDUSE file descriptor, even in drained sections. The > > > VDUSE file descriptor doesn't submit I/O, so it's safe to handle it in > > > drained sections. This ensures that the VDUSE kernel code gets a fast > > > response. > > > > > > - Suspend virtqueue fd handlers in .drained_begin() and resume them in > > > .drained_end(). This eliminates the need for the > > > aio_set_fd_handler(is_external=true) flag, which is being removed from > > > QEMU. > > > > > > This is a long list but splitting it into individual commits would > > > probably lead to git bisect failures - the changes are all related. > > > > > > Signed-off-by: Stefan Hajnoczi <stefanha@xxxxxxxxxx> > > > --- > > > block/export/vduse-blk.c | 132 +++++++++++++++++++++++++++------------ > > > 1 file changed, 93 insertions(+), 39 deletions(-) > > > > > > diff --git a/block/export/vduse-blk.c b/block/export/vduse-blk.c > > > index f7ae44e3ce..35dc8fcf45 100644 > > > --- a/block/export/vduse-blk.c > > > +++ b/block/export/vduse-blk.c > > > @@ -31,7 +31,8 @@ typedef struct VduseBlkExport { > > > VduseDev *dev; > > > uint16_t num_queues; > > > char *recon_file; > > > - unsigned int inflight; > > > + unsigned int inflight; /* atomic */ > > > + bool vqs_started; > > > } VduseBlkExport; > > > > > > typedef struct VduseBlkReq { > > > @@ -41,13 +42,24 @@ typedef struct VduseBlkReq { > > > > > > static void vduse_blk_inflight_inc(VduseBlkExport *vblk_exp) > > > { > > > - vblk_exp->inflight++; > > > + if (qatomic_fetch_inc(&vblk_exp->inflight) == 0) { > > > > I wonder why we need to use atomic operations here. > > The inflight counter is only modified by the vhost-user export thread, > but it may be read by another thread here: > I see. I mean is it enough to just use volatile keywords here, since the writers would not access the variable concurrently. > static bool vduse_blk_drained_poll(void *opaque) > { > BlockExport *exp = opaque; > VduseBlkExport *vblk_exp = container_of(exp, VduseBlkExport, export); > > return qatomic_read(&vblk_exp->inflight) > 0; > > BlockDevOps->drained_poll() calls are invoked when BlockDriverStates are > drained (e.g. blk_drain_all() and related APIs). > > > > @@ -355,13 +410,12 @@ static void vduse_blk_exp_delete(BlockExport *exp) > > > g_free(vblk_exp->handler.serial); > > > } > > > > > > +/* Called with exp->ctx acquired */ > > > static void vduse_blk_exp_request_shutdown(BlockExport *exp) > > > { > > > VduseBlkExport *vblk_exp = container_of(exp, VduseBlkExport, export); > > > > > > - aio_context_acquire(vblk_exp->export.ctx); > > > - vduse_blk_detach_ctx(vblk_exp); > > > - aio_context_acquire(vblk_exp->export.ctx); > > > + vduse_blk_stop_virtqueues(vblk_exp); > > > > Can we add a AIO_WAIT_WHILE() here? Then we don't need to > > increase/decrease the BlockExport refcount during I/O processing. > > I don't think so because vduse_blk_exp_request_shutdown() is not the > only place where we wait for requests to complete. There would still > need to be away to wait for requests to finish (without calling > AIO_WAIT_WHILE()) in vduse_blk_drained_poll(). > But the BlockExport would not be freed until we call vduse_blk_exp_request_shutdown(). If we can ensure that there will be no inflight I/O after we call vduse_blk_exp_request_shutdown(), the BlockExport can be freed safely without increasing/decreasing the BlockExport refcount during I/O processing. Thanks, Yongji
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |