[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 06/12] xen-blkfront: add callbacks for PM suspend and hibernation]

To: "Agarwal, Anchal" <anchalag@xxxxxxxxxx>
From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
Date: Thu, 4 Jun 2020 09:05:48 +0200
Authentication-results: esa1.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none
Cc: "Valentin, Eduardo" <eduval@xxxxxxxxxx>, "len.brown@xxxxxxxxx" <len.brown@xxxxxxxxx>, "peterz@xxxxxxxxxxxxx" <peterz@xxxxxxxxxxxxx>, "benh@xxxxxxxxxxxxxxxxxxx" <benh@xxxxxxxxxxxxxxxxxxx>, "x86@xxxxxxxxxx" <x86@xxxxxxxxxx>, "linux-mm@xxxxxxxxx" <linux-mm@xxxxxxxxx>, "pavel@xxxxxx" <pavel@xxxxxx>, "hpa@xxxxxxxxx" <hpa@xxxxxxxxx>, "tglx@xxxxxxxxxxxxx" <tglx@xxxxxxxxxxxxx>, "sstabellini@xxxxxxxxxx" <sstabellini@xxxxxxxxxx>, "Kamata, Munehisa" <kamatam@xxxxxxxxxx>, "mingo@xxxxxxxxxx" <mingo@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "Singh, Balbir" <sblbir@xxxxxxxxxx>, "axboe@xxxxxxxxx" <axboe@xxxxxxxxx>, "konrad.wilk@xxxxxxxxxx" <konrad.wilk@xxxxxxxxxx>, "bp@xxxxxxxxx" <bp@xxxxxxxxx>, Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>, "jgross@xxxxxxxx" <jgross@xxxxxxxx>, "netdev@xxxxxxxxxxxxxxx" <netdev@xxxxxxxxxxxxxxx>, "linux-pm@xxxxxxxxxxxxxxx" <linux-pm@xxxxxxxxxxxxxxx>, "rjw@xxxxxxxxxxxxx" <rjw@xxxxxxxxxxxxx>, "linux-kernel@xxxxxxxxxxxxxxx" <linux-kernel@xxxxxxxxxxxxxxx>, "vkuznets@xxxxxxxxxx" <vkuznets@xxxxxxxxxx>, "davem@xxxxxxxxxxxxx" <davem@xxxxxxxxxxxxx>, "Woodhouse, David" <dwmw@xxxxxxxxxxxx>
Delivery-date: Thu, 04 Jun 2020 07:06:16 +0000
Ironport-sdr: 9zAhf83GSgEwVQAlNLIU0pGVdGDAshbC0Sjobq4FVFCviv2UJIg1ceYEsq+Yy0UQBCh1NFfEVy iMLTX3yV7vbF6Waw25yG9y+zEinLLz1zR9Lro1huou6GtYPZBtzM7yyyrAbkeV8wwEgZAesGPR 9t4kRazxWGZKk+HF81fNvmBeezr3rFw/eWaf2A0imOXifNpt7cl+sNxzIbSZAhzYx7lP3DUvFq UuLfxFcBZb2rPfRlU+omrackTfSfGDW1cThP56PbO0AyoTGYXRLwTK9XnJksPbOfMd9H9U2L+p U9U=
List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Hello,

On Wed, Jun 03, 2020 at 11:33:52PM +0000, Agarwal, Anchal wrote:
>  CAUTION: This email originated from outside of the organization. Do not 
> click links or open attachments unless you can confirm the sender and know 
> the content is safe.
> 
> 
> 
>     On Tue, May 19, 2020 at 11:27:50PM +0000, Anchal Agarwal wrote:
>     > From: Munehisa Kamata <kamatam@xxxxxxxxxx>
>     > 
>     > S4 power transition states are much different than xen
>     > suspend/resume. Former is visible to the guest and frontend drivers 
> should
>     > be aware of the state transitions and should be able to take appropriate
>     > actions when needed. In transition to S4 we need to make sure that at 
> least
>     > all the in-flight blkif requests get completed, since they probably 
> contain
>     > bits of the guest's memory image and that's not going to get saved any
>     > other way. Hence, re-issuing of in-flight requests as in case of xen 
> resume
>     > will not work here. This is in contrast to xen-suspend where we need to
>     > freeze with as little processing as possible to avoid dirtying RAM late 
> in
>     > the migration cycle and we know that in-flight data can wait.
>     > 
>     > Add freeze, thaw and restore callbacks for PM suspend and hibernation
>     > support. All frontend drivers that needs to use 
> PM_HIBERNATION/PM_SUSPEND
>     > events, need to implement these xenbus_driver callbacks. The freeze 
> handler
>     > stops block-layer queue and disconnect the frontend from the backend 
> while
>     > freeing ring_info and associated resources. Before disconnecting from 
> the
>     > backend, we need to prevent any new IO from being queued and wait for 
> existing
>     > IO to complete. Freeze/unfreeze of the queues will guarantee that there 
> are no
>     > requests in use on the shared ring. However, for sanity we should check
>     > state of the ring before disconnecting to make sure that there are no
>     > outstanding requests to be processed on the ring. The restore handler
>     > re-allocates ring_info, unquiesces and unfreezes the queue and 
> re-connect to
>     > the backend, so that rest of the kernel can continue to use the block 
> device
>     > transparently.
>     > 
>     > Note:For older backends,if a backend doesn't have commit'12ea729645ace'
>     > xen/blkback: unmap all persistent grants when frontend gets 
> disconnected,
>     > the frontend may see massive amount of grant table warning when freeing
>     > resources.
>     > [   36.852659] deferring g.e. 0xf9 (pfn 0xffffffffffffffff)
>     > [   36.855089] xen:grant_table: WARNING:e.g. 0x112 still in use!
>     > 
>     > In this case, persistent grants would need to be disabled.
>     > 
>     > [Anchal Changelog: Removed timeout/request during blkfront freeze.
>     > Reworked the whole patch to work with blk-mq and incorporate upstream's
>     > comments]
> 
>     Please tag versions using vX and it would be helpful if you could list
>     the specific changes that you performed between versions. There where
>     3 RFC versions IIRC, and there's no log of the changes between them.
> 
> I will elaborate on "upstream's comments" in my changelog in my next round of 
> patches.

Sorry for being picky, but can you please make sure your email client
properly quotes previous emails on reply. Note the lack of '>' added
to the quoted parts of your reply.

>     > +                     }
>     > +
>     >                       break;
>     > +             }
>     > +
>     > +             /*
>     > +              * We may somehow receive backend's Closed again while 
> thawing
>     > +              * or restoring and it causes thawing or restoring to 
> fail.
>     > +              * Ignore such unexpected state regardless of the backend 
> state.
>     > +              */
>     > +             if (info->connected == BLKIF_STATE_FROZEN) {
> 
>     I think you can join this with the previous dev->state == 
> XenbusStateClosed?
> 
>     Also, won't the device be in the Closed state already if it's in state
>     frozen?
> Yes but I think this mostly due to a hypothetical case if during thawing 
> backend switches to Closed state.
> I am not entirely sure if that could happen. Could use some expertise here.

I think the frontend seeing the backend in the closed state during
restore would be a bug that should prevent the frontend from
resuming.

>     > +     /* Kick the backend to disconnect */
>     > +     xenbus_switch_state(dev, XenbusStateClosing);
>     > +
>     > +     /*
>     > +      * We don't want to move forward before the frontend is 
> diconnected
>     > +      * from the backend cleanly.
>     > +      */
>     > +     timeout = 
> wait_for_completion_timeout(&info->wait_backend_disconnected,
>     > +                                           timeout);
>     > +     if (!timeout) {
>     > +             err = -EBUSY;
> 
>     Note err is only used here, and I think could just be dropped.
> 
> This err is what's being returned from the function. Am I missing anything?

Just 'return -EBUSY;' directly, and remove the top level variable. You
can also use -EBUSY directly in the xenbus_dev_error call. Anyway, not
that important.

>     > +             xenbus_dev_error(dev, err, "Freezing timed out;"
>     > +                              "the device may become inconsistent 
> state");
> 
>     Leaving the device in this state is quite bad, as it's in a closed
>     state and with the queues frozen. You should make an attempt to
>     restore things to a working state.
> 
> You mean if backend closed after timeout? Is there a way to know that? I 
> understand it's not good to 
> leave it in this state however, I am still trying to find if there is a good 
> way to know if backend is still connected after timeout.
> Hence the message " the device may become inconsistent state".  I didn't see 
> a timeout not even once on my end so that's why 
> I may be looking for an alternate perspective here. may be need to thaw 
> everything back intentionally is one thing I could think of.

You can manually force this state, and then check that it will behave
correctly. I would expect that on a failure to disconnect from the
backend you should switch the frontend to the 'Init' state in order to
try to reconnect to the backend when possible.

>     > +     }
>     > +
>     > +     return err;
>     > +}
>     > +
>     > +static int blkfront_restore(struct xenbus_device *dev)
>     > +{
>     > +     struct blkfront_info *info = dev_get_drvdata(&dev->dev);
>     > +     int err = 0;
>     > +
>     > +     err = talk_to_blkback(dev, info);
>     > +     blk_mq_unquiesce_queue(info->rq);
>     > +     blk_mq_unfreeze_queue(info->rq);
>     > +     if (!err)
>     > +         blk_mq_update_nr_hw_queues(&info->tag_set, info->nr_rings);
> 
>     Bad indentation. Also shouldn't you first update the queues and then
>     unfreeze them?
> Please correct me if I am wrong, blk_mq_update_nr_hw_queues freezes the queue
> So I don't think the order could be reversed.

Regardless of what blk_mq_update_nr_hw_queues does, I don't think it's
correct to unfreeze the queues without having updated them. Also the
freezing/unfreezing uses a refcount, so I think it's perfectly fine to
call blk_mq_update_nr_hw_queues first and then unfreeze the queues.

Also note that talk_to_blkback returning an error should likely
prevent any unfreezing, as the queues won't be updated to match the
parameters of the backend.

Thanks, Roger.

Follow-Ups:
- Re: [PATCH 06/12] xen-blkfront: add callbacks for PM suspend and hibernation]
  - From: Anchal Agarwal

References:
- Re: [PATCH 06/12] xen-blkfront: add callbacks for PM suspend and hibernation]
  - From: Agarwal, Anchal

Prev by Date: Re: [PATCH for-4.14] x86/shim: Fix defconfig selection and trim the build further
Next by Date: Re: handle_pio looping during domain shutdown, with qemu 4.2.0 in stubdom
Previous by thread: Re: [PATCH 06/12] xen-blkfront: add callbacks for PM suspend and hibernation]
Next by thread: Re: [PATCH 06/12] xen-blkfront: add callbacks for PM suspend and hibernation]
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.