[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: “Backend has not unmapped grant” errors



On 24.08.22 22:38, SeongJae Park wrote:
+ Roger

On Wed, 24 Aug 2022 17:44:42 +0000 SeongJae Park <sj@xxxxxxxxxx> wrote:

Hello,

On Wed, 24 Aug 2022 08:02:40 +0200 Juergen Gross <jgross@xxxxxxxx> wrote:


[-- Attachment #1.1.1: Type: text/plain, Size: 4312 bytes --]

On 24.08.22 02:20, Marek Marczykowski-Górecki wrote:
FWIW, I hit this issue twice already in this week CI run, while it never
happened before. The difference compared to previous run is Linux
5.15.57 vs 5.15.61. The latter reports persistent grants disabled. The
only related commits I see there are three commits indeed related to
persistent grants:

    c98e956ef489 xen-blkfront: Apply 'feature_persistent' parameter when connect
    ef26b5d530d4 xen-blkback: Apply 'feature_persistent' parameter when connect
    7304be4c985d xen-blkback: fix persistent grants negotiation

But none of the commit messages suggests intentional disabling it
without explicit request for doing so. I did not requested disabling it
in toolstack (although I have set backend as "trusted" - XSA-403).
I have confirmed it's the frontend version that matters. Running older
frontend kernel with 5.15.61 backend results in persistent grants
enabled (and both frontend and backend xenstore "feature-persistent"
entries are "1" in this case).

This is a mess.

I think the main problem seems to be that the feature negotiation process
isn't specified in a sane way.

  From the blkif.h header:

Backend-side:
   * feature-persistent
   *      Values:         0/1 (boolean)
   *      Default Value:  0
   *      Notes: 7
   *
   *      A value of "1" indicates that the backend can keep the grants used
   *      by the frontend driver mapped, so the same set of grants should be
   *      used in all transactions. The maximum number of grants the backend
   *      can map persistently depends on the implementation, but ideally it
   *      should be RING_SIZE * BLKIF_MAX_SEGMENTS_PER_REQUEST. Using this
   *      feature the backend doesn't need to unmap each grant, preventing
   *      costly TLB flushes. The backend driver should only map grants
   *      persistently if the frontend supports it. If a backend driver chooses
   *      to use the persistent protocol when the frontend doesn't support it,
   *      it will probably hit the maximum number of persistently mapped grants
   *      (due to the fact that the frontend won't be reusing the same grants),
   *      and fall back to non-persistent mode. Backend implementations may
   *      shrink or expand the number of persistently mapped grants without
   *      notifying the frontend depending on memory constraints (this might
   *      cause a performance degradation).

Frontend-side:
   * feature-persistent
   *      Values:         0/1 (boolean)
   *      Default Value:  0
   *      Notes: 7, 8, 9
   *
   *      A value of "1" indicates that the frontend will reuse the same grants
   *      for all transactions, allowing the backend to map them with write
   *      access (even when it should be read-only). If the frontend hits the
   *      maximum number of allowed persistently mapped grants, it can fallback
   *      to non persistent mode. This will cause a performance degradation,
   *      since the the backend driver will still try to map those grants
   *      persistently. Since the persistent grants protocol is compatible with
   *      the previous protocol, a frontend driver can choose to work in
   *      persistent mode even when the backend doesn't support it.

Those definitions don't make clear, which side is the one to decide whether
the feature should be used or not. In my understanding the related drivers
should just advertise their setting (the _ability_ to use the feature), and
it should be used only if both sides have written a "1".

With above patches applied, the frontend will set 'feature-persistent' in
Xenstore only, if the backend has done so, but the backend will set it
only, if the frontend has done it. This results in persistent grants
always being disabled.

Sorry for making the mess, and thank you for the kind report and detailed
explanation of the problem.


This is wrong, as the value written should not reflect the current state
of the interface. That state should be set according to both sides' value,
probably a cached one on the blkback side (using a new flag for caching it,
not the current state).

Agreed.  So, I think the issue comes from the fact that we are using one field,
which was a place for saving only the negotiation result, for yet another
purpose: caching of the parameter value.  As a result, the advertisement, which
should follow only the parameter value, becomes inconsistent.

How about simply adding another field for the caching purpose, so that the
advertisation could be done regardless of the negotiation?  For example:

diff --git a/drivers/block/xen-blkback/common.h 
b/drivers/block/xen-blkback/common.h
index bda5c815e441..a28473470e66 100644
--- a/drivers/block/xen-blkback/common.h
+++ b/drivers/block/xen-blkback/common.h
@@ -226,6 +226,9 @@ struct xen_vbd {
        sector_t                size;
        unsigned int            flush_support:1;
        unsigned int            discard_secure:1;
+       /* Connect-time cached feature_persistent parameter value */
+       unsigned int            feature_gnt_persistent_parm:1;
+       /* Persistent grants feature negotiation result */
        unsigned int            feature_gnt_persistent:1;
        unsigned int            overflow_max_grants:1;
  };
diff --git a/drivers/block/xen-blkback/xenbus.c 
b/drivers/block/xen-blkback/xenbus.c
index ee7ad2fb432d..c0227dfa4688 100644
--- a/drivers/block/xen-blkback/xenbus.c
+++ b/drivers/block/xen-blkback/xenbus.c
@@ -907,7 +907,7 @@ static void connect(struct backend_info *be)
        xen_blkbk_barrier(xbt, be, be->blkif->vbd.flush_support);
err = xenbus_printf(xbt, dev->nodename, "feature-persistent", "%u",
-                       be->blkif->vbd.feature_gnt_persistent);
+                       be->blkif->vbd.feature_gnt_persistent_parm);
        if (err) {
                xenbus_dev_fatal(dev, err, "writing %s/feature-persistent",
                                 dev->nodename);
@@ -1085,7 +1085,9 @@ static int connect_ring(struct backend_info *be)
                return -ENOSYS;
        }
- blkif->vbd.feature_gnt_persistent = feature_persistent &&
+       blkif->vbd.feature_gnt_persistent_parm = feature_persistent;
+       blkif->vbd.feature_gnt_persistent =
+               blkif->vbd.feature_gnt_persistent_parm &&
                xenbus_read_unsigned(dev->otherend, "feature-persistent", 0);
blkif->vbd.overflow_max_grants = 0;
diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 8e56e69fb4c4..dfae08115450 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -213,6 +213,9 @@ struct blkfront_info
        unsigned int feature_fua:1;
        unsigned int feature_discard:1;
        unsigned int feature_secdiscard:1;
+       /* Connect-time cached feature_persistent parameter */
+       unsigned int feature_persistent_parm:1;
+       /* Persistent grants feature negotiation result */
        unsigned int feature_persistent:1;
        unsigned int bounce:1;
        unsigned int discard_granularity;
@@ -1848,7 +1851,7 @@ static int talk_to_blkback(struct xenbus_device *dev,
                goto abort_transaction;
        }
        err = xenbus_printf(xbt, dev->nodename, "feature-persistent", "%u",
-                       info->feature_persistent);
+                       info->feature_persistent_parm);
        if (err)
                dev_warn(&dev->dev,
                         "writing persistent grants feature to xenbus");
@@ -2281,7 +2284,8 @@ static void blkfront_gather_backend_features(struct 
blkfront_info *info)
        if (xenbus_read_unsigned(info->xbdev->otherend, "feature-discard", 0))
                blkfront_setup_discard(info);
- if (feature_persistent)
+       info->feature_persistent_parm = feature_persistent;
+       if (info->feature_persistent_parm)
                info->feature_persistent =
                        !!xenbus_read_unsigned(info->xbdev->otherend,
                                               "feature-persistent", 0);


Yes, this is much better IMO.

Could you please send it as two proper patches (one for each driver) with
the correct "Fixes:" tags?


Juergen

Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature
Description: OpenPGP digital signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.