|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: [XEN RFC PATCH 24/40] xen/arm: introduce a helper to parse device tree NUMA distance map
On Tue, 31 Aug 2021, Wei Chen wrote:
> Hi Stefano,
>
> > -----Original Message-----
> > From: Stefano Stabellini <sstabellini@xxxxxxxxxx>
> > Sent: 2021年8月31日 8:48
> > To: Wei Chen <Wei.Chen@xxxxxxx>
> > Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx; sstabellini@xxxxxxxxxx; julien@xxxxxxx;
> > jbeulich@xxxxxxxx; Bertrand Marquis <Bertrand.Marquis@xxxxxxx>
> > Subject: Re: [XEN RFC PATCH 24/40] xen/arm: introduce a helper to parse
> > device tree NUMA distance map
> >
> > On Wed, 11 Aug 2021, Wei Chen wrote:
> > > A NUMA aware device tree will provide a "distance-map" node to
> > > describe distance between any two nodes. This patch introduce a
> > > new helper to parse this distance map.
> > >
> > > Signed-off-by: Wei Chen <wei.chen@xxxxxxx>
> > > ---
> > > xen/arch/arm/numa_device_tree.c | 67 +++++++++++++++++++++++++++++++++
> > > 1 file changed, 67 insertions(+)
> > >
> > > diff --git a/xen/arch/arm/numa_device_tree.c
> > b/xen/arch/arm/numa_device_tree.c
> > > index bbe081dcd1..6e0d1d3d9f 100644
> > > --- a/xen/arch/arm/numa_device_tree.c
> > > +++ b/xen/arch/arm/numa_device_tree.c
> > > @@ -200,3 +200,70 @@ device_tree_parse_numa_memory_node(const void *fdt,
> > int node,
> > >
> > > return 0;
> > > }
> > > +
> > > +/* Parse NUMA distance map v1 */
> > > +int __init
> > > +device_tree_parse_numa_distance_map_v1(const void *fdt, int node)
> > > +{
> > > + const struct fdt_property *prop;
> > > + const __be32 *matrix;
> > > + int entry_count, len, i;
> > > +
> > > + printk(XENLOG_INFO "NUMA: parsing numa-distance-map\n");
> > > +
> > > + prop = fdt_get_property(fdt, node, "distance-matrix", &len);
> > > + if ( !prop )
> > > + {
> > > + printk(XENLOG_WARNING
> > > + "NUMA: No distance-matrix property in distance-map\n");
> > > +
> > > + return -EINVAL;
> > > + }
> > > +
> > > + if ( len % sizeof(uint32_t) != 0 )
> > > + {
> > > + printk(XENLOG_WARNING
> > > + "distance-matrix in node is not a multiple of u32\n");
> > > + return -EINVAL;
> > > + }
> > > +
> > > + entry_count = len / sizeof(uint32_t);
> > > + if ( entry_count <= 0 )
> > > + {
> > > + printk(XENLOG_WARNING "NUMA: Invalid distance-matrix\n");
> > > +
> > > + return -EINVAL;
> > > + }
> > > +
> > > + matrix = (const __be32 *)prop->data;
> > > + for ( i = 0; i + 2 < entry_count; i += 3 )
> > > + {
> > > + uint32_t from, to, distance;
> > > +
> > > + from = dt_read_number(matrix, 1);
> > > + matrix++;
> > > + to = dt_read_number(matrix, 1);
> > > + matrix++;
> > > + distance = dt_read_number(matrix, 1);
> > > + matrix++;
> > > +
> > > + if ( (from == to && distance != NUMA_LOCAL_DISTANCE) ||
> > > + (from != to && distance <= NUMA_LOCAL_DISTANCE) )
> > > + {
> > > + printk(XENLOG_WARNING
> > > + "Invalid nodes' distance from node#%d to node#%d
> > = %d\n",
> > > + from, to, distance);
> > > + return -EINVAL;
> > > + }
> > > +
> > > + printk(XENLOG_INFO "NUMA: distance from node#%d to node#%d
> > = %d\n",
> > > + from, to, distance);
> > > + numa_set_distance(from, to, distance);
> > > +
> > > + /* Set default distance of node B->A same as A->B */
> > > + if (to > from)
> > > + numa_set_distance(to, from, distance);
> >
> > I am a bit unsure about this last 2 lines: why calling numa_set_distance
> > in the opposite direction only when to > from? Wouldn't it be OK to
> > always do both:
> >
> > numa_set_distance(from, to, distance);
> > numa_set_distance(to, from, distance);
> >
> > ?
> >
> I borrowed this code from Linux, but here is my understanding:
>
> First, I read some notes in Documentation/devicetree/bindings/numa.txt
> 1. Each entry represents distance from first node to second node.
> The distances are equal in either direction.
> 2. distance-matrix should have entries in lexicographical ascending
> order of nodes.
>
> Here is an example of distance-map node in DTB:
> Sample#1, full list:
> distance-map {
> compatible = "numa-distance-map-v1";
> distance-matrix = <0 0 10>,
> <0 1 20>,
> <0 2 40>,
> <0 3 20>,
> <1 0 20>,
> <1 1 10>,
> <1 2 20>,
> <1 3 40>,
> <2 0 40>,
> <2 1 20>,
> <2 2 10>,
> <2 3 20>,
> <3 0 20>,
> <3 1 40>,
> <3 2 20>,
> <3 3 10>;
> };
>
> Call numa_set_distance when "to > from" will prevent Xen to call
> numa_set_distance(0, 1, 20) again when it's setting distance for <1 0 20>.
> But, numa_set_distance(1, 0, 20) will be call twice.
>
> Normally, distance-map node will be optimized in following sample#2,
> all redundant entries are removed:
> Sample#2, partial list:
> distance-map {
> compatible = "numa-distance-map-v1";
> distance-matrix = <0 0 10>,
> <0 1 20>,
> <0 2 40>,
> <0 3 20>,
> <1 1 10>,
> <1 2 20>,
> <1 3 40>,
> <2 2 10>,
> <2 3 20>,
> <3 3 10>;
> };
>
> There is not any "from > to" entry in the map. But using this partial map
> still can set all distances for all pairs. And numa_set_distance(1, 0, 20)
> will be only once.
I see. I can't find in Documentation/devicetree/bindings/numa.txt where
it says that "from > to" nodes can be omitted. If it is not written
down, then somebody could easily optimize it the opposite way:
distance-matrix = <0 0 10>,
<1 0 20>,
<2 0 40>,
<3 0 20>,
<1 1 10>,
<2 1 20>,
<3 1 40>,
<2 2 10>,
<3 2 20>,
<3 3 10>;
I think the code in Xen should be resilient and able to cope with a
device tree like the one you wrote or the one I wrote. From a code
perspective, it should be very easy to do. If nothing else it would make
Xen more resilient against buggy firmware.
> > But in any case, I have a different suggestion. The binding states that
> > "distances are equal in either direction". Also it has an example where
> > only one direction is expressed unfortunately (at the end of the
> > document).
> >
>
> Oh, I should see this comment first, then I will not post above
> comment : )
>
> > So my suggestion is to parse it as follows:
> >
> > - call numa_set_distance just once from
> > device_tree_parse_numa_distance_map_v1
> >
> > - in numa_set_distance:
> > - set node_distance_map[from][to] = distance;
> > - check node_distance_map[to][from]
> > - if unset, node_distance_map[to][from] = distance;
> > - if already set to the same value, return success;
> > - if already set to a different value, return error;
>
> I don't really like this implementation. I want the behavior of
> numa_set_distance just like the function name, do not include
> implicit operations. Otherwise, except the user read this function
> implementation before he use it, he probably doesn't know this
> function has done so many things.
You can leave numa_set_distance as-is without any implicit operations.
In that case, just call numa_set_distance twice from numa_set_distance
for both from/to and to/from. numa_set_distance could return error is
the entry was already set to a different value or success otherwise
(also in the case it was already set to the same value). This would
allow Xen to cope with both scenarios above.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |