[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [PATCH v5 2/6] xen/x86: move generically usable NUMA code from x86 to common


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Wei Chen <Wei.Chen@xxxxxxx>
  • Date: Fri, 30 Sep 2022 01:45:31 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=lists.xenproject.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com])
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none
  • Arc-message-signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ZZAEitwMsMioq+jgAgQnYyC/ecnmIWHIZugmig0g7+8=; b=IpsgJGdAlDLuY1nusz6r9t9d8hO0FPpc7mP/6Z9MQb4PFePErf2VpljO34KgQ5GJ2UkytCE/UkVkbVKHdu9y3wRxDjHwe2tv9JSZY1OyoTOkmlvPH8tSndHwU2SXKthq8wYwnNW3JOiT58QCgybz6/VBEs9oeJfASozRqg4ULImhKi3kIZXB0NXcm24iMKrJ4WdB2MVPDjSqQ5a7qK19j6wEGHi8cHf58kV61ZEj6cqAWCS4UYHhnAcSDPro0PWnm9f11x9BcxouUu6UMxEaFXoX+muBn7Goe9MfgSFNO64L5BXID4keNT8aJc6wNKNrEkvT8mILiuNcn8egXXvhbg==
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ZZAEitwMsMioq+jgAgQnYyC/ecnmIWHIZugmig0g7+8=; b=Qgjl57/WCN08TPGOgNw0CgJnQQMPjPCuqbPOwaj/47tcETe+BX1o0Q1ed2/rn7s6869CE9N5IT6hZJdjKpifz/Z4BNjxPc2udyK6KVSWja0AwjbTKlQ8fOHOsmyx2DU5lzd+EJ+I4zTUrr4yALYCUmZByI5b0oYl7zsZUWomjezEPhVUkYbQfHv352oBVv0A5uFhKgCYQ+BNJ9C897copqsyJvNajxMv7TDO/udjg0tfQ6YNQYb/P5KmFo3ZJ59PdVvFJvBui0wlLvMdTusEZjK93HHEMGP6bmou9LrDxMNJb7jagjrGI0lcgOK560yY6G07uIXj7TXoad/KsevLbg==
  • Arc-seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=F94RCtdB+kJr0FyeDwQ6rNS9UgaVHWfXJdyn1Yo2VpQfihJ1pQ+kUcBirsO4HLIGrHP94pJMH2Wvz2BxkrOCNwc0GeET4HD7pspxnPc05oMvOxNZoUZyrkDBoyqO+xgZq/kyXMS7gCBa4RRE1SleW630VnrvJnSiWUpHLYGOVa6yMyIJNPl16N3uQnuJBJz12qRXRQPLqFiSjRA3g4EomDCfb07k42Q+/rsJuNoOVof5uBrR9Y+6Z9FIJ3HBs8cdgUag7/2C7cawJHl1JZCzR51q3L3pUeFfZM0QQ9fxkC7w9DTbhw9qtiNav6bweR7y+vGmY5T9E1m5U88sg+SkTA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=DMlIF1jSABlRT6hhFH1pdSCG3LVApvrEG+Q/K/gw/snY3BX0YB+e6oNFx+KwCprEmgzz4BrWAxa6vHqFSJ9VF3KUt/T+LfslHAQtMfrYc68BDPaq8ezTIQNFLYOSbc+8Q1ZfbNkMYpw5BikszOoTCwCi738fY09G/Vgh/4pllMqzqPukpz6EBQ7URvCq8CCIrOnb91y0RUbQwveZoWCAo+HpY0mb+1A+RpvAtk5hSq+yllzS1mN/sCfH6L/EWxKgBXfITM1417YR1MeT94XIVGpM9gN/P9j+TO4rWqUlf2/wZMwGChQ23bDkI64KrzKFZ0clJJQPQx1RIGUeWwnXKA==
  • Authentication-results-original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com;
  • Cc: nd <nd@xxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Fri, 30 Sep 2022 01:46:00 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Nodisclaimer: true
  • Original-authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com;
  • Thread-index: AQHYzNEmJs18WyT35kWq9bNIXVbTQ63y+hyAgAMalwCAAEusAIAA4YSw
  • Thread-topic: [PATCH v5 2/6] xen/x86: move generically usable NUMA code from x86 to common

Hi Jan,

> -----Original Message-----
> From: Jan Beulich <jbeulich@xxxxxxxx>
> Sent: 2022年9月29日 20:14
> To: Wei Chen <Wei.Chen@xxxxxxx>
> Cc: nd <nd@xxxxxxx>; Andrew Cooper <andrew.cooper3@xxxxxxxxxx>; Roger Pau
> Monné <roger.pau@xxxxxxxxxx>; Wei Liu <wl@xxxxxxx>; George Dunlap
> <george.dunlap@xxxxxxxxxx>; Julien Grall <julien@xxxxxxx>; Stefano
> Stabellini <sstabellini@xxxxxxxxxx>; xen-devel@xxxxxxxxxxxxxxxxxxxx
> Subject: Re: [PATCH v5 2/6] xen/x86: move generically usable NUMA code
> from x86 to common
> 
> On 29.09.2022 09:43, Wei Chen wrote:
> > On 2022/9/27 16:19, Jan Beulich wrote:
> >> On 20.09.2022 11:12, Wei Chen wrote:
> >>> +        nodes_used++;
> >>> +        if ( epdx > memtop )
> >>> +            memtop = epdx;
> >>> +    }
> >>> +
> >>> +    if ( nodes_used <= 1 )
> >>> +        i = BITS_PER_LONG - 1;
> >>
> >> Is this actually going to be correct for all architectures? Aiui
> >> Arm64 has only up to 48 physical address bits, but what about an
> >> architecture allowing the use of all 64 bits? I think at the very
> >> least we want BUILD_BUG_ON(PADDR_BITS >= BITS_PER_LONG) here.
> >>
> >
> > Ok I will add above BUILD_BUG_ON. And I also have question why can't
> > we use PADDR_BITS here directly?
> 
> Well, if you used PADDR_BITS, then you would use it without subtracting
> 1, and you'd be in trouble again when PADDR_BITS == BITS_PER_LONG. What
> may be possible to do instead of BUILD_BUG_ON() is
> 
>     if ( nodes_used <= 1 )
>         i = min(PADDR_BITS, BITS_PER_LONG - 1);
> 

This one seems better, I will follow it.

> >>> +    else
> >>> +        i = find_first_bit(&bitfield, sizeof(unsigned long) * 8);
> >>> +
> >>> +    memnodemapsize = (memtop >> i) + 1;
> >>
> >> Again perhaps the subject of a separate patch: Isn't there an off-by-1
> >> mistake here? memtop is the maximum of all epdx-es, which are
> >> calculated to be the first PDX following the region. Hence I'd expect
> >>
> >>      memnodemapsize = ((memtop - 1) >> i) + 1;
> >>
> >> here. I guess I'll make patches for both issues, which you may then
> >> need to re-base over.
> >>
> >
> > Thanks, I will wait your patches.
> 
> Already sent out yesterday.
> 

Ok.

> >>> +static void cf_check dump_numa(unsigned char key)
> >>> +{
> >>> +    s_time_t now = NOW();
> >>> +    unsigned int i, j, n;
> >>> +    struct domain *d;
> >>> +    const struct page_info *page;
> >>> +    unsigned int page_num_node[MAX_NUMNODES];
> >>> +    const struct vnuma_info *vnuma;
> >>> +
> >>> +    printk("'%c' pressed -> dumping numa info (now = %"PRI_stime")\n",
> key,
> >>> +           now);
> >>> +
> >>> +    for_each_online_node ( i )
> >>> +    {
> >>> +        paddr_t pa = pfn_to_paddr(node_start_pfn(i) + 1);
> >>> +
> >>> +        printk("NODE%u start->%lu size->%lu free->%lu\n",
> >>> +               i, node_start_pfn(i), node_spanned_pages(i),
> >>> +               avail_node_heap_pages(i));
> >>> +        /* Sanity check phys_to_nid() */
> >>> +        if ( phys_to_nid(pa) != i )
> >>> +            printk("phys_to_nid(%"PRIpaddr") -> %d should be %u\n",
> >>> +                   pa, phys_to_nid(pa), i);
> >>> +    }
> >>> +
> >>> +    j = cpumask_first(&cpu_online_map);
> >>> +    n = 0;
> >>> +    for_each_online_cpu ( i )
> >>> +    {
> >>> +        if ( i != j + n || cpu_to_node[j] != cpu_to_node[i] )
> >>> +        {
> >>> +            if ( n > 1 )
> >>> +                printk("CPU%u...%u -> NODE%d\n", j, j + n - 1,
> cpu_to_node[j]);
> >>> +            else
> >>> +                printk("CPU%u -> NODE%d\n", j, cpu_to_node[j]);
> >>> +            j = i;
> >>> +            n = 1;
> >>> +        }
> >>> +        else
> >>> +            ++n;
> >>> +    }
> >>> +    if ( n > 1 )
> >>> +        printk("CPU%u...%u -> NODE%d\n", j, j + n - 1,
> cpu_to_node[j]);
> >>> +    else
> >>> +        printk("CPU%u -> NODE%d\n", j, cpu_to_node[j]);
> >>> +
> >>> +    rcu_read_lock(&domlist_read_lock);
> >>> +
> >>> +    printk("Memory location of each domain:\n");
> >>> +    for_each_domain ( d )
> >>> +    {
> >>> +        process_pending_softirqs();
> >>> +
> >>> +        printk("Domain %u (total: %u):\n", d->domain_id,
> domain_tot_pages(d));
> >>> +
> >>> +        for_each_online_node ( i )
> >>> +            page_num_node[i] = 0;
> >>
> >> I'd be inclined to suggest to use memset() here, but I won't insist
> >> on you doing this "on the fly". Along with this would likely go the
> >> request to limit the scope of page_num_node[] (and then perhaps also
> >> vnuma and page).
> >>
> >
> > memset for page_num_node makes sense, I will do it before
> > for_each_domain ( d ).
> 
> That won't be right - array elements need clearing on every iteration.
> Plus ...
> 

Oh, Yes.

> > About limit the scope, did you mean, we should move:
> >
> > "const struct page_info *page;
> > unsigned int page_num_node[MAX_NUMNODES];
> > const struct vnuma_info *vnuma;"
> >
> > to the block of for_each_domain ( d )?
> 
> ... this limiting of scope (yes to your question) would also conflict
> with the movement you suggest. It is actually (among other things)
> such a mistaken movement which the more narrow scope is intended to
> prevent.
> 

Thanks,
Wei Chen

> Jan

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.