[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: [PATCH v5 2/6] xen/x86: move generically usable NUMA code from x86 to common
Hi Jan, > -----Original Message----- > From: Jan Beulich <jbeulich@xxxxxxxx> > Sent: 2022年9月29日 20:14 > To: Wei Chen <Wei.Chen@xxxxxxx> > Cc: nd <nd@xxxxxxx>; Andrew Cooper <andrew.cooper3@xxxxxxxxxx>; Roger Pau > Monné <roger.pau@xxxxxxxxxx>; Wei Liu <wl@xxxxxxx>; George Dunlap > <george.dunlap@xxxxxxxxxx>; Julien Grall <julien@xxxxxxx>; Stefano > Stabellini <sstabellini@xxxxxxxxxx>; xen-devel@xxxxxxxxxxxxxxxxxxxx > Subject: Re: [PATCH v5 2/6] xen/x86: move generically usable NUMA code > from x86 to common > > On 29.09.2022 09:43, Wei Chen wrote: > > On 2022/9/27 16:19, Jan Beulich wrote: > >> On 20.09.2022 11:12, Wei Chen wrote: > >>> + nodes_used++; > >>> + if ( epdx > memtop ) > >>> + memtop = epdx; > >>> + } > >>> + > >>> + if ( nodes_used <= 1 ) > >>> + i = BITS_PER_LONG - 1; > >> > >> Is this actually going to be correct for all architectures? Aiui > >> Arm64 has only up to 48 physical address bits, but what about an > >> architecture allowing the use of all 64 bits? I think at the very > >> least we want BUILD_BUG_ON(PADDR_BITS >= BITS_PER_LONG) here. > >> > > > > Ok I will add above BUILD_BUG_ON. And I also have question why can't > > we use PADDR_BITS here directly? > > Well, if you used PADDR_BITS, then you would use it without subtracting > 1, and you'd be in trouble again when PADDR_BITS == BITS_PER_LONG. What > may be possible to do instead of BUILD_BUG_ON() is > > if ( nodes_used <= 1 ) > i = min(PADDR_BITS, BITS_PER_LONG - 1); > This one seems better, I will follow it. > >>> + else > >>> + i = find_first_bit(&bitfield, sizeof(unsigned long) * 8); > >>> + > >>> + memnodemapsize = (memtop >> i) + 1; > >> > >> Again perhaps the subject of a separate patch: Isn't there an off-by-1 > >> mistake here? memtop is the maximum of all epdx-es, which are > >> calculated to be the first PDX following the region. Hence I'd expect > >> > >> memnodemapsize = ((memtop - 1) >> i) + 1; > >> > >> here. I guess I'll make patches for both issues, which you may then > >> need to re-base over. > >> > > > > Thanks, I will wait your patches. > > Already sent out yesterday. > Ok. > >>> +static void cf_check dump_numa(unsigned char key) > >>> +{ > >>> + s_time_t now = NOW(); > >>> + unsigned int i, j, n; > >>> + struct domain *d; > >>> + const struct page_info *page; > >>> + unsigned int page_num_node[MAX_NUMNODES]; > >>> + const struct vnuma_info *vnuma; > >>> + > >>> + printk("'%c' pressed -> dumping numa info (now = %"PRI_stime")\n", > key, > >>> + now); > >>> + > >>> + for_each_online_node ( i ) > >>> + { > >>> + paddr_t pa = pfn_to_paddr(node_start_pfn(i) + 1); > >>> + > >>> + printk("NODE%u start->%lu size->%lu free->%lu\n", > >>> + i, node_start_pfn(i), node_spanned_pages(i), > >>> + avail_node_heap_pages(i)); > >>> + /* Sanity check phys_to_nid() */ > >>> + if ( phys_to_nid(pa) != i ) > >>> + printk("phys_to_nid(%"PRIpaddr") -> %d should be %u\n", > >>> + pa, phys_to_nid(pa), i); > >>> + } > >>> + > >>> + j = cpumask_first(&cpu_online_map); > >>> + n = 0; > >>> + for_each_online_cpu ( i ) > >>> + { > >>> + if ( i != j + n || cpu_to_node[j] != cpu_to_node[i] ) > >>> + { > >>> + if ( n > 1 ) > >>> + printk("CPU%u...%u -> NODE%d\n", j, j + n - 1, > cpu_to_node[j]); > >>> + else > >>> + printk("CPU%u -> NODE%d\n", j, cpu_to_node[j]); > >>> + j = i; > >>> + n = 1; > >>> + } > >>> + else > >>> + ++n; > >>> + } > >>> + if ( n > 1 ) > >>> + printk("CPU%u...%u -> NODE%d\n", j, j + n - 1, > cpu_to_node[j]); > >>> + else > >>> + printk("CPU%u -> NODE%d\n", j, cpu_to_node[j]); > >>> + > >>> + rcu_read_lock(&domlist_read_lock); > >>> + > >>> + printk("Memory location of each domain:\n"); > >>> + for_each_domain ( d ) > >>> + { > >>> + process_pending_softirqs(); > >>> + > >>> + printk("Domain %u (total: %u):\n", d->domain_id, > domain_tot_pages(d)); > >>> + > >>> + for_each_online_node ( i ) > >>> + page_num_node[i] = 0; > >> > >> I'd be inclined to suggest to use memset() here, but I won't insist > >> on you doing this "on the fly". Along with this would likely go the > >> request to limit the scope of page_num_node[] (and then perhaps also > >> vnuma and page). > >> > > > > memset for page_num_node makes sense, I will do it before > > for_each_domain ( d ). > > That won't be right - array elements need clearing on every iteration. > Plus ... > Oh, Yes. > > About limit the scope, did you mean, we should move: > > > > "const struct page_info *page; > > unsigned int page_num_node[MAX_NUMNODES]; > > const struct vnuma_info *vnuma;" > > > > to the block of for_each_domain ( d )? > > ... this limiting of scope (yes to your question) would also conflict > with the movement you suggest. It is actually (among other things) > such a mistaken movement which the more narrow scope is intended to > prevent. > Thanks, Wei Chen > Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |