[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v4 4/7] xen/riscv: introduce functionality to work with CPU info
On Wed, 2024-08-14 at 17:22 +0200, Jan Beulich wrote: > On 14.08.2024 16:45, oleksii.kurochko@xxxxxxxxx wrote: > > On Tue, 2024-08-13 at 10:54 +0200, Jan Beulich wrote: > > > On 09.08.2024 18:19, Oleksii Kurochko wrote: > > > > --- a/xen/arch/riscv/include/asm/smp.h > > > > +++ b/xen/arch/riscv/include/asm/smp.h > > > > @@ -5,6 +5,8 @@ > > > > #include <xen/cpumask.h> > > > > #include <xen/percpu.h> > > > > > > > > +#define INVALID_HARTID UINT_MAX > > > > + > > > > DECLARE_PER_CPU(cpumask_var_t, cpu_sibling_mask); > > > > DECLARE_PER_CPU(cpumask_var_t, cpu_core_mask); > > > > > > > > @@ -14,6 +16,14 @@ DECLARE_PER_CPU(cpumask_var_t, > > > > cpu_core_mask); > > > > */ > > > > #define park_offline_cpus false > > > > > > > > +void smp_set_bootcpu_id(unsigned long boot_cpu_hartid); > > > > + > > > > +/* > > > > + * Mapping between linux logical cpu index and hartid. > > > > + */ > > > > +extern unsigned long cpuid_to_hartid_map[NR_CPUS]; > > > > +#define cpuid_to_hartid(cpu) cpuid_to_hartid_map[cpu] > > > > > > How wide can hart IDs be? Wider than 32 bits? If not, why > > > unsigned > > > long? > > According to the spec: > > ``` > > The mhartid CSR is an MXLEN-bit read-only register containing the > > integer ID of the hardware thread running the code > > ``` > > where MXLEN can bit 32 and 64. > > Hmm, okay. If the full MXLEN bits can be used, then using unsigned > long > is indeed the right thing here. > > > > > @@ -40,6 +41,19 @@ void __init noreturn start_xen(unsigned long > > > > bootcpu_id, > > > > { > > > > remove_identity_mapping(); > > > > > > > > + /* > > > > + * tp register contains an address of physical cpu > > > > information. > > > > + * So write physical CPU info of boot cpu to tp register > > > > + * It will be used later by get_processor_id() ( look at > > > > + * <asm/processor.h> ): > > > > + * #define get_processor_id() (tp->processor_id) > > > > + */ > > > > + asm volatile ( "mv tp, %0" : : "r"((unsigned > > > > long)&pcpu_info[0]) ); > > > > > > Perhaps be on the safe side and also add a memory barrier? > > Do you mean compiler barrier? ( asm volatile ( "..." :: ... : > > "memory" > > )? ) > > Yes. > > > > Perhaps be on the safe side and do this absolutely first in the > > > function, > > > or even in assembly (such that initializers of future variables > > > declared > > > at the top of the function end up safe, too)? > > I am not sure that I am following your part related to put this > > code in > > assembler ( instructions in assembly code still code be re-ordered > > what > > can affect a time when tp will be set with correct value ) > > I'm afraid I don't understand this. Instructions can be re-ordered, > sure, > but later instructions are guaranteed to observe the effects on > register > state that earlier instructions caused. > > > and what do > > you mean by "initializers of future variables declared at the top > > of > > the function end up safe" > > With > > void start_xen() > { > int var = func(); > > asm volatile ( "mv tp, %0" :: ...); > ... > > you end up with the requirement that func() must not use anything > that > depends on tp being set. In this simple example it may be easy to say > "don't use an initializer and call the function afterwards". But that > is > going to be a risky game to play. Look at x86'es __start_xen(). An > insertion into such a big set of declarations may not pay attention > to > tp not being set yet, when _all_ other C code may reasonably assume > tp > is set. Thanks for the clarification. I missed the possibility that someone might accidentally use tp before it is set. In this case, I agree that it would be better to create a setup_tp() function and call it before start_xen(). > > > > > --- /dev/null > > > > +++ b/xen/arch/riscv/smp.c > > > > @@ -0,0 +1,4 @@ > > > > +#include <xen/smp.h> > > > > + > > > > +/* tp points to one of these per cpu */ > > > > +struct pcpu_info pcpu_info[NR_CPUS]; > > > > > > And they all need setting up statically? Is there a plan to make > > > this > > > dynamic (which could be recorded in a "fixme" in the comment)? > > I didn't plan to make allocation of this array dynamic. I don't > > expect > > that NR_CPUS will be big. > > What is this expectation of yours based on? Other architectures > permit > systems with hundreds or even thousands of CPUs; why would RISC-V be > different there? Based on available dev boards. ( what isn't really strong argument ) I checked other architectures and they are using static allocation too: struct cpuinfo_x86 cpu_data[NR_CPUS]; u32 x86_cpu_to_apicid[NR_CPUS] __read_mostly = { [0 ... NR_CPUS-1] = BAD_APICID }; ... /* Arm */ struct cpuinfo_arm cpu_data[NR_CPUS]; I wanted to check to understand which one API should be used to allocate this array dynamically. xmalloc? And I am curious how I can use xmalloc() at this stage if page allocator (?) should be initialized what isn't true now. Or just to allocate pcpu_info only for boot cpu and for other then use xmalloc()? > > > I can add "fixme" but I am not really > > understand what will be advantages if pcpu_info[] will be allocated > > dynamically. > > Where possible it's better to avoid static allocations, of which on > some systems only a very small part may be used. Even if you put > yourself > on the position that many take - memory being cheap - you then still > waste cache and TLB bandwidth. Furthermore as long as struct > pcpu_info > isn't big enough (and properly aligned) for two successive array > entries > to not share cache lines, you may end up playing cacheline ping-pong > when a CPU writes to its own array slot. Why the mentioned issues aren't work for dynamic memory? We still allocating memory for sizeof(pcpu_info) * NR_CPUS and when this allocated memory access it will go to cache, CPU/MMU still will use TLB for address translation for this region and without a proper alignment of pcpu_info size it still could be an issue with cache line sharing. ~ Oleksii
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |