[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] Re: [RFC][PATCH] Per-cpu xentrace buffers
Keir, would you mind commenting on this new design in the next few days? If it looks like a good design, I'd like to do some more testing and get this into our next XenServer release. -George On Thu, Jan 7, 2010 at 3:13 PM, George Dunlap <dunlapg@xxxxxxxxx> wrote: > In the current xentrace configuration, xentrace buffers are all > allocated in a single contiguous chunk, and then divided among logical > cpus, one buffer per cpu. The size of an allocatable chunk is fairly > limited, in my experience about 128 pages (512KiB). As the number of > logical cores increase, this means a much smaller maximum per-cpu > trace buffer per cpu; on my dual-socket quad-core nehalem box with > hyperthreading (16 logical cpus), that comes to 8 pages per logical > cpu. > > The attached patch addresses this issue by allocating per-cpu buffers > separately. This allows larger trace buffers; however, it requires an > interface change to xentrace, which is why I'm making a Request For > Comments. (I'm not expecting this patch to be included in the 4.0 > release.) > > The old interface to get trace buffers was fairly simple: you ask for > the info, and it gives you: > * the mfn of the first page in the buffer allocation > * the total size of the trace buffer > > The tools then mapped [mfn,mfn+size), calculated where the per-pcpu > buffers were, and went on to consume records from them. > > -- Interface -- > > The proposed interface works as follows. > > * XEN_SYSCTL_TBUFOP_get_info still returns an mfn and a size (so no > changes to the library). However, this new are is to a trace buffer > info area (t_info), allocated once at boot time. The trace buffer > info area contains mfns of the per-pcpu buffers. > * The t_info struct contains an array of "offset pointers", one per > pcpu. These are an offset into the t_info data area of an array of > mfns for that pcpu. So logically, the layout looks like this: > struct { > int16_t tbuf_size; /* Number of pages per cpu */ > int16_t offset[NR_CPUS]; /* Offset into the t_info area of the array */ > uint32_t mfn[NR_CPUS][TBUF_SIZE]; > }; > > So if NR_CPUS was 16, and TBUF_SIZE was 32, we'd have: > struct { > int16_t tbuf_size; /* Number of pages per cpu */ > int16_t offset[16]; /* Offset into the t_info area of the array */ > uint32_t p0_mfn_list[32]; > uint32_t p1_mfn_list[32]; > ... > uint32_t p15_mfn_list[32]; > }; > * So the new way to map trace buffers is as follows: > + Call TBUFOP_get_info to get the mfn and size of the t_info area, and map > it. > + Get the number of cpus > + For each cpu: > - Calculate the offset into the t_info area thus: unsigned long > *mfn_list = ((unsigned long*)t_info)+(t_info->cpu_offset[cpu])) > - Map t_info->tbuf_size mfns from mfn_list using xc_map_foreign_batch() > > In the current implementation, the t_info size is fixed at 2 pages, > allowing about 2000 pages total to be mapped. For a 32-way system, > this would allow up to 63 pages per cpu (256MiB). Bumping this up to > 4 would allow even larger systems if required. > > The current implementation also allocates each trace buffer > contiguously, since that's the easiest way to get contiguous virtual > address space. But this interface allows Xen the flexibility, in the > future, to allocate buffers in several chunks if necessary, without > having to change the interface again. > > -- Implementation notes -- > > The t_info area is allocated once at boot. Trace buffers are > allocated either at boot (if a parameter is passed) or when > TBUFOP_set_size is called. Due to the complexity of tracking pages > mapped by dom0, unmapping or resizing trace buffers is not supported. > > I introduced a new per-cpu spinlock guarding trace data and buffers. > This allows per-cpu data to be safely accessed and modified without > tracing with current tracing events. The per-cpu spinlock is grabbed > whenever a trace event is generated; but in the (very very very) > common case, the lock should be in the cache already. > > Feedback welcome. > > -George > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |