[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-ia64-devel] [PATCH 0/4] [RFC] performance tuning of vTLB flush
Hi Alex. Thank you very much for measurement. It's interesting. At first I found a big bug in the deferred page freeing yesterday. It flushes unnecessarily. It is under development. So probably the bug causes the degration. I checked kernel compile with the per vcpu vhpt patch and the tlb tracking patch (without deferred page freeing patch), I saw improvemnt. I should have explained the patches. - per vcpu vhpt What is this patch for? It focuses on vcpu migration between physical cpus. With credit scheduler, vcpu is heavily migrated. This patch tries to reduce vTLB flush when vcpu is migrated. Expected effect When vcpu migration is occurred frequently, performance would be increased. - tlb tracking What is this patch for? It forcuses on grant table mapping. When page is unmapped, full vTLB flush is necessary. By tracking tlb insert on grant mapped page, full vTLB flush can be avoided. Especially vbd does only DMA, so dom0 doesn't insert tlb entry on the grant mapped page. In such case any vTLB flush isn't needed. Expected effect vbd performance increase. vnif packet sending performace increase. - deferred page freeing What is this patch for? When the page in which tlb insert isn't tracked is unmapped/zapped from domain, full vTLB flush is necessary again. Balloon driver and grant table page transfer is the case. This patch focuses on it. It tries to batch freeing/zapping page from domain in order to reduce full vTLB flush. Expected effect vnif packet receiving performance increase balloon driver performance increase On Mon, Aug 07, 2006 at 01:54:38PM -0600, Alex Williamson wrote: > On Fri, 2006-08-04 at 21:27 +0900, Isaku Yamahata wrote: > > Hi all > > These patches are for performance tuning. > > They are for comment, review and evaluation. > > > > - per vcpu vhpt > > - tlb tracking > > - deferred page freeing > > NEW: This patch is incomplete yet. It must be polished more. > > Here are my performance numbers: > > System: 2 Cell HP Superdome, 8-way 1.5GHz/6M, 12GB RAM > > The test: UP dom0 (2GB, single user mode), 7-way domU (3GB, single user > mode, no network), kernel build time w/ make -j8 (4 runs, 1st run thrown > out, average of other 3 runs) > > Stock (cset 10931): > > real: 282.643s > user: 1733.523s > sys: 132.650s > > Patches applied, TLB tracking NOT enabled (fixed domain.c build): > > real: 282.209s (0.998) > user: 1734.533s (1.001) > sys: 130.253s (0.982) > > Patches applied, TLB tracking enabled: > > real: 288.591s (1.021) > user: 1770.453s (1.021) > sys: 143.297s (1.080) > > So it looks like w/o TLB tracking enabled, the patch is probably within > the noise of my test. With TLB tracking enabled, there is a small, but > noticeable performance degradation. Under what conditions might we see > a performance improvement? Thanks, The per vcpu vhpt patch focuses vcpu migration cost. Given your setup that # of vcpu = # of physical cpu, probably there were no vcpu migration. It can be observed by running "xm vcpu-list" periodically. If vcpu migration didn't occured, your result means that the per vcpu vhpt doesn't introduce overhead. So it's good result. To see the pervcpu vhpt effect, vcpu migration is necessary. So make # of vcpu > # of pcpu by creating more domU and compiling on the domUs simalteniously with credit scheduler. In such case I expect to see the difference. For TLB tracking, I'm somewhat shocked. I expected much performance increase. I hope that the deferred page freeing patch spoiled it though, the benchmak will show the result. I tested the deferred page freeing patch by wget very roughly. Although network performance is horrible yet, it showed improvement. However your result seems that it causes overhead. Thanks. -- yamahata _______________________________________________ Xen-ia64-devel mailing list Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-ia64-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |