[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-ia64-devel] [PATCH][RFC] performance tuning TAKE 5



Hi, all

We, Fujitsu team evaluated these patches' effects from the point of view of 
vnif performance improvement.

Conclusion:
These patches are very excellent at the point of  transfer from DomU.
Details of this conclusion is published at Xen summit NOW!

patch list
11334:3f4b18d026de_avoid_long_time_interrupt_masking.patch
11335:05f4b6f69eee_perfc_for_vtlb_flush.patch
11336:2885dcbd70e0_perfc_mm_c.patch
11337:eb2a3b4b22da_perfc_dom0vp_p2m_and_m2p.patch
11338:c8f39b6e1ece_p2m_exposure_xen_side.patch
11339:33d306f87a3d_p2m_exposure_linux_side.patch
11340:fa62cd7eea19_p2m_exposure_test_module.patch
11341:1b9a35a82cf1_pervcpu_vhpt.patch
11342:b158294df78d_fix_pte_flags_conflict.patch
11343:543338390a16_import_linux_hash.h.patch
11344:1aeb66069203_tlb_track.patch
11345:7c8573def928_deferred_page_freeing.patch
11346:a1ae2f9af64d_skbuff_tlb_tracking_xen_side.patch
11347:b3bf1d38f023_skbuff_tlb_tracking_linux_side.patch
11348:9b5a7b16bfcd_tlb_track_netfront_page_xen_side.patch
11349:3a4e3fbd2956_tlb_tracking_on_netfront_page_linux_side.patch
11350:d109c748960c_tlbflush_clock.patch

patched case:     all patches applied.
unpatched case:  only 11335,11336,11337 applied.  These are added only perf 
counter.
patched-2 case:  patches applied except 11346,11347,11348,11349

measurement environment

Tiger4----------GbitEthreHUB-----------Tiger4(Native)
xen cs11333                                             RHEL4.2 memory 8GB
Dom0: RHEL4.2 memory 1GB
DomU: RHEL4.2 memory 1GB

tool : netperf2.4.1
measurement time 100sec
=========================================
Native to Native(top's load of CPU)
 client (0.7%) -> server (1.7%)
 transfer rate 764.81(Mbit/sec)
(Mbit/sec)????unpached????patched????patched-2

Dom0->Native??798.18?????798.36?????797.78
Native->Dom0??688.94?????752.22?????740.77

DomU->Native??356.90?????707.10?????354.95 <------LOOK!
Native->DomU??109.27?????129.69?????108.76
==========================================
Attached file is logs of measurement with xentop logs.
Special thanks Isaku!

Best regards.
katase.

----- Original Message ----- From: "Isaku Yamahata" <yamahata@xxxxxxxxxxxxx>
To: <xen-ia64-devel@xxxxxxxxxxxxxxxxxxx>
Sent: Thursday, August 31, 2006 12:17 PM
Subject: [Xen-ia64-devel] [PATCH][RFC] performance tuning TAKE 5


Hi all

These patches are for performance tuning TAKE 5.
They are for comment, review and evaluation.
I analyzed the behaviour of these patches based on performance counter.
I'll send the mail on it separately.

PATCHES:
- performace counter
- p2m exposure
- per vcpu vhpt
- tlb tracking
- grant table transfer - netback skbuff preregister
 - netfront page preregister
- deferred page freeing
- tlb flush clock
 Now stabilized

CHANGES:
- various bug fixes
 especially tlb flush cloks is stabilized

PATCH DETAIL:
- per vcpu vhpt
 It focuses on vcpu migration between physical cpus.
 With credit scheduler, vcpu is heavily migrated.
 This patch tries to reduce vTLB flush when vcpu is migrated.

- p2m exposure
 DMA paravirtualization requires the conversion from pseudo physical address
 to machine address. Currently it is done by hypercall.
This patch tries to reduce the conversion overhead by read-only mapping the xen p2m table to domain.

- tlb tracking
 It forcuses on grant table mapping.
 When page is unmapped, full vTLB flush is necessary.
 By tracking tlb insert on grant mapped page, full vTLB flush
 can be avoided.
 Especially vbd does only DMA, so dom0 doesn't insert tlb entry
 on the grant mapped page. In such case any vTLB flush isn't needed.
- netback skbuff/netfront page tlb tracking
 This focuses on grant table transfer.
When page is transfered, full vTLB flush is necessary on both sender domain and receiver domain. By preregistering the page, Xen/IA64 begins to track tlb insert on regestered pages.

- deferred page freeing
 When the page in which tlb insert isn't tracked is unmapped/zapped from
 domain, full vTLB flush is necessary again.
 Balloon driver and grant table page transfer is the case.
 This patch focuses on it.
 It tries to batch freeing/zapping page from domain in order
 to reduce full vTLB flush.

- tlb flush clock
 This is intended to be a counter part of Xen/x86 tlb flush clock.
 But this is used only when vcpu context switch only. not for lazy tlb flush.

thanks.
--
yamahata



--------------------------------------------------------------------------------


_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ia64-devel

Attachment: yamahata.patch.log2.tar.gz
Description: GNU Zip compressed data

_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ia64-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.