[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-ia64-devel] [PATCH][RFC] performance tuning TAKE 6



Hi.
These patches are for performance tuning TAKE 6.
They are for comment, review and evaluation.
The Fujitsu's bench mark result revealed the issues
that netperf native -> domU performance isn't improved enough.
Tlb zap page hypercall and deferred free timered list addres the issue.
Analysing the behaviour based on perfc based analysis I concluded that
This patch reduces the vTLB flush cost.
However I'm not sure the netperf performance is really improved.
I hope this patch also improves dom0 -> domU case in a same box.

Theses patches are for the changeset
11456:3e4fa8b5b245889a89a894faf6f5b8398a8f9907
of xen-ia64-unstable.hg

PATCHES:
- performace counter
- p2m exposure
- per vcpu vhpt
- tlb tracking
  - grant table transfer 
  - netback skbuff preregister
  - netfront page preregister
- deferred page freeing
- tlb flush clock
- tlb zap page hypercall and deferred free timered list
  NEW. This addresses netperf native -> domU performance.

CHANGES:
- tlb zap page hypercall and deferred free timered list

PATCH DETAIL:
- per vcpu vhpt
  It focuses on vcpu migration between physical cpus.
  With credit scheduler, vcpu is heavily migrated.
  This patch tries to reduce vTLB flush when vcpu is migrated.

- p2m exposure
  DMA paravirtualization requires the conversion from pseudo physical address
  to machine address. Currently it is done by hypercall.
  This patch tries to reduce the conversion overhead by read-only 
  mapping the xen p2m table to domain.

- tlb tracking
  It forcuses on grant table mapping.
  When page is unmapped, full vTLB flush is necessary.
  By tracking tlb insert on grant mapped page, full vTLB flush
  can be avoided.
  Especially vbd does only DMA, so dom0 doesn't insert tlb entry
  on the grant mapped page. In such case any vTLB flush isn't needed.
  
- netback skbuff/netfront page tlb tracking
  This focuses on grant table transfer.
  When page is transfered, full vTLB flush is necessary on both 
  sender domain and receiver domain.
  By preregistering the page, Xen/IA64 begins to track tlb insert on 
  regestered pages.

- deferred page freeing
  When the page in which tlb insert isn't tracked is unmapped/zapped from
  domain, full vTLB flush is necessary again.
  Balloon driver and grant table page transfer is the case.
  This patch focuses on it.
  It tries to batch freeing/zapping page from domain in order
  to reduce full vTLB flush.

- tlb flush clock
  This is intended to be a counter part of Xen/x86 tlb flush clock.
  But this is used only when vcpu context switch only. not for lazy tlb flush.

- tlb zap page hypercall and deferred free timered list
  introduces tlb zap page hypercall,
  modifies tlb track page hypercall semantics and
  reimplements tlb untrack page hypercall.
  This patch tries to reduce vTLB flush cost of
  tlb track/untrack/zap page hypercall by trying to batch using timer.

FWIW my dot configs are as follows
- xen dot config
crash_debug=y
debug=y
verbose=y
xen_ia64_tlb_track=y
xen_ia64_tlb_track_cnt=y
xen_ia64_tlb_track_grant_table_page_transfer=y
xen_ia64_tlb_track_skbuff=y
xen_ia64_tlb_track_netfront_page=y
xen_ia64_tlb_track_deferred_flush=y
xen_ia64_pervcpu_vhpt=y
xen_ia64_deferred_free=y
xen_ia64_tlbflush_clock=y
xen_ia64_tlbflush_clock_tlb_track_entry=y

perfc=y
perfc_arrays=y

- Linux dot config includes
CONFIG_XEN_IA64_VDSO_PARAVIRT=y
CONFIG_XEN_IA64_EXPOSE_P2M=y
CONFIG_XEN_IA64_EXPOSE_P2M_USE_DTR=y
CONFIG_XEN_IA64_TLB_TRACK_SKBUFF=y
CONFIG_XEN_IA64_TLB_TRACK_NETFRONT_PAGE=y


thanks.
-- 
yamahata

Attachment: perf-tuning-6.tar.bz2
Description: Binary data

_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ia64-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.