[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: [Xen-devel] [PATCH][RFC] FPU LWP 0/5: patch description
Hi Dan, This isn't the cycles of a single switch. This is the total cycle count (added) over a period. I randomly dumped the numbers when a guest was running. Thanks, -Wei -----Original Message----- From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Dan Magenheimer Sent: Friday, April 15, 2011 3:16 PM To: Huang2, Wei; Keir Fraser Cc: xen-devel@xxxxxxxxxxxxxxxxxxx Subject: RE: [Xen-devel] [PATCH][RFC] FPU LWP 0/5: patch description Wait... a context switch takes over 4 billion cycles? Not likely! And please check your division. I get the same answer from "dc" only when I use lowercase hex numbers and dc complains about unimplemented chars, else I get 0.033%... also unlikely. > -----Original Message----- > From: Wei Huang [mailto:wei.huang2@xxxxxxx] > Sent: Thursday, April 14, 2011 4:57 PM > To: Keir Fraser > Cc: xen-devel@xxxxxxxxxxxxxxxxxxx > Subject: Re: [Xen-devel] [PATCH][RFC] FPU LWP 0/5: patch description > > Hi Keir, > > I ran a quick test to calculate the overhead of __fpu_unlazy_save() and > __fpu_unlazy_restore(), which are used to save/restore LWP state. Here > are the results: > > (1) tsc_total: total time used for context_switch() in x86/domain.c > (2) tsc_unlazy: total time used for __fpu_unlazy_save() + > __fpu_unlazy_retore() > > One example: > (XEN) tsc_unlazy=0x00000000008ae174 > (XEN) tsc_total=0x00000001028b4907 > > So the overhead is about 0.2% of total time used by context_switch(). > Of > course, this is just one example. I would say the overhead ratio would > be <1% for most cases. > > Thanks, > -Wei > > > > On 04/14/2011 04:09 PM, Keir Fraser wrote: > > On 14/04/2011 21:37, "Wei Huang"<wei.huang2@xxxxxxx> wrote: > > > >> The following patches support AMD lightweight profiling. > >> > >> Because LWP isn't tracked by CR0.TS bit, we clean up the FPU code to > >> handle lazy and unlazy FPU states differently. Lazy FPU state (such > as > >> SSE, YMM) is handled when #NM is triggered. Unlazy state, such as > LWP, > >> is saved and restored on each vcpu context switch. To simplify the > code, > >> we also add a mask option to xsave/xrstor function. > > How much cost is added to context switch paths in the (overwhelmingly > > likely) case that LWP is not being used by the guest? Is this adding > a whole > > lot of unconditional overhead for a feature that noone uses? > > > > -- Keir > > > >> Thanks, > >> -Wei > >> > >> > >> > >> _______________________________________________ > >> Xen-devel mailing list > >> Xen-devel@xxxxxxxxxxxxxxxxxxx > >> http://lists.xensource.com/xen-devel > > > > > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxxxxxxxx > http://lists.xensource.com/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |