[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Questions about Using Perf at Dom0



On 08/30/2016 11:22 AM, Sanghyun Hong wrote:
>> As I said, this works for me for Xen guests. I thought perf kvm might
>> check whether there is a KVM guest but it's not. The error message that
>> you see is generated at the end of sampling if no samples are found.
>
> Oh, I see. I tried the same command and check the contents of the
> perf.data.kvm file, and it seems like it does not have hardware
> counters. I’ve used the command ‘/perf stat -d -d -d -a/’ to collect
> the hardware counters. Are there any way to collect hardware counters
> and distinguish them from the host?
>
>> What Xen and Linux versions are you running (xl info)?
>
> Here are the information from the command (xl info):
>
> release                : 3.10.0+10
> version                : #19 SMP Mon Aug 29 10:17:18 EDT 2016


This is a very old kernel. Xen PMU support went in at 4.2 or so. I
understand that you backported VPMU patches but I have no idea what the
state of perf for that release is, both kernel and userland.

I'd suggest you upgrade to something more up-to-date and try again.

-boris


> machine                : x86_64
> nr_cpus                : 4
> max_cpu_id             : 3
> nr_nodes               : 1
> cores_per_socket       : 4
> threads_per_core       : 1
> cpu_mhz                : 3292
> hw_caps                :
> b7ebfbff:17bae3ff:28100800:00000001:00000001:00000000:00000000:00000100
> virt_caps              : hvm
> total_memory           : 8079
> free_memory            : 7076
> sharing_freed_memory   : 0
> sharing_used_memory    : 0
> outstanding_claims     : 0
> free_cpus              : 0
> xen_major              : 4
> xen_minor              : 6
> xen_extra              : .1
> xen_version            : 4.6.1
> xen_caps               : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32
> hvm-3.0-x86_32p hvm-3.0-x86_64 
> xen_scheduler          : credit
> xen_pagesize           : 4096
> platform_params        : virt_start=0xffff800000000000
> xen_changeset          : 
> xen_commandline        : dom0_mem=752M,max:752M watchdog=0 ucode=scan
> dom0_max_vcpus=4 crashkernel=128M@256M console=vga vga=mode-0x0311
> vpmu=bts loglvl=all guest_loglvl=all hap=0
> cc_compiler            : gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-4)
> cc_compile_by          : root
> cc_compile_domain      : 
> cc_compile_date        : Mon Aug 29 12:11:32 EDT 2016
> build_id               : 16b0170b4215754184393e894660df6ceac6f940
> xend_config_format     : 4
>
>> Do you know whether performance counters are enabled (dmesg | grep
>> Performance)?
>
> Here is the output from the command, and I think the performance
> counter (vPMU driver) enabled.
>
> [    0.016096] PerformanceEvents: 16-deep LBR, SandyBridge events,
> Intel PMU driver.
>
> Really appreciate for your support,
> Sanghyun.
>
>
>> On Aug 30, 2016, at 10:39 AM, Boris Ostrovsky
>> <boris.ostrovsky@xxxxxxxxxx <mailto:boris.ostrovsky@xxxxxxxxxx>> wrote:
>>
>> On 08/29/2016 08:12 PM, Sanghyun Hong wrote:
>>> Hi Boris, 
>>>
>>> /[Appreciate for your support, and I have one last question]/
>>>
>>> Since we’re using the */Xen/* hypervisor (not the KVM that is
>>> type-II), I think we cannot use the /*perf k*/*/vm/* command. Here are
>>> the outputs:
>>>
>>> --------
>>> # perf kvm --host --guest record -C 1 sleep 1
>>> [ perf record: Woken up 1 times to write data ]
>>> Warning:
>>> 1 unprocessable samples recorded.
>>> Do you have a KVM guest running and not using 'perf kvm'?
>>> [ perf record: Captured and wrote 0.242 MB perf.data.kvm ]
>>
>> As I said, this works for me for Xen guests. I thought perf kvm might
>> check whether there is a KVM guest but it's not. The error message that
>> you see is generated at the end of sampling if no samples are found.
>>
>> What Xen and Linux versions are you running (xl info)?
>>
>> Do you know whether performance counters are enabled (dmesg | grep
>> Performance)?
>>
>>>
>>> # perf kvm --guest --host report --stdio
>>> Warning:
>>> 1 unprocessable samples recorded.
>>> Do you have a KVM guest running and not using 'perf kvm'?
>>> Error:
>>> The perf.data.kvm file has no samples!
>>> # To display the perf.data header info, please use
>>> --header/--header-only options.
>>> #
>>> --------
>>>
>>> In addition, we used perf to collect hardware performance counters,
>>> not the symbols, kernel functions, or other stuffs.
>>
>> Symbols are added to samples at the end --- perf doesn't collect them
>> explicitly.
>>
>>> Thus, what we really need is that we need to separate the hardware
>>> counters by each domain if it’s possible (i.e. we want to verify some
>>> counters are from Dom0, other counters are from Dom1, and the others
>>> come from Dom2, etc.)
>>>
>>> I’m pretty much sure that the code you used to distinguish each domain
>>> from samples will work. I wonder if you can agree with that.
>>
>>
>> It does work. Note though that you will only be able distinguish dom0
>> from guests, i.e. all non-dom0 guests are presented by perf as a single
>> guest, you won't be able to separate guest1 from guest2. I don't know
>> whether this is perf kvm limitation or (more likely) something specific
>> to perf kvm processing samples from Xen. I've never looked at that.
>>
>> -boris
>>
>>>
>>> Best,
>>> Sanghyun.
>>>
>>>
>>>> On Aug 29, 2016, at 6:37 PM, Boris Ostrovsky
>>>> <boris.ostrovsky@xxxxxxxxxx
>>>> <mailto:boris.ostrovsky@xxxxxxxxxx> <mailto:boris.ostrovsky@xxxxxxxxxx>>
>>>> wrote:
>>>>
>>>> On 08/29/2016 05:48 PM, Sanghyun Hong wrote:
>>>>> I really appreciate for your answers. Last but not the least, I want
>>>>> to make it more clear:
>>>>>
>>>>>> The hypervisor will provide dom0 with a raw sample (guest's,
>>>>>> dom0's or
>>>>>> hypervisor's) and then it's the job of dom0 kernel to properly
>>>>>> tag and
>>>>>> format it and make it available to the userland (i.e. perf itself).
>>>>> Does this mean if I run the perf command on Dom0 while other domains
>>>>> are running, we can collect the performance counter values without
>>>>> distinguishing each domain, right?
>>>>
>>>> With 'self' mode each domain collects samples only for itself.
>>>>
>>>> And for 'all' mode apparently this kind of works. With perf build from
>>>> Linux 4.7 tree:
>>>>
>>>> root@haswell> xl vcpu-list
>>>> Name                                ID  VCPU   CPU State   Time(s)
>>>> Affinity (Hard / Soft)
>>>> Domain-0                             0     0    0   -b-      55.0  0
>>>> / all
>>>> Domain-0                             0     1    1   r--      35.9  1
>>>> / all
>>>> Domain-0                             0     2    2   -b-      36.9  2
>>>> / all
>>>> Domain-0                             0     3    3   -b-      36.2  3
>>>> / all
>>>> fedora                               1     0    1   -b-     183.7  1
>>>> / all
>>>> root@haswell> ./perf kvm --guest --host --guestkallsyms=/tmp/kallsyms
>>>> record -C 1 sleep 1
>>>> [ perf record: Woken up 1 times to write data ]
>>>> [ perf record: Captured and wrote 0.139 MB perf.data.kvm (60 samples) ]
>>>> root@haswell> ./perf kvm --guest --host --guestkallsyms=/tmp/kallsyms
>>>> report --stdio
>>>> ...
>>>> # Overhead  Command          Shared Object           
>>>> Symbol                                      
>>>> # ........  ...............  ....................... 
>>>> .............................................
>>>> #
>>>>   14.48%  swapper          [unknown]                [k]
>>>> 0xffff82d0801c7a37
>>>>    6.06%  swapper          [kernel.kallsyms]        [k]
>>>> update_blocked_averages
>>>>    4.38%  swapper          [unknown]                [k]
>>>> 0xffff82d08013690f
>>>>    2.75%  swapper          [kernel.kallsyms]        [k] update_rq_clock
>>>>    2.34%  swapper          [kernel.kallsyms]        [k] irq_enter
>>>>    2.33%  [guest/0]        [guest.kernel.kallsyms]  [g]
>>>> fb_pad_aligned_buffer
>>>>    2.24%  1.hda-0          [kernel.kallsyms]        [k]
>>>> _raw_spin_unlock_irqrestore
>>>>    2.11%  [guest/0]        [guest.kernel.kallsyms]  [g]
>>>> native_safe_halt
>>>>    2.05%  [guest/0]        [guest.kernel.kallsyms]  [g]
>>>> update_blocked_averages
>>>> ...
>>>>
>>>> "unknown" samples are from hypervisor.
>>>>
>>>>
>>>>>
>>>>>> The tool will then look at the tag and display the event as
>>>>>> belonging to
>>>>>> host (dom0, really) or guest. This is supported for KVM, with 'perf
>>>>>> kvm'
>>>>>> commands.
>>>>>>
>>>>>> So if you can run perf kvm commands then you may be able to
>>>>>> differentiate dom0's and guest's samples (you will also see
>>>>>> hypervisor
>>>>>> samples but they won't get resolved to symbols). I am just not
>>>>>> sure per
>>>>>> kvm will run on a non-KVM host. I had a private copy where it did but
>>>>>> this was based on a fairly old version of Linux.
>>>>>>
>>>>>> (Also, stack profiling is not supported at all).
>>>>>
>>>>> I wonder if you can share the private copy of your code. I’m looking
>>>>> into the source code of the patches, and it seems there’s a way to
>>>>> do it. Even if your codes are outdated, I think I can manage to port
>>>>> it. I will promise once I finish the porting, I will share my
>>>>> adjustment of your code as well.
>>>>
>>>> Looks like you may not need it.
>>>>
>>>> -boris
>>>>
>>>>>
>>>>> Best,
>>>>> Sanghyun.
>>>>>
>>>>>
>>>>>> On Aug 29, 2016, at 5:24 PM, Boris Ostrovsky
>>>>>> <boris.ostrovsky@xxxxxxxxxx
>>>>>> <mailto:boris.ostrovsky@xxxxxxxxxx> <mailto:boris.ostrovsky@xxxxxxxxxx>>
>>>>>> wrote:
>>>>>>
>>>>>> On 08/29/2016 05:08 PM, Sanghyun Hong wrote:
>>>>>>>> Yes, this will allow the hypervisor to collect samples from
>>>>>>>> multiple
>>>>>>>> guests. However, the tool (perf) probably won't be able to properly
>>>>>>>> process these samples. But you can try.
>>>>>>> I understand, thus, I applied the patches and set
>>>>>>> the /pmu_mode/ to *all*. However, I’m really curious what you
>>>>>>> mean by
>>>>>>> the tool (*perf*) probably won’t be able to properly process these
>>>>>>> samples. Is there any things I have to have in mind while collecting
>>>>>>> the counters, or will it have incorrect values of the counters?
>>>>>> The hypervisor will provide dom0 with a raw sample (guest's,
>>>>>> dom0's or
>>>>>> hypervisor's) and then it's the job of dom0 kernel to properly
>>>>>> tag and
>>>>>> format it and make it available to the userland (i.e. perf
>>>>>> itself). The
>>>>>> tool will then look at the tag and display the event as belonging to
>>>>>> host (dom0, really) or guest. This is supported for KVM, with 'perf
>>>>>> kvm'
>>>>>> commands.
>>>>>>
>>>>>> So if you can run perf kvm commands then you may be able to
>>>>>> differentiate dom0's and guest's samples (you will also see
>>>>>> hypervisor
>>>>>> samples but they won't get resolved to symbols). I am just not
>>>>>> sure per
>>>>>> kvm will run on a non-KVM host. I had a private copy where it did but
>>>>>> this was based on a fairly old version of Linux.
>>>>>>
>>>>>> (Also, stack profiling is not supported at all).
>>>>>>
>>>>>>>> You will want to run dom0 on all physical processors (i.e. no
>>>>>>>> dom0_max_vcpus boot option) and pin all VCPUs for both dom0 and the
>>>>>>>> guest.
>>>>>>> Sure, thanks! My machine has four physical cores, and I’ve set
>>>>>>> the /dom0_max_vcpus=4/. I think it will have the same effect in
>>>>>>> removing the /dom0_max_vcpus/ option.
>>>>>> Yes.
>>>>>>
>>>>>> -boris
>>>>>>
>>>>>>> Best,
>>>>>>> Sanghyun.
>>>>>>>
>>>>>>>
>>>>>>>> On Aug 29, 2016, at 4:59 PM, Boris Ostrovsky
>>>>>>>> <boris.ostrovsky@xxxxxxxxxx
>>>>>>>> <mailto:boris.ostrovsky@xxxxxxxxxx> <mailto:boris.ostrovsky@xxxxxxxxxx>
>>>>>>>> <mailto:boris.ostrovsky@xxxxxxxxxx>> wrote:
>>>>>>>>
>>>>>>>> On 08/29/2016 02:42 PM, Sanghyun Hong wrote:
>>>>>>>>> Hi Boris,
>>>>>>>>>
>>>>>>>>> I’ve found the
>>>>>>>>> documentations(https://github.com/torvalds/linux/blob/master/Documentation/ABI/testing/sysfs-hypervisor-pmu)
>>>>>>>>> in the kernel source code, and it seems like if we change the mode
>>>>>>>>> from *self* to *all* then we can collect counters for all the
>>>>>>>>> domains,
>>>>>>>>> right?
>>>>>>>> Yes, this will allow the hypervisor to collect samples from
>>>>>>>> multiple
>>>>>>>> guests. However, the tool (perf) probably won't be able to properly
>>>>>>>> process these samples. But you can try.
>>>>>>>>
>>>>>>>> You will want to run dom0 on all physical processors (i.e. no
>>>>>>>> dom0_max_vcpus boot option) and pin all VCPUs for both dom0 and the
>>>>>>>> guest.
>>>>>>>>
>>>>>>>>
>>>>>>>> -boris
>>>>>>>>
>>>>>>>>> All the best,
>>>>>>>>> Sanghyun.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> On Aug 29, 2016, at 12:36 PM, Boris Ostrovsky
>>>>>>>>>> <boris.ostrovsky@xxxxxxxxxx
>>>>>>>>>> <mailto:boris.ostrovsky@xxxxxxxxxx> 
>>>>>>>>>> <mailto:boris.ostrovsky@xxxxxxxxxx>
>>>>>>>>>> <mailto:boris.ostrovsky@xxxxxxxxxx>
>>>>>>>>>> <mailto:boris.ostrovsky@xxxxxxxxxx>> wrote:
>>>>>>>>>>
>>>>>>>>>> On 08/29/2016 09:18 AM, Sanghyun Hong wrote:
>>>>>>>>>>> Dear Xen-Devel Community:
>>>>>>>>>>>
>>>>>>>>>>> I’m a grad student working on measuring performance counters
>>>>>>>>>>> at the
>>>>>>>>>>> Xen domains. I read this
>>>>>>>>>>> thread(https://wiki.xenproject.org/wiki/Xen_Profiling:_oprofile_and_perf)
>>>>>>>>>>> in
>>>>>>>>>>> web, and it says using Linux perf command will let us collecting
>>>>>>>>>>> performance counters in both dom0 and domU. Does it mean that
>>>>>>>>>>> we can
>>>>>>>>>>> collect both of them at once if we run perf command on the
>>>>>>>>>>> dom0? (If
>>>>>>>>>>> not, does it mean we can collect counters for each domain
>>>>>>>>>>> separately
>>>>>>>>>>> once we run the perf command in each domain?
>>>>>>>>>> Profiling both guest and dom0 (and the hypervisor) requires
>>>>>>>>>> changes to
>>>>>>>>>> perf and those are not there yet.
>>>>>>>>>>
>>>>>>>>>> But you can run perf in each guest (including dom0) separately.
>>>>>>>>>> Make
>>>>>>>>>> sure you have vpmu=true boot option.
>>>>>>>>>>
>>>>>>>>>> -boris
>


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.