[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-users] Re: xenoprof: operation 9 failed for dom0 (status: -1)


  • To: "Santos, Jose Renato G" <joserenato.santos@xxxxxx>
  • From: Dante Cinco <dantecinco@xxxxxxxxx>
  • Date: Thu, 12 Nov 2009 18:05:43 -0800
  • Cc: Xen-users <xen-users@xxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Thu, 12 Nov 2009 18:06:47 -0800
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=xxGVrrcIg3EDmmBQ+GaDM1h6JmVTX2tsnS4R9TmojhDllZDF8nbTY9hYUOEYTybdvq mMoSHUlT7hCw0lFOp6QjQqSa3hdSfmrg6gMbS9MV6YJOOCcYKrUw27vgZu9YEQMHvFnQ sZMTRHyVJ0EYP/JoBCN2/XlpTp55lslVzGWFE=
  • List-id: Xen user discussion <xen-users.lists.xensource.com>

I'm using the same patched (xen-r2.patch) oprofile-0.9.3 in both dom0 and domU. I'm using Ubuntu 9.04 with kernel 2.6.30.1 in domU and I do not see a "xenoprof" directory under "drivers/xen" in my 2.6.30.1 kernel tree. So that's my problem, right?

Dante

On Thu, Nov 12, 2009 at 5:54 PM, Santos, Jose Renato G <joserenato.santos@xxxxxx> wrote:
It looks like your domU does not have the right version of oprofile.
You should have seen a few printks for "dom 1". In particular dom1 should have called XENOPROF_enable_virq when you run "opcontrol --start" in the guest.
That is why is failing.
Either your kernel or your user level tools do not have Xen support. Did you apply the patch "oprofile-0.9.3-xen-r2.patch" to Oprofile in the guest? f so your kernel does not have the Xenoprof support.
 
Renato
 


From: Dante Cinco [mailto:dantecinco@xxxxxxxxx]
Sent: Thursday, November 12, 2009 5:44 PM

To: Santos, Jose Renato G
Cc: Xen-users
Subject: Re: xenoprof: operation 9 failed for dom0 (status: -1)

I added this line (after the close bracket of "switch(op)") to xenoprof.c so that I can see the state transitions and the values of the static variables:

    printk("xenoprof: operation %d for dom %d (xenoprof_state : %d, ret : %d, activated : %d, adomains : %d)\n",
           op, current->domain->domain_id, xenoprof_state, ret, activated, adomains);

Here's the Xen console output after running "opcontrol --start-daemon --active-domains=0,1"

(XEN) xenoprof: operation 1 for dom 0 (xenoprof_state : 1, ret : 0, activated : 0, adomains : 0)
(XEN) xenoprof: operation 3 for dom 0 (xenoprof_state : 1, ret : 0, activated : 0, adomains : 1)
(XEN) xenoprof: operation 3 for dom 0 (xenoprof_state : 1, ret : 0, activated : 0, adomains : 2)
(XEN) xenoprof: operation 14 for dom 0 (xenoprof_state : 1, ret : 0, activated : 0, adomains : 2)
(XEN) xenoprof: operation 5 for dom 0 (xenoprof_state : 2, ret : 0, activated : 0, adomains : 2)
(XEN) xenoprof: operation 6 for dom 0 (xenoprof_state : 2, ret : 0, activated : 0, adomains : 2)
(XEN) xenoprof: operation 6 for dom 0 (xenoprof_state : 2, ret : 0, activated : 0, adomains : 2)
(XEN) xenoprof: operation 7 for dom 0 (xenoprof_state : 3, ret : 0, activated : 0, adomains : 2)
(XEN) xenoprof: operation 8 for dom 0 (xenoprof_state : 3, ret : 0, activated : 1, adomains : 2)

Notice that "activated" is incremented only to 1 and it is the mismatch when compared against "adomains" that causes XENOPROF_start to fail after running "opcontrol --start" in dom0:

(XEN) xenoprof: operation 9 for dom 0 (xenoprof_state : 3, ret : -1, activated : 1, adomains : 2)
(XEN) xenoprof: operation 9 failed for dom 0 (status : -1)

My question is what is supposed to trigger "activated" to go to 2? It seems that XENOPROF_enable_virq (calls set_active which increments "activated") needs to be called twice (once for each active domain). I found only one instance in the dom0 kernel that calls "HYPERVISOR_xenoprof_op(XENOPROF_enable_virq, NULL)" and it is in drivers/xen/xenoprof/xenoprofile.c inside the function xenoprof_setup(void).

I did run "opcontrol --start" in domU before running it in dom0. Here's the sequence:

domU: opcontrol --shutdown; opcontrol --reset
dom0: opcontrol --shutdown; opcontrol --reset; opcontrol --start-daemon --active-domains=0,1
domU: opcontrol --start
dom0: opcontrol --start

Dante


On Wed, Nov 11, 2009 at 6:06 PM, Santos, Jose Renato G <joserenato.santos@xxxxxx> wrote:
This is not the right way to use active domains.
You should specify "--active-domains=1" and not "--passive-domains=1".
By specifying passive domains the samples for domain 1 are being delivered to dom0.
Probably the samples you are seeing in the guest are from a previous run.
This definetely not right.
Also you do not need to specify dom0 as an active domain. The option --active-domain=0 should have no effect
 
If "--active-domains=1"  is giving you the same error as before, this is probably because you are not following the right sequence of commands between dom0 and the guest.
It is working fine for me after the patch
 
Renato


From: Dante Cinco [mailto:dantecinco@xxxxxxxxx]
Sent: Wednesday, November 11, 2009 5:55 PM

To: Santos, Jose Renato G
Cc: Xen-users
Subject: Re: xenoprof: operation 9 failed for dom0 (status: -1)

Yes, your patch made a big difference. I can now get profile data in domU. Thanks.

There's a slight twist though in order to make this work. I have to use this command line in dom0:

opcontrol --start-daemon --passive-domains=1 --passive-images=/boot/vmlinux-2.6.30.1 --active-domains=0 --vmlinux=/boot/vmlinux-2.6.30.3 --xen=/boot/xen-syms-3.4.1

If I try to use "--active-domains=1" and not specify --passive-domains and --passive-images, I still get the same error message when I run "opcontrol --start" in dom0. I thought passive-domains are domains that are not running OProfile but I am running OProfile in domU (dom ID#1).

When I run "opreport -l" in dom0, I see samples from domain1-kernel and when I run "opreport -l" in domU, I see samples from vmlinux-2.6.30.1. My question is how does Xenoprof know which samples to route to domain1-kernel in dom0 and which samples to routed in vmlinux-2.6.30.1 in domU?

Dante

On Wed, Nov 11, 2009 at 12:16 AM, Santos, Jose Renato G <joserenato.santos@xxxxxx> wrote:
I think I found the bug.
This patch should fix it.
Please let me know if it works

Thanks

Renato

diff -r 7422afed66ee xen/common/xenoprof.c
--- a/xen/common/xenoprof.c     Mon Sep 07 08:53:07 2009 +0100
+++ b/xen/common/xenoprof.c     Tue Nov 10 23:45:48 2009 -0800
@@ -681,8 +681,9 @@ int do_xenoprof_op(int op, XEN_GUEST_HAN
    {
    case XENOPROF_init:
        ret = xenoprof_op_init(arg);
-        if ( !ret )
-            xenoprof_state = XENOPROF_INITIALIZED;
+        if ( (ret == 0) &&
+             (current->domain == xenoprof_primary_profiler) )
+                xenoprof_state = XENOPROF_INITIALIZED;
        break;

    case XENOPROF_get_buffer:


> -----Original Message-----
> From: Dante Cinco [mailto:dantecinco@xxxxxxxxx]
> Sent: Tuesday, November 10, 2009 11:21 AM
> To: Santos, Jose Renato G
> Cc: Xen-users
> Subject: Re: xenoprof: operation 9 failed for dom0 (status: -1)
>
> As you can see from the output of "opreport -l" below, most
> of the cycles are coming from domain1-modules so I do need to
> focus on --active-domains=1 since --passive-domains=1 does
> not provide the profiling details for the domU modules.
>
> The steps described in the xenoprof tutorial for active
> domains is pretty straightforward but I cannot get past the
> "write error" when I run "opcontrol --start" in dom0.
>
> After running "opcontrol --start" in domU, I see the response
> below and I use lsmod to verify that oprofile module is
> loaded. Given the results from --passive-domains=1, the dom0
> and Xenoprof interface is working. It's the domU and dom0
> interface that has some problem. Is there some other way I
> can tell from dom0 that domU is "ready" before running
> "opcontrol --start" in dom0?
>
> FYI: I'm using the same oprofile-0.9.3 with xen-r2.patch in
> dom0 and domU. If I boot the Debian 2.6.30.1 kernel (the same
> one I'm using in domU) in bare-metal (no Xen VMM), I'm able
> to successfully run oprofile-0.9.3.
>
> opcontrol --start (from domU before "opcontrol --start" in dom0):
> Using 2.6+ OProfile kernel interface.
> Reading module info.
> Using log file /var/lib/oprofile/samples/oprofiled.log
> Daemon started.
> Profiler running.
>
> opreport -l (from dom0 using --passive-domains=1
> --passive-images=/boot/vmlinux-2.6.30.1):
> CPU: Core 2, speed 2533.51 MHz (estimated) Counted
> CPU_CLK_UNHALTED events (Clock cycles when not halted) with a
> unit mask of 0x00 (Unhalted core cycles) count 100000
> samples  %        image name               app name
>       symbol name
> 617835   38.6156  domain1-modules          domain1-modules
>       (no symbols)
> 448911   28.0576  domain1-xen-unknown
> domain1-xen-unknown      (no symbols)
> 72460     4.5289  domain1-kernel           domain1-kernel
>       __down
> 43294     2.7059  domain1-kernel           domain1-kernel
>       __down_killable
> 34145     2.1341  domain1-kernel           domain1-kernel
>       validate_slab_slab
>
> Dante
>
>
> On Tue, Nov 10, 2009 at 10:22 AM, Santos, Jose Renato G
> <joserenato.santos@xxxxxx> wrote:
>
>
>       With passive domains you cannot have detailed profiling
> information on modules, only in kernel builtin functions and
> on Xen. All the samples associated with modules will be
> grouped under the same symbol "domain1-modules".
>       If you are interested in one particular module you
> should try to recompile the kernel with the associated code
> builtin (or you can use active domains, but follow the steps
> on the xenoprof tutorial to coordinate opcontrol in dom0 and
> in the guest)
>
>       Renato
>
>
> ________________________________
>
>
>               From: Dante Cinco [mailto:dantecinco@xxxxxxxxx]
>
>               Sent: Tuesday, November 10, 2009 10:13 AM
>
>               To: Santos, Jose Renato G
>               Cc: Xen-users
>               Subject: Re: xenoprof: operation 9 failed for
> dom0 (status: -1)
>
>
>               Renato,
>
>               I think I'm making progress. I followed your
> suggestion of using --passive-domains and --passive-images.
> When I run opreport, it is unable to find /domain1-modules
> and /domain1-xen-unknown. Where or how do I specify the
> kernel modules I have running in domU/domain1? I tried
> copying the *.ko files in /boot in dom0 and used
> --image-path=/boot in opreport and it is still not finding them.
>
>               Thanks.
>
>               Dante
>
>
>               On Mon, Nov 9, 2009 at 6:44 PM, Santos, Jose
> Renato G <joserenato.santos@xxxxxx> wrote:
>
>
>                       Try replacing "--active-domains=1" with
> "--passive-domains=1 passive-images=<domU-kernel-image>" (use
> the uncompressed version of your kernel image for the guest,
> vmlinux-*)
>
>                       To use "active-domains" you need to run
> opcontrol in the guest in addition to running it in dom0 and
> you need to coordinate the execution of both instances.. This
> require the guest opcontrol to be ready before running
> "opcontrol --start" in dom0. That is why it is failing. I
> suspect you have not executed opcontrol in the guest
>                       Using active-domains is very tricky. I
> suggest that you use --passive-domains, unless you really
> need active domains (it is only useful in case you need
> detailed profiles for user processes running in the guest)
>
>                       Renato
>
>
> ________________________________
>
>
>                               From: Dante Cinco
> [mailto:dantecinco@xxxxxxxxx]
>
>                               Sent: Monday, November 09, 2009 6:13 PM
>
>                               To: Santos, Jose Renato G
>
>                               Cc: Xen-users
>                               Subject: Re: xenoprof:
> operation 9 failed for dom0 (status: -1)
>
>
>                               Renato,
>
>                               I've narrowed down the
> opcontrol command sequence that causes the "write error" I'm
> having. If I just run "opcontrol --start" in dom0, it runs
> with no error and after "opcontrol --shutdown" I can run
> "opreport" and get a real report.
>
>                               If I run "opcontrol
> --start-daemon --active-domains=1" in dom0, run "opcontrol
> --start" in domU (ID#1) and go back to dom0 and run
> "opcontrol --start" I get the "write error" message. It's as
> if "--start-daemon" is grabbing the file handle for
> /dev/oprofile/enable so when "--start" tries to write "1" to
> /dev/oprofile/enable, it is unable to do so because it is
> already locked.
>
>                               So I can run OProfile in normal
> (non-Xen) mode but as soon I start using "--start-daemon" I
> have problems. To me it seems like a Xenoprofile problem.
>
>                               I'm using OProfile 0.9.3 and
> oprofile-0.9.3-xen-r2.patch applied successfully.
>
>                               Dante
>
>
>                               On Thu, Nov 5, 2009 at 6:05 PM,
> Santos, Jose Renato G <joserenato.santos@xxxxxx> wrote:
>
>
>                                       What version of
> OProfile are you using?
>                                       Did you apply the Xen
> patch available in http://xenoprof.sourceforge.net ?
>
>                                       Renato
>
> ________________________________
>
>                                               From: Dante
> Cinco [mailto:dantecinco@xxxxxxxxx]
>                                               Sent: Thursday,
> November 05, 2009 5:16 PM
>                                               To: Santos,
> Jose Renato G
>                                               Cc: Xen-devel
>                                               Subject:
> xenoprof: operation 9 failed for dom0 (status: -1)
>
>                                               Renato,
>
>                                               When I tried
> running "opcontrol --start" (after previously running
> "opcontrol --start-daemon") in dom0, I get this error message:
>
>
> /usr/local/bin/opcontrol: line 1639: echo: write error:
> Operation not permitted
>
>                                               and this
> message in the Xen console:
>
>                                               (XEN) xenoprof:
> operation 9 failed for dom 0 (status : -1)
>
>                                               It looks like
> opcontrol is trying to do this: echo 1 > /dev/oprofile/enable
>
>                                               and it is
> failing. "operation 9" maps to XENOPROF_start which is
> consistent with running "opcontrol --start." At first, I
> ignored the error because it gave the indication "Profiler
> running" but after I ran "opcontrol --shutdown" followed by
> "opreport" in dom0, I got this error message:
>
>                                               error: no
> sample files found: profile specification too strict ?
>
>                                               Do you know why
> the write error is occurring? I followed the steps in
> xenoprof_tutorial.ppt.
>
>                                               Dante
>
>
>
>
>
>



_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.