[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [xen-unstable test] 24354: regressions - trouble: broken/fail/pass



flight 24354 xen-unstable real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/24354/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xend-qemut-winxpsp3  3 host-install(3)  broken REGR. vs. 24334
 test-amd64-amd64-xl-qemut-win7-amd64  7 windows-install   fail REGR. vs. 24334
 test-amd64-i386-xl-win7-amd64 12 guest-localmigrate/x10   fail REGR. vs. 24320

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl           9 guest-start                  fail   never pass
 test-amd64-amd64-xl-pcipt-intel  9 guest-start                 fail never pass
 test-amd64-i386-xend-winxpsp3 16 leak-check/check             fail  never pass
 test-amd64-i386-xl-qemut-win7-amd64 13 guest-stop              fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 13 guest-stop             fail never pass
 test-amd64-amd64-xl-win7-amd64 13 guest-stop                   fail never pass
 test-amd64-amd64-xl-qemut-winxpsp3 13 guest-stop               fail never pass
 test-amd64-amd64-xl-winxpsp3 13 guest-stop                   fail   never pass
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 13 guest-stop         fail never pass
 test-amd64-i386-xl-winxpsp3-vcpus1 13 guest-stop               fail never pass
 test-amd64-amd64-xl-qemuu-winxpsp3 13 guest-stop               fail never pass

version targeted for testing:
 xen                  4fad2dc72a8607f50c3783e1cbcb3fb25e3af932
baseline version:
 xen                  2d03be65d5c50053fec4a5fa1d691972e5d953c9

------------------------------------------------------------
People who touched revisions under test:
  Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  Daniel Kiper <daniel.kiper@xxxxxxxxxx>
  David Scott <dave.scott@xxxxxxxxxxxxx>
  David Vrabel <david.vrabel@xxxxxxxxxx>
  Don Slutz <dslutz@xxxxxxxxxxx>
  Ian Campbell <ian.campbell@xxxxxxxxxx>
  Ian Jackson <ian.jackson@xxxxxxxxxxxxx>
  Jan Beulich <jbeulich@xxxxxxxx>
  Julien Grall <julien.grall@xxxxxxxxxx>
  Mukesh Rathor <mukesh.rathor@xxxxxxxxxx>
  Rob Hoes <rob.hoes@xxxxxxxxxx>
------------------------------------------------------------

jobs:
 build-amd64-xend                                             pass    
 build-i386-xend                                              pass    
 build-amd64                                                  pass    
 build-armhf                                                  pass    
 build-i386                                                   pass    
 build-amd64-oldkern                                          pass    
 build-i386-oldkern                                           pass    
 build-amd64-pvops                                            pass    
 build-armhf-pvops                                            pass    
 build-i386-pvops                                             pass    
 test-amd64-amd64-xl                                          pass    
 test-armhf-armhf-xl                                          fail    
 test-amd64-i386-xl                                           pass    
 test-amd64-i386-rhel6hvm-amd                                 pass    
 test-amd64-i386-qemut-rhel6hvm-amd                           pass    
 test-amd64-i386-qemuu-rhel6hvm-amd                           pass    
 test-amd64-i386-freebsd10-amd64                              pass    
 test-amd64-amd64-xl-qemut-win7-amd64                         fail    
 test-amd64-i386-xl-qemut-win7-amd64                          fail    
 test-amd64-amd64-xl-qemuu-win7-amd64                         fail    
 test-amd64-amd64-xl-win7-amd64                               fail    
 test-amd64-i386-xl-win7-amd64                                fail    
 test-amd64-i386-xl-credit2                                   pass    
 test-amd64-i386-freebsd10-i386                               pass    
 test-amd64-amd64-xl-pcipt-intel                              fail    
 test-amd64-i386-rhel6hvm-intel                               pass    
 test-amd64-i386-qemut-rhel6hvm-intel                         pass    
 test-amd64-i386-qemuu-rhel6hvm-intel                         pass    
 test-amd64-i386-xl-multivcpu                                 pass    
 test-amd64-amd64-pair                                        pass    
 test-amd64-i386-pair                                         pass    
 test-amd64-amd64-xl-sedf-pin                                 pass    
 test-amd64-amd64-pv                                          pass    
 test-amd64-i386-pv                                           pass    
 test-amd64-amd64-xl-sedf                                     pass    
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1                     fail    
 test-amd64-i386-xl-winxpsp3-vcpus1                           fail    
 test-amd64-i386-xend-qemut-winxpsp3                          broken  
 test-amd64-amd64-xl-qemut-winxpsp3                           fail    
 test-amd64-amd64-xl-qemuu-winxpsp3                           fail    
 test-amd64-i386-xend-winxpsp3                                fail    
 test-amd64-amd64-xl-winxpsp3                                 fail    


------------------------------------------------------------
sg-report-flight on woking.cam.xci-test.com
logs: /home/xc_osstest/logs
images: /home/xc_osstest/images

Logs, config files, etc. are available at
    http://www.chiark.greenend.org.uk/~xensrcts/logs

Test harness code can be found at
    http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary


Not pushing.

------------------------------------------------------------
commit 4fad2dc72a8607f50c3783e1cbcb3fb25e3af932
Author: Ian Campbell <ian.campbell@xxxxxxxxxx>
Date:   Tue Jan 7 15:52:29 2014 +0000

    Revert "tools: libxc: flush data cache after loading images into guest 
memory"
    
    This reverts commit a0035ecc0d82c1d4dcd5e429e2fcc3192d89747a.
    
    Even with this fix there is a period between the flush and the unmap where
    processor may speculate data into the cache. The solution is to map this
    region uncached or to use the HCR.DC bit to mark all guest accesses cached.
    89eb02c2204a "xen: arm: force guest memory accesses to cacheable when MMU is
    disabled" has arranged to do the latter.
    
    Signed-off-by: Ian Campbell <ian.campbell@xxxxxxxxxx>

commit 89eb02c2204a0b42a0aa169f107bc346a3fef802
Author: Ian Campbell <ian.campbell@xxxxxxxxxx>
Date:   Wed Jan 8 14:09:01 2014 +0000

    xen: arm: force guest memory accesses to cacheable when MMU is disabled
    
    On ARM guest OSes are started with MMU and Caches disables (as they are on
    native) however caching is enabled in the domain running the builder and
    therefore we must ensure cache consistency.
    
    The existing solution to this problem (a0035ecc0d82 "tools: libxc: flush 
data
    cache after loading images into guest memory") is to flush the caches after
    loading the various blobs into guest RAM. However this approach has two 
short
    comings:
    
     - The cache flush primitives available to userspace on arm32 are not
       sufficient for our needs.
     - There is a race between the cache flush and the unmap of the guest page
       where the processor might speculatively dirty the cache line again.
    
    (of these the second is the more fundamental)
    
    This patch makes use of the the hardware functionality to force all accesses
    made from guest mode to be cached (the HCR.DC == default cached bit). This
    means that we don't need to worry about the domain builder's writes being
    cached because the guests "uncached" accesses will actually be cached.
    
    Unfortunately the use of HCR.DC is incompatible with the guest enabling its
    MMU (SCTLR.M bit). Therefore we must trap accesses to the SCTLR so that we 
can
    detect when this happens and disable HCR.DC. This is done with the HCR.TVM
    (trap virtual memory controls) bit which also causes various other registers
    to be trapped, all of which can be passed straight through to the underlying
    register. Once the guest has enabled its MMU we no longer need to trap so
    there is no ongoing overhead. In my tests Linux makes about half a dozen
    accesses to these registers before the MMU is enabled, I would expect other
    OSes to behave similarly (the sequence of writes needed to setup the MMU is
    pretty obvious).
    
    Apart from this unfortunate need to trap these accesses this approach is
    incompatible with guests which attempt to do DMA operations with their MMU
    disabled. In practice this means guests with passthrough which we do not yet
    support. Since a typical guest (including dom0) does not access devices 
which
    require DMA until after it is fully up and running with paging enabled the
    main risk is to in-guest firmware which does DMA i.e. running EFI in a 
guest,
    with a disk passed through and booting from that disk. Since we know that 
dom0
    is not using any such firmware and we do not support device passthrough to
    guests yet we can live with this restriction. Once passthrough is 
implemented
    this will need to be revisited.
    
    The patch includes a couple of seemingly unrelated but necessary changes:
    
     - HSR_SYSREG_CRN_MASK was incorrectly defined, which happened to be benign
       with the existing set of system register we handled, but broke with the 
new
       ones introduced here.
     - The defines used to decode the HSR system register fields were named the
       same as the register. This breaks the accessor macros. This had gone
       unnoticed because the handling of the existing trapped registers did not
       require accessing the underlying hardware register. Rename those 
constants
       with an HSR_SYSREG prefix (in line with HSR_CP32/64 for 32-bit 
registers).
    
    This patch has survived thousands of boot loops on a Midway system.
    
    Signed-off-by: Ian Campbell <ian.campbell@xxxxxxxxxx>
    Acked-by: Julien Grall <julien.grall@xxxxxxxxxx>

commit ca6bf20d4157b3b0b270e384e47c1e351964be16
Author: Julien Grall <julien.grall@xxxxxxxxxx>
Date:   Fri Jan 10 03:27:55 2014 +0000

    xen/arm: Scrub heap pages during boot
    
    Scrub heap pages was disabled because it was slow on the models. Now that 
Xen
    supports real hardware, it's possible to enable by default scrubbing.
    
    Signed-off-by: Julien Grall <julien.grall@xxxxxxxxxx>
    Acked-by: Ian Campbell <ian.campbell@xxxxxxxxxx>

commit 8aba7e1ce9e26cdf9d2b002ed87b4bd75fce4af3
Author: Rob Hoes <rob.hoes@xxxxxxxxxx>
Date:   Fri Jan 10 13:52:04 2014 +0000

    libxl: ocaml: use 'for_app_registration' in osevent callbacks
    
    This allows the application to pass a token to libxl in the fd/timeout
    registration callbacks, which it receives back in modification or
    deregistration callbacks.
    
    It turns out that this is essential for timeout handling, in order to
    identify which timeout to change on a modify event.
    
    Signed-off-by: Rob Hoes <rob.hoes@xxxxxxxxxx>
    Acked-by: David Scott <dave.scott@xxxxxxxxxxxxx>
    Acked-by: Ian Jackson <ian.jackson@xxxxxxxxxxxxx>

commit 0896bd8bea84526b00e00d2d076f4f953a3d73cb
Author: David Vrabel <david.vrabel@xxxxxxxxxx>
Date:   Fri Jan 10 17:46:33 2014 +0100

    x86: map portion of kexec crash area that is within the direct map area
    
    Commit 7113a45451a9f656deeff070e47672043ed83664 (kexec/x86: do not map
    crash kernel area) causes fatal page faults when loading a crash
    image.  The attempt to zero the first control page allocated from the
    crash region will fault as the VA return by map_domain_page() has no
    mapping.
    
    The fault will occur on non-debug builds of Xen when the crash area is
    below 5 TiB (which will be most systems).
    
    The assumption that the crash area mapping was not used is incorrect.
    map_domain_page() is used when loading an image and building the
    image's page tables to temporarily map the crash area, thus the
    mapping is required if the crash area is in the direct map area.
    
    Reintroduce the mapping, but only the portions of the crash area that
    are within the direct map area.
    
    Reported-by: Don Slutz <dslutz@xxxxxxxxxxx>
    Signed-off-by: David Vrabel <david.vrabel@xxxxxxxxxx>
    Tested-by: Don Slutz <dslutz@xxxxxxxxxxx>
    Reviewed-by: Daniel Kiper <daniel.kiper@xxxxxxxxxx>
    Tested-by: Daniel Kiper <daniel.kiper@xxxxxxxxxx>
    
    This is really just a band aid - kexec shouldn't rely on the crash area
    being always mapped when in the direct mapping range (and it didn't use
    to in its previous form). That's primarily because map_domain_page()
    (needed when the area is outside the direct mapping range) may be
    unusable when wanting to kexec due to a crash, but also because in the
    case of PFN compression the kexec range (if specified on the command
    line) could fall into a hole between used memory ranges (while we're
    currently only ignoring memory at the top of the physical address
    space, it's pretty clear that sooner or later we will want that
    selection to become more sophisticated in order to maximize the memory
    made use of).
    
    Acked-by: Jan Beulich <jbeulich@xxxxxxxx>

commit 3dbab7a8bf4bef1bb2967cb3a8c7ed2146482ab3
Author: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
Date:   Fri Jan 10 17:45:01 2014 +0100

    dbg_rw_guest_mem: need to call put_gfn in error path
    
    Using a 1G hvm domU (in grub) and gdbsx:
    
    (gdb) set arch i8086
    warning: A handler for the OS ABI "GNU/Linux" is not built into this 
configuration
    of GDB.  Attempting to continue with the default i8086 settings.
    
    The target architecture is assumed to be i8086
    (gdb) target remote localhost:9999
    Remote debugging using localhost:9999
    Remote debugging from host 127.0.0.1
    0x0000d475 in ?? ()
    (gdb) x/1xh 0x6ae9168b
    
    Will reproduce this bug.
    
    With a debug=y build you will get:
    
    Assertion '!preempt_count()' failed at preempt.c:37
    
    For a debug=n build you will get a dom0 VCPU hung (at some point) in:
    
             [ffff82c4c0126eec] _write_lock+0x3c/0x50
              ffff82c4c01e43a0  __get_gfn_type_access+0x150/0x230
              ffff82c4c0158885  dbg_rw_mem+0x115/0x360
              ffff82c4c0158fc8  arch_do_domctl+0x4b8/0x22f0
              ffff82c4c01709ed  get_page+0x2d/0x100
              ffff82c4c01031aa  do_domctl+0x2ba/0x11e0
              ffff82c4c0179662  do_mmuext_op+0x8d2/0x1b20
              ffff82c4c0183598  __update_vcpu_system_time+0x288/0x340
              ffff82c4c015c719  continue_nonidle_domain+0x9/0x30
              ffff82c4c012938b  add_entry+0x4b/0xb0
              ffff82c4c02223f9  syscall_enter+0xa9/0xae
    
    And gdb output:
    
    (gdb) x/1xh 0x6ae9168b
    0x6ae9168b:     0x3024
    (gdb) x/1xh 0x6ae9168b
    0x6ae9168b:     Ignoring packet error, continuing...
    Reply contains invalid hex digit 116
    
    The 1st one worked because the p2m.lock is recursive and the PCPU
    had not yet changed.
    
    crash reports (for example):
    
    crash> mm_rwlock_t 0xffff83083f913010
    struct mm_rwlock_t {
      lock = {
        raw = {
          lock = 2147483647
        },
        debug = {<No data fields>}
      },
      unlock_level = 0,
      recurse_count = 1,
      locker = 1,
      locker_function = 0xffff82c4c022c640 <__func__.13514> 
"__get_gfn_type_access"
    }
    
    Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
    Signed-off-by: Don Slutz <dslutz@xxxxxxxxxxx>
    Acked-by: Mukesh Rathor <mukesh.rathor@xxxxxxxxxx>
(qemu changes not included)

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.