[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [win-pv-devel] Problems with xenvbd



Il 07/09/2015 13:57, Paul Durrant ha scritto:
-----Original Message-----
From: Fabio Fantoni [mailto:fabio.fantoni@xxxxxxx]
Sent: 07 September 2015 11:33
To: Paul Durrant; Stefano Stabellini
Cc: RafaÅ WojdyÅa; win-pv-devel@xxxxxxxxxxxxxxxxxxxx
Subject: Re: [win-pv-devel] Problems with xenvbd

Il 07/09/2015 11:26, Paul Durrant ha scritto:
Fabio,

    Can you confirm that you don't see any problem if you use standard IDE
emulated disks? I certainly don't.
    Paul
WIth ide instead ahci was with same results but about udev problem now
seems I found the cause, seems the dom0 kernel.
With kernel 3.2.0-4-amd64 version 3.2.68-1+deb7u3 (from wheezy
repository) don't works without udev file, with 3.16.0-0.bpo.4-amd64
version 3.16.7-ckt11-1+deb8u3~bpo70+1 works.
Initially new pv drivers was with network not working with kernel <3.14
but after seems was solved (I don't know the exactly commit) but seems
that xen without udev file a newer kernel is still needed.
With 3.16 kernel I had other problems instead, for example with
balloning (even if should not be used).
With second test with kernel 3.16 I tried to remove a workaround of
balloning problem (dom0_mem=2G,max:3G in grub.cfg instead
dom0_mem=2G,max:2G), I nomore saw kern.log spam but W7 domU
crashed at boot.
Another strange things is even if with trace enabled don't show pv
drivers debug lines with 3.16 kernel (on older tests with 3.16 did if I
remember good)
In attachment the windows minidump.
That yielded:

0: kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

SYSTEM_SERVICE_EXCEPTION (3b)
An exception happened while executing a system service routine.
Arguments:
Arg1: 00000000c0000005, Exception code that caused the bugcheck
Arg2: fffff80002a8a7c5, Address of the instruction which caused the bugcheck
Arg3: fffff88001e86c00, Address of the context record for the exception that 
caused the bugcheck
Arg4: 0000000000000000, zero.

Debugging Details:
------------------


EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%08lx referenced 
memory at 0x%08lx. The memory could not be %s.

FAULTING_IP:
nt!ExpInterlockedPopEntrySListFault16+0
fffff800`02a8a7c5 498b08          mov     rcx,qword ptr [r8]

CONTEXT:  fffff88001e86c00 -- (.cxr 0xfffff88001e86c00;r)
rax=0000000026f60003 rbx=0000000000000001 rcx=fffff80002c1fc00
rdx=6c8b4830245c8b41 rsi=fffff80002ccf8d0 rdi=0000000000000000
rip=fffff80002a8a7c5 rsp=fffff88001e875e0 rbp=fffff88001e87640
  r8=6c8b4830245c8b40  r9=fffff80002a1e000 r10=fffff80002c1fc00
r11=0000000000000001 r12=fffff88000967000 r13=0000000000000020
r14=0000000000000000 r15=0000000000001000
iopl=0         nv up ei pl nz na pe nc
cs=0010  ss=0018  ds=002b  es=002b  fs=0053  gs=002b             efl=00010202
nt!ExpInterlockedPopEntrySListFault16:
fffff800`02a8a7c5 498b08          mov     rcx,qword ptr [r8] 
ds:002b:6c8b4830`245c8b40=????????????????
Last set context:
rax=0000000026f60003 rbx=0000000000000001 rcx=fffff80002c1fc00
rdx=6c8b4830245c8b41 rsi=fffff80002ccf8d0 rdi=0000000000000000
rip=fffff80002a8a7c5 rsp=fffff88001e875e0 rbp=fffff88001e87640
  r8=6c8b4830245c8b40  r9=fffff80002a1e000 r10=fffff80002c1fc00
r11=0000000000000001 r12=fffff88000967000 r13=0000000000000020
r14=0000000000000000 r15=0000000000001000
iopl=0         nv up ei pl nz na pe nc
cs=0010  ss=0018  ds=002b  es=002b  fs=0053  gs=002b             efl=00010202
nt!ExpInterlockedPopEntrySListFault16:
fffff800`02a8a7c5 498b08          mov     rcx,qword ptr [r8] 
ds:002b:6c8b4830`245c8b40=????????????????
Resetting default scope

CUSTOMER_CRASH_COUNT:  1

DEFAULT_BUCKET_ID:  WIN7_DRIVER_FAULT

BUGCHECK_STR:  0x3B

PROCESS_NAME:  lsass.exe

CURRENT_IRQL:  0

ANALYSIS_VERSION: 6.3.9600.17237 (debuggers(dbg).140716-0327) x86fre

LAST_CONTROL_TRANSFER:  from 0000000000000000 to fffff80002a8a7c5

STACK_TEXT:
fffff880`01e875e0 00000000`00000000 : 00000000`00000000 00000000`00000000 
00000000`00000000 00000000`00000000 : nt!ExpInterlockedPopEntrySListFault16


FOLLOWUP_IP:
nt!ExpInterlockedPopEntrySListFault16+0
fffff800`02a8a7c5 498b08          mov     rcx,qword ptr [r8]

SYMBOL_STACK_INDEX:  0

SYMBOL_NAME:  nt!ExpInterlockedPopEntrySListFault16+0

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: nt

IMAGE_NAME:  ntkrnlmp.exe

DEBUG_FLR_IMAGE_TIMESTAMP:  556356e8

IMAGE_VERSION:  6.1.7601.18869

STACK_COMMAND:  .cxr 0xfffff88001e86c00 ; kb

FAILURE_BUCKET_ID:  X64_0x3B_nt!ExpInterlockedPopEntrySListFault16+0

BUCKET_ID:  X64_0x3B_nt!ExpInterlockedPopEntrySListFault16+0

ANALYSIS_SOURCE:  KM

FAILURE_ID_HASH_STRING:  km:x64_0x3b_nt!expinterlockedpopentryslistfault16+0

FAILURE_ID_HASH:  {b390bf2a-9c11-079f-34b0-5dffcabffe4b}

Followup: MachineOwner
---------

0: kd> .cxr 0xfffff88001e86c00;r
rax=0000000026f60003 rbx=0000000000000001 rcx=fffff80002c1fc00
rdx=6c8b4830245c8b41 rsi=fffff80002ccf8d0 rdi=0000000000000000
rip=fffff80002a8a7c5 rsp=fffff88001e875e0 rbp=fffff88001e87640
  r8=6c8b4830245c8b40  r9=fffff80002a1e000 r10=fffff80002c1fc00
r11=0000000000000001 r12=fffff88000967000 r13=0000000000000020
r14=0000000000000000 r15=0000000000001000
iopl=0         nv up ei pl nz na pe nc
cs=0010  ss=0018  ds=002b  es=002b  fs=0053  gs=002b             efl=00010202
nt!ExpInterlockedPopEntrySListFault16:
fffff800`02a8a7c5 498b08          mov     rcx,qword ptr [r8] 
ds:002b:6c8b4830`245c8b40=????????????????
Last set context:
rax=0000000026f60003 rbx=0000000000000001 rcx=fffff80002c1fc00
rdx=6c8b4830245c8b41 rsi=fffff80002ccf8d0 rdi=0000000000000000
rip=fffff80002a8a7c5 rsp=fffff88001e875e0 rbp=fffff88001e87640
  r8=6c8b4830245c8b40  r9=fffff80002a1e000 r10=fffff80002c1fc00
r11=0000000000000001 r12=fffff88000967000 r13=0000000000000020
r14=0000000000000000 r15=0000000000001000
iopl=0         nv up ei pl nz na pe nc
cs=0010  ss=0018  ds=002b  es=002b  fs=0053  gs=002b             efl=00010202
nt!ExpInterlockedPopEntrySListFault16:
fffff800`02a8a7c5 498b08          mov     rcx,qword ptr [r8] 
ds:002b:6c8b4830`245c8b40=????????????????

That's pretty strange. I'd say something is probably corrupt.

The corruption I saw for sure about disks are using btrfs as dom0 fs (in rare cases) and with qcow2 overlay.
These tests are instead with ext4 and raw domUs disk.
About others corruptions (not only disks) I don't know.


I'm going crazy with these too many problems and haven't time to do all
useful tests :(
Too many moving parts I'd say. I've been running with a  4.2-rc dom0 kernel, a 
Xen from about 3 weeks ago and upstream qemu from Xen's upstream tag (again 
from about 3 weeks ago) and I'm not seeing any problems. I do have a fairly 
standard config though; ide disks and std-vga graphics.

   Paul

About dom0 kernel, are you using build from a package o custom build?
Can be useful for me try kernel 4.2 instead?


Can you advice me about more useful tests for try to found/solve these
problems?

Thanks for any reply and sorry for my bad english.


_______________________________________________
win-pv-devel mailing list
win-pv-devel@xxxxxxxxxxxxxxxxxxxx
http://lists.xenproject.org/cgi-bin/mailman/listinfo/win-pv-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.