[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [win-pv-devel] Blue screen on xenvif install



Il 17/10/2014 14:02, Paul Durrant ha scritto:
-----Original Message-----
From: Fabio Fantoni [mailto:fabio.fantoni@xxxxxxx]
Sent: 17 October 2014 12:02
To: Paul Durrant; win-pv-devel@xxxxxxxxxxxxxxxxxxxx
Subject: Re: [win-pv-devel] Blue screen on xenvif install

Il 17/10/2014 11:48, Paul Durrant ha scritto:
-----Original Message-----
From: win-pv-devel-bounces@xxxxxxxxxxxxxxxxxxxx [mailto:win-pv-devel-
bounces@xxxxxxxxxxxxxxxxxxxx] On Behalf Of Fabio Fantoni
Sent: 17 October 2014 10:39
To: Paul Durrant; win-pv-devel@xxxxxxxxxxxxxxxxxxxx
Subject: Re: [win-pv-devel] Blue screen on xenvif install

Il 16/10/2014 17:58, Paul Durrant ha scritto:
-----Original Message-----
From: win-pv-devel-bounces@xxxxxxxxxxxxxxxxxxxx [mailto:win-pv-
devel-
bounces@xxxxxxxxxxxxxxxxxxxx] On Behalf Of Fabio Fantoni
Sent: 16 October 2014 16:05
To: Paul Durrant; win-pv-devel@xxxxxxxxxxxxxxxxxxxx
Subject: Re: [win-pv-devel] Blue screen on xenvif install

Il 16/10/2014 16:54, Paul Durrant ha scritto:
-----Original Message-----
From: Fabio Fantoni [mailto:fabio.fantoni@xxxxxxx]
Sent: 16 October 2014 15:46
To: Paul Durrant; win-pv-devel@xxxxxxxxxxxxxxxxxxxx
Subject: Re: Blue screen on xenvif install

Il 16/10/2014 15:45, Paul Durrant ha scritto:
-----Original Message-----
From: Fabio Fantoni [mailto:fabio.fantoni@xxxxxxx]
Sent: 16 October 2014 14:40
To: Paul Durrant; win-pv-devel@xxxxxxxxxxxxxxxxxxxx
Subject: Re: Blue screen on xenvif install

Il 16/10/2014 13:04, Paul Durrant ha scritto:
-----Original Message-----
From: Fabio Fantoni [mailto:fabio.fantoni@xxxxxxx]
Sent: 16 October 2014 11:59
To: Paul Durrant; win-pv-devel@xxxxxxxxxxxxxxxxxxxx
Subject: Re: Blue screen on xenvif install

Il 16/10/2014 12:00, Paul Durrant ha scritto:
-----Original Message-----
From: Fabio Fantoni [mailto:fabio.fantoni@xxxxxxx]
Sent: 16 October 2014 10:35
To: Paul Durrant; win-pv-devel@xxxxxxxxxxxxxxxxxxxx
Subject: Re: Blue screen on xenvif install

Il 16/10/2014 10:59, Paul Durrant ha scritto:
-----Original Message-----
From: Fabio Fantoni [mailto:fabio.fantoni@xxxxxxx]
Sent: 16 October 2014 09:58
To: win-pv-devel@xxxxxxxxxxxxxxxxxxxx
Cc: Paul Durrant
Subject: Blue screen on xenvif install

Today I tried to install new winpv driver to other 3
windows 7
pro
64
bit domUs.
Days ago I installed them successful on one windows 7 pro
64
bit
and
one
windows 8.1 enterprise 64 bit.
On first domUs I tried to install them today I installed
xenbus
and
xenvbd successful but on xenvif install windows crashed
with
blue
screen, if I saw it correctly was an kernel in-page error or
something
similar.
After reboot I retried to install xenvif but now always fails.

Sounds odd. Do you have a MEMORY.DMP?

          Paul
Windows is setted to write dump of kernel memory on
system
error
but
%systemroot%\memory.dmp is missed :(
I also checked the windows events log but I found only
kernel-
power
as
critical that seems not contains useful data.

I found probably useful data in xl dmesg (I copied all below
from
domU
start to xen bug):
(d98) HVM Loader
(d98) Detected Xen v4.5-unstable
(d98) Xenbus rings @0xfeffc000, event channel 1
(d98) System requested SeaBIOS
(d98) CPU speed is 2660 MHz
(d98) Relocating guest memory for lowmem MMIO space
disabled
(XEN) irq.c:270: Dom98 PCI link 0 changed 0 -> 5
(d98) PCI-ISA link 0 routed to IRQ5
(XEN) irq.c:270: Dom98 PCI link 1 changed 0 -> 10
(d98) PCI-ISA link 1 routed to IRQ10
(XEN) irq.c:270: Dom98 PCI link 2 changed 0 -> 11
(d98) PCI-ISA link 2 routed to IRQ11
(XEN) irq.c:270: Dom98 PCI link 3 changed 0 -> 5
(d98) PCI-ISA link 3 routed to IRQ5
(d98) pci dev 01:3 INTA->IRQ10
(d98) pci dev 02:0 INTA->IRQ11
(d98) pci dev 03:0 INTA->IRQ5
(d98) pci dev 04:0 INTA->IRQ5
(d98) pci dev 05:0 INTA->IRQ10
(d98) pci dev 06:0 INTA->IRQ11
(d98) pci dev 1d:0 INTA->IRQ10
(d98) pci dev 1d:1 INTB->IRQ11
(d98) pci dev 1d:2 INTC->IRQ5
(d98) pci dev 1d:7 INTD->IRQ5
(d98) No RAM in high memory; setting high_mem resource
base
to
100000000
(d98) pci dev 05:0 bar 10 size 004000000: 0f0000000
(d98) pci dev 05:0 bar 14 size 004000000: 0f4000000
(d98) pci dev 02:0 bar 14 size 001000000: 0f8000008
(d98) pci dev 06:0 bar 30 size 000040000: 0f9000000
(d98) pci dev 05:0 bar 30 size 000010000: 0f9040000
(d98) pci dev 03:0 bar 10 size 000004000: 0f9050000
(d98) pci dev 05:0 bar 18 size 000002000: 0f9054000
(d98) pci dev 04:0 bar 14 size 000001000: 0f9056000
(d98) pci dev 1d:7 bar 10 size 000001000: 0f9057000
(d98) pci dev 02:0 bar 10 size 000000100: 00000c001
(d98) pci dev 06:0 bar 10 size 000000100: 00000c101
(d98) pci dev 06:0 bar 14 size 000000100: 0f9058000
(d98) pci dev 04:0 bar 10 size 000000020: 00000c201
(d98) pci dev 05:0 bar 1c size 000000020: 00000c221
(d98) pci dev 1d:0 bar 20 size 000000020: 00000c241
(d98) pci dev 1d:1 bar 20 size 000000020: 00000c261
(d98) pci dev 1d:2 bar 20 size 000000020: 00000c281
(d98) pci dev 01:1 bar 20 size 000000010: 00000c2a1
(d98) Multiprocessor initialisation:
(d98)  - CPU0 ... 36-bit phys ... fixed MTRRs ... var MTRRs
[1/8]
...
done.
(d98)  - CPU1 ... 36-bit phys ... fixed MTRRs ... var MTRRs
[1/8]
...
done.
(d98) Testing HVM environment:
(d98)  - REP INSB across page boundaries ... passed
(d98)  - GS base MSRs and SWAPGS ... passed
(d98) Passed 2 of 2 tests
(d98) Writing SMBIOS tables ...
(d98) Loading SeaBIOS ...
(d98) Creating MP tables ...
(d98) Loading ACPI ...
(d98) S3 disabled
(d98) S4 disabled
(d98) vm86 TSS at fc00a100
(d98) BIOS map:
(d98)  10000-100d3: Scratch space
(d98)  c0000-fffff: Main BIOS
(d98) E820 table:
(d98)  [00]: 00000000:00000000 - 00000000:000a0000: RAM
(d98)  HOLE: 00000000:000a0000 - 00000000:000c0000
(d98)  [01]: 00000000:000c0000 - 00000000:00100000:
RESERVED
(d98)  [02]: 00000000:00100000 - 00000000:78000000: RAM
(d98)  HOLE: 00000000:78000000 - 00000000:fc000000
(d98)  [03]: 00000000:fc000000 - 00000001:00000000:
RESERVED
(d98) Invoking SeaBIOS ...
(d98) SeaBIOS (version
debian/1.7.5-1-0-g506b58d-20140603_102943-testVS01OU)
(d98)
(d98) Found Xen hypervisor signature at 40000100
(d98) Running on QEMU (i440fx)
(d98) xen: copy e820...
(d98) Relocating init from 0x000df619 to 0x77fae600 (size
71995)
(d98) CPU Mhz=2660
(d98) Found 13 PCI devices (max PCI bus is 00)
(d98) Allocated Xen hypercall page at 77fff000
(d98) Detected Xen v4.5-unstable
(d98) xen: copy BIOS tables...
(d98) Copying SMBIOS entry point from 0x00010010 to
0x000f0f40
(d98) Copying MPTABLE from 0xfc001170/fc001180 to
0x000f0e40
(d98) Copying PIR from 0x00010030 to 0x000f0dc0
(d98) Copying ACPI RSDP from 0x000100b0 to 0x000f0d90
(d98) Using pmtimer, ioport 0xb008
(d98) Scan for VGA option rom
(d98) Running option rom at c000:0003
(XEN) stdvga.c:147:d98v0 entering stdvga and caching
modes
(d98) pmm call arg1=0
(d98) Turning on vga text mode console
(d98) SeaBIOS (version
debian/1.7.5-1-0-g506b58d-20140603_102943-testVS01OU)
(d98) Machine UUID f4cdeb74-0db1-4748-948d-
42579a494120
(d98) EHCI init on dev 00:1d.7 (regs=0xf9057020)
(d98) Found 0 lpt ports
(d98) Found 0 serial ports
(d98) ATA controller 1 at 1f0/3f4/0 (irq 14 dev 9)
(d98) ATA controller 2 at 170/374/0 (irq 15 dev 9)
(d98) ata0-0: QEMU HARDDISK ATA-7 Hard-Disk (40720
MiBytes)
(d98) Searching bootorder for:
/pci@i0cf8/*@1,1/drive@0/disk@0
(d98) DVD/CD [ata0-1: QEMU DVD-ROM ATAPI-4 DVD/CD]
(d98) Searching bootorder for:
/pci@i0cf8/*@1,1/drive@0/disk@1
(d98) UHCI init on dev 00:1d.0 (io=c240)
(d98) UHCI init on dev 00:1d.1 (io=c260)
(d98) UHCI init on dev 00:1d.2 (io=c280)
(d98) PS2 keyboard initialized
(d98) All threads complete.
(d98) Scan for option roms
(d98) Running option rom at c980:0003
(d98) pmm call arg1=1
(d98) pmm call arg1=0
(d98) pmm call arg1=1
(d98) pmm call arg1=0
(d98) Searching bootorder for: /pci@i0cf8/*@6
(d98)
(d98) Press F12 for boot menu.
(d98)
(d98) Searching bootorder for: HALT
(d98) drive 0x000f0d40: PCHS=16383/16/63 translation=lba
LCHS=1024/255/63 s=83394560
(d98) Space available for UMB: ca800-ee800, f0000-f0ce0
(d98) Returned 258048 bytes of ZoneHigh
(d98) e820 map has 6 items:
(d98)   0: 0000000000000000 - 000000000009fc00 = 1 RAM
(d98)   1: 000000000009fc00 - 00000000000a0000 = 2
RESERVED
(d98)   2: 00000000000f0000 - 0000000000100000 = 2
RESERVED
(d98)   3: 0000000000100000 - 0000000077fff000 = 1 RAM
(d98)   4: 0000000077fff000 - 0000000078000000 = 2
RESERVED
(d98)   5: 00000000fc000000 - 0000000100000000 = 2
RESERVED
(d98) enter handle_19:
(d98)   NULL
(d98) Booting from Hard Disk...
(d98) Booting from 0000:7c00
(XEN) d98: VIRIDIAN GUEST_OS_ID: vendor: 1 os: 4 major: 6
minor:
1
sp:
1 build: 1db1
(XEN) d98: VIRIDIAN HYPERCALL: enabled: 1 pfn: 3ffff
(XEN) d98v0: VIRIDIAN APIC_ASSIST: enabled: 1 pfn: 3fffe
(XEN) d98v1: VIRIDIAN APIC_ASSIST: enabled: 1 pfn: 3fffd
(XEN) irq.c:270: Dom98 PCI link 0 changed 5 -> 0
(XEN) irq.c:270: Dom98 PCI link 1 changed 10 -> 0
(XEN) irq.c:270: Dom98 PCI link 2 changed 11 -> 0
(XEN) irq.c:270: Dom98 PCI link 3 changed 5 -> 0
(XEN) irq.c:380: Dom98 callback via changed to GSI 24
(d98) XEN|BUGCHECK: ====>
(d98) XEN|BUGCHECK: 0000007A: FFFFF6FC000171C8
FFFFFFFFC0000185
000000001BD6A860 FFFF
(d98) F80002E39000
(d98) XEN|BUGCHECK: CONTEXT (FFFFF8800310E530):
(d98) XEN|BUGCHECK: - GS = 002B
(d98) XEN|BUGCHECK: - FS = 0053
(d98) XEN|BUGCHECK: - ES = 002B
(d98) XEN|BUGCHECK: - DS = 002B
(d98) XEN|BUGCHECK: - SS = 0018
(d98) XEN|BUGCHECK: - CS = 0010
(d98) XEN|BUGCHECK: - EFLAGS = 00000086
(d98) XEN|BUGCHECK: - RDI = 00000000000171C8
(d98) XEN|BUGCHECK: - RSI = 00000000C0000185
(d98) XEN|BUGCHECK: - RBX = 00000000038AC8F8
(d98) XEN|BUGCHECK: - RDX = 0000000000000000
(d98) XEN|BUGCHECK: - RCX = 000000000310E530
(d98) XEN|BUGCHECK: - RAX = 000000002ECF4722
(d98) XEN|BUGCHECK: - RBP = 000000001BD6A860
(d98) XEN|BUGCHECK: - RIP = 00000000038A2A43
(d98) XEN|BUGCHECK: - RSP = 000000000310E510
(d98) XEN|BUGCHECK: - R8 = 0000000000000000
(d98) XEN|BUGCHECK: - R9 = 0000000000000000
(d98) XEN|BUGCHECK: - R10 = 0000000000000000
(d98) XEN|BUGCHECK: - R11 = 0000000000000000
(d98) XEN|BUGCHECK: - R12 = 000000000000007A
(d98) XEN|BUGCHECK: - R13 = 0000000000000001
(d98) XEN|BUGCHECK: - R14 = 0000000002E39000
(d98) XEN|BUGCHECK: - R15 = 0000000002CBAC40
(d98) XEN|BUGCHECK: STACK:
(d98) XEN|BUGCHECK: 000000000310EA20:
(0000000000000003
00000000038A50F0 00000000038A
(d98) 4860 000000000000007A) xen.sys + 00000000000049AC
(d98) XEN|BUGCHECK: 000000000310EA70:
(00000000038AD0D0
0000000000000000 000000000000
(d98) 0004 0000000002C83500) ntoskrnl.exe +
0000000000128585
(d98) XEN|BUGCHECK: 000000000310EAA0:
(00000000038AD0D0
0000000002C83500 000000000000
(d98) 000F 0000000001780660) ntoskrnl.exe +
0000000000167C0D
(d98) XEN|BUGCHECK: 000000000310F170:
(0000000003816470
0000000002B4C8C1 000000000000
(d98) 00FE 0000000000000000) ntoskrnl.exe +
0000000000075CC4
(d98) XEN|BUGCHECK: 000000000310F1B0:
(000000000000007A
00000000000171C8 00000000C000
(d98) 0185 000000001BD6A860) ntoskrnl.exe +
00000000000E8752
(d98) XEN|BUGCHECK: 000000000310F290:
(00000000038163B0
000000000310F320 0000000002CB
(d98) D540 00000000038163B0) ntoskrnl.exe +
000000000009C91F
(d98) XEN|BUGCHECK: 000000000310F360:
(0000000000000000
0000000000000008 00000000FFFF
(d98) FFFF 00000000003500F0) ntoskrnl.exe +
00000000000831B9
(d98) XEN|BUGCHECK: 000000000310F4C0:
(0000000000000008
0000000002E39000 000000000000
(d98) 0000 0000000000000000) ntoskrnl.exe +
0000000000073CEE
(d98) XEN|BUGCHECK: 000000000310F658:
(0000000002E5CFD5
0000000000000000 0000000003EB
(d98) 8710 00000000FFFFFFFF) ntoskrnl.exe +
000000000042C000
If you need more tests/informations tell me and I'll post
them.
Thanks for any reply and sorry for my bad english.
Well, that tells me you had a 7A BSOD, which is a
KERNEL_DATA_INPAGE_ERROR. The error status (param 2) was
C0000185
so
that's a STATUS_IO_DEVICE_ERROR. The documentation at
http://msdn.microsoft.com/en-
gb/library/windows/hardware/ff559211%28v=vs.85%29.aspx
tells
me
that
this means:
"improper termination or defective cabling on SCSI devices or
that
two
devices are trying to use the same IRQ."
But you said you had xenvbd already installed so you'll be
using a
PV
storage path. Is there any indication of problems with your
storage?
         Paul
DomUs disk are all raw files in local dom0 disks, raid1 with "LSI
Logic
/ Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2" dom0
partitions
are GTP and fs ext4.
Dom0 is wheezy with kernel is linux-image-3.16-0.bpo.2-amd64
version
3.16.3-2~bpo70+1
No kern or syslog errors, only many of these warning that
someone
replied me that should not be a problem:
Oct 15 10:45:13 mtorMN01OU kernel: [773197.117518]
xen:balloon:
reserve_additional_memory: add_memory() failed: -17
DomU now see disk as xen pvdisk, on xenvif install probably still
was
full emulated disk even if xenvbd was installed successful
before
xenvif
install (failed with BSOD)
The IRQ seems visible from xl dmesg output below, can you
check
them is
have something wrong or strange, I see some of them
"duplicate"
but I
not know if is correct.

DomU xl cfg:
name='office1_w7'
builder="hvm"
memory=2048
vcpus=2
acpi_s3=0
acpi_s4=0
vif=['bridge=xenbr0,mac=00:16:3e:41:ae:8b']

disk=['/mnt/vm/disks/office1_w7.disk1.xm,raw,hda,rw',',raw,hdb,ro,cdrom'
]
boot='c'
device_model_version="qemu-xen"
viridian=1
vnc=0
keymap="it"
on_crash="destroy"
vga="qxl"
spice=1
spicehost='0.0.0.0'
spiceport=6001
spicedisable_ticketing=0
spicepasswd="password"
spicevdagent=1
spice_clipboard_sharing=0
spiceusbredirection=4
soundhw="hda"
localtime=1
If you need more tests/informations tell me and I'll post them.

Do you have the qemu log (with xen platform logging enabled).
This is
where the PV drivers log failures/warnings.
        Paul


I retried enabling xen platform debug in qemu trace but no add
lines
in
log when I try to install xenvif and fails.
I attach anyway the log if can be useful.
All I can see there is an apparently clean shutdown of domain 101;
no
sign
of XENVIF and no sign of a BSOD.
I'll also try to restore the backup of this night of the domU when I
had
the BSOD trying reproduce it with xen debug in qemu trace and
different
windows memory dump options.
Ok.

       Paul

Reproduced it but now with xen platform trace enabled in qemu.
(see
attachment)
Dump of memory instead is not present also in this case (probably
because disk is "failed").
You're using emulate disk in this case. Did your BSOD not indicate that
it
was dumping?
      Paul

I have to time to see if do a dump but I suppose it can't write to disk
since its driver fails.
Log qemu is not enough? You need the windows memory dump to find
and
fix
the bug?
I don't know what the bug is. My guess is that it's a deadlock somewhere
which is causing Windows to believe there's a storage issue, but since the
stack is entirely in the kernel and without a crashdump I cannot decode to
symbols there's not much more I can do unless I happen to repro.
     Paul

If needed tomorrow I'll do other tests.
I retried and seems always reproducible on same domU.
Now I disabled automatic reboot on crash and do a screenshot (in
attachment) but memory dump is still missed :( Probably with disk
"failed" can't write.
Is there another way to take/save the memory dump or the data you
need?
Could you remove xenvbd and make sure the emulated disk is functional.
Then, install Xenvif and see if it BSODs.

Now I tried to install xenvif without install xenvbd and installation is
ok, after I installed xennet (ok), xenvbd (ok) and xeniface but on this
latest same blue screen.
I can (and I must try) install all except xenvbd?
I also attached the new test's qemu log.
You obviously have not rebooted yet. Try installing everything except xenvbd. 
Reboot. Then install xenvbd and reboot. Any hand over from an emulated device 
to a PV device requires a reboot.

   Paul

Yes, I know that pvdrivers installation require reboot but the crash was before finish install all them and reboot. Now I install all other 4 component (not xenvbd), rebooted, installed xenvbd and rebooted and no crash happen for now. I suppose that this is only a workaround and one or more cases of disk crash can still happen, or I'm wrong?
What I can do to verify if can still crash in some cases?

Thanks for any reply.

About IRQ you checked in xl dmesg I posted if are all ok?

Yes, I don't think there's a problem there.

    Paul

Thanks for any reply and sorry for my bad english.


_______________________________________________
win-pv-devel mailing list
win-pv-devel@xxxxxxxxxxxxxxxxxxxx
http://lists.xenproject.org/cgi-bin/mailman/listinfo/win-pv-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.