[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [win-pv-devel] Problems with xenvbd

Il 21/08/2015 15:14, Fabio Fantoni ha scritto:
Il 21/08/2015 10:12, Fabio Fantoni ha scritto:
Il 21/08/2015 00:03, RafaÅ WojdyÅa ha scritto:
On 2015-08-19 23:25, Paul Durrant wrote:
-----Original Message----- From:
win-pv-devel-bounces@xxxxxxxxxxxxxxxxxxxx [mailto:win-pv-devel-
bounces@xxxxxxxxxxxxxxxxxxxx] On Behalf Of Rafal Wojdyla Sent: 18
August 2015 14:33 To: win-pv-devel@xxxxxxxxxxxxxxxxxxxx Subject:
[win-pv-devel] Problems with xenvbd


I've been testing the current pvdrivers code in preparation for
creating upstream patches for my xeniface additions and I noticed
than xenvbd seems to be very unstable for me. I'm not sure if it's
a problem with xenvbd itself or my code because it seemed to only
manifest when the full suite of our guest tools was installed along
with xenvbd. In short, most of the time the system crashed with
kernel memory corruption in seemingly random processes shortly
after start. Driver Verifier didn't seem to catch anything. You can
see a log from one such crash in the attachment crash1.txt.

Today I tried to perform some more tests but this time without our
guest tools (only pvdrivers and our shared libraries were
installed). To my surprise now Driver Verifier was crashing the
system every time in xenvbd (see crash2.txt). I don't know why it
didn't catch that previously... If adding some timeout to the
offending wait doesn't break anything I'll try that to see if I can
reproduce the previous memory corruptions.

Those crashes do look odd. I'm on PTO for the next week but I'll have
a look when I get back to the office. I did run verifier on all the
drivers a week or so back (while running vbd plug/unplug tests) but
there have been a couple of changes since then.


No problem. I attached some more logs. The last one was during system
shutdown, after that the OS failed to boot (probably corrupted
filesystem since the BSOD itself seemed to indicate that). I think
time there is a BLKIF_RSP_ERROR somewhere but I'm not yet familiar with
Xen PV device interfaces so not sure what that means.

In the meantime I've run more tests on my modified xeniface driver to
make sure it's not contributing to these issues but everything
seemed to
be fine there.

I also had a disk corruption on windows 10 pro 64 bit with pv drivers
build of 11 august but I'm not sure that is related to winpv drivers,
on same domU I started testing also snapshot with qcow2 disk overlay.
For this case I don't have useful information because don't try to
boot windows at all but if rehappen I'll try to take other useful

Happen another time but also this I was unable to understand what is
exactly the cause.
On windows reboot all seems was ok and did a clean shutdown but on
reboot seabios don't found bootable disk and qemu log don't show
useful informations.
qemu-img check show errors:
/usr/lib/xen/bin/qemu-img check W10.disk1.cow-sn1
ERROR cluster 143 refcount=1 reference=2
Leaked cluster 1077 refcount=1 reference=0
ERROR cluster 1221 refcount=1 reference=2
Leaked cluster 2703 refcount=1 reference=0
Leaked cluster 5212 refcount=1 reference=0
Leaked cluster 13375 refcount=1 reference=0

2 errors were found on the image.
Data may be corrupted, or further writes to the image may corrupt it.

4 leaked clusters were found on the image.
This means waste of disk space, but no harm to data.
27853/819200 = 3.40% allocated, 22.65% fragmented, 0.00% compressed
Image end offset: 1850736640
I created it with:
/usr/lib/xen/bin/qemu-img create -o
backing_file=W10.disk1.xm,backing_fmt=raw -f qcow2 W10.disk1.cow-sn1
and changed the xl domU configuration:
Dom0 is with xen 4.6-rc1 and qemu 2.4.0
DomU is windows 10 pro 64 bit with pv drivers build of 11 august

How I can know for sure if it is a winpv or qemu or other problem and
take useful information to report?

Thanks for any reply and sorry for my bad english.

I have the 2 Windows10 domUs on my test server with xen 4.6.0-rc2 unable
to boot with new windows pv drivers, both with build of 11 august.
Both with raw disks.
I'm unable to found useful informations about. Trying to boot from W10
dvd, repair boot don't works and chkdsk don't found errors.
After tried windows boot repair now give blue screen (see attachment)
instead of freeze on windows logo.
I suppose boot repair have disabled testsigning, is it right?
If yes is there a way to enable it changing a file offline from W10 iso
dos prompt or linux live iso? I did a fast google search without found it.
Another W10 domUs with old gplpv still boot correctly instead.

New pv drivers are now used also for next xenserver and are now keep
tested right? If yes and similar bug was not found probably there are
patches that solve/workaround the problem.
For example after some major changes new pv drivers was not working on
xen 4.5 but after backporting these 2 patches return to works correctly:
- x86/hvm: add per-vcpu evtchn upcalls
- x86/hvm: extend HVM cpuid leaf with vcpu id
I saw that these patches has been backported also in xenserver patchqueue.
I tried to find a probable fix/workaround also for this case supposing
that on xenserver works correctly but I not found it.
I also not found a 4.6 patchqueue in xenserver github for a better
compare but only the 4.5.

If you need other informations/tests tell me and I'll post them.

Thanks for any reply and sorry for my bad english.

Attachment: pvbluescreen.png
Description: PNG image

win-pv-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.