[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[win-pv-devel] xenvbd (8.x) - blkback/tapdisk3 problems


  • To: win-pv-devel@xxxxxxxxxxxxxxxxxxxx
  • From: Martin Cerveny <martin@xxxxxxxxx>
  • Date: Fri, 28 Oct 2016 11:40:17 +0200 (CEST)
  • Delivery-date: Fri, 28 Oct 2016 09:40:31 +0000
  • List-id: Developer list for the Windows PV Drivers subproject <win-pv-devel.lists.xenproject.org>

Hello.

I have problems with xenvbd (8.x). There was NOT problem with older pv-drivers xenvbd (7.2x). Questions @ bottom.

I use remote raw disk as source (multipath+iscsi+iser+ib).
Two configs:

--------------------------

1) use direct blkback (format=raw, vdev=hda, access=rw, 
target=/dev/mapper/3600144f07a0542580000568ba94a0001)

Performance is good, but __unusable__ for working.

Every few seconds/minutes (randomly, depends on disk load) the windows hung on io-operations. I usually saw this more often during write operations.

Sometimes (1:10) I saw "PdoReset" in "DebugView" (DomU):

00003034        10:12:32        XENVBD|__PdoReset:Target[0] ====>
00003035        10:12:32        XENVBD|__PdoPauseDataPath:Target[0] : Waiting 
for 5 Submitted requests
00003036        10:12:52        XENVBD|NotifierDpc:Target[0] : Paused, 5 
outstanding
00003037        10:12:53        XENVBD|NotifierDpc:Target[0] : Paused, 4 
outstanding
00003038        10:12:53        XENVBD|NotifierDpc:Target[0] : Paused, 3 
outstanding
00003039        10:12:53        XENVBD|NotifierDpc:Target[0] : Paused, 2 
outstanding
00003040        10:12:53        XENVBD|NotifierDpc:Target[0] : Paused, 1 
outstanding
00003041        10:12:53        XENVBD|__PdoPauseDataPath:Target[0] : 0/5 
Submitted requests left (21711 iterrations)
00003042        10:12:53        XENVBD|__FrontendSetState:Target[0] : ENABLED 
----> CLOSING
00003043        10:12:53        XENVBD|__FrontendSetState:Target[0] : in state 
CONNECTED
00003044        10:12:53        XENVBD|__FrontendSetState:Target[0] : in state 
CLOSING
00003045        10:12:53        XENVBD|__FrontendSetState:Target[0] : CLOSING 
----> CLOSED
00003046        10:12:53        XENVBD|__FrontendSetState:Target[0] : in state 
CLOSED
00003047        10:12:53        XENVBD|__FrontendSetState:Target[0] : CLOSED 
----> ENABLED
00003048        10:12:53        XENVBD|FrontendWriteUsage:Target[0] : DUMP 
NOT_HIBER PAGE
00003049        10:12:53        XENVBD|PdoUpdateInquiryData:Target[0] : 
VDI-UUID = {00000000-0000-0000-0000-000000000000}
00003050        10:12:53        XENVBD|FrontendPrepare:Target[0] : BackendId 0 
(/local/domain/0/backend/vbd/3/768)
00003051        10:12:53        XENVBD|__FrontendSetState:Target[0] : in state 
PREPARED
00003052        10:12:53        XENVBD|__FrontendSetState:Target[0] : in state 
CONNECTED
00003053        10:12:53        XENVBD|__FrontendSetState:Target[0] : in state 
ENABLED
00003054        10:12:53        XENVBD|__PdoReset:Target[0] <====

There is also restart log in Dom0, but no errors on disks/iscsi:

[ 3919.034421] xen-blkback:backend/vbd/3/768: prepare for reconnect
[ 3919.039869] xen-blkback:ring-ref 32, event-channel 40, protocol 1 
(x86_64-abi)

Sometimes (1:1000) systems hungs totally (this is from screenshot, not able to 
save log)

XENDISK:PdoSendTrimSynchronous:fail2
XENDISK:PdoSendTrimSynchronous:fail1 (c0000185)

When using OLDER pvdrivers 7.2x, no hunging but also some interesting logs in "DebugView" (DomU):

00000035        556.56622314    XENVBD|__BufferReaperThread:Reaping Buffers (185 
> 32)
00000036        557.56567383    XENVBD|__BufferReaperThread:Reaping Buffers (362 
> 32)
00000037        558.56231689    XENVBD|__BufferReaperThread:Reaping Buffers (209 
> 32)

---------------------

2) use tapdisk3 (format=raw, vdev=hda, access=rw, script=block-tap, 
target=aio:/dev/mapper/3600144f07a0542580000568ba94a0001)

Performance is __bad__, but usable for working.

There is __no__ errors, but performance dropped ~20-50%! Also as expected when do "CrystalDiskMark", Dom0 "tapdisk" takes 100% of one cpu (is it singlethreaded ?), and vmstat reports ~100000 context switches !
(I think that there is some more optimalization as described 
http://xenserver.org/discuss-virtualization/virtualization-blog/entry/tapdisk3.html
 )

---------------------

Enviroments:
- Windows7 x64
- tested signed winpv drivers 8.1 and primary on development drivers 8.2
- xen 4.5.3, 4.6 and primary 4.7.0
- kernels "XenServer" - kernel-3.10.41-353.380450 (and others from XS6.5) and kernel-3.10.96-495.383045.x86_64 (and others from XS7) - blktap3 - blktap-3.0.0.xs1001-xs6.5.0 and blktap-3.2.0.xs1087-xs7.0.0.x86_64

---------------------

Questions:

What is buggy in "direct blkback" chain ?
Was it tested ?
Is there some simple observability tool scripts for Dom0 (for example for 
systemtap) to study blkback behaviour ?

How to check that using "optimized" tapdisk3 ?
Is install tapdisk3 and xen compiling with "--disable-blktap2" sufficient ?
Is the performance drop of tapdisk3 and high load in Dom0 expected ?

Thanks for answers, Martin Cerveny

_______________________________________________
win-pv-devel mailing list
win-pv-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/cgi-bin/mailman/listinfo/win-pv-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.