[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Some trouble to use NVIDIA CUDA with Xen



Hello.

On Thu, 15 Aug 2013, Konrad Rzeszutek Wilk wrote:
HAVE_NV_XEN is NOT defined.

HAVE_NV_XEN is defined only if "nv-xen.h" is present (tested in
/usr/src/nvidia-319.37/conftest.h) and it seems to be removed from
distributed source (~ in nvidia driver 19x.x.x versions).

Ok, i downloaded some older version "nv-xen.h" from net to

Do you know what it contains? Perhaps there are some oddities in there?

I take first match from google :-) https://github.com/lll-project/nvidia/blob/master/include/nvidia/nv-xen.h

Maybe it is not the last one, but I ask nvidia to deliver "up-to-date" version.

But some programs hung PCIe and kernel:

[55799.433278] BUG: Bad rss-counter state mm:ffff8800723e0000 idx:1 val:21
[55800.139090] abrt-handle-eve[10175]: segfault at 18 ip 0000003f20ebb6d3 sp 
00007fffa7e6ef00 error 4 in libc-2.16.so[3f20e00000+1ad000]
[55800.375196] BUG: Bad rss-counter state mm:ffff8800723e2680 idx:1 val:5
[55845.124636] BUG: Bad rss-counter state mm:ffff8800723e0000 idx:1 val:8
[55962.186275] BUG: Bad rss-counter state mm:ffff880074a27800 idx:0 val:5
[55962.192811] BUG: Bad rss-counter state mm:ffff880074a27800 idx:1 val:795
[55962.262019] traps: abrt-handle-eve[10287] general protection ip:3f20ebb7a6 
sp:7fffbd613410 error:0 in libc-2.16.so[3f20e00000+1ad000]
[55962.394789] BUG: Bad rss-counter state mm:ffff8800723e0380 idx:1 val:13

That and those errors above imply that the nvidia driver is not doing
a good job of converting the WC pages back to WB. And when they
go back to the general pool of memory they still have the WC bit
set. Which is really really bad.

I presume there was some code that did the 'mark_WC' and then
'unmark_WC' (or mark_WB) or perhaps set_pages_wb and set_pages_wc.

(The set_pages_wb and set_pages_wb fix is the one pageattr.c file.
You could also add in the code there an printk to make sure that
it is indeed working correctly - or use this little module:

http://xenbits.xen.org/gitweb/?p=xentesttools/bootstrap.git;a=blob;f=root_image/drivers/wb_to_wc/wb_to_wc.c;h=cd2439ac103150229f14f732a9a7a271ca6f397e;hb=HEAD

to double check that it is working correctly).

I will try @weekend.

M.C>

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.