[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Rebooting domu fails in nfs share exported from another domu on the same dom0




On 2014/7/28 10:14, David Vrabel wrote:
On 16/07/14 21:36, annie li wrote:
Hi

I hit a problem in such scenario: vm1 is running and export nfs service,
dom0 mount this nfs, and vm2 is booted in this nfs location. vm1 and vm2
are running on the same dom0.

When this bug happens, the data flow is:  vm2 blkfront-> vm2 blkback->
loop -> nfs file -> nfs client -> bridge priv1 -> vm1 vif -> vm1 netback
-> vm1 netfront.

In above data flow, nfs implements direct io, blkfront and blkback uses
grantmap. This makes page mapping works well through vm2 blkfront to vm1
netback. However, when netback does grant copy, the error happens in
this routine:
__gnttab_copy->__get_paged_frame->get_page_from_gfn->get_page.
See /xen/arch/x86/mm.c get_page(),
     if ( likely(owner == domain) )
         return 1;
In above if condition, the src page is from vm2, so owner is id of vm2,
domain is 0 here. Then get_page return 0, hence get_page_from_gfn return
NULL and __get_paged_frame return GNTST_bad_page. Finally, put_page is
called in __grant_copy directly and grant copy fails in netback. As a
result, writing to nfsfile fails and this results damage to nfsfile,
then vm can not be rebooted successfully.

Disable the nfs direct io can be a workaround, however, this will cause
performance penalty. Or any copy is involved between vm2 blkfront->vm1
netback probably helps in this case. But zerocopy is the best thing for
performance, so any suggestions for this issue?
I planned (eventually) for foreign struct page's for grant mapped frames
to be marked as such and then the gref and original domain accessible.
The netback specific code for dealing with foreign pages could then be
made generic.

This sounds good if dealing with foreign pages in netback could be generic.


The difficultly lies in extending struct page without actually making it
bigger and without adding Xen-specific fields into it...

Yes...


Other alternatives I explored were using the guest mapping to copy
to/from instead of having to use the grant ref to find the page.  But
page sharing etc. made this look like a nightmare.

What I am thinking is add one more item named "frame" in grant_mapping structure, see xen/include/xen/grant_table.h. From this, we can get the ref based on foreign page, this probably involves some searching work. But I was interrupted by other works and did not started it till now.

For example,

struct grant_mapping {
    u32      ref;           /* grant ref */
    u16      flags;         /* 0-4: GNTMAP_* ; 5-15: unused */
    domid_t  domid;         /* granting domain */
+  unsigned long frame;  /* grant frame */
};

Thanks
Annie

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.