[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] netback BUG_ON when using copy_skb=1
>>> On 17.10.13 at 12:26, jerry <jerry.lilijun@xxxxxxxxxx> wrote: > Hi Jan, please don't top post. > In my test, the grant table copy error may cause that VM crash. > The stack is as follows: > kernel BUG at /linux/driver/redhat6.2/xen-vnif/xen-netfront.c:372! > ... > The BUG code in xen-netfront.c xennet_tx_buf_gc() is: > if (unlikely(gnttab_query_foreign_access( > np->grant_tx_ref[id]) != 0)) { > printk(KERN_ALERT "xennet_tx_buf_gc: warning " > "-- grant still in use by backend " > "domain.\n"); > BUG(); > > In my guess the reason may be as follows: > 1) XEN: The function _set_status() called in hypercall __gnttab_copy() and > __acquire_grant_for_copy() is executed failed and the grant ref is not ended. > So GTF_reading bit cannot be cleared. > 2) Netfront: this module invokes a BUG when it checks the GTF_reading bit is > still set. If that was the case, this would be a hypervisor bug: a grant copy operation is supposed to hold the grant active only for as long as the copy operation takes. You'll in particular notice that __acquire_grant_for_copy() in its error path clears GTF_reading (and GTF_writing, as appropriate) again. You'd likely need to instrument the code to demonstrate (via a couple of extra log messages) what you think is not working properly here. Jan > On 2013/10/17 16:00, Jan Beulich wrote: >>>>> On 17.10.13 at 09:41, jerry <jerry.lilijun@xxxxxxxxxx> wrote: >>> But there may be still concurrency problems in my test. >>> If the page replacing in copy_pending_req() was done after >>> netif_get_page_ext() in netbk_gop_frag(), copy_gop->flags is wrongly marked >>> with GNTCOPY_source_gref. >>> Here the memory of that page in skb has been replaced with Dom0 local >>> memory, so the later HYPERVISOR_multicall() with GNTTABOP_copy in >>> netbk_rx_actions() will get errors. >>> The messages is shown as: >>> >>> (XEN) grant_table.c:305:d0 Bad flags (0) or dom (0). (expected dom 0) >>> >>> Would you like to share some opinions? >> >> At a first glance that seems possible, but the question is - does it >> cause any problems other than the quoted message to be issued >> (and the problematic packet getting re-transmitted)? I'm asking >> mainly because fixing this would appear to imply adding locking to >> these paths - with the risk of adversely affecting performance. >> >> Jan >> >> >> _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |