[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel][PV-ops][PATCH] Netback: Fix PV network issue for netback multiple threads patchset
On Thu, 2010-07-01 at 16:29 +0100, Jeremy Fitzhardinge wrote: > On 07/01/2010 04:48 PM, Ian Campbell wrote: > > On Mon, 2010-06-21 at 12:14 +0100, Jeremy Fitzhardinge wrote: > > > >> Subject: [PATCH] xen/netback: make sure all the group structures are > >> initialized before starting async code > >> > >> Split the netbk group array initialization into two phases: one to do > >> simple "can't fail" initialization of lists, timers, locks, etc; and a > >> second pass to allocate memory and start the async code. > >> > >> This makes sure that all the basic stuff is in place by the time the async > >> code > >> is running. > >> > > Paul noticed a crash in netback init because... > > > > > >> @@ -1764,16 +1766,6 @@ static int __init netback_init(void) > >> netbk->netbk_tx_pending_timer.function = > >> netbk_tx_pending_timeout; > >> > >> - netbk->mmap_pages = > >> - alloc_empty_pages_and_pagevec(MAX_PENDING_REQS); > >> - if (!netbk->mmap_pages) { > >> - printk(KERN_ALERT "%s: out of memory\n", __func__); > >> - del_timer(&netbk->netbk_tx_pending_timer); > >> - del_timer(&netbk->net_timer); > >> - rc = -ENOMEM; > >> - goto failed_init; > >> - } > >> - > >> for (i = 0; i < MAX_PENDING_REQS; i++) { > >> page = netbk->mmap_pages[i]; > >> SetPageForeign(page, netif_page_release); > >> > > ...this dereference of netbk->mmap_pages[i]... > > > > > >> @@ -1786,6 +1778,26 @@ static int __init netback_init(void) > >> > > [...] > > > >> + netbk->mmap_pages = > >> + alloc_empty_pages_and_pagevec(MAX_PENDING_REQS); > >> > > ... happens before this initialisation of the array. > > > > Hm, I hadn't meant to commit that properly. I had it locally and > accidentally pushed it out. > > I only did that patch as an RFC in response to an issue alluded to by > Dongxiao (or was it you?) about things not being fully initialized by > the time the async code starts. Is this a real issue, and if so, what's > the correct fix? I don't think there is an actual current issue, just a potential one since we are relying on data structures being zeroed rather than properly initialised to keep the async code from running off into the weeds, it just seemed a little fragile this way. Originally I said: >> The crash is in one of the calls to list_move_tail and I think it is >> because netbk->pending_inuse_head not being initialised until after >> the >> threads and/or tasklets are created (I was running in threaded mode). >> Perhaps even though we are now zeroing the netbk struct those fields >> should still be initialised before kicking off any potentially >> asynchronous tasks? this specific issue was fixed by zeroing the netbk array as it is allocated, I just thought we could make things more robust by not triggering the async code until everything was fully setup. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |