[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] netback Oops then xenwatch stuck in D state



>>> On 13.02.13 at 21:17, Wei Liu <wei.liu2@xxxxxxxxxx> wrote:
> On Wed, Feb 13, 2013 at 07:20:32PM +0000, David Vrabel wrote:
>> On 13/02/13 18:37, Wei Liu wrote:
>> > A slightly upgraded version of the *UNTESTED* patch.
>> > 
>> > 
>> > Wei.
>> > 
>> > ----8<----
>> > commit df4c929d034cec7043fbd96ba89833eb639c336e
>> > Author: Wei Liu <wei.liu2@xxxxxxxxxx>
>> > Date:   Wed Feb 13 18:17:01 2013 +0000
>> > 
>> >     netback: fix netbk_count_requests
>> >     
>> >     There are two paths in the original code, a) test against work_to_do, 
> b) test
>> >     against first->size, could return 0 even when error happens.
>> >     
>> >     Simply return -1 in error paths should work. Modify all error paths to 
> return
>> >     -1 to be consistent.
>> 
>> You also need to remove the netbk_tx_err() after checking the result of
>> netbk_count_requests().  Otherwise you will have a double xenvif_put(),
>> which will screw up ref counting.
> 
> I just realized that we were talking about different code path when I
> walked home.
> 
> The path you mentioned is correct. As excution flow should never reach
> there if there is error in netbk_count_requests.
> 
> The path I'm not sure is that in the netbk_fatal_tx_err, it calls
> xenvif_carrier_off which calls xenvif_put, and then it calls xenvif_put,
> which is likely to mess up the refcount.

I first thought so too when looking at this again yesterday, but
then convinced myself that this double put _here_ is correct. And
Ian's patch specifically removed to call to netbk_tx_err() after the
caller netbk_count_requests() recognized the error, knowing that
the latter function now itself issues a call to netbk_fatal_tx_err()
(whether that wouldn't better have replaced the caller's call to
netbk_tx_err() is a different question).

What I do support is that fact that the third error path in
netbk_count_requests() has the potential of returning zero
instead of a negative value. As the caller doesn't really do
anything with the specific negative value (it only did so before
Ian's patch), returning -1 or -EINVAL on all four error paths
would seem the right change to me.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.