[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network troubles



Thursday, February 27, 2014, 4:57:26 PM, you wrote:

> On Thu, Feb 27, 2014 at 04:26:55PM +0100, Sander Eikelenboom wrote:
> [...]
>> >> Added some more printk's:
>> >> 
>> >> @@ -2072,7 +2076,11 @@ __gnttab_copy(
>> >>                                        &s_frame, &s_pg,
>> >>                                        &source_off, &source_len, 1);
>> >>          if ( rc != GNTST_okay )
>> >> -            goto error_out;
>> >> +            PIN_FAIL(error_out, GNTST_general_error,
>> >> +                     "?!?!? src_is_gref: aquire grant for copy failed 
>> >> current_dom_id:%d src_dom_id:%d dest_dom_id:%d\n",
>> >> +                     current->domain->domain_id, op->source.domid, 
>> >> op->dest.domid);
>> >> +
>> >> +
>> >>          have_s_grant = 1;
>> >>          if ( op->source.offset < source_off ||
>> >>               op->len > source_len )
>> >> @@ -2096,7 +2104,11 @@ __gnttab_copy(
>> >>                                        current->domain->domain_id, 0,
>> >>                                        &d_frame, &d_pg, &dest_off, 
>> >> &dest_len, 1);
>> >>          if ( rc != GNTST_okay )
>> >> -            goto error_out;
>> >> +            PIN_FAIL(error_out, GNTST_general_error,
>> >> +                     "?!?!? dest_is_gref: aquire grant for copy failed 
>> >> current_dom_id:%d src_dom_id:%d dest_dom_id:%d\n",
>> >> +                     current->domain->domain_id, op->source.domid, 
>> >> op->dest.domid);
>> >> +
>> >> +
>> >>          have_d_grant = 1;
>> >> 
>> >> 
>> >> this comes out:
>> >> 
>> >> (XEN) [2014-02-27 02:34:37] grant_table.c:2109:d0 ?!?!? dest_is_gref: 
>> >> aquire grant for copy failed current_dom_id:0 src_dom_id:32752 
>> >> dest_dom_id:7
>> >> 
>> 
>> > If it fails in gnttab_copy then I very much suspects this is a network
>> > driver problem as persistent grant in blk driver doesn't use grant
>> > copy.
>> 
>> Does the dest_gref or src_is_gref by any chance give some sort of direction ?
>> 

> Yes, there's indication. For network driver, dest_is_gref means DomU RX
> path, src_is_gref means DomU TX path.

> In the particular error message you mentioned, it means that this
> happens in DomU's RX path, but it would not give us clear idea what had
> happened. As the ring is skewed any way it's not surprised to see a
> garbage gref in hypervisor. 

> Wei.

>> >> 
>> >> > My suggestion is, if you have a working base line, you can try to setup
>> >> > different frontend / backend combination to help narrow down the
>> >> > problem.
>> >> 
>> >> Will see what i can do after the weekend
>> >> 
A small update

I tried reverting the latest netback / netfront patches .. but to no avail ..
Also tried if i could trigger it somehow by using netperf and generating a lot
of frags (as that would make it more easily reproduceable).
But that was also to no avail .. it seems to only trigger sometimes with my
specific workload.

So i took a flight forward by trying out Zoltan's series v6
(since it also had changes to the way the network code uses the granttables),
got that running overnight applying the same workload as before and
i haven't triggered anything yet .. looking good so far :-)

--
Sander



>> 
>> > Thanks
>> 
>> >> > Wei.
>> >> 
>> >> <snip>
>> >> 
>> 



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.