[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Xen-devel] [PATCH] Fix locking bug in vcpu_migrate
On 4/22/2011 11:43 AM, Keir Fraser wrote:
It's odd that it seemed to lead to such a big difference for me, then.
> I'll do some further tests -- maybe I changed something else to cause
> the behavior, or the problem is more random than I thought and just
> hasn't occurred for me yet in all the new tests.
I did further testing and determined that my domU was starting properly
because I had only tested once or twice with Debian Squeeze after
applying the patch; I had then done more extensive testing only under a
Win2k3 domU. It seems that Win2k3 domUs don't have the same issue.
Back on the Squeeze domU, I am reliably seeing the BUG again, with
I have rolled back schedule.c to pre-22948, when it was much simpler,
and that seems to have resolved this particular bug.
Now, a different credit2 bug has occurred, though only once for me so
far; with the other bug, I was seeing a panic with every 1 or 2 domU
startups, but I have seen the new bug on one test out of 15.
Specifically, I have triggered the BUG_ON in csched_domcntl. The line
number is not the standard one because I have added further debugging,
but the BUG_ON is:
BUG_ON(svc->rqd != RQD(ops, svc->vcpu->processor));
The bt being:
(XEN) [<ffff82c480119578>] csched_dom_cntl+0x11a/0x185
(XEN) [<ffff82c48011f24d>] sched_adjust+0x102/0x1f9
(XEN) [<ffff82c480102ee5>] do_domctl+0xb25/0x1250
(XEN) [<ffff82c4801ff0e8>] syscall_enter+0xc8/0x122
Also, in three of those last 15 startups, my domU froze three times
(consuming no CPU and seemingly doing nothing), somewhere in this block
of code in ring_read in tools/firmware/hvmloader/xenbus.c -- I added
debug information that allowed me to narrow it down. This function is
being called when it is writing the SMBIOS tables. I can't tell whether
this is related to the credit2 problem. (The domU can be "destroyed" to
get out of it).
/* Don't overrun the producer pointer */
while ( (part = MASK_XENSTORE_IDX(rings->rsp_prod -
rings->rsp_cons)) == 0 )
/* Don't overrun the end of the ring */
if ( part > (XENSTORE_RING_SIZE -
part = XENSTORE_RING_SIZE - MASK_XENSTORE_IDX(rings->rsp_cons);
/* Don't read more than we were asked for */
if ( part > len )
part = len;
Note that I am using stubdoms.
I would be happy to temporarily turn over the reins on this machine to
you or George, if you'd like to debug any of these issues directly. I
may not be able to continue experimenting in the short term here myself
due to time constraints.
Xen-devel mailing list