[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v5][XSA-97] x86/paging: make log-dirty operations preemptible



>>> On 11.09.14 at 20:27, <andrew.cooper3@xxxxxxxxxx> wrote:
> On inspection of the libxc logs, I am feeing quite glad I left this
> debugging message in:
> 
> xenguest-75-save[11876]: xc: detail: Bitmap contained more entries than
> expected...
> xenguest-83-save[32123]: xc: detail: Bitmap contained more entries than
> expected...
> xenguest-84-save[471]: xc: detail: Bitmap contained more entries than
> expected...
> xenguest-88-save[3823]: xc: detail: Bitmap contained more entries than
> expected...
> xenguest-89-save[4656]: xc: detail: Bitmap contained more entries than
> expected...
> xenguest-95-save[9379]: xc: detail: Bitmap contained more entries than
> expected...
> xenguest-98-save[11784]: xc: detail: Bitmap contained more entries than
> expected...
> 
> This means that periodically, a XEN_DOMCTL_SHADOW_OP_{CLEAN,PEEK}
> hypercall gives us back a bitmap with more set bits than
> stats.dirty_count which it hands back at the same time.

Which has quite simple an explanation, at least when (as I assume is
the case) XEN_DOMCTL_SHADOW_OP_CLEAN is being used: We
clear

d->arch.paging.log_dirty.{dirty,fault}_count on the each step of the
continuation instead of just on the final one. That code needs moving
down.

This, however, points at a broader problem: The necessary dropping
of the paging lock between continuation steps implies that subsequent
setting of bits in the map would need to distinguish between ranges
where the bitmap already got copied and those still outstanding when
it comes to updating dirty_count. I.e. no matter whether we leave
the current copying/clearing where it is or move it down, the counts
wouldn't be precise.

Now both from your description and from looking at current code
I conclude that this is with your new migration code, since current
code uses both counts only for printing messages. Which in turn
raises the question whether use of these counts, getting exported
just as statistics, for anything other than statistics is actually
appropriate. Otoh, if your new migration code doesn't use the
counts for non-statistical, non-logging purposes I still can't see why
things go wrong for you but not for me or osstest.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.