[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v5][XSA-97] x86/paging: make log-dirty operations preemptible
>>> On 15.09.14 at 09:50, <andrew.cooper3@xxxxxxxxxx> wrote: > It is indeed migration v2, which is necessary in XenServer given our > recent switch from 32bit dom0 to 64bit. The counts are only used for > logging, and debugging purposes; all movement of pages is based off the > bits in the bitmap alone. In particular, the dirty count is used as a > basis of the statistics for the present iteration of migration. While > getting it wrong is not the end of the world, it would certainly be > preferable for the count to be accurate. > > As for the memory corruption, XenRT usually tests pairs of VMs at a time > (32 and 64bit variants) and all operations as back-to-back as possible. > Therefore, it is highly likely that a continued operation on one domain > intersects with other paging operations on another. But there's nothing I can see where domains would have a way of getting mismatched. It is in particular this one (XEN) [ 7832.953068] mm.c:827:d0v0 pg_owner 100 l1e_owner 100, but real_pg_owner 99 which puzzles me: Assuming Dom99 was the original one, how would Dom100 get hold of any of Dom99's pages (IOW why would Dom0 map one of Dom99's pages into Dom100)? The patch doesn't alter any of the page refcounting after all. Nor does your v2 migration series I would think. In general I understand you - as much as I - suspect that we're losing one or more bits from the dirty bitmap (too many being set wouldn't do any harm other than affecting performance afaict), but that scenario doesn't seem to fit with your observations. > The results (now they have run fully) are 10 tests each. 10 passes > without this patch, and 10 failures in similar ways with the patch, > spread across a randomly selected set of hardware. I was meanwhile considering the call to d->arch.paging.log_dirty.clean_dirty_bitmap() getting made only in the final success exit case to be a problem (with the paging lock dropped perhaps multiple times in between), but I'm pretty certain it isn't: Newly dirtied pages would get accounted correctly in the bitmap no matter whether they're in the range already processed or the remainder, and ones already having been p2m_ram_rw would have no problem if further writes to them happen while we do continuations. The only thing potentially suffering here seems efficiency: We might return a few pages to p2m_ram_logdirty without strict need (but that issue existed before already, we're just widening the window). Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |