[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] frequently ballooning results in qemu exit
At 05:54 +0000 on 15 Mar (1363326854), Hanweidong wrote: > > > I'm also curious about this. There is a window between memory balloon > > out > > > and QEMU invalidate mapcache. > > > > That by itself is OK; I don't think we need to provide any meaningful > > semantics if the guest is accessing memory that it's ballooned out. > > > > The question is where the SIGBUS comes from: either qemu has a mapping > > of the old memory, in which case it can write to it safely, or it > > doesn't, in which case it shouldn't try. > > The error always happened at memcpy in if (is_write) branch in > address_space_rw. Sure, but _why_? Why does this access cause SIGBUS? Presumably there's some part of the mapcache code that thinks it has a mapping there when it doesn't. > We found that, after the last xen_invalidate_map_cache, the mapcache entry > related to the failed address was mapped: > ==xen_map_cache== phys_addr=7a3c1ec0 size=0 lock=0 > ==xen_remap_bucket== begin size=1048576 ,address_index=7a3 > ==xen_remap_bucket== end > entry->paddr_index=7a3,entry->vaddr_base=2a2d9000,size=1048576,address_index=7a3 OK, so that's 0x2a2d9000 -- 0x2a3d8fff. > ==address_space_rw== ptr=2a39aec0 > ==xen_map_cache== phys_addr=7a3c1ec4 size=0 lock=0 > ==xen_map_cache==first return 2a2d9000+c1ec4=2a39aec4 > ==address_space_rw== ptr=2a39aec4 > ==xen_map_cache== phys_addr=7a3c1ec8 size=0 lock=0 > ==xen_map_cache==first return 2a2d9000+c1ec8=2a39aec8 > ==address_space_rw== ptr=2a39aec8 > ==xen_map_cache== phys_addr=7a3c1ecc size=0 lock=0 > ==xen_map_cache==first return 2a2d9000+c1ecc=2a39aecc > ==address_space_rw== ptr=2a39aecc These are all to page 0x2a3e9a___. > ==xen_map_cache== phys_addr=7a16c108 size=0 lock=0 > ==xen_map_cache== return 92a407000+6c108=2a473108 > ==xen_map_cache== phys_addr=7a16c10c size=0 lock=0 > ==xen_map_cache==first return 2a407000+6c10c=2a47310c > ==xen_map_cache== phys_addr=7a16c110 size=0 lock=0 > ==xen_map_cache==first return 2a407000+6c110=2a473110 > ==xen_map_cache== phys_addr=7a395000 size=0 lock=0 > ==xen_map_cache== return 2a2d9000+95000=2a36e000 > ==address_space_rw== ptr=2a36e000 And this is to page 0x2a36e___, a different page in the same bucket. > here, the SIGBUS error occurred. So that page isn't mapped. Which means: - it was never mapped (and the mapcache code didn't handle the error correctly at map time); or - it was never mapped (and the mapcache hasn't checked its own records before using the map); or - it was mapped (and something unmapped it in the meantime). Why not add some tests in xen_remap_bucket to check that all the pages that qemu records as mapped are actually there? Tim. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |