[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Crashing kernel with dom0/libxc gnttab/gntshr
On 07/30/2013 12:58 PM, David Vrabel wrote: [...] [ 902.729307] BUG: Bad page map in process vchan-node1 pte:12bfff167 pmd:b9b5c067 [ 902.729312] page:ffffea0004afffc0 count:1 mapcount:-1 mapping: (null) index:0xffffffffffffffff I think this is the test for page_mapcount(page) < 0 in zap_pte_range(). This has looked up the page using the PTE it is trying to clear. Has it found the correct page? Since the MFN is currently mapped into the same domain, has the m2p_override stuff confused the look up and it is checking the grantee page not the granter? David I think something like this is happening, since while reproducing this on my test system, some linked list corruption was found that I believe to be the cause of this problem. The gnttab_map_refs function on PV uses m2p_add_override on the page, which threads page->lru to an m2p_overrides list. However, something else is using page->lru during the use of gntdev, as shown by the following debug patch: diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c index 3c8803f..198e57e 100644 --- a/drivers/xen/gntdev.c +++ b/drivers/xen/gntdev.c @@ -294,6 +294,11 @@ static int map_grant_pages(struct grant_map *map) if (err) return err;+ printk("map page0 lru: %p prev=%p:%p next=%p:%p\n", + &map->pages[0]->lru, + map->pages[0]->lru.prev, map->pages[0]->lru.prev->next, + map->pages[0]->lru.next, map->pages[0]->lru.next->prev); + for (i = 0; i < map->count; i++) { if (map->map_ops[i].status) err = -EINVAL; @@ -320,6 +325,10 @@ static int __unmap_grant_pages(struct grant_map *map, int offset, int pages) } }+ printk("unmap page0 lru: %p prev=%p:%p next=%p:%p\n", + &map->pages[0]->lru, + map->pages[0]->lru.prev, map->pages[0]->lru.prev->next, + map->pages[0]->lru.next, map->pages[0]->lru.next->prev); err = gnttab_unmap_refs(map->unmap_ops + offset, use_ptemod ? map->kmap_ops + offset : NULL, map->pages + offset, pages); Output: [ 88.610644] map page0 lru: ffffea0001dee160 prev=ffffffff82f2d510:ffffea0001dee160 next=ffffffff82f2d510:ffffea0001dee160 [ 88.611515] BUG: Bad page map in process a.out pte:8000000077b85167 pmd:2541a067 [ 88.611525] page:ffffea0001dee140 count:1 mapcount:-1 mapping: (null) index:0xffffffffffffffff [ 88.611532] page flags: 0x1000000000000814(referenced|dirty|private) [ 88.611541] addr:00007f1adaef3000 vm_flags:140400fb anon_vma: (null) mapping:ffff8800692974a0 index:0 [ 88.611547] vma->vm_ops->fault: (null) [ 88.611555] vma->vm_file->f_op->mmap: gntalloc_mmap+0x0/0x1d0 [...backtrace cropped...] [ 88.614301] unmap page0 lru: ffffea0001dee160 prev=ffff8800254c9d08:ffff88001ea0b120 next=ffff8800254c9d08:ffff88001ea0b938 The initial map is a linked list with only that element, so the address 0xffffffff82f2d510 is the m2p_overrides entry. This means the page being found by zap_pte_range is not a valid struct page. The struct page* being used by the gntalloc device was 0xffffea0000952740, for reference; it's not a direct collision between the page used by the gntdev and gntalloc devices. Not sure what the best fix is for this at the moment. -- Daniel De Graaf National Security Agency _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |