[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] dom0 linux 3.6.0-rc4, crash due to ballooning althoug dom0_mem=X, max:X set



On Wed, 12 Sep 2012, Sander Eikelenboom wrote:
> Tuesday, September 11, 2012, 6:02:47 PM, you wrote:
> 
> > On Wed, 5 Sep 2012, Konrad Rzeszutek Wilk wrote:
> >> On Tue, Sep 04, 2012 at 04:27:20PM -0400, Robert Phillips wrote:
> >> > Ben,
> >> > 
> >> > You have asked me to provide the rationale behind the gnttab_old_mfn 
> >> > patch, which you emailed to Sander earlier today. 
> >> > Here are my findings.
> >> > 
> >> > I found that xen_blkbk_map() in drivers/block/xen-blkback/blkback.c has 
> >> > changed from our previous version.  It now calls gnttab_map_refs() in 
> >> > drivers/xen/grant-table.c.
> >> > 
> >> > That function first calls 
> >> > HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref, ... ) and then calls 
> >> > m2p_add_override() in p2m.c
> >> 
> >> And HYPERVISOR_grant_table_op .. would populate map_ops[i].bus_addr with 
> >> the machine address..
> >> 
> >> > which is where I made my change.
> >> > 
> >> > The unpatched code was saving the pfn's old mfn in 
> >> > kmap_op->dev_bus_addr.  
> >> > 
> >> > kmap_op is of type struct gnttab_map_grant_ref.  That data type is used 
> >> > to record grant table mappings so later they can be unmapped correctly.
> >> 
> >> Right, but the blkback makes a distinction by passing NULL as kmap_op, 
> >> which means it should
> >> use the old mechanism. Meaning that once the hypercall is done, the 
> >> map_ops[i].bus_addr is not
> >> used anymore..
> >> 
> >> > 
> >> > The problem with saving the old mfn in kmap_op->dev_bus_addr is that it 
> >> > is later overwritten by __gnttab_map_grant_ref() in 
> >> > xen/common/grant_table.c
> >> 
> >> Uh, so the problem of saving the old mfn in dev_bus_addr has been there 
> >> for a long long time then?
> >> Even before this patch set?
> 
> > I think that Robert identified the real problem: dev_bus_addr shouldn't
> > have been used here. However the bug only shows up if we are batching
> > the grant table operations, that we started doing since
> > f62805f1f30a40e354bd036b4cb799863a39be4b.
> > That's why Sander's bisection found that
> > f62805f1f30a40e354bd036b4cb799863a39be4b is the culprit.
> 
> > However the fix is incorrect because it is modifying a struct that is
> > part of the Xen ABI.
> > I am appending an alternative fix that doesn't need any changes to
> > public headers.
> 
> > Sander, could you please let me know if it fixes the problem for you?
> 
> It does !
> 
> Tested-By: Sander Eikelenboom <linux@xxxxxxxxxxxxxx>
> 

Thanks for testing!


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.