[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Re: Memory corruption bug with Xen PV Dom0 and BOSS-S1 RAID card



On Wed, Feb 19, 2025 at 07:37:47PM +0100, Paweł Srokosz wrote:
> Hello,
> 
> > So the issue doesn't happen on debug=y builds? That's unexpected.  I would
> > expect the opposite, that some code in Linux assumes that pfn + 1 == mfn +
> > 1, and hence breaks when the relation is reversed.
> 
> It was also surprising for me but I think the key thing is that debug=y
> causes whole mapping to be reversed so each PFN lands on completely different
> MFN e.g. MFN=0x1300000 is mapped to PFN=0x20e50c in ndebug, but in debug
> it's mapped to PFN=0x5FFFFF. I guess that's why I can't reproduce the
> problem.
> 
> > Can you see if you can reproduce with dom0-iommu=strict in the Xen command
> > line?
> 
> Unfortunately, it doesn't help. But I have few more observations.
> 
> Firstly, I checked the "xen-mfndump dump-m2p" output and found that misread
> blocks are mapped to suspiciously round MFNs. I have different versions of
> Xen and Linux kernel on each machine and I see some coincidence.
> 
> I'm writing few huge files without Xen to ensure that they have been written
> correctly (because under Xen both read and writeback is affected). Then I'm
> booting to Xen, memory-mapping the files and reading each page. I see that 
> when 
> block is corrupted, it is mapped on round MFN e.g. 
> pfn=0x5095d9/mfn=0x1600000, 
> another on pfn=0x4095d9/mfn=0x1500000 etc.
> 
> On another machine with different Linux/Xen version these faults appear on
> pfn=0x20e50c/mfn=0x1300000, pfn=0x30e50c/mfn=0x1400000 etc.
> 
> I also noticed that during read of page that is mapped to
> pfn=0x20e50c/mfn=0x1300000, I'm getting these faults from DMAR:
> 
> ```
> (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:65:00.0] fault addr 
> 1200000000
> (XEN) [VT-D]DMAR: reason 05 - PTE Write access is not set
> (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:65:00.0] fault addr 
> 1200001000
> (XEN) [VT-D]DMAR: reason 05 - PTE Write access is not set
> (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:65:00.0] fault addr 
> 1200006000
> (XEN) [VT-D]DMAR: reason 05 - PTE Write access is not set
> (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:65:00.0] fault addr 
> 1200008000
> (XEN) [VT-D]DMAR: reason 05 - PTE Write access is not set
> (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:65:00.0] fault addr 
> 1200009000
> (XEN) [VT-D]DMAR: reason 05 - PTE Write access is not set
> (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:65:00.0] fault addr 
> 120000a000
> (XEN) [VT-D]DMAR: reason 05 - PTE Write access is not set
> (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:65:00.0] fault addr 
> 120000c000
> (XEN) [VT-D]DMAR: reason 05 - PTE Write access is not set
> ```

That's interesting, it seems to me that Linux is assuming that pages
at certain boundaries are superpages, and thus it can just increase
the mfn to get the next physical page.

> and every time I'm dropping the cache and reading this region, I'm getting
> DMAR faults on few random addresses from 1200000000-120000f000 range (I guess 
> MFNs 0x1200000-120000f). MFNs 0x1200000-0x12000ff are not mapped to any PFN in
> Dom0 (based on xen-mfndump output.). 

It would be very interesting to figure out where those requests
originate, iow: which entity in Linux creates the bios with the
faulting address(es).

It's a wild guess, but could you try to boot Linux with swiotlb=force
on the command line and attempt to trigger the issue?  I wonder
whether imposing the usage of the swiotlb will surface the issues as
CPU accesses, rather then IOMMU faults, and that could get us a trace
inside Linux of how those requests are generated.

> On the other hand, I'm not getting these DMAR faults while reading other 
> regions.
> Also I can't trigger the bug with reversed Dom0 mapping, even if I fill the 
> page
> cache with reads.

There's possibly some condition we are missing that causes a component
in Linux to assume the next address is mfn + 1, instead of doing the
full address translation from the linear or pfn space.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.