[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [BUG]SMMU-V3 queue need no-cache memory



On Wed, 7 Dec 2022, Julien Grall wrote:
> Hi,
> 
> I only noticed this e-mail because I was skimming xen-devel. If you want to
> get our attention, then I would suggest to CC both of us because I (and I
> guess Stefano) have filter rules so those e-mails land directly in my inbox.
> 
> On 07/12/2022 10:24, Rahul Singh wrote:
> > > On 7 Dec 2022, at 2:04 am, sisyphean <sisyphean@zlw.email> wrote:
> > > 
> > > Hi,
> > > 
> > >      I try to run XEN on my ARM board(Sorry, for some commercial reasons,
> > > I can't tell you
> > >      on which platform I run XEN)  and enable SMMU-V3, but all cmds in
> > > cmdq failed when XEN started.
> > > 
> > >      After using the debugger to track debugging, the reason for this
> > > problem is that
> > >      the queue in the smmu-v3 driver is not no-cache, so after the
> > > function arm_smmu_cmdq_build_cmd
> > >      is executed, the cmd is still in cache.Therefore, the SMMU-V3
> > > hardware cannot obtain the correct cmd
> > >      from the memory for execution.
> > 
> > Yes you are right as of now we are allocating the memory for cmdqueue via
> > _xzalloc() which is cached
> > memory because of that you are observing the issue. We have tested the Xen
> > SMMUv3 driver on SOC
> > where SMMUv3 HW is in the coherency domain, and because of that we have not
> > encountered this issue.
> > 
> > I think In your case SMMUv3 HW is not in the coherency domain. Please
> > confirm from your side if the
> > "dma-coherent” property is not set in DT.
> > 
> > I think there is no function available as of now to request Xen to allocate
> > memory that is not cached.
> 
> You are correct.
> 
> > 
> > @Julien and @Stefano do you have any suggestion on how we can request memory
> > from Xen that is not
> > cached something like dma_alloc_coherent() in Linux.
> 
> At the moment all the RAM is mapped cacheable in Xen. So it will require some
> work to have some memory uncacheable.
> 
> There are two options:
>  1) Allocate a pool of memory at boot time that will be mapped with different
> memory attribute. This means we would need a separate pool and the user will
> have to size it.
>  2) Modify after the allocation the caching attribute in the memory and then
> revert back after freeing. The cons is we would end up to shatter superpage.
> We also can't re-create superpage (yet), but that might be fine if the memory
> is never freed.
> 
> Option two would probably the best. But before going that route I have one
> question...
> 
> > The temporary solution I use is to execute function clean_dcache every
> > time cmd is copied to cmdq in function queue_write. But it is obvious
> > that this will seriously affect the efficiency.
> 
> I agree you will see some performance impact in micro-benchmark. But I am not
> sure about normal use-cases. How often do you expect the command queue to be
> used?

That is a good question. But even for the micro-benchmark, is the
difference significant? 

My gut feeling (to be discussed and confirmed) is that for this use-case
it might not be worth to do option 1) or option 2) above. Clean_dcache
as needed might be good enough?


> Also, I am a bit surprised you are seing issue with the command queue but not
> with the stage-2 page-tables. Does your SMMU support coherent walk but cannot
> snoop for the command queue?

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.