[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1/2 v4] iommu/amd: Fix logic for clearing the IOMMU interrupt bits



>>> On 10.06.13 at 15:49, George Dunlap <George.Dunlap@xxxxxxxxxxxxx> wrote:
> On Mon, Jun 10, 2013 at 1:41 PM, Jan Beulich <JBeulich@xxxxxxxx> wrote:
>> The IOMMU interrupt bits in the IOMMU status registers are
>> "read-only, and write-1-to-clear (RW1C).  Therefore, the existing
>> logic which reads the register, set the bit, and then writing back
>> the values could accidentally clear certain bits if it has been set.
>>
>> The correct logic would just be writing only the value which only
>> set the interrupt bits, and leave the rest to zeros.
>>
>> This patch also, clean up #define masks as Jan has suggested.
>>
>> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@xxxxxxx>
>>
>> With iommu_interrupt_handler() properly having got switched its readl()
>> from status to control register, the subsequent writel() needed to be
>> switched too (and the RW1C comment there was bogus).
>>
>> Some of the cleanup went too far - undone.
>>
>> Further, with iommu_interrupt_handler() now actually disabling the
>> interrupt sources, they also need to get re-enabled by the tasklet once
>> it finished processing the respective log.
>>
>> Finally, guest write emulation to the status register needs to be done
>> with the RW1C (and RO for all other bits) semantics in mind too.
>>
>> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> 
> What's the impact of this as a bug?

The primary issue really comes very close to a security one: The
workaround to the live lock (observed on VT-d iirc) consisting of
disabling the interrupt and deferring the actual handling to a
tasklet might have been completely broken. Whether that's
actually the case very much depends on how close documentation
documents actual hardware: If the wording there is precise, we'd
be susceptible to this on the event log side, but be safe from this
for the PPR log.

> This looks like it has a fairly high risk of introducing regressions
> in existing working system.  So unless it has a pretty wide impact, I
> think we should wait and include this in 4.3.1.

If it turns out the documentation is imprecise in the way that we
would hope for, I'm fine with postponing this, as then the only real
problem would be that we might be seeing too few interrupts (i.e.
there might be a silently non-functional device somewhere in the
system because of this, but there wouldn't be any risk to the
system as a whole). So when to apply this depends heavily on AMD
clarifying actual hardware behavior vs. documentation.

But of course you also need to view this in the context of patch
2, which is something I think we want to have in 4.3 (but which
will [minimally] need reworking if this patch it to be deferred).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.