[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH][XSA-126] xen: limit guest control of PCI command register



On Mon, Jun 08, 2015 at 08:42:57AM +0100, Jan Beulich wrote:
> >>> On 07.06.15 at 08:23, <mst@xxxxxxxxxx> wrote:
> > On Mon, Apr 20, 2015 at 04:32:12PM +0200, Michael S. Tsirkin wrote:
> >> On Mon, Apr 20, 2015 at 03:08:09PM +0100, Jan Beulich wrote:
> >> > >>> On 20.04.15 at 15:43, <mst@xxxxxxxxxx> wrote:
> >> > > On Mon, Apr 13, 2015 at 01:51:06PM +0100, Jan Beulich wrote:
> >> > >> >>> On 13.04.15 at 14:47, <mst@xxxxxxxxxx> wrote:
> >> > >> > Can you check device capabilities register, offset 0x4 within
> >> > >> > pci express capability structure?
> >> > >> > Bit 15 is 15 Role-Based Error Reporting.
> >> > >> > Is it set?
> >> > >> > 
> >> > >> > The spec says:
> >> > >> > 
> >> > >> >     15
> >> > >> >     On platforms where robust error handling and PC-compatible 
> >> > >> > Configuration 
> >> > >> > Space probing is
> >> > >> >     required, it is suggested that software or firmware have the 
> >> > >> > Unsupported 
> >> > >> > Request Reporting Enable
> >> > >> >     bit Set for Role-Based Error Reporting Functions, but clear for 
> >> > >> > 1.0a 
> >> > >> > Functions. Software or
> >> > >> >     firmware can distinguish the two classes of Functions by 
> >> > >> > examining the 
> >> > >> > Role-Based Error Reporting
> >> > >> >     bit in the Device Capabilities register.
> >> > >> 
> >> > >> Yes, that bit is set.
> >> > > 
> >> > > curiouser and curiouser.
> >> > > 
> >> > > So with functions that do support Role-Based Error Reporting, we have
> >> > > this:
> >> > > 
> >> > > 
> >> > >        With device Functions implementing Role-Based Error Reporting, 
> >> > > setting the 
> >> > > Unsupported Request
> >> > >        Reporting Enable bit will not interfere with PC-compatible 
> >> > > Configuration 
> >> > > Space probing, assuming
> >> > >        that the severity for UR is left at its default of non-fatal. 
> >> > > However, 
> >> > > setting the Unsupported Request
> >> > >        Reporting Enable bit will enable the Function to report UR 
> >> > > errors 97 
> >> > > detected with posted Requests,
> >> > >        helping avoid this case for potential silent data corruption.
> >> > 
> >> > I still don't see what the PC-compatible config space probing has to
> >> > do with our issue.
> >> 
> >> I'm not sure but I think it's listed here because it causes a ton of URs
> >> when device scan probes unimplemented functions.
> >> 
> >> > > did firmware reconfigure this device to report URs as fatal errors 
> >> > > then?
> >> > 
> >> > No, the Unsupported Request Error Serverity flag is zero.
> >> 
> >> OK, that's the correct configuration, so how come the box crashes when
> >> there's a UR then?
> > 
> > Ping - any update on this?
> 
> Not really. All we concluded so far is that _maybe_ the bridge, upon
> seeing the UR, generates a Master Abort, rendering the whole thing
> fatal.

But Master Abort is the equivalent of the UR, so I think that a
reasonable system would not be configured to trigger a fatal error
in this case - and you previously said it's configured reasonably.

> Otoh the respective root port also has
> - Received Master Abort set in its Secondary Status register (but
>   that's also already the case in the log that we have before the UR
>   occurs, i.e. that doesn't mean all that much),
> - Received System Error set in its Secondary Status register (and
>   after the UR the sibling endpoint [UR originating from 83:00.0,
>   sibling being 83:00.1] also shows Signaled System Error set).

It's another function of the same physical device, correct?

So is this sibling the only function sending SERR?
What happens if you disable SERR# in the command register
of 83:00.1?



> > Do we can chalk this up to hardware bugs on a specific box?
> 
> I have to admit that I'm still very uncertain whether to consider all
> this correct behavior, a firmware flaw, or a hardware bug.
> 
> Jan

Questions:
1.  Does this only happen with a specific endpoint?
    What if you add another endpoint to the same system?
2.  Has a driver initialized this endpoint? What if you don't
    load a driver before sending the transaction resulting in the UR?


-- 
MST

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.