[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Config space access to Mediatek MT7922 doesn't work after device reset in Xen PV dom0 (regression, Linux 6.12)



On Wed, Jan 29, 2025 at 03:10:49AM +0100, Marek Marczykowski-Górecki wrote:
> On Tue, Jan 28, 2025 at 07:15:26PM -0600, Bjorn Helgaas wrote:
> > On Fri, Jan 17, 2025 at 01:05:30PM +0100, Marek Marczykowski-Górecki wrote:
> > > After updating PV dom0 to Linux 6.12, The Mediatek MT7922 device reports
> > > all 0xff when accessing its config space. This happens only after device
> > > reset (which is also triggered when binding the device to the
> > > xen-pciback driver).
> > 
> > Thanks for the report and for all the debugging you've already done!
> > 
> > > Reproducer:
> > > 
> > >     # lspci -xs 01:00.0
> > >     01:00.0 Network controller: MEDIATEK Corp. MT7922 802.11ax PCI 
> > > Express Wireless Network Adapter
> > >     00: c3 14 16 06 00 00 10 00 00 00 80 02 10 00 00 00
> > >     ...
> > >     # echo 1 > /sys/bus/pci/devices/0000:01:00.0/reset
> > >     # lspci -xs 01:00.0
> > >     01:00.0 Network controller: MEDIATEK Corp. MT7922 802.11ax PCI 
> > > Express Wireless Network Adapter
> > >     00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> > >
> > > The same operation done on Linux 6.12 running without Xen works fine.
> > > 
> > > git bisect points at:
> > > 
> > >     commit d591f6804e7e1310881c9224d72247a2b65039af
> > >     Author: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
> > >     Date:   Tue Aug 27 18:48:46 2024 -0500
> > > 
> > >     PCI: Wait for device readiness with Configuration RRS
> > > 
> > > part of that commit:
> > > @@ -1311,9 +1320,15 @@ static int pci_dev_wait(struct pci_dev *dev, char 
> > > *reset_type, int timeout)
> > >                         return -ENOTTY;
> > >                 }
> > >  
> > > -               pci_read_config_dword(dev, PCI_COMMAND, &id);
> > > -               if (!PCI_POSSIBLE_ERROR(id))
> > > -                       break;
> > > +               if (root && root->config_crs_sv) {
> > > +                       pci_read_config_dword(dev, PCI_VENDOR_ID, &id);
> > > +                       if (!pci_bus_crs_vendor_id(id))
> > > +                               break;
> > > +               } else {
> > > +                       pci_read_config_dword(dev, PCI_COMMAND, &id);
> > > +                       if (!PCI_POSSIBLE_ERROR(id))
> > > +                               break;
> > > +               }
> > >  
> > >     
> > > Adding some debugging, the PCI_VENDOR_ID read in pci_dev_wait() returns
> > > initially 0xffffffff. If I extend the condition with
> > > "&& !PCI_POSSIBLE_ERROR(id)", then the issue disappear. But reading the
> > > patch description, it would break VF.
> > > I'm not sure where the issue is, but given it breaks only when running
> > > with Xen, I guess something is wrong with "Configuration RRS Software
> > > Visibility" in that case.
> > 
> > I'm missing something.  If you get 0xffffffff, that is not the 0x0001
> > Vendor ID, so pci_dev_wait() should exit immediately.  
> 
> I'm not sure what is going on there either, but my _guess_ is that the
> loop exits too early due to the above. And it makes some further actions
> to fail.

Seems like a good guess worth investigating.  Maybe log all config
accesses to this device after the FLR and see what we're doing?



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.