[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3 5/9] xen/vpci: add handlers to map the BARs
On Fri, Jun 23, 2017 at 02:58:28AM -0600, Jan Beulich wrote: > >>> On 22.06.17 at 19:13, <roger.pau@xxxxxxxxxx> wrote: > > On Fri, May 19, 2017 at 09:21:56AM -0600, Jan Beulich wrote: > >> >>> On 27.04.17 at 16:35, <roger.pau@xxxxxxxxxx> wrote: > >> > +static int vpci_modify_bars(struct pci_dev *pdev, const bool map) > >> > +{ > >> > + struct vpci_header *header = &pdev->vpci->header; > >> > + unsigned int i; > >> > + int rc = 0; > >> > + > >> > + for ( i = 0; i < ARRAY_SIZE(header->bars); i++ ) > >> > + { > >> > + paddr_t gaddr = map ? header->bars[i].gaddr > >> > + : header->bars[i].mapped_addr; > >> > + paddr_t paddr = header->bars[i].paddr; > >> > + > >> > + if ( header->bars[i].type != VPCI_BAR_MEM && > >> > + header->bars[i].type != VPCI_BAR_MEM64_LO ) > >> > + continue; > >> > + > >> > + rc = modify_mmio(pdev->domain, _gfn(PFN_DOWN(gaddr)), > >> > + _mfn(PFN_DOWN(paddr)), > >> > PFN_UP(header->bars[i].size), > >> > >> The PFN_UP() indicates a problem: For sub-page BARs you can't > >> blindly map/unmap them without taking into consideration other > >> devices sharing the same page. > > > > I'm not sure I follow, the start address of BARs is always aligned to > > a 4KB boundary, so there's no chance of the same page being used by > > two different BARs at the same time. > > I'm not sure where you're taking this from. Modern BIOSes may > aim at doing so, but for one I'm sure I've seen smaller alignment > quite often on older machines, and then my most modern AMD > one has these three devices, for example: Right, I guess I will have to somehow check for overlapping regions, how inconvenient. > 00:11.0 SATA controller: Advanced Micro Devices [AMD] nee ATI > SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode] (prog-if 01 [AHCI 1.0]) > Subsystem: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 SATA > Controller [AHCI mode] > Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- > Stepping- SERR- FastB2B- DisINTx- > Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- > <TAbort- <MAbort- >SERR- <PERR- INTx- > Latency: 64 > Interrupt: pin A routed to IRQ 22 > Region 0: I/O ports at 2430 [size=8] > Region 1: I/O ports at 2424 [size=4] > Region 2: I/O ports at 2428 [size=8] > Region 3: I/O ports at 2420 [size=4] > Region 4: I/O ports at 2400 [size=16] > Region 5: Memory at c8014000 (32-bit, non-prefetchable) [size=1K] > Capabilities: [60] Power Management version 2 > Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA > PME(D0-,D1-,D2-,D3hot-,D3cold-) > Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- > Capabilities: [70] SATA HBA v1.0 InCfgSpace > Kernel driver in use: ahci > Kernel modules: ahci > > 00:12.2 USB controller: Advanced Micro Devices [AMD] nee ATI > SB7x0/SB8x0/SB9x0 USB EHCI Controller (prog-if 20 [EHCI]) > Subsystem: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 USB > EHCI Controller > Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- > Stepping- SERR- FastB2B- DisINTx- > Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- > <TAbort- <MAbort- >SERR- <PERR- INTx- > Latency: 64, Cache Line Size: 32 bytes > Interrupt: pin B routed to IRQ 17 > Region 0: Memory at c8014400 (32-bit, non-prefetchable) [size=256] > Capabilities: [c0] Power Management version 2 > Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA > PME(D0+,D1+,D2+,D3hot+,D3cold-) > Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- > Bridge: PM- B3+ > Capabilities: [e4] Debug port: BAR=1 offset=00e0 > Kernel driver in use: ehci_hcd > Kernel modules: ehci-hcd > > 00:13.2 USB controller: Advanced Micro Devices [AMD] nee ATI > SB7x0/SB8x0/SB9x0 USB EHCI Controller (prog-if 20 [EHCI]) > Subsystem: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 USB > EHCI Controller > Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- > Stepping- SERR- FastB2B- DisINTx- > Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- > <TAbort- <MAbort- >SERR- <PERR- INTx- > Latency: 64, Cache Line Size: 32 bytes > Interrupt: pin B routed to IRQ 19 > Region 0: Memory at c8014800 (32-bit, non-prefetchable) [size=256] > Capabilities: [c0] Power Management version 2 > Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA > PME(D0+,D1+,D2+,D3hot+,D3cold-) > Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- > Bridge: PM- B3+ > Capabilities: [e4] Debug port: BAR=1 offset=00e0 > Kernel driver in use: ehci_hcd > Kernel modules: ehci-hcd > > > The size is indeed not aligned to 4KB, but I don't see how this can > > cause collisions with other BARs unless the domain is actively trying > > to make the BARs overlap, in which case there's not much Xen can do. > > The above is not what Dom0 did, but how the system boots up. > And this "there's not much Xen can do" is what I've been trying > to get at with my comment: A solution is needed here for your > approach to vPCI handling to be viable. Checking for overlap seem to be the only sensible option here. Xen is in no position to relocate BARs. > >> > +static int vpci_bar_write(struct pci_dev *pdev, unsigned int reg, > >> > + union vpci_val val, void *data) > >> > +{ > >> > + struct vpci_bar *bar = data; > >> > + uint32_t wdata = val.double_word; > >> > + bool hi = false, unset = false; > >> > + > >> > + ASSERT(bar->type == VPCI_BAR_MEM || bar->type == VPCI_BAR_MEM64_LO > >> > || > >> > + bar->type == VPCI_BAR_MEM64_HI); > >> > + > >> > + if ( wdata == GENMASK(31, 0) ) > >> > >> I'm afraid this again doesn't match real hardware behavior: As the > >> low bits are r/o, writes with them having any value, but all other > >> bits being 1 should have the same effect. I notice that while I had > >> fixed this for the ROM BAR in Linux'es pciback, I should have also > >> fixed this for ordinary ones. > > > > I've changed this to: > > > > switch ( bar->type ) > > { > > case VPCI_BAR_MEM: > > size_mask = GENMASK(31, 12); > > Relating to the comment further up - where's this 12 coming from? Hm, this is from the "PCI Express Technology" Mindshare book, which states that for 32bit memory BARs bits [11,4] are hardcoded to 0 and for 64bit BARs bits [25,4] are also hardcoded to 0. I've been searching for such statement in the PCI Local Bus specification version 3.0, but I don't seem to be able to find any references to this. I guess I will do: switch ( bar->type ) { case VPCI_BAR_MEM32: case VPCI_BAR_MEM64_LO: size_mask = PCI_BASE_ADDRESS_MEM_MASK; break; case VPCI_BAR_MEM64_HI: size_mask = ~0u; default: ASSERT_UNREACHABLE(); return; } > > break; > > case VPCI_BAR_MEM64_LO: > > size_mask = GENMASK(31, 26); > > And this 26? > > > break; > > case VPCI_BAR_MEM64_HI: > > size_mask = GENMASK(31, 0); > > break; > > default: > > ASSERT_UNREACHABLE(); > > break; > > You want to return here. > > >> > + } > >> > + > >> > + ASSERT(IS_ALIGNED(bar->gaddr, PAGE_SIZE)); > >> > >> Urgh. > > > > Removed. > > With your comment further up, you should have refused to do so > (i.e. I'm getting the impression you're not really sure about that > supposed 4k alignment). No, it's not 4K aligned. > >> > + if ( (bars[i].type == VPCI_BAR_MEM && addr == GENMASK(31, 12)) > >> > || > >> > + addr == GENMASK(63, 26) ) > >> > >> Where is this 26 coming from? > >> Perhaps > >> > >> if ( addr == GENMASK(bars[i].type == VPCI_BAR_MEM ? 31 : 63, 12) ) > > > > I'm checking the memory decode bit here instead in order to figure out > > if the BAR is not positioned. > > > >> ? Albeit I'm unconvinced GENMASK() is useful to be used here anyway > >> (see also below). > > > > Right, regardless of the specific usage above, what would you > > recommend regarding the usage of GENMASK? > > > > Julien suggested introducing GENMASK_ULL. Should I go that route, or > > introduce something locally for vPCI? > > Back when GENMASK() was introduced to our code base I've > already indicated that I'm not really in favor of it. I don't think > it really helps readability all that much (to me, plain hex > numbers are easier to grok, albeit I admit ones extending > beyond 8 or 10 digits are less easy to digest; sadly the once > proposed [by Intel, I think, in the early ia64 days] language > extension to permit _ separators in numbers doesn't appear > to have made it anywhere). I could also switch GENMASK to use long long instead, but I'm not sure if that's going to break existing callers. Let me try to see if I can get away without using it (although I kind of liked it for coding masks). > >> > + } bars[6]; > >> > >> What about the ROM and SR-IOV ones? > > > > I've implemented support for the expansion ROM BAR (which I still need > > to figure out how to test), > > There should be hardly any graphics card without a ROM. For > remote boot purposes also most NICs come with a ROM, albeit > many BIOSes allow turning it off. Most SCSI cards I've seem > have a (configuration) ROM too. OK, but testing NICs ROMs is going to be impossible from a PVH Dom0. I guess graphics cards, although most of my boxes are headless. > > but I would like to defer SR-IOV for later > > because it involves a non-trivial amount of work, and with this series > > one can already boot a PVH Dom0 (minus SR-IOV of course). > > That's likely okay as long as there's a suitable, much beloved > "fixme" comment somewhere. OK, the vpci.h header seems like the best place to add such a fixme comment. Thanks, Roger. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |