[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH for-4.17] x86/efi: don't translate EFI memory map MMIO regions to e820 RESERVED
On Tue, Oct 04, 2022 at 03:15:02PM +0200, Jan Beulich wrote: > On 04.10.2022 14:59, Roger Pau Monné wrote: > > On Tue, Oct 04, 2022 at 02:21:20PM +0200, Jan Beulich wrote: > >> On 04.10.2022 14:17, Roger Pau Monné wrote: > >>> On Tue, Oct 04, 2022 at 12:40:10PM +0200, Jan Beulich wrote: > >>>> On 04.10.2022 11:27, Roger Pau Monné wrote: > >>>>> On Tue, Oct 04, 2022 at 11:01:18AM +0200, Jan Beulich wrote: > >>>>>> On 30.09.2022 16:17, Roger Pau Monne wrote: > >>>>>>> The EFI memory map contains two memory types (EfiMemoryMappedIO and > >>>>>>> EfiMemoryMappedIOPortSpace) used to describe IO memory areas of > >>>>>>> devices used by EFI. > >>>>>>> > >>>>>>> The current parsing of the EFI memory map was translating > >>>>>>> EfiMemoryMappedIO and EfiMemoryMappedIOPortSpace to E820_RESERVED on > >>>>>>> x86. This is an issue because device MMIO regions (BARs) should not > >>>>>>> be positioned on reserved regions. Any BARs positioned on non-hole > >>>>>>> areas of the memory map will cause is_memory_hole() to return false, > >>>>>>> which would then cause memory decoding to be disabled for such device. > >>>>>>> This leads to EFI firmware malfunctions when using runtime services. > >>>>>>> > >>>>>>> The system under which this was observed has: > >>>>>>> > >>>>>>> EFI memory map: > >>>>>>> [...] > >>>>>>> 00000fd000000-00000fe7fffff type=11 attr=800000000000100d > >>>>>>> [...] > >>>>>>> 0000:00:1f.5 disabled: BAR [0xfe010, 0xfe010] overlaps with memory map > >>>>>>> > >>>>>>> The device behind this BAR is: > >>>>>>> > >>>>>>> 00:1f.5 Serial bus controller [0c80]: Intel Corporation Lewisburg SPI > >>>>>>> Controller (rev 09) > >>>>>>> Subsystem: Super Micro Computer Inc Device 091c > >>>>>>> Flags: fast devsel > >>>>>>> Memory at fe010000 (32-bit, non-prefetchable) [size=4K]well > >>>>>>> > >>>>>>> For the record, the symptom observed in that machine was a hard freeze > >>>>>>> when attempting to set an EFI variable (XEN_EFI_set_variable). > >>>>>>> > >>>>>>> Fix by not adding regions with type EfiMemoryMappedIO or > >>>>>>> EfiMemoryMappedIOPortSpace to the e820 memory map, that allows BARs to > >>>>>>> be positioned there. > >>>>>>> > >>>>>>> Fixes: 75cc460a1b ('xen/pci: detect when BARs are not suitably > >>>>>>> positioned') > >>>>>>> Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx> > >>>>>> > >>>>>> In the best case this is moving us from one way of being wrong to > >>>>>> another: > >>>>>> So far we wrongly include BARs in E820_RESERVED (_if_ they can be > >>>>>> legitimately covered by a EfiMemoryMappedIO region in the first place, > >>>>>> which I'm not sure is actually permitted - iirc just like E820_RESERVED > >>>>>> may not be used for BARs, this memory type also may not be), whereas > >>>>>> with > >>>>>> your change we would no longer report non-BAR MMIO space (chipset > >>>>>> specific > >>>>>> ranges for example) as reserved. In fact I think the example you > >>>>>> provide > >>>>>> is at least partly due to bogus firmware behavior: The BAR is put in > >>>>>> space > >>>>>> normally used for firmware specific memory (MMIO) ranges. I think > >>>>>> firmware > >>>>>> should either assign the BAR differently or exclude the range from the > >>>>>> memory map. > >>>>> > >>>>> Hm, I'm not sure the example is bogus, how would firmware request a BAR > >>>>> to be mapped for run time services to access it otherwise if it's not > >>>>> using EfiMemoryMappedIO? > >>>>> > >>>>> Not adding the BAR to the memory map in any way would mean the OS is > >>>>> free to not map it for runtime services to access. > >>>> > >>>> My view is that BARs should not be marked for runtime services use. Doing > >>>> so requires awareness of the driver inside the OS, which I don't think > >>>> one can expect. If firmware needs to make use of a device in a system, it > >>>> ought to properly hide it from the OS. Note how the potential sharing of > >>>> an RTC requires special provisions in the spec, mandating driver > >>>> awareness. > >>>> > >>>> Having a BAR expressed in the memory map also contradicts the ability of > >>>> an OS to relocate all BARs of all devices, if necessary. > >>> > >>> I've failed to figure out if there's a way in UEFI to report a device > >>> is in use by the firmware. I've already looked before sending the > >>> patch (see also the post commit notes about for example not passing > >>> through the device to any guest for obvious reason). > >>> > >>> I've got no idea if Linux has any checks to avoid trying to move BARs > >>> residing in EfiMemoryMappedIO ranges, we have now observed this > >>> behavior in two systems already. > >>> > >>> Maybe we could do a special check for PCI devices and allow them > >>> having BARs in EfiMemoryMappedIO, together with printing a warning > >>> message. > >> > >> Right, that's one of the possible quirk workarounds I was thinking of. > >> At the risk of stating the obvious - the same would presumably apply to > >> E820_RESERVED on non-EFI systems then. > > > > One option would be to strictly limit to EfiMemoryMappedIO, by taking > > the EFI memory map into account also if present. > > > > Another maybe simpler option is to allow BARs to be placed in > > E820_RESERVED regions, and translate EfiMemoryMappedIO into > > E820_RESERVED like we have been doing. > > > > I will attempt the later if you are OK with the approach. > > I might be okay with the approach, but first of all I continue to be > worried of us violating specifications if we make this the default > behavior. In any case it would be the firmware violating the specification by not hiding those PCI devices, Xen is just trying to cope. Xen went from not checking the position of the BARs at all to enforcing them to not be placed over any regions on the memory map. I think we need to relax the checks a bit to match reality. Thanks, Roger.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |