[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH for-4.17] x86/efi: don't translate EFI memory map MMIO regions to e820 RESERVED
On 04.10.2022 15:55, Roger Pau Monné wrote: > On Tue, Oct 04, 2022 at 03:15:02PM +0200, Jan Beulich wrote: >> On 04.10.2022 14:59, Roger Pau Monné wrote: >>> On Tue, Oct 04, 2022 at 02:21:20PM +0200, Jan Beulich wrote: >>>> On 04.10.2022 14:17, Roger Pau Monné wrote: >>>>> On Tue, Oct 04, 2022 at 12:40:10PM +0200, Jan Beulich wrote: >>>>>> On 04.10.2022 11:27, Roger Pau Monné wrote: >>>>>>> On Tue, Oct 04, 2022 at 11:01:18AM +0200, Jan Beulich wrote: >>>>>>>> On 30.09.2022 16:17, Roger Pau Monne wrote: >>>>>>>>> The EFI memory map contains two memory types (EfiMemoryMappedIO and >>>>>>>>> EfiMemoryMappedIOPortSpace) used to describe IO memory areas of >>>>>>>>> devices used by EFI. >>>>>>>>> >>>>>>>>> The current parsing of the EFI memory map was translating >>>>>>>>> EfiMemoryMappedIO and EfiMemoryMappedIOPortSpace to E820_RESERVED on >>>>>>>>> x86. This is an issue because device MMIO regions (BARs) should not >>>>>>>>> be positioned on reserved regions. Any BARs positioned on non-hole >>>>>>>>> areas of the memory map will cause is_memory_hole() to return false, >>>>>>>>> which would then cause memory decoding to be disabled for such device. >>>>>>>>> This leads to EFI firmware malfunctions when using runtime services. >>>>>>>>> >>>>>>>>> The system under which this was observed has: >>>>>>>>> >>>>>>>>> EFI memory map: >>>>>>>>> [...] >>>>>>>>> 00000fd000000-00000fe7fffff type=11 attr=800000000000100d >>>>>>>>> [...] >>>>>>>>> 0000:00:1f.5 disabled: BAR [0xfe010, 0xfe010] overlaps with memory map >>>>>>>>> >>>>>>>>> The device behind this BAR is: >>>>>>>>> >>>>>>>>> 00:1f.5 Serial bus controller [0c80]: Intel Corporation Lewisburg SPI >>>>>>>>> Controller (rev 09) >>>>>>>>> Subsystem: Super Micro Computer Inc Device 091c >>>>>>>>> Flags: fast devsel >>>>>>>>> Memory at fe010000 (32-bit, non-prefetchable) [size=4K]well >>>>>>>>> >>>>>>>>> For the record, the symptom observed in that machine was a hard freeze >>>>>>>>> when attempting to set an EFI variable (XEN_EFI_set_variable). >>>>>>>>> >>>>>>>>> Fix by not adding regions with type EfiMemoryMappedIO or >>>>>>>>> EfiMemoryMappedIOPortSpace to the e820 memory map, that allows BARs to >>>>>>>>> be positioned there. >>>>>>>>> >>>>>>>>> Fixes: 75cc460a1b ('xen/pci: detect when BARs are not suitably >>>>>>>>> positioned') >>>>>>>>> Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx> >>>>>>>> >>>>>>>> In the best case this is moving us from one way of being wrong to >>>>>>>> another: >>>>>>>> So far we wrongly include BARs in E820_RESERVED (_if_ they can be >>>>>>>> legitimately covered by a EfiMemoryMappedIO region in the first place, >>>>>>>> which I'm not sure is actually permitted - iirc just like E820_RESERVED >>>>>>>> may not be used for BARs, this memory type also may not be), whereas >>>>>>>> with >>>>>>>> your change we would no longer report non-BAR MMIO space (chipset >>>>>>>> specific >>>>>>>> ranges for example) as reserved. In fact I think the example you >>>>>>>> provide >>>>>>>> is at least partly due to bogus firmware behavior: The BAR is put in >>>>>>>> space >>>>>>>> normally used for firmware specific memory (MMIO) ranges. I think >>>>>>>> firmware >>>>>>>> should either assign the BAR differently or exclude the range from the >>>>>>>> memory map. >>>>>>> >>>>>>> Hm, I'm not sure the example is bogus, how would firmware request a BAR >>>>>>> to be mapped for run time services to access it otherwise if it's not >>>>>>> using EfiMemoryMappedIO? >>>>>>> >>>>>>> Not adding the BAR to the memory map in any way would mean the OS is >>>>>>> free to not map it for runtime services to access. >>>>>> >>>>>> My view is that BARs should not be marked for runtime services use. Doing >>>>>> so requires awareness of the driver inside the OS, which I don't think >>>>>> one can expect. If firmware needs to make use of a device in a system, it >>>>>> ought to properly hide it from the OS. Note how the potential sharing of >>>>>> an RTC requires special provisions in the spec, mandating driver >>>>>> awareness. >>>>>> >>>>>> Having a BAR expressed in the memory map also contradicts the ability of >>>>>> an OS to relocate all BARs of all devices, if necessary. >>>>> >>>>> I've failed to figure out if there's a way in UEFI to report a device >>>>> is in use by the firmware. I've already looked before sending the >>>>> patch (see also the post commit notes about for example not passing >>>>> through the device to any guest for obvious reason). >>>>> >>>>> I've got no idea if Linux has any checks to avoid trying to move BARs >>>>> residing in EfiMemoryMappedIO ranges, we have now observed this >>>>> behavior in two systems already. >>>>> >>>>> Maybe we could do a special check for PCI devices and allow them >>>>> having BARs in EfiMemoryMappedIO, together with printing a warning >>>>> message. >>>> >>>> Right, that's one of the possible quirk workarounds I was thinking of. >>>> At the risk of stating the obvious - the same would presumably apply to >>>> E820_RESERVED on non-EFI systems then. >>> >>> One option would be to strictly limit to EfiMemoryMappedIO, by taking >>> the EFI memory map into account also if present. >>> >>> Another maybe simpler option is to allow BARs to be placed in >>> E820_RESERVED regions, and translate EfiMemoryMappedIO into >>> E820_RESERVED like we have been doing. >>> >>> I will attempt the later if you are OK with the approach. >> >> I might be okay with the approach, but first of all I continue to be >> worried of us violating specifications if we make this the default >> behavior. > > In any case it would be the firmware violating the specification by > not hiding those PCI devices, Xen is just trying to cope. > > Xen went from not checking the position of the BARs at all to > enforcing them to not be placed over any regions on the memory map. I > think we need to relax the checks a bit to match reality. Sure, as a workaround. Yet we don't want to relax too much, or else a primary purpose of the check will be lost. Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |