|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v6] Avoid crash calling PrintErrMesg from efi_multiboot2
On Thu, Mar 6, 2025 at 3:02 PM Frediano Ziglio
<frediano.ziglio@xxxxxxxxx> wrote:
>
> On Thu, Mar 6, 2025 at 2:26 PM Jan Beulich <jbeulich@xxxxxxxx> wrote:
> >
> > On 26.02.2025 19:54, Marek Marczykowski-Górecki wrote:
> > > On Mon, Feb 24, 2025 at 02:31:00PM +0000, Frediano Ziglio wrote:
> > >> On Mon, Feb 24, 2025 at 1:16 PM Marek Marczykowski-Górecki
> > >> <marmarek@xxxxxxxxxxxxxxxxxxxxxx> wrote:
> > >>>
> > >>> On Mon, Feb 24, 2025 at 12:57:13PM +0000, Frediano Ziglio wrote:
> > >>>> On Fri, Feb 21, 2025 at 8:20 PM Marek Marczykowski-Górecki
> > >>>> <marmarek@xxxxxxxxxxxxxxxxxxxxxx> wrote:
> > >>>>>
> > >>>>> On Mon, Feb 17, 2025 at 04:26:59PM +0000, Frediano Ziglio wrote:
> > >>>>>> Although code is compiled with -fpic option data is not position
> > >>>>>> independent. This causes data pointer to become invalid if
> > >>>>>> code is not relocated properly which is what happens for
> > >>>>>> efi_multiboot2 which is called by multiboot entry code.
> > >>>>>>
> > >>>>>> Code tested adding
> > >>>>>> PrintErrMesg(L"Test message", EFI_BUFFER_TOO_SMALL);
> > >>>>>> in efi_multiboot2 before calling efi_arch_edd (this function
> > >>>>>> can potentially call PrintErrMesg).
> > >>>>>>
> > >>>>>> Before the patch (XenServer installation on Qemu, xen replaced
> > >>>>>> with vanilla xen.gz):
> > >>>>>> Booting `XenServer (Serial)'Booting `XenServer (Serial)'
> > >>>>>> Test message: !!!! X64 Exception Type - 0E(#PF - Page-Fault) CPU
> > >>>>>> Apic ID - 00000000 !!!!
> > >>>>>> ExceptionData - 0000000000000000 I:0 R:0 U:0 W:0 P:0 PK:0 SS:0
> > >>>>>> SGX:0
> > >>>>>> RIP - 000000007EE21E9A, CS - 0000000000000038, RFLAGS -
> > >>>>>> 0000000000210246
> > >>>>>> RAX - 000000007FF0C1B5, RCX - 0000000000000050, RDX -
> > >>>>>> 0000000000000010
> > >>>>>> RBX - 0000000000000000, RSP - 000000007FF0C180, RBP -
> > >>>>>> 000000007FF0C210
> > >>>>>> RSI - FFFF82D040467CE8, RDI - 0000000000000000
> > >>>>>> R8 - 000000007FF0C1C8, R9 - 000000007FF0C1C0, R10 -
> > >>>>>> 0000000000000000
> > >>>>>> R11 - 0000000000001020, R12 - FFFF82D040467CE8, R13 -
> > >>>>>> 000000007FF0C1B8
> > >>>>>> R14 - 000000007EA33328, R15 - 000000007EA332D8
> > >>>>>> DS - 0000000000000030, ES - 0000000000000030, FS -
> > >>>>>> 0000000000000030
> > >>>>>> GS - 0000000000000030, SS - 0000000000000030
> > >>>>>> CR0 - 0000000080010033, CR2 - FFFF82D040467CE8, CR3 -
> > >>>>>> 000000007FC01000
> > >>>>>> CR4 - 0000000000000668, CR8 - 0000000000000000
> > >>>>>> DR0 - 0000000000000000, DR1 - 0000000000000000, DR2 -
> > >>>>>> 0000000000000000
> > >>>>>> DR3 - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 -
> > >>>>>> 0000000000000400
> > >>>>>> GDTR - 000000007F9DB000 0000000000000047, LDTR - 0000000000000000
> > >>>>>> IDTR - 000000007F48E018 0000000000000FFF, TR - 0000000000000000
> > >>>>>> FXSAVE_STATE - 000000007FF0BDE0
> > >>>>>> !!!! Find image based on IP(0x7EE21E9A) (No PDB)
> > >>>>>> (ImageBase=000000007EE20000, EntryPoint=000000007EE23935) !!!!
> > >>>>>>
> > >>>>>> After the patch:
> > >>>>>> Booting `XenServer (Serial)'Booting `XenServer (Serial)'
> > >>>>>> Test message: Buffer too small
> > >>>>>> BdsDxe: loading Boot0000 "UiApp" from
> > >>>>>> Fv(7CB8BDC9-F8EB-4F34-AAEA-3EE4AF6516A1)/FvFile(462CAA21-7614-4503-836E-8AB6F4662331)
> > >>>>>> BdsDxe: starting Boot0000 "UiApp" from
> > >>>>>> Fv(7CB8BDC9-F8EB-4F34-AAEA-3EE4AF6516A1)/FvFile(462CAA21-7614-4503-836E-8AB6F4662331)
> > >>>>>>
> > >>>>>> This partially rollback commit 00d5d5ce23e6.
> > >>>>>>
> > >>>>>> Fixes: 9180f5365524 ("x86: add multiboot2 protocol support for EFI
> > >>>>>> platforms")
> > >>>>>> Signed-off-by: Frediano Ziglio <frediano.ziglio@xxxxxxxxx>
> > >>>>>
> > >>>>> I tried testing this patch, but it seems I cannot reproduce the
> > >>>>> original
> > >>>>> failure...
> > >>>>>
> > >>>>> I did as the commit message suggests here:
> > >>>>> https://gitlab.com/xen-project/people/marmarek/xen/-/commit/ca3d6911c448eb886990f33d4380b5646617a982
> > >>>>>
> > >>>>> With blexit() in PrintErrMesg(), it went back to the bootloader, so
> > >>>>> I'm
> > >>>>> sure this code path was reached. But with blexit() commented out, Xen
> > >>>>> started correctly both with and without this patch... The branch I
> > >>>>> used
> > >>>>> is here:
> > >>>>> https://gitlab.com/xen-project/people/marmarek/xen/-/commits/automation-tests?ref_type=heads
> > >>>>>
> > >>>>> Are there some extra condition to reproduce the issue? Maybe it
> > >>>>> depends
> > >>>>> on the compiler version? I guess I can try also on QEMU, but based on
> > >>>>> the description, I would expect it to crash in any case.
> > >>>>>
> > >>>>
> > >>>> Did you see the correct message in both cases?
> > >>>> Did you use Grub or direct EFI?
> > >>>>
> > >>>> With Grub and without this patch you won't see the message, with grub
> > >>>> with the patch you see the correct message.
> > >>>
> > >>> I did use grub, and I didn't see the message indeed.
> > >>> But in the case it was supposed to crash (with added PrintErrMesg(),
> > >>> commented out blexit and without your patch) it did _not_ crashed and
> > >>> continued to normal boot. Is that #PF non-fatal here?
> > >>>
> > >>
> > >> Hi,
> > >> I tried again with my test environment.
> > >> Added the PrintErrMesg line before efi_arch_edd call, I got a #PF, in
> > >> my case the system hangs. With the fix patch machine is rebooting and
> > >> I can see the message in the logs.
> > >> I'm trying with Xen starting inside Qemu, EFI firmware, xen.gz
> > >> compiled as ELF file. Host system is an Ubuntu 22.04.5 LTS. Gcc is
> > >> version 11.4.
> > >
> > > My test was wrong, commenting out blexit made "mesg" variable unused.
> > > After fixing that, I can reproduce it on both QEMU and real hardware:
> > > without your patch it crashes and with your patch it works just fine.
> > > While there may be more places with similar issue, this patch clearly
> > > improves the situation, so:
> > >
> > > Acked-by: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
> >
> > This had to be reverted, for breaking the build with old Clang. See the
> > respective Matrix conversation.
> >
> > Jan
> >
>
> To sum up the failure is:
>
> clang: error: unknown argument: '-fno-jump-tables'
>
Now that the minimum clang version supports this option, can this
change be applied?
Frediano
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |