[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v6] Avoid crash calling PrintErrMesg from efi_multiboot2


  • To: Frediano Ziglio <frediano.ziglio@xxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Fri, 21 Mar 2025 07:47:57 +0100
  • Autocrypt: addr=jbeulich@xxxxxxxx; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL
  • Cc: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, "Daniel P. Smith" <dpsmith@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Fri, 21 Mar 2025 06:48:19 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 20.03.2025 21:10, Frediano Ziglio wrote:
> On Thu, Mar 20, 2025 at 3:15 PM Jan Beulich <jbeulich@xxxxxxxx> wrote:
>>
>> On 20.03.2025 15:33, Frediano Ziglio wrote:
>>> On Thu, Mar 6, 2025 at 3:02 PM Frediano Ziglio
>>> <frediano.ziglio@xxxxxxxxx> wrote:
>>>>
>>>> On Thu, Mar 6, 2025 at 2:26 PM Jan Beulich <jbeulich@xxxxxxxx> wrote:
>>>>>
>>>>> On 26.02.2025 19:54, Marek Marczykowski-Górecki wrote:
>>>>>> On Mon, Feb 24, 2025 at 02:31:00PM +0000, Frediano Ziglio wrote:
>>>>>>> On Mon, Feb 24, 2025 at 1:16 PM Marek Marczykowski-Górecki
>>>>>>> <marmarek@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>>>>>>>>
>>>>>>>> On Mon, Feb 24, 2025 at 12:57:13PM +0000, Frediano Ziglio wrote:
>>>>>>>>> On Fri, Feb 21, 2025 at 8:20 PM Marek Marczykowski-Górecki
>>>>>>>>> <marmarek@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>>>>>>>>>>
>>>>>>>>>> On Mon, Feb 17, 2025 at 04:26:59PM +0000, Frediano Ziglio wrote:
>>>>>>>>>>> Although code is compiled with -fpic option data is not position
>>>>>>>>>>> independent. This causes data pointer to become invalid if
>>>>>>>>>>> code is not relocated properly which is what happens for
>>>>>>>>>>> efi_multiboot2 which is called by multiboot entry code.
>>>>>>>>>>>
>>>>>>>>>>> Code tested adding
>>>>>>>>>>>    PrintErrMesg(L"Test message", EFI_BUFFER_TOO_SMALL);
>>>>>>>>>>> in efi_multiboot2 before calling efi_arch_edd (this function
>>>>>>>>>>> can potentially call PrintErrMesg).
>>>>>>>>>>>
>>>>>>>>>>> Before the patch (XenServer installation on Qemu, xen replaced
>>>>>>>>>>> with vanilla xen.gz):
>>>>>>>>>>>   Booting `XenServer (Serial)'Booting `XenServer (Serial)'
>>>>>>>>>>>   Test message: !!!! X64 Exception Type - 0E(#PF - Page-Fault)  CPU 
>>>>>>>>>>> Apic ID - 00000000 !!!!
>>>>>>>>>>>   ExceptionData - 0000000000000000  I:0 R:0 U:0 W:0 P:0 PK:0 SS:0 
>>>>>>>>>>> SGX:0
>>>>>>>>>>>   RIP  - 000000007EE21E9A, CS  - 0000000000000038, RFLAGS - 
>>>>>>>>>>> 0000000000210246
>>>>>>>>>>>   RAX  - 000000007FF0C1B5, RCX - 0000000000000050, RDX - 
>>>>>>>>>>> 0000000000000010
>>>>>>>>>>>   RBX  - 0000000000000000, RSP - 000000007FF0C180, RBP - 
>>>>>>>>>>> 000000007FF0C210
>>>>>>>>>>>   RSI  - FFFF82D040467CE8, RDI - 0000000000000000
>>>>>>>>>>>   R8   - 000000007FF0C1C8, R9  - 000000007FF0C1C0, R10 - 
>>>>>>>>>>> 0000000000000000
>>>>>>>>>>>   R11  - 0000000000001020, R12 - FFFF82D040467CE8, R13 - 
>>>>>>>>>>> 000000007FF0C1B8
>>>>>>>>>>>   R14  - 000000007EA33328, R15 - 000000007EA332D8
>>>>>>>>>>>   DS   - 0000000000000030, ES  - 0000000000000030, FS  - 
>>>>>>>>>>> 0000000000000030
>>>>>>>>>>>   GS   - 0000000000000030, SS  - 0000000000000030
>>>>>>>>>>>   CR0  - 0000000080010033, CR2 - FFFF82D040467CE8, CR3 - 
>>>>>>>>>>> 000000007FC01000
>>>>>>>>>>>   CR4  - 0000000000000668, CR8 - 0000000000000000
>>>>>>>>>>>   DR0  - 0000000000000000, DR1 - 0000000000000000, DR2 - 
>>>>>>>>>>> 0000000000000000
>>>>>>>>>>>   DR3  - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - 
>>>>>>>>>>> 0000000000000400
>>>>>>>>>>>   GDTR - 000000007F9DB000 0000000000000047, LDTR - 0000000000000000
>>>>>>>>>>>   IDTR - 000000007F48E018 0000000000000FFF,   TR - 0000000000000000
>>>>>>>>>>>   FXSAVE_STATE - 000000007FF0BDE0
>>>>>>>>>>>   !!!! Find image based on IP(0x7EE21E9A) (No PDB)  
>>>>>>>>>>> (ImageBase=000000007EE20000, EntryPoint=000000007EE23935) !!!!
>>>>>>>>>>>
>>>>>>>>>>> After the patch:
>>>>>>>>>>>   Booting `XenServer (Serial)'Booting `XenServer (Serial)'
>>>>>>>>>>>   Test message: Buffer too small
>>>>>>>>>>>   BdsDxe: loading Boot0000 "UiApp" from 
>>>>>>>>>>> Fv(7CB8BDC9-F8EB-4F34-AAEA-3EE4AF6516A1)/FvFile(462CAA21-7614-4503-836E-8AB6F4662331)
>>>>>>>>>>>   BdsDxe: starting Boot0000 "UiApp" from 
>>>>>>>>>>> Fv(7CB8BDC9-F8EB-4F34-AAEA-3EE4AF6516A1)/FvFile(462CAA21-7614-4503-836E-8AB6F4662331)
>>>>>>>>>>>
>>>>>>>>>>> This partially rollback commit 00d5d5ce23e6.
>>>>>>>>>>>
>>>>>>>>>>> Fixes: 9180f5365524 ("x86: add multiboot2 protocol support for EFI 
>>>>>>>>>>> platforms")
>>>>>>>>>>> Signed-off-by: Frediano Ziglio <frediano.ziglio@xxxxxxxxx>
>>>>>>>>>>
>>>>>>>>>> I tried testing this patch, but it seems I cannot reproduce the 
>>>>>>>>>> original
>>>>>>>>>> failure...
>>>>>>>>>>
>>>>>>>>>> I did as the commit message suggests here:
>>>>>>>>>> https://gitlab.com/xen-project/people/marmarek/xen/-/commit/ca3d6911c448eb886990f33d4380b5646617a982
>>>>>>>>>>
>>>>>>>>>> With blexit() in PrintErrMesg(), it went back to the bootloader, so 
>>>>>>>>>> I'm
>>>>>>>>>> sure this code path was reached. But with blexit() commented out, Xen
>>>>>>>>>> started correctly both with and without this patch... The branch I 
>>>>>>>>>> used
>>>>>>>>>> is here:
>>>>>>>>>> https://gitlab.com/xen-project/people/marmarek/xen/-/commits/automation-tests?ref_type=heads
>>>>>>>>>>
>>>>>>>>>> Are there some extra condition to reproduce the issue? Maybe it 
>>>>>>>>>> depends
>>>>>>>>>> on the compiler version? I guess I can try also on QEMU, but based on
>>>>>>>>>> the description, I would expect it to crash in any case.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Did you see the correct message in both cases?
>>>>>>>>> Did you use Grub or direct EFI?
>>>>>>>>>
>>>>>>>>> With Grub and without this patch you won't see the message, with grub
>>>>>>>>> with the patch you see the correct message.
>>>>>>>>
>>>>>>>> I did use grub, and I didn't see the message indeed.
>>>>>>>> But in the case it was supposed to crash (with added PrintErrMesg(),
>>>>>>>> commented out blexit and without your patch) it did _not_ crashed and
>>>>>>>> continued to normal boot. Is that #PF non-fatal here?
>>>>>>>>
>>>>>>>
>>>>>>> Hi,
>>>>>>>    I tried again with my test environment.
>>>>>>> Added the PrintErrMesg line before efi_arch_edd call, I got a #PF, in
>>>>>>> my case the system hangs. With the fix patch machine is rebooting and
>>>>>>> I can see the message in the logs.
>>>>>>> I'm trying with Xen starting inside Qemu, EFI firmware, xen.gz
>>>>>>> compiled as ELF file. Host system is an Ubuntu 22.04.5 LTS. Gcc is
>>>>>>> version 11.4.
>>>>>>
>>>>>> My test was wrong, commenting out blexit made "mesg" variable unused.
>>>>>> After fixing that, I can reproduce it on both QEMU and real hardware:
>>>>>> without your patch it crashes and with your patch it works just fine.
>>>>>> While there may be more places with similar issue, this patch clearly
>>>>>> improves the situation, so:
>>>>>>
>>>>>> Acked-by: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
>>>>>
>>>>> This had to be reverted, for breaking the build with old Clang. See the
>>>>> respective Matrix conversation.
>>>>
>>>> To sum up the failure is:
>>>>
>>>>     clang: error: unknown argument: '-fno-jump-tables'
>>>
>>> Now that the minimum clang version supports this option, can this
>>> change be applied?
>>
>> Not sure. I for one would expect that we actively reject building with
>> too old tool chains then, which is yet to be carried out. Plus I think
>> you'd want to re-submit, with all tags dropped. The change was wrong to
>> go in at that earlier point, and hence any such tags weren't quite
>> accurate.
> 
>   not sure what you intend with "tags" in the above sentence. Git tags ?

Acks and R-b-s.

> Not sure we need to carry on using old tool chains if we decide to
> bump the minimal versions.

I fear I don't understand this remark in this context. In any event,
Andrew meanwhile has sent a patch to the effect of what my comment was
saying.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.