[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3 2/5] x86/boot: Split bootsym() into four types of relocations
On Fri, 2019-08-30 at 17:10 +0200, Jan Beulich wrote: > On 21.08.2019 18:35, David Woodhouse wrote: > > --- a/xen/arch/x86/boot/head.S > > +++ b/xen/arch/x86/boot/head.S > > @@ -699,14 +699,30 @@ trampoline_setup: > > cmp $sym_offs(__trampoline_rel_stop),%edi > > jb 1b > > > > - /* Patch in the trampoline segment. */ > > + mov $sym_offs(__trampoline32_rel_start),%edi > > +1: > > + mov %fs:(%edi),%eax > > + add %edx,%fs:(%edi,%eax) > > + add $4,%edi > > + cmp $sym_offs(__trampoline32_rel_stop),%edi > > + jb 1b > > + > > + mov $sym_offs(__bootsym_rel_start),%edi > > +1: > > + mov %fs:(%edi),%eax > > + add %edx,%fs:(%edi,%eax) > > + add $4,%edi > > + cmp $sym_offs(__bootsym_rel_stop),%edi > > + jb 1b > > With the smaller sets now - are we risking misbehavior if one > of the relocation sets ends up empty? This wasn't reasonable to > expect before, but I think it would be nice to have a build-time > check rather than a hard to debug crash in case this happens. Or just code it differently as a while() instead of a do{}while() so that it actually copes with a zero-length section. > > --- a/xen/arch/x86/boot/trampoline.S > > +++ b/xen/arch/x86/boot/trampoline.S > > @@ -16,21 +16,62 @@ > > * not guaranteed to persist. > > */ > > > > -/* NB. bootsym() is only usable in real mode, or via BOOT_PSEUDORM_DS. */ > > +/* > > + * There are four sets of relocations: > > + * > > + * bootsym(): Boot-time code relocated to low memory and run only once. > > + * Only usable at boot; in real mode or via > > BOOT_PSEUDORM_DS. > > + * bootdatasym(): Boot-time BIOS-discovered data, relocated back up to Xen > > + * image after discovery. > > + * trampsym(): Permanent trampoline code relocated into low memory for > > AP > > + * startup and wakeup. > > + * tramp32sym(): 32-bit trampoline code which at boot can be used directly > > + * from the Xen image in memory, but which will need to be > > + * relocated into low (well, into *mapped*) memory in order > > + * to be used for AP startup. > > + */ > > #undef bootsym > > #define bootsym(s) ((s)-trampoline_start) > > > > #define bootsym_rel(sym, off, opnd...) \ > > bootsym(sym),##opnd; \ > > 111:; \ > > - .pushsection .trampoline_rel, "a"; \ > > + .pushsection .bootsym_rel, "a"; \ > > .long 111b - (off) - .; \ > > .popsection > > > > #define bootsym_segrel(sym, off) \ > > $0,$bootsym(sym); \ > > 111:; \ > > - .pushsection .trampoline_seg, "a"; \ > > + .pushsection .bootsym_seg, "a"; \ > > + .long 111b - (off) - .; \ > > + .popsection > > + > > +#define bootdatasym(s) ((s)-trampoline_start) > > +#define bootdatasym_rel(sym, off, opnd...) \ > > + bootdatasym(sym),##opnd; \ > > +111:; \ > > + .pushsection .bootdatasym_rel, "a";\ > > + .long 111b - (off) - .; \ > > + .popsection > > + > > +#undef trampsym > > Why this and ... > > > +#define trampsym(s) ((s)-trampoline_start) > > + > > +#define trampsym_rel(sym, off, opnd...) \ > > + trampsym(sym),##opnd; \ > > +111:; \ > > + .pushsection .trampsym_rel, "a"; \ > > + .long 111b - (off) - .; \ > > + .popsection > > + > > +#undef tramp32sym > > ... this #undef? You have none ahead of the bootdatasym #define-s, > and (other than for bootsym) there's not conflicting C level one > afaics. > > > +#define tramp32sym(s) ((s)-trampoline_start) > > + > > +#define tramp32sym_rel(sym, off, opnd...) \ > > + tramp32sym(sym),##opnd; \ > > +111:; \ > > + .pushsection .tramp32sym_rel, "a"; \ > > .long 111b - (off) - .; \ > > .popsection > > After your reply to my comment regarding the redundancy here I've > checked (in your git branch) how things end up. Am I mistaken, or > are the trampsym and tramp32sym #define-s entirely identical > (except for the relocations section name)? Even between the others > there's little enough difference, so it continues to be unclear to > me why you think it's better to have four instances of about the > same (not entirely trivial) thing. The distinction is that in a no-real-mode boot tramp32 is used in place in the Xen image at the physical address it happened to be loaded at, and then *again* later in the AP/wakeup path. In the latter case it needs to be moved to low memory (or we need to put the physical location into idle_pg_table which seemed to be harder, as discussed). So tramp32 symbols get relocated *twice*, while the plain tramp symbols don't, but actually we could probably ditch the distinction and treat them all the same, which would reduce the four categories to three. I'll take a look. I suppose we could also combine bootsym and bootdatasym, and copy that *whole* section back up to the Xen image; both code and data. But I'm inclined to prefer keeping them separate and only copying the data back up. > > @@ -48,16 +89,19 @@ > > GLOBAL(trampoline_realmode_entry) > > mov %cs,%ax > > mov %ax,%ds > > - movb $0xA5,bootsym(trampoline_cpu_started) > > + movb $0xA5,trampsym(trampoline_cpu_started) > > cld > > cli > > - lidt bootsym(idt_48) > > - lgdt bootsym(gdt_48) > > + lidt trampsym(idt_48) > > + lgdt trampsym(gdt_48) > > mov $1,%bl # EBX != 0 indicates we are an AP > > xor %ax, %ax > > inc %ax > > lmsw %ax # CR0.PE = 1 (enter protected > > mode) > > - ljmpl $BOOT_CS32,$bootsym_rel(trampoline_protmode_entry,6) > > + ljmpl $BOOT_CS32,$tramp32sym_rel(trampoline_protmode_entry,6) > > + > > +GLOBAL(trampoline_cpu_started) > > + .byte 0 > > The movement of this item here seems unrelated to this change; it's > also not mentioned in the description. Andy's already moved that elsewhere anyway; I'll undo that as I rebase. > > @@ -115,10 +115,10 @@ static void __init relocate_trampoline(unsigned long > > phys) > > trampoline_ptr < __trampoline_rel_stop; > > ++trampoline_ptr ) > > *(u32 *)(*trampoline_ptr + (long)trampoline_ptr) += phys; > > - for ( trampoline_ptr = __trampoline_seg_start; > > - trampoline_ptr < __trampoline_seg_stop; > > + for ( trampoline_ptr = __trampoline32_rel_start; > > + trampoline_ptr < __trampoline32_rel_stop; > > ++trampoline_ptr ) > > - *(u16 *)(*trampoline_ptr + (long)trampoline_ptr) = phys >> 4; > > + *(u32 *)(*trampoline_ptr + (long)trampoline_ptr) += phys; > > } > > Seeing this and adding in the comment about the redundant tramp*sym > macros I wonder why the relocations can't be put together in a single > section, and there be just a single loop here. (I realize this > entire function gets deleted from here later on, but anyway.) Yeah, I think it's worth the harmless double-relocation in the non-EFI case to treat everything (well tramp vs. tramp32) the same there. I'll do that. Thanks. Attachment:
smime.p7s _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |