[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4 3/4] libelf: rewrite symtab/strtab loading



>>> On 26.02.16 at 18:02, <roger.pau@xxxxxxxxxx> wrote:
> El 26/2/16 a les 14:15, Jan Beulich ha escrit:
>>>>> On 16.02.16 at 18:37, <roger.pau@xxxxxxxxxx> wrote:
>>> --- a/xen/common/libelf/libelf-loader.c
>>> +++ b/xen/common/libelf/libelf-loader.c
>>> @@ -164,20 +164,33 @@ void elf_parse_bsdsyms(struct elf_binary *elf, 
>>> uint64_t pstart)
>>>      sz = sizeof(uint32_t);
>>>  
>>>      /* Space for the elf and elf section headers */
>>> -    sz += (elf_uval(elf, elf->ehdr, e_ehsize) +
>>> -           elf_shdr_count(elf) * elf_uval(elf, elf->ehdr, e_shentsize));
>>> +    sz += elf_uval(elf, elf->ehdr, e_ehsize) +
>>> +          3 * elf_uval(elf, elf->ehdr, e_shentsize);
>> 
>> I think a literal 3 like this either needs a #define (at once serving
>> as documentation) or a comment. Perhaps rather the former,
>> considering that the (same?) 3 appears again further down. Plus -
>> what guarantees there are 3 section headers in the image?
> 
> This is not related to the image itself, but to the metadata that libelf
> places at the end of the loaded kernel in order to stash the symtab and
> strtab for the guest to use.

I understand this, yet imo the literal 3 still should be replaced.

> The layout is as follows (I should add this to the patch itself as a
> comment, since I guess this is still quite confusing):
> 
> +------------------------+
> |                        |
> | KERNEL                 |
> |                        |
> +------------------------+ pend
> | ALIGNMENT              |
> +------------------------+ bsd_symtab_pstart
> |                        |
> | size                   | bsd_symtab_pend - bsd_symtab_pstart
> |                        |
> +------------------------+
> |                        |
> | ELF header             |
> |                        |
> +------------------------+
> |                        |
> | ELF section header 0   | Dummy section header
> |                        |
> +------------------------+
> |                        |
> | ELF section header 1   | SYMTAB section header
> |                        |
> +------------------------+
> |                        |
> | ELF section header 2   | STRTAB section header
> |                        |
> +------------------------+
> |                        |
> | SYMTAB                 |
> |                        |
> +------------------------+
> |                        |
> | STRTAB                 |
> |                        |
> +------------------------+ bsd_symtab_pend
> 
> There are always only tree sections because that's all we need, section
> 0 is ignored, section 1 is used to describe the SYMTAB, and section 2 is
> used to describe the STRTAB.

All fine, but this still doesn't clarify how the kernel learns where
bsd_symtab_pstart is.

> According to the ELF spec there can only be
> one SYMTAB, so that's all we need.

Did you perhaps overlook the spec saying "... but this restriction
may be relaxed in the future", plus the fact that we're talking
about an executable file here (which commonly have two symbol
tables - .dynsym and .symtab), not an object one? (This isn't to
say that you need to make the code handle multiple ones, if you
know that *BSD will only ever have one.)

>>>      sz = elf_round_up(elf, sz);
>>>  
>>> -    /* Space for the symbol and string tables. */
>>> +    /* Space for the symbol and string table. */
>>>      for ( i = 0; i < elf_shdr_count(elf); i++ )
>>>      {
>>>          shdr = elf_shdr_by_index(elf, i);
>>>          if ( !elf_access_ok(elf, ELF_HANDLE_PTRVAL(shdr), 1) )
>>>              /* input has an insane section header count field */
>>>              break;
>>> -        type = elf_uval(elf, shdr, sh_type);
>>> -        if ( (type == SHT_STRTAB) || (type == SHT_SYMTAB) )
>>> -            sz = elf_round_up(elf, sz + elf_uval(elf, shdr, sh_size));
>>> +
>>> +        if ( elf_uval(elf, shdr, sh_type) != SHT_SYMTAB )
>>> +            continue;
>>> +
>>> +        sz = elf_round_up(elf, sz + elf_uval(elf, shdr, sh_size));
>>> +        shdr = elf_shdr_by_index(elf, elf_uval(elf, shdr, sh_link));
>>> +
>>> +        if ( !elf_access_ok(elf, ELF_HANDLE_PTRVAL(shdr), 1) )
>>> +            /* input has an insane section header count field */
>>> +            break;
>>> +
>>> +        if ( elf_uval(elf, shdr, sh_type) != SHT_STRTAB )
>>> +            /* Invalid symtab -> strtab link */
>>> +            break;
>> 
>> This is not sufficient - what if sh_link is out of bounds, but the
>> bogusly accessed field happens to hold SHT_STRTAB? (Oddly
>> enough you have at least an SHN_UNDEF check in the second
>> loop below.)
> 
> The out-of-bounds access would be detected by elf_access_ok (if it's out
> of the scope of the ELF file, which is all we care about). In fact the
> elf_access_ok above could be removed because elf_uval already performs
> out-of-bounds checks. The result is definitely no worse that what we are
> doing ATM.

No, the out of bounds check should be more strict than just
considering the whole image: The image is broken if the link
points to a non-existing section.

>>> @@ -186,79 +199,169 @@ void elf_parse_bsdsyms(struct elf_binary *elf, 
>>> uint64_t pstart)
>>>  
>>>  static void elf_load_bsdsyms(struct elf_binary *elf)
>>>  {
>>> -    ELF_HANDLE_DECL(elf_ehdr) sym_ehdr;
>>> -    unsigned long sz;
>>> -    elf_ptrval maxva;
>>> -    elf_ptrval symbase;
>>> -    elf_ptrval symtab_addr;
>>> -    ELF_HANDLE_DECL(elf_shdr) shdr;
>>> -    unsigned i, type;
>>> +    /*
>>> +     * Header that is placed at the end of the kernel and allows
>>> +     * the OS to find where the symtab and strtab have been loaded.
>>> +     * It mimics a valid ELF file header, although it only contains
>>> +     * a symtab and a strtab section.
>>> +     *
>>> +     * NB: according to the ELF spec there's only ONE symtab per ELF
>>> +     * file, and accordingly we will only load the corresponding
>>> +     * strtab, so we only need three section headers in our fake ELF
>>> +     * header (first section header is always a dummy).
>>> +     */
>>> +    struct {
>>> +        uint32_t size;
>>> +        struct {
>>> +            elf_ehdr header;
>>> +            elf_shdr section[3];
>>> +        } __attribute__((packed)) elf_header;
>>> +    } __attribute__((packed)) header;
>> 
>> If this is placed at the end, how can the OS reasonably use it
>> without knowing that there are exactly 3 section header
>> entries? I.e. how would it find the base of this structure?
> 
> See the commend below, the base is found at the end of the kernel, and
> then the ELF header contains the right pointers to the sections headers
> (by appropriately setting e_shoff).

Thing is - I can't see which "comment below" you try to refer me to.

>>> +    ELF_HANDLE_DECL(elf_ehdr) header_handle;
>>> +    unsigned long shdr_size;
>>> +    ELF_HANDLE_DECL(elf_shdr) section_handle;
>>> +    ELF_HANDLE_DECL(elf_shdr) image_handle;
>>> +    unsigned int i, link;
>>> +    elf_ptrval header_base;
>>> +    elf_ptrval elf_header_base;
>>> +    elf_ptrval symtab_base;
>>> +    elf_ptrval strtab_base;
>>>  
>>>      if ( !elf->bsd_symtab_pstart )
>>>          return;
>>>  
>>> -#define elf_hdr_elm(_elf, _hdr, _elm, _val)     \
>>> -do {                                            \
>>> -    if ( elf_64bit(_elf) )                      \
>>> -        elf_store_field(_elf, _hdr, e64._elm, _val);  \
>>> -    else                                        \
>>> -        elf_store_field(_elf, _hdr, e32._elm, _val);  \
>>> +#define elf_store_field_bitness(_elf, _hdr, _elm, _val)             \
>>> +do {                                                                \
>>> +    if ( elf_64bit(_elf) )                                          \
>>> +        elf_store_field(_elf, _hdr, e64._elm, _val);                \
>>> +    else                                                            \
>>> +        elf_store_field(_elf, _hdr, e32._elm, _val);                \
>>>  } while ( 0 )
>>>  
>>> -    symbase = elf_get_ptr(elf, elf->bsd_symtab_pstart);
>>> -    symtab_addr = maxva = symbase + sizeof(uint32_t);
>>> -
>>> -    /* Set up Elf header. */
>>> -    sym_ehdr = ELF_MAKE_HANDLE(elf_ehdr, symtab_addr);
>>> -    sz = elf_uval(elf, elf->ehdr, e_ehsize);
>>> -    elf_memcpy_safe(elf, ELF_HANDLE_PTRVAL(sym_ehdr), 
>>> ELF_HANDLE_PTRVAL(elf->ehdr), sz);
>>> -    maxva += sz; /* no round up */
>>> +#define SYMTAB_INDEX    1
>>> +#define STRTAB_INDEX    2
>>>  
>>> -    elf_hdr_elm(elf, sym_ehdr, e_phoff, 0);
>>> -    elf_hdr_elm(elf, sym_ehdr, e_shoff, elf_uval(elf, elf->ehdr, 
>>> e_ehsize));
>>> -    elf_hdr_elm(elf, sym_ehdr, e_phentsize, 0);
>>> -    elf_hdr_elm(elf, sym_ehdr, e_phnum, 0);
>>> +    /* Allow elf_memcpy_safe to write to symbol_header. */
>>> +    elf->caller_xdest_base = &header;
>>> +    elf->caller_xdest_size = sizeof(header);
>>>  
>>> -    /* Copy Elf section headers. */
>>> -    shdr = ELF_MAKE_HANDLE(elf_shdr, maxva);
>>> -    sz = elf_shdr_count(elf) * elf_uval(elf, elf->ehdr, e_shentsize);
>>> -    elf_memcpy_safe(elf, ELF_HANDLE_PTRVAL(shdr),
>>> -                    ELF_IMAGE_BASE(elf) + elf_uval(elf, elf->ehdr, 
>>> e_shoff),
>>> -                    sz);
>>> -    maxva = elf_round_up(elf, (unsigned long)maxva + sz);
>>> +    /*
>>> +     * Calculate the position of the various elements in GUEST MEMORY 
>>> SPACE.
>>> +     * This addresses MUST only be used with elf_load_image.
>>> +     *
>>> +     * NB: strtab_base cannot be calculated at this point because we don't
>>> +     * know the size of the symtab yet, and the strtab will be placed 
>>> after it.
>>> +     */
>>> +    header_base = elf_get_ptr(elf, elf->bsd_symtab_pstart);
>>> +    elf_header_base = elf_get_ptr(elf, elf->bsd_symtab_pstart) +
>>> +                      sizeof(uint32_t);
>>> +    symtab_base = elf_round_up(elf, header_base + sizeof(header));
>>> +
>>> +    /* Fill the ELF header, copied from the original ELF header. */
>>> +    header_handle = ELF_MAKE_HANDLE(elf_ehdr,
>>> +                                
>>> ELF_REALPTR2PTRVAL(&header.elf_header.header));
>>> +    elf_memcpy_safe(elf, ELF_HANDLE_PTRVAL(header_handle),
>>> +                    ELF_HANDLE_PTRVAL(elf->ehdr),
>>> +                    elf_uval(elf, elf->ehdr, e_ehsize));
>>> +
>>> +    /* Set the offset to the shdr array. */
>>> +    elf_store_field_bitness(elf, header_handle, e_shoff,
>>> +                            offsetof(typeof(header.elf_header), section));
>>> +
>>> +    /* Set the right number of section headers. */
>>> +    elf_store_field_bitness(elf, header_handle, e_shnum, 3);
>>> +
>>> +    /* Clear a couple of fields we don't use. */
>>> +    elf_store_field_bitness(elf, header_handle, e_phoff, 0);
>>> +    elf_store_field_bitness(elf, header_handle, e_phentsize, 0);
>>> +    elf_store_field_bitness(elf, header_handle, e_phnum, 0);
>> 
>> Perhaps better just give the structure above an initializer?
> 
> Not really, the problem is that we copy the original header from the ELF
> binary, and then we mangle it. This is just a fixup to make it clear
> that no program headers are present (we just use section headers in
> order to describe the SYMTAB and STRTAB positions).

Ah, I see now.

>>> +    /* Zero the dummy section. */
>> 
>> Dummy? And anyway I think you mean "section header" here.
> 
> Yes, the right comment would be: zero the dummy section header.

But then still - why "dummy"? The 0-th section header has a purpose
beyond being a filler, even if that purpose doesn't matter for the
intentions you have here.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.