Some ELF files do have segments with size zero, at offsets that are
invalid / out of bounds. This commit changes the default `ReadRef`
implementation for `&[u8]` to permit reads that are out of bounds, if
the requested length is zero, by returning `&[]`.
Examples of such files are debug files created with `objcopy --only-keep-debug`,
for example
`/usr/lib/debug/.build-id/bd/dd2eaf3326ffce6d173666b5f3e62a376e123a.debug`
in package `libgedit-gfls-1-0-dbgsym_0.2.1-2_arm64.deb`:
```
~ cp /usr/lib/debug/.build-id/bd/dd2eaf3326ffce6d173666b5f3e62a376e123a.debug /tmp/bad_file.elf
~ r2 /tmp/bad_file.elf
[0x0000027c]> iI~binsz
binsz 64184
[0x0000027c]> iSS
[Segments]
nth paddr size vaddr vsize perm type name
―――――――――――――――――――――――――――――――――――――――――――――――――――――――
0 0x00000000 0x27c 0x00000000 0x55e0 -r-x MAP LOAD0
1 0x0000fab8 0x0 0x0001fab8 0x5c0 -rw- MAP LOAD1
2 0x0000fab8 0x0 0x0001fb28 0x240 -rw- MAP DYNAMIC
[...]
8 0x0000fab8 0x0 0x0001fab8 0x548 -r-- MAP GNU_RELRO
```
This file has multiple segments starting at `0xfab8` (the end of the
file), with a physical size of `0`, while the file size is also `0xfab8`
(64184).
The current `ReadRef` implementation for `&[u8]` causes
`ProgramHeader::data` to fail for them, causing e.g. `ElfSegment::bytes`
to also fail.
While a caller could handle this error, or one could provide their own
type implementing `ReadRef` this way, I think it makes sense to do this
by default, since no data is *actually* being read out of bounds, but
with the current implementation we still try to index out of bounds and
then error.
Notably with this change this also matches the implementation of
`ReadRef` for `ReadCache` which already has a similar check implemented
for `read_bytes_at`.
Previously, if the symtab was empty then we were writing a strtab
that had 0 bytes of data. This gives a linker error:
SHT_STRTAB string table section [index 4] is empty
Also, if there is a symtab then there must be a strtab,
otherwise the error is:
invalid sh_type for string table section [index 0]: expected SHT_STRTAB, but got SHT_NULL
When using compression, debug sections in Mach-O produced by the go
compiler have a __zdebug_ section name prefix, and the section data has
the same format as GNU .zdebug_ compression for ELF.
Support these section names in `Object::section_by_name`, and support
the compressed section data in `ObjectSection::compressed_data`.
This commit extracts the GNU-style section compression logic from the
read::elf::section to a module underneath read, and then uses it also
in read::macho.
For Mach-O, `add_symbol_data` now ensures that the symbol size
is at least 1 when subsections via symbols are enabled.
This change was made to support linking with ld-prime. It is also
unclear how this previously managed to work with ld64.
`write::Object::add_subsection` no longer enables subsections
via symbols for Mach-O. Use `set_subsections_via_symbols` instead.
This change was made because Mach-O subsections are all or nothing,
so this decision must be made before any symbols are added.
`write::Object::add_subsection` no longer adds data to the subsection.
This change was made because it was done with `append_section_data`,
but this is often not the correct way to add data to the subsection.
Usually `add_symbol_data` is a better choice.
It's possible to have local symbols in the dynamic symbol table,
and sh_info should count these, the same as it already does
for SHT_SYMTAB.
Also be sure to exclude these from the GNU hash table.
.dynstr doesn't need to be present if there are no dynamic strings.
- handle section index 0 in read::elf::SectionTable::strings
- don't require .dynstr when writing if there are no strings
- don't print name of symbol index 0 in readobj example
Replace uses of these variants with read::Relocation::flags
and write::Relocation::flags.
Additionally, for write::Relocation, move the kind/encoding/size
fields into RelocationFlags::Generic, since these are not
required when using format specific variants.