Disclaimer: I’m a total newbie to all this executable file format stuff. Slowly learning!
The strings command lists all sufficiently long printable character strings in a file.
I recently found myself needing to locate a string found in an ELF executable in the memory of the running program. strings did its job just fine in reporting the location of the string in the executable:
$ strings --radix=x ./bin.x86_64
[snip]
143fd5 really interesting string
[snip]
However, strings merely operates on binary data and doesn’t care if it be an ELF executable, a dump of random data or even a plain text file. So I had to find how to map this file offset to a memory address in the running program.
Time to head to Wikipedia for a quick readup on the ELF file format:
“Elf-layout–en” by Surueña – Own work. Licensed under CC BY-SA 3.0 via Wikimedia Commons. |
So turns out the location of the string is pretty simple to figure out, and doesn’t even require the program to be run. The silver bullet here is readelf, which provides all kinds of information on the contents of an ELF file:
$ readelf --program-headers ./bin.x86_64
Elf file type is EXEC (Executable file)
Entry point 0x406840
There are 8 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
PHDR 0x0000000000000040 0x0000000000400040 0x0000000000400040
0x00000000000001c0 0x00000000000001c0 R E 8
INTERP 0x0000000000000200 0x0000000000400200 0x0000000000400200
0x000000000000001c 0x000000000000001c R 1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x00000000001af658 0x00000000001af658 R E 200000
LOAD 0x00000000001af658 0x00000000007af658 0x00000000007af658
0x0000000000002ce8 0x0000000000008418 RW 200000
DYNAMIC 0x00000000001afd18 0x00000000007afd18 0x00000000007afd18
0x0000000000000270 0x0000000000000270 RW 8
NOTE 0x000000000000021c 0x000000000040021c 0x000000000040021c
0x0000000000000020 0x0000000000000020 R 4
GNU_EH_FRAME 0x00000000001709d0 0x00000000005709d0 0x00000000005709d0
0x000000000000b5fc 0x000000000000b5fc R 4
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 8
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .note.ABI-tag .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .text .fini .rodata .eh_frame_hdr .eh_frame .gcc_except_table
03 .ctors .dtors .jcr .dynamic .got .got.plt .data .bss
04 .dynamic
05 .note.ABI-tag
06 .eh_frame_hdr
07
The first column indicates the offsets and sizes of program segments in the file, while the second column indicates the location of those segments in memory (i.e., at runtime).
So my string at file offset 0x143fd5 was located inside the first LOAD segment (file range [0x000000,0x1af658]), mapped to memory range [0x400000,0x5af658].
Hence the location of the string in memory: 0x543fd5.
This can be confirmed easily with other tools:
$ objdump --full-contents --file-offsets --section=.rodata --start-address=0x543fd5 --stop-address=$((0x543fd5 + 26)) ./bin.x86_64
./bin.x86_64: file format elf64-x86-64
Contents of section .rodata: (Starting at file offset: 0x143fd5)
543fd5 726561 6c6c7920 696e7465 72657374 69 really interesti
543fe5 6e6720 73747269 6e6700 ng string.
$ gdb ./bin.x86_64
(gdb) x/s 0x543fd5
0x543fd5: "really interesting string"
Note: I do not have the slightest idea how all of this plays with ASLR.