Unwrapping a LZ4-compressed kernel

So I’m reinstalling my Gentoo system from scratch, and I want it to boot with UEFI and Secure Boot. That means I want to embed the kernel’s initramfs into the kernel image, so that the signature-checking performed by the firmware covers both the kernel and the initramfs.

Roughly following Sakaki’s awesome EFI install guide, I got to the point where I’ve got a kernel with its embedded initramfs – courtesy of genkernel. However now I want to double-check that the contents of the initramfs are fine. Also, since I like to make things harder for myself, I want to check from the actual initramfs embedded in the kernel image – not from the copy that’s still sitting in /var/tmp/genkernel.

First off, I’ve never tried to pick apart a kernel image before, so this is gonna be… exploratory, shall we say. A bit of googling quickly got me there, so I try the binwalk approach:

$ binwalk /boot/kernel

DECIMAL       HEXADECIMAL     DESCRIPTION
--------------------------------------------------------------------------------
0             0x0             Microsoft executable, portable (PE)
4178666       0x3FC2EA        MySQL MISAM compressed data file Version 2
5317291       0x5122AB        mcrypt 2.5 encrypted data, algorithm: "E", keysize: 18609 bytes, mode: ")",
7809361       0x772951        Certificate in DER format (x509 v3), header length: 4, sequence length: 18
7809365       0x772955        Certificate in DER format (x509 v3), header length: 4, sequence length: 18
7809369       0x772959        Certificate in DER format (x509 v3), header length: 4, sequence length: 18
7809373       0x77295D        Certificate in DER format (x509 v3), header length: 4, sequence length: 18
7809377       0x772961        Certificate in DER format (x509 v3), header length: 4, sequence length: 18
7809381       0x772965        Certificate in DER format (x509 v3), header length: 4, sequence length: 17
8283104       0x7E63E0        xz compressed data
8471664       0x814470        Unix path: /505V/F505/F707/F717/P8
8549858       0x8275E2        ELF, 64-bit LSB processor-specific,
8839487       0x86E13F        SHA256 hash constants, little endian
9228073       0x8CCF29        Certificate in DER format (x509 v3), header length: 4, sequence length: 1342
9240733       0x8D009D        Executable script, shebang: "/bin/ash"
9399653       0x8F6D65        SHA256 hash constants, little endian
9400048       0x8F6EF0        xz compressed data
9819275       0x95D48B        xz compressed data
11495515      0xAF685B        Copyright string: "Copyright (C) 2009 Red Hat, Inc. All !"
11698317      0xB2808D        CramFS filesystem, big endian size 5399878 hole_support CRC 0x2B007379, edition 1937113192, 1885762304 blocks, 1239979513 files
17206081      0x1068B41       lzop compressed data,b09,
17252766      0x107419E       xz compressed data
18215265      0x115F161       Unix path: /lib/gcc/x86_64-pc-linux-gnu/5.4.0"
18633646      0x11C53AE       SHA256 hash constants, little endian
18788285      0x11EAFBD       SHA256 hash constants, little endian

Ok so my kernel starts off with a PE executable, that seems reasonable, must be the EFI stub. However afterwards… MySQL MISAM? mcrypt? /505V/F505/F707/F717/P8? CramFS filesystem with 1239979513 files?
Something isn't right. I quickly look up the text bits (copyright string, shebang) with hexdump, and notice they are both isolated strings interspersed with other bits of binary and text, i.e. not part of actual scripts. I guess this must be a side effect of the compression. I also guess that a lot of the patterns reported by binwalk must be garbage because binwalk didn't pick up the LZ4-compressed blob(s?).

So, looks like I must find another way. Time to look for the LZ4 magic pattern (there has to be one, right?) myself. Reading the specs for the LZ4 frame format, I try to grep for the LZ4 magic:

$ grep -abo $'\x04\x22\x4d\x18' /boot/kernel
$ grep -abo $'\x18\x4d\x22\x04' /boot/kernel        # let's try big endian too, just in case

Nothing. Meh. :(

Ok, perhaps the kernel is not using LZ4's frame format (it happens). So let's have a look at how compression is performed. Skimming through the occurrences of lz4 and lz4c in the kernel sources yields some interesting results:

scripts/extract-ikconfig
64:try_decompress '0241\11430' xyy 'lz4 -d -l'

scripts/gen_initramfs_list.sh
263:                && compr="lz4 -l -9 -f"

scripts/Makefile.lib
373:    lz4c -l -c1 stdin stdout && $(call size_append, $(filter-out FORCE,$^))) > $@ || \

What's that -l flag? From the lz4(1) man page:

      -l     Use Legacy format (typically for Linux Kernel compression)

This smells good! Turns out this format is mentioned a little bit further down the page on the LZ4 frame format. It has a different magic, so let's try to grep for it:

$ grep -abo $'\x02\x21\x4c\x18' /boot/kernel
17332:!L
18902071:!L
18902178:!L
18902255:!L

Neat! Let's uncompress the blob:

$ dd if=/boot/kernel bs=17332 skip=1 | lz4 -dlc >/tmp/kernel-image
$ file /tmp/kernel-image
/tmp/kernel-image: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, BuildID[sha1]=5e2c94abac046f5e20c55ccba9e07208eba5830b, stripped, with debug_info
$ binwalk /tmp/kernel-image
DECIMAL       HEXADECIMAL     DESCRIPTION
--------------------------------------------------------------------------------
0             0x0             ELF, 64-bit LSB executable, AMD x86-64, version 1 (SYSV)
14684320      0xE010A0        Linux kernel version "4.9.11-hardened (root@sysresccd) (gcc version 5.4.0 (Gentoo Hardened 5.4.0-r3 p1.3, pie-0.6.5) ) #4 SMP Sun Feb 26 23:55:23 CET "
14830216      0xE24A88        gzip compressed data, maximum compression, from Unix, NULL date (1970-01-01 00:00:00)
15155136      0xE73FC0        CRC32 polynomial table, little endian
15301919      0xE97D1F        Unix path: /370R4V/370R5E/3570RE/370R5V
16988114      0x10337D2       Unix path: /arch/x86/include/asm/irqflags.h
[...]
22949458      0x15E2E52       Unix path: /arch/x86/platform/efi/efi.c
22991298      0x15ED1C2       Unix path: /drivers/firmware/efi/memattr.c
22991458      0x15ED262       Unix path: /drivers/firmware/efi/memmap.c
23052576      0x15FC120       Certificate in DER format (x509 v3), header length: 4, sequence length: 1342
23154632      0x1614FC8       ASCII cpio archive (SVR4 with no CRC), file name: ".", file name length: "0x00000002", file size: "0x00000000"
23154744      0x1615038       ASCII cpio archive (SVR4 with no CRC), file name: "proc", file name length: "0x00000005", file size: "0x00000000"
23154860      0x16150AC       ASCII cpio archive (SVR4 with no CRC), file name: "var", file name length: "0x00000004", file size: "0x00000000"
[...]
44243788      0x2A31B4C       ASCII cpio archive (SVR4 with no CRC), file name: "usr/lib64", file name length: "0x0000000A", file size: "0x00000003"
44243912      0x2A31BC8       ASCII cpio archive (SVR4 with no CRC), file name: "TRAILER!!!", file name length: "0x0000000B", file size: "0x00000000

Eureka! Those ASCII cpio archive entries are the files from the initramfs. The cpio archive that holds them is uncompressed, as it is embedded into the kernel and thus already benefits from the compression of the kernel itself.
Reading cpio(5) (specifically the section on the "New ASCII Format", newc), it looks like a cpio archive is just a collection of files, each one with its cpio header. So the first entry at 0x1614fc8 must be the start of the archive:

$ dd if=/tmp/kernel-image bs=$((0x1614fc8)) skip=1 of=/tmp/initramfs.cpio
$ file /tmp/initramfs.cpio
/tmp/initramfs.cpio: ASCII cpio archive (SVR4 with no CRC)
$ mkdir /tmp/initramfs
$ cpio -idHnewc --no-absolute-filenames -d /tmp/initramfs </tmp/initramfs.cpio
41191 blocks

And there it is! The initramfs is now successfully extracted from the kernel, ready to be scrutinized.

Leave a Reply

Your email address will not be published. Required fields are marked *