Linux Kernel Module Symbol Versioning
The Linux kernel itself and its modules may export symbols, so that other modules can import and use them. As the functions are written in C, it is important that the function signature matches:
- the number of arguments must match
- the ordering of the arguments must match
- the data types must match, which includes the structure and layout of all input and output parameters
If any of them changes, the Application Binary Interface (ABI) changes and you risk crashing the kernel. If you’re lucky, recompiling the kernel and the modules is enough for both ends to pick up the new Application Programming Interface (API).
To detect such breaking changes, the Linux kernel can be compiled with CONFIG_MODVERSIONS
enabled:
This calculates a Cyclic Redundancy Check (CRC) checksum over the function signature and embeds this information with the kernel and the modules.
The dynamic linker of the Linux kernel checks, that for each requested symbol its CRC matches the CRC of the Linux kernel or already loaded modules.
A module is only loaded, if a match is found for all symbols.
Otherwise loading fails.
Rust goes DWARF
The mechanism described here does not work with Rust.
As such the Linux kernel learned a new trick and can use the DWARF (Debugging With Arbitrary Record Formats) debugging information to calculate the CRC.
When CONFIG_RUST
is enabled, gendwarfksyms
is used instead of genksyms
.
Both versions are incompatible as they calculate different CRCs for the same function.
But they work similar enough, so I will not go into details here.
If you’re interested, look for CONFIG_EXTENDED_MODVERSIONS
.
Executable and Linkable Format
Linux Kernel modules object files using the Executable and Linkable Format (ELF).
Instead of using the well-known suffix .o
, they use the suffix .ko
, but are otherwise the same.
They are comprised of multiple sections containing executable code, read-only constants, initialized data and other informations required for linking.
Example: ELF sections of a Linux Kernel Module
$ objdump --section-headers --wide avm-modver.ko
$ LC_ALL=C readelf --wide --section-headers avm-modver.ko
There are 38 section headers, starting at offset 0x25d80:
Sections Header:
[Nr] Name Type Addresse Off Size ES Flg Lk Inf Al Usage
🔵[ 0] NULL 0 0 0 0 0 0 0 ELF header
🟠[ 1] .note.gnu.build-id NOTE 0 40 24 0 A 0 0 4 unique build ID bitstring
🟣[ 2] .note.Linux NOTE 0 64 30 0 A 0 0 4 Architecture data
🟢[ 3] .text PROGBITS 0 a0 1f 0 AX 0 0 16 Code
⚪[ 4] .rela.text RELA 0 14e50 30 18 I 35 3 8
🔴[ 5] __ksymtab PROGBITS 0 c0 c 0 A 0 0 4 EXPORT_SYMBOL
⚪[ 6] .rela__ksymtab RELA 0 14e80 48 18 I 35 5 8
🔴[ 7] __kcrctab PROGBITS 0 cc 4 0 A 0 0 4 CRC
🟣[ 8] __mcount_loc PROGBITS 0 d0 8 0 A 0 0 1 ftrace()
⚪[ 9] .rela__mcount_loc RELA 0 14ec8 18 18 I 35 8 8
🟣[10] .modinfo PROGBITS 0 d8 92 0 A 0 0 1 MODULE_INFO
🟣[11] .return_sites PROGBITS 0 16a 4 0 A 0 0 1 Live patching
⚪[12] .rela.return_sites RELA 0 14ee0 18 18 I 35 11 8
🟣[13] .call_sites PROGBITS 0 16e 4 0 A 0 0 1 Live patching
⚪[14] .rela.call_sites RELA 0 14ef8 18 18 I 35 13 8
🔴[15] __ksymtab_strings PROGBITS 0 172 d 1 AMS 0 0 1 EXPORT_SYMBOL
🔴[16] __versions PROGBITS 0 180 51 0 A 0 0 32 CRC
🟣[17] __patchable_function_entries PROGBITS 58 1d8 8 0 WAL 3 0 8 NOPs
⚪[18] .rela__patchable_function_entries RELA 0 14f10 18 18 I 35 17 8
⚫[19] .data PROGBITS 0 1e0 0 0 WA 0 0 1 Initialized data
🟣[20] .gnu.linkonce.this_module PROGBITS 0 200 500 0 WA 0 0 64
⚫[21] .bss NOBITS 0 700 0 0 WA 0 0 1 Uninitialized data
🟠[22] .debug_info PROGBITS 0 700 b51f 0 0 0 1
⚪[23] .rela.debug_info RELA 0 14f28 fc00 18 I 35 22 8
🟠[24] .debug_abbrev PROGBITS 0 bc1f 71e 0 0 0 1
🟠[25] .debug_aranges PROGBITS 0 c33d 50 0 0 0 1
⚪[26] .rela.debug_aranges RELA 0 24b28 48 18 I 35 25 8
🟠[27] .debug_line PROGBITS 0 c38d 3be 0 0 0 1
⚪[28] .rela.debug_line RELA 0 24b70 1050 18 I 35 27 8
🟠[29] .debug_str PROGBITS 0 c74b 7792 1 MS 0 0 1
🟠[30] .debug_line_str PROGBITS 0 13edd 943 1 MS 0 0 1
🟡[31] .comment PROGBITS 0 14820 58 1 MS 0 0 1 Compiler version
🟡[32] .note.GNU-stack PROGBITS 0 14878 0 0 0 0 1 Stack hardening flag
🟠[33] .debug_frame PROGBITS 0 14878 40 0 0 0 8
⚪[34] .rela.debug_frame RELA 0 25bc0 30 18 I 35 33 8
🔵[35] .symtab SYMTAB 0 148b8 438 18 36 40 8 Symbols
🔵[36] .strtab STRTAB 0 14cf0 15f 0 0 0 1 Symbol names
🔵[37] .shstrtab STRTAB 0 25bf0 18b 0 0 0 1 Section names
Key to Flags:
Write, Alloc, eXecute, Merge, Strings, Info, Link order, extra Os processing required, Group, TLS,
Compressed, x=unknown, o=OS specific, Exclude, mbinD, large, processor specific
- 🔴 Linux kernel module specific sections
- 🟣 Linux specific sections
- ⚫ data
- 🟢 executable code
- ⚪ relocations
- 🟠 debug information
- 🟡 compiler information
- 🔵 ELF
The section names have varying lengths.
As such the names are collected in their own section called .shstrtab
, which is referenced by index in the ELF file header.
All sections are listed in the section header table and their names are referenced by offset.
Run readelf -p .shstrtab avm-job.ko
to dump those names.
Similar for symbols:
There names are collected in the section .strtab
and referenced via offset from .symtab
.
Run readelf -p .strtab avm-job.ko
to dump those names.
.symtab
contains all symbols (and .strtab
) their names.
When shared objects (.so
) are used, the linker moves those symbols to .dynsym
and their names to .dynstr
.
Already resolved symbols may be removed respectively both tables .symtab
and .strtab
may be stripped completely.
The remaining dynamic symbols are only resolved by the dynamic linker, when section is loaded.
The dynamic linker has to go through the section and substitute the placeholders with the then correct address.
For that the ELF file contains the relocation sections, of which there are two types:
REL
(relocation without addend) and RELA
(relocation with addend), which allows to add an additional constant.
Either one may be used per section and each table references a symbol table, which gets used.
Not all of them are loaded into memory respectively are freed again, when they are no longer needed by the linker. Only those sections, which contain information that is necessary for runtime execution of the file, are kept. Multiple (similar) sections can be combined and are then called segments. But that is only relevant for fully linked executables: Only they have a program header
References to functions are then resolved by the linker and the place-holders get replaced by the real addresses. This is where versioning kicks in.
Anatomy of a Linux kernel module
When you write and export a function in the Linux kernel or an module, the following happens:
void my_function(void) {
return;
}
- The compiler/assembler puts the code into the
.text
section. - The name of the function is added to the
.strtab
section. - An entry is added to the
symtab
section linking the offset within the.text
section to the name via its offset in thestrtab
section.
Using EXPORT_SYMBOL
adds more magic:
#include <linux/module.h>
EXPORT_SYMBOL(my_function);
- It puts the name of the function into a section called
__ksymtab_strings
.$ LC_ALL=C readelf --wide --string-dump=__ksymtab_strings avm-modver.ko String dump of section '__ksymtab_strings': [ 0] my_function
- It creates a new section called
__ksymtab+my_function
with a singlestruct kernel_symbol
linking the address of the function to its name. Later on these sections will be collected by the linker scriptscripts/module-common.lds
and will be put into the section called__ksymtab
. Similar happens forEXPORT_SYMBOL_GPL
andEXPORT_SYMBOL_GPL_FUTURE
andEXPORT_SYMBOL_NS
, but with different prefixes.$ LC_ALL=C readelf --wide --relocated-dump=__ksymtab --relocs avm-modver.ko | grep -A4 __ksymtab Relocation section '.rela__ksymtab' at offset 0x14fc8 contains 3 entries: Offset Info Type Symbol's Value Symbol's Name + Addend 0000000000000000 0000002e00000002 R_X86_64_PC32 0000000000000010 my_function + 0 0000000000000004 0000001b00000002 R_X86_64_PC32 0000000000000000 __kstrtab_my_function + 0 0000000000000008 0000001c00000002 R_X86_64_PC32 000000000000000c __kstrtabns_my_function + 0 -- Hex dump of section '__ksymtab': 0x00000000 10000000 fcffffff 04000000 ............
Too see more details, use make avm-modver.i
to run the pre-processor and to get the intermediate file, where all macros have been expanded.
With CONFIG_MODVERSIONS
enabled even more magic happens.
If a module uses EXPORT_SYMBOL
, then genksyms
is called.
The source code of the module is pre-processed again via cpp
, but with a different definition for EXPORT_SYMBOLS
.
- For each function exported via
EXPORT_SYMBOL
a CRC for the function signature is computed by parsing the C function call. A new section called___kcrctab+my_function
with a singlelong
containing the CRC is created. Later on these sections will be collected by the linker scriptscripts/module-common.lds
and will be put into the section called__kcrctab
. Similar happens forEXPORT_SYMBOL_GPL
andEXPORT_SYMBOL_GPL_FUTURE
andEXPORT_SYMBOL_NS
, but with different prefixes. - For each used symbol the CRC is looked up in the
Module.symvers
files. They are created as part of the kernel or any module compilation process whenCONFIG_MODVERSIONS
is enabled. The file collects the CRC and module path for each symbol. The symbol name and its CRC is collected in aconst char __versions[]
array in section__versions
.
Module loading
When a kernel module is loaded, the Linux kernel linker resolves all dynamic symbols of the module.
It looks up each unresolved symbol from .symtab
and resolves it to all symbols loaded so far.
You can view them from user-space in /proc/kallsyms
.
In addition to that simple lookup the loader also checks the modules licence from .modinfo
:
Symbols exported via EXPORT_SYMBOL_GPL
can only be resolved if the module has MODULE_LICENCE("GPL")
and such.
When CONFIG_MODVERSIONS
is enabled, the linker inside the Linux kernel also checks the CRC:
For every undefined symbol there is a matching entry for it in section __versions
, which contains the CRC of the symbol from compile time.
$ LC_ALL=C readelf --wide -s avm-modver.ko | grep UND
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
42: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND __fentry__
43: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND _printk
44: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND __x86_return_thunk
But there are two different layouts used:
Upstream Linux Kernel
$ LC_ALL=C readelf --wide --hex-dump=__versions avm-modver.ko
Hex dump of section '__versions':
0x00000000 bb6dfbbd 00000000 5f5f6665 6e747279 .m......__fentry
0x00000010 5f5f0000 00000000 00000000 00000000 __..............
0x00000020 00000000 00000000 00000000 00000000 ................
0x00000030 00000000 00000000 00000000 00000000 ................
0x00000040 d87e9992 00000000 5f707269 6e746b00 .~......_printk.
0x00000050 00000000 00000000 00000000 00000000 ................
0x00000060 00000000 00000000 00000000 00000000 ................
0x00000070 00000000 00000000 00000000 00000000 ................
0x00000080 cb8119bf 00000000 6d6f6475 6c655f6c ........module_l
0x00000090 61796f75 74000000 00000000 00000000 ayout...........
0x000000a0 00000000 00000000 00000000 00000000 ................
0x000000b0 00000000 00000000 00000000 00000000 ................
The original Linux kernel uses const struct modversion_info __version[]
.
The structure has a fixed size of 64 bytes:
- the first 8 bytes contain the CRC.
- the remaining 56 bytes contain the symbol name.
Longer symbol names are not supported and require the use of the extended modversions.
Ubuntu Linux Kernel
$ LC_ALL=C readelf --wide --hex-dump=__versions avm-modver.ko
Hex dump of section '__versions':
0x00000000 14000000 bb6dfbbd 5f5f6665 6e747279 .....m..__fentry
0x00000010 5f5f0000 10000000 7e3a2c12 5f707269 __......~:,._pri
0x00000020 6e746b00 1c000000 ca39825b 5f5f7838 ntk......9.[__x8
0x00000030 365f7265 7475726e 5f746875 6e6b0000 6_return_thunk..
0x00000040 18000000 eb7b33e1 6d6f6475 6c655f6c .....{3.module_l
0x00000050 61796f75 74000000 00000000 00000000 ayout...........
0x00000060 00
Ubuntu has changed this and uses const char ____versions[]
:
- the first 8 bytes contain the CRC.
- next follows the symbol name with a terminating NUL byte.
- more NUL bytes for padding up to the next address dividable by 4.
Ubuntu changed this to support longer symbol names, which Ubuntu claims is required for RUST support. See modpost: support arbitrary symbol length in modversion for details. This has been reverted by 2039010.
…