[Translation] How the sizes of C arrays became part of the library's binary interface

[Translation] How the sizes of C arrays became part of the library's binary interface


Most C compilers allow you to access the extern array with undefined boundaries, for example:

  extern int external_array [];

 int
 array_get (long int index)
 {
  return external_array [index];
 }  

The definition of external_array may be in another translation unit and look like this:

  int external_array [3] = {1, 2, 3};  

The question is what happens if this separate definition changes like this:

  int external_array [4] = {1, 2, 3, 4};  

Or like this:

  int external_array [2] = {1, 2};  

Will the binary interface persist (assuming that there is a mechanism that allows the application to determine the size of the array at run time)?

Curiously, on many architectures, increasing the size of the array breaks the binary interface compatibility (ABI). Reducing the size of the array can also cause compatibility issues. In this article, we will take a closer look at ABI compatibility and explain how to avoid problems.

Links in the data section of the executable file


To understand how the size of the array becomes part of the binary interface, we first need to examine the links in the data section of the executable file. Of course, the details depend on the specific architecture, and here we focus on the x86-64 architecture.

The x86-64 architecture supports addressing with respect to the program counter, that is, access to the global array variable, as in the array_get function shown earlier, can be compiled into one movl statement:

  array_get:
 movl external_array (,% rdi, 4),% eax
 ret  

From this, the assembler creates an object file in which the instruction is marked as R_X86_64_32S .

  0000000000000000:
  0: mov 0x0 (,% rdi, 4),% eax
  3: R_X86_64_32S external_array
  7: retq  

Such a move indicates to the linker ( ld ) rather than filling in the corresponding location of the variable external_array during linking when creating the executable file.

This has two important consequences.

  • Since the variable offset is determined at build time, there is no overhead for its execution at run time. The only price is memory access itself.
  • To determine the offset, you need to know the sizes of all variable data. Otherwise, it would be impossible to calculate the format of the data section at the time of linking.

For C implementations that are Executable and Link Format (ELF) , as in GNU/Linux, References to extern variables do not contain object sizes. In the array_get example, the size of the object is unknown even to the compiler. In fact, the entire file with the assembler looks like this (omitting only the promotion information with -fno-asynchronous-unwind-tables , which is technically required for psABI compliance):

  .file "get.c"
 .text
 .p2align 4, 15
 .globl array_get
 .type array_get, @function
 array_get:
 movl external_array (,% rdi, 4),% eax
 ret
 .size array_get,.-array_get
 .ident "GCC: (GNU) 8.3.1 20190223 (Red Hat 8.3.1-2)"
 .section .note.GNU-stack, "", @ progbits  

In this assembly file there is no size information for external_array : the only reference to the symbol is in the line with the instruction movl , and the only numeric data in the instruction is the size of the array element (implied code> movl multiplied by 4).

If ELF requires dimensions for undefined variables, it will not even be possible to compile the array_get function.

How does the linker get the actual symbol size? He looks at the definition of the symbol and uses the size information he finds there. This allows the compiler to calculate the layout of the data section and fill the data movement with appropriate offsets.

ELF Common Objects


C implementations for ELF do not require the programmer to add source code markup to indicate whether a function or a variable is in the current object (which can be a library or the main executable file) or in another object. The linker and the dynamic loader will take care of this.

At the same time, for executable files there was a desire not to degrade performance by changing the compilation model. This means that when compiling the source code for the main program (this is without -fPIC , and in this particular case without -fPIE ) the function array_get compiles into the exact same sequence of commands, before the introduction of dynamic shared objects. In addition, it does not matter whether the external_array variable is defined in the main executable file itself or if a common object is loaded separately at runtime. The instructions created by the compiler are the same in both cases.

How is this possible? After all, common ELF objects are independent of position. They are loaded at unpredictable, randomized addresses at run time. However, the compiler generates a sequence of machine code that requires these variables to be located with a fixed offset calculated at build time , long before the program starts.

The fact is that these fixed offsets use only one loaded object (the main executable file). All other objects (the dynamic loader itself, the C runtime library, and any other library used by the program) are compiled and assembled as fully position-independent objects (PIC). For such objects, the compiler loads the actual address of each variable from the global offset table (GOT). We can see this roundabout path if we compile an example of array_get with -fPIC , which would result in this assembler code:

  array_get:
 movq external_array @ GOTPCREL (% rip),% rax
 movl (% rax,% rdi, 4),% eax
 ret  

As a result, the address of the external_array variable is no longer hardcoded and can be changed at run time by appropriately initializing the GOT record. This means that at runtime the definition of external_array can be in the same shared object, another shared object, or the main program. The dynamic loader will find the corresponding definition based on the ELF character search rules and associate the undefined symbol reference with its definition by updating the GOT entry to its actual address.

Let us return to the original example, where the function array_get is in the main program, therefore the address of the variable is specified directly. The key idea implemented in the linker is that the main program will provide the definition of the variable external_array , even if it is actually defined in the shared object at runtime . Instead of pointing to the original definition of a variable in a shared object, the dynamic loader will select a copy of a variable in the data section of the executable file.

This has two important consequences. First of all, recall that external_array is defined as:

  int external_array [3] = {1, 2, 3};  

There is an initializer that should be applied to the definition in the main executable file. For this, the main executable file contains a link to the moved copy (copy relocation) of the character. The readelf -rW command displays it as R_X86_64_COPY .

 Relocation section '.rela.dyn' at offset 0x408 contains 3 entries:
  Offset Info Type Symbol Name Value's Name + Addend
 0000000000403ff0 0000000100000006 R_X86_64_GLOB_DAT 0000000000000000 __libc_start_main@GLIBC_2.2.5 + 0
 0000000000403ff8 0000000200000006 R_X86_64_GLOB_DAT 0000000000000000 __gmon_start__ + 0
 0000000000404020 0000000300000005 R_X86_64_COPY 0000000000404020 external_array + 0 

Like other moves, copy transfers are handled by the dynamic loader. It includes a simple, bitwise copy operation. The target of the copy is determined by the translation offset ( 0000000000404020 in the example). The source is determined at run time based on the symbol name ( external_array ) and its value. When creating a copy, the dynamic loader will also look at the size of the character to get the number of bytes to be copied. To make all this possible, the external_array symbol is automatically exported from the executable file as a specific symbol so that it is visible to the dynamic loader at runtime. The dynamic symbol table ( .dynsym ) reflects this, as shown by the readelf -sW command:

 Symbol table '.dynsym' contains 4 entries:
  Num: Value Size Type Bind Vis Ndx Name
  0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
  1: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __ libc_start_main@GLIBC_2.2.5 (2)
  2: 0000000000000000 0 NOTYPE WEAK DEFAULT UND __gmon_start__
  3: 0000000000404020 12 OBJECT GLOBAL DEFAULT 22 external_array 

Where does information about object size come from (12 bytes, in this example)? The linker opens all shared objects, searches for its definition, and takes the size information. As before, this allows the linker to calculate the layout of the data section so that you can use fixed offsets. Again, the size of the definition in the main executable file is fixed and cannot be changed at run time.

The dynamic linker also redirects symbolic links in shared objects to the moved copy in the main executable file. This ensures that the entire program has only one copy of the variable, as required by the semantics of the C language. Otherwise, if the variable is changed after initialization, updates from the main executable file will not be visible to dynamic shared objects and vice versa.

Influence on binary compatibility


What happens if we change the definition of external_array in a shared object without linking (or recompiling) the main program? First, consider adding an array element.

  int external_array [4] = {1, 2, 3, 4};  

This will give a warning from the dynamic loader in runtime:

main-program: Symbol `external_array 'has different size in shared object, consider re-linking

The main program still contains the definition of external_array with only 12 bytes of space. This means that the copy is incomplete: only the first three elements of the array are copied. As a result, access to the element of the extern_array [3] array is not defined. This approach affects not only the main program, but all the code in the process, because all references to extern_array were redirected to the definition in the main program. This includes a shared object that provides the definition of extern_array . He is probably not ready to meet the situation when the element of the array has disappeared in its own definition.

What about changing in the opposite direction, removing the item?

  int external_array [2] = {1, 2};  

If the program avoids access to the element of the extern_array [2] array, since it somehow detects a reduced array length, this will work.After the array there is a bit of unused memory, but that won't break the program.

This means that we get the following rule:

  • Adding elements to a global array variable breaks binary compatibility.
  • Deleting items can break compatibility if there is no mechanism to prevent access to deleted items.

Unfortunately, the warning of the dynamic loader looks more innocuous than it actually is, but there are no warning elements for deleted elements at all.

How to avoid this situation


Finding ABI changes is pretty easy using tools like libabigail .

The easiest way to avoid this situation is to implement a function that returns the address of the array:

  static int local_array [3] = {1, 2, 3};

 int *
 get_external_array (void)
 {
  return local_array;
 }  

If the definition of an array cannot be made static because of how it is used in the library, we can instead hide its appearance and also prevent its export and, therefore, avoid the truncation problem:

  int local_array [3] __attribute__ ((visibility ("hidden"))) =
  {1, 2, 3};  

Everything is much more complicated if the array variable is exported for backward compatibility reasons. Since the array from the library is truncated, the old main program with a shorter array definition will not be able to provide access to the full array for the new client code if it is used with the same global array. Instead, the access function may use a separate (static or hidden) array, or perhaps a separate array for the added elements at the end. The disadvantage is that it is impossible to save everything in a continuous array if the array variable is exported for backward compatibility. The design of the additional interface should reflect this.

Using character versioning, you can export multiple versions with different sizes, never changing the size in a particular version. Using this model, new related programs will always use the latest version, presumably with the largest size. Since the version and size of a symbol are fixed by the link editor at the same time, they are always consistent. The GNU C library uses this approach for the historical variables sys_errlist and sys_siglist . However, this still does not provide a single continuous array.

All things considered, the access function (for example, the get_external_array function above) is the best approach to avoid this ABI compatibility problem.

Source text: [Translation] How the sizes of C arrays became part of the library's binary interface