M0AGX / LB9MG

Amateur radio and embedded systems

Abusing reserved interrupt vectors on Cortex-M for metadata

Bootloaders in embedded systems need a way to tell if there is a valid application available in memory. Executables on devices with an operating system use elaborate file formats like elf and exe but small bare-metal bootloaders need something simpler to keep both the bootloader size and the application size small. In this post I present the simplest possible scheme - a single CRC32 checksum appended to a raw binary file and how to do it.

To verify the checksum of an application the bootloader code has to know first where the application binary starts and where it ends. The starting point is usually fixed by the linker script but the application can grow (or shrink) from release to release. A brute force approach would be to always make an image that fits into all available memory but that would be wasteful and limiting. For example the application can use last sectors of flash to store its settings or logs. It is much more future proof to only use as much memory for the application as actually needed.

The task is basically to place two magic numbers (the application length and the checksum) somewhere in the raw binary. For simplicity both number will be 4-bytes wide and little-endian to look like regular uint32_t to the C "world".

Cortex-M binaries usually have a vector table at the beginning. This table is a very convenient place to store magic numbers (ie. the length of the binary) because it has unused elements and they are at a fixed address.

Real-life example

This is an interrupt vector table example from an EFM32 which has a Cortex-M3 CPU. This array is placed at address zero in flash thanks to the vectors linker section.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
const pFunc __Vectors[] __attribute__ ((section(".vectors"), used)) = {
  /* Cortex-M Exception Handlers */
  (pFunc)&__StackTop,    /*      Initial Stack Pointer */
  Reset_Handler,         /*      Reset Handler         */
  NMI_Handler,           /*      NMI Handler           */
  HardFault_Handler,     /*      Hard Fault Handler    */
  MemManage_Handler,     /*      MPU Fault Handler     */
  BusFault_Handler,      /*      Bus Fault Handler     */
  UsageFault_Handler,    /*      Usage Fault Handler   */
  Default_Handler,       /*      Reserved              */
  Default_Handler,       /*      Reserved              */
  Default_Handler,       /*      Reserved              */
  Default_Handler,       /*      Reserved              */
  SVC_Handler,           /*      SVCall Handler        */
  DebugMon_Handler,      /*      Debug Monitor Handler */
  Default_Handler,       /*      Reserved              */
  PendSV_Handler,        /*      PendSV Handler        */
  SysTick_Handler,       /*      SysTick Handler       */

  /* External interrupts */
  DMA_IRQHandler,       /*  0 - DMA                    */
  GPIO_EVEN_IRQHandler, /*  1 - GPIO_EVEN              */
  TIMER0_IRQHandler,    /*  2 - TIMER0                 */
  USART0_RX_IRQHandler, /*  3 - USART0_RX              */
  USART0_TX_IRQHandler, /*  4 - USART0_TX              */
  USB_IRQHandler,       /*  5 - USB                    */
  ACMP0_IRQHandler,     /*  6 - ACMP0                  */
  ADC0_IRQHandler,      /*  7 - ADC0                   */
  DAC0_IRQHandler,      /*  8 - DAC0                   */
  I2C0_IRQHandler,      /*  9 - I2C0                   */
  I2C1_IRQHandler,      /*  10 - I2C1                  */
  GPIO_ODD_IRQHandler,  /*  11 - GPIO_ODD              */
  TIMER1_IRQHandler,    /*  12 - TIMER1                */
  TIMER2_IRQHandler,    /*  13 - TIMER2                */
  TIMER3_IRQHandler,    /*  14 - TIMER3                */
  USART1_RX_IRQHandler, /*  15 - USART1_RX             */
  USART1_TX_IRQHandler, /*  16 - USART1_TX             */
  LESENSE_IRQHandler,   /*  17 - LESENSE               */
  USART2_RX_IRQHandler, /*  18 - USART2_RX             */
  USART2_TX_IRQHandler, /*  19 - USART2_TX             */
  UART0_RX_IRQHandler,  /*  20 - UART0_RX              */
  UART0_TX_IRQHandler,  /*  21 - UART0_TX              */
  UART1_RX_IRQHandler,  /*  22 - UART1_RX              */
  UART1_TX_IRQHandler,  /*  23 - UART1_TX              */
  LEUART0_IRQHandler,   /*  24 - LEUART0               */
  LEUART1_IRQHandler,   /*  25 - LEUART1               */
  LETIMER0_IRQHandler,  /*  26 - LETIMER0              */
  PCNT0_IRQHandler,     /*  27 - PCNT0                 */
  PCNT1_IRQHandler,     /*  28 - PCNT1                 */
  PCNT2_IRQHandler,     /*  29 - PCNT2                 */
  RTC_IRQHandler,       /*  30 - RTC                   */
  BURTC_IRQHandler,     /*  31 - BURTC                 */
  CMU_IRQHandler,       /*  32 - CMU                   */
  VCMP_IRQHandler,      /*  33 - VCMP                  */
  LCD_IRQHandler,       /*  34 - LCD                   */
  MSC_IRQHandler,       /*  35 - MSC                   */
  AES_IRQHandler,       /*  36 - AES                   */
  EBI_IRQHandler,       /*  37 - EBI                   */
  EMU_IRQHandler,       /*  38 - EMU                   */
};

You can see the interrupt handlers for peripherals (timers, UARTs, etc.), exception handlers for the CPU (hard fault, bus fault, usage fault etc.) but also some handlers reserved for architectural reasons. For example a Cortex-M0 will not have a bus fault handler and that slot will be reserved.

The layout of the vector table is hardwired so the reserved handlers will never be called on a particular chip. These interrupt lines simply do not exist which means that the locations in the vector table can be used to store any data. Every vector is a pointer to a function. On 32-bit ARM architecture that is simply a 4 byte unsigned integer (effectively a uint32_t).

The elf (built with arm-none-eabi-gcc) can be converted to raw binary with arm-none-eabi-objcopy file.elf -O binary file.bin. The beginning of the file holds the vector table. A raw dump (using od -N 320 -vtx4 file.bin) looks like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
0000000 20002000 00000eb9 00000eb5 000006fd
0000020 00000eb5 00000eb5 00000eb5 00000eb5
0000040 00000eb5 00000eb5 00000eb5 00000eb5
0000060 00000eb5 00000eb5 00000eb5 00006415
0000100 0000160d 0000432d 00000eb5 00000eb5
0000120 00003d3d 00000eb5 0000432d 00005385
0000140 00006ab5 00000eb5 00000eb5 00000eb5
0000160 00000eb5 00000eb5 00000eb5 00000eb5
0000200 00000eb5 000054f5 00005449 00002825
0000220 00000eb5 02922240 46921a9a 46c04770
0000240 2b004b17 4b13d100 f7ff469d 2100fff3
0000260 460f468b 4a144813 f0061a12 4b0eff4c
0000300 d0002b00 4b0d4798 d0002b00 20004798
0000320 00042100 480d000d d0022800 e000480c
0000340 f006bf00 0020ff0b f0060029 f006f84d
0000360 46c0feef 00080000 00000000 00000000
0000400 20002000 2000035c 20000c70 00000000
0000420 00000000 68c26903 d8014293 181b6880
0000440 1a983b01 b5f74770 68c46883 1b1f9301
0000460 00056843 1918000e 429718a4 003ad807

The vectors are addresses to functions in flash (low addresses starting from zero, for example 0000160d) or RAM (addresses starting with 0x2000xxxx, not present in this example). You can see the number 00000eb5 appearing many times. This is the Default_Handler placed at address 0xEB5 in flash. Other numbers (like 46921a9a) are CPU instructions ("the code").

Vocabulary

There are many confusingly similar words used when dealing with interrupts. Here is a short explanation (as I see it 😊):

  • interrupt - an event that happens unexpectedly (from CPU's point of view), example: the UART has received a data byte
  • interrupt request, IRQ - the signal from a peripheral to an interrupt controller (logic pulse or logic level)
  • interrupt line - a physical wire (inside or outside the chip) that carries the interrupt request
  • interrupt controller - a part of the CPU that has inputs for multiple interrupt lines, it receives interrupt requests over these lines and makes the CPU execute appropriate interrupt handlers (on Cortex-M it is called the NVIC)
  • interrupt number - a logical name/number for a particular interrupt line (example from the table above: TIMER0 uses interrupt number 2)
  • interrupt handler - a function that gets magically called thanks to the interrupt controller when an interrupt request arrives at the CPU, this function is supposed to service the event
  • interrupt vector - the address of an interrupt handler
  • interrupt table, vector table - an array that holds the vectors in a particular order that the interrupt controller reads to find out the location of a handler for a particular line

If that were not enough, the Cortex-M documentation uses the term exception. Exceptions can be generated by the CPU itself (eg. division by zero) or come from outside (and then they are called interrupts).

What is the risk of using reserved vectors?

As the reserved vectors are not used the risk of using them for storage of arbitrary data on a particular chip is exactly zero. The chip will not change. The number and location of interrupt vectors will not change. Everything is hardwired.

The risk I see is that porting the firmware to a different MCU can be more difficult. If the firmware started its life on a Cortex-M0 and is being ported to a Cortex-M3 or M7 then it will be an extra hassle to free up the vectors for bus faults, usage faults etc. if they were repurposed on the original chip. It is not end of the world though. 🙂

Alternative approaches

A cleaner approach would be to reserve up front some space at a fixed address in the output binary. This can be done by adding an extra section (for example called metadata) to the linker script that is placed for example after the vector table. The vector table has fixed size and address so the new section will also end up at a fixed address. In C code the section can be dummy-filled by a simple array with an attribute __attribute__((section("metadata"), used)). This dummy array is needed to prevent the whole section from being optimized out during linking.

Adding length and CRC with Python

This script reads the binary file, fills a word at OFFSET_BLOB_LENGTH with the size of the file in little endian format and appends a CRC-32 at the end. The length does not include the CRC itself. The offset is the 7th vector in the table (the one right after UsageFault_Handler).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#!/usr/bin/env python3
import binascii
import os
import sys

# Use the reserved core interrupt vectors to store some metadata
OFFSET_BLOB_LENGTH = 7 * 4 #use the 7th word for blob length

input_file_path = sys.argv[1]
input_data = bytearray(open(input_file_path, "rb").read())
output_file_name = os.path.splitext(os.path.basename(input_file_path))[0] + "_with_crc.bin"
output_file_path = os.path.dirname(input_file_path)
output_file = open(output_file_name, "wb")

total_length = len(input_data)
input_data[OFFSET_BLOB_LENGTH + 0] = total_length & 0xFF
input_data[OFFSET_BLOB_LENGTH + 1] = (total_length >> 8) & 0xFF
input_data[OFFSET_BLOB_LENGTH + 2] = (total_length >> 16) & 0xFF
input_data[OFFSET_BLOB_LENGTH + 3] = (total_length >> 24) & 0xFF

crc = binascii.crc32(input_data)

output_file.write(input_data)
output_file.write(crc.to_bytes(4, byteorder='little'))
print("Blob length is %d, CRC is 0x%08X" % (total_length, crc))

The output is something along these lines:

1
Blob length is 38684, CRC is 0xB25316AE

Dump of the new file looks like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
0000000 20002000 00000eb9 00000eb5 000006fd
0000020 00000eb5 00000eb5 00000eb5 0000971c <-- difference here
0000040 00000eb5 00000eb5 00000eb5 00000eb5
0000060 00000eb5 00000eb5 00000eb5 00006415
0000100 0000160d 0000432d 00000eb5 00000eb5
0000120 00003d3d 00000eb5 0000432d 00005385
0000140 00006ab5 00000eb5 00000eb5 00000eb5
0000160 00000eb5 00000eb5 00000eb5 00000eb5
0000200 00000eb5 000054f5 00005449 00002825
0000220 00000eb5 02922240 46921a9a 46c04770
0000240 2b004b17 4b13d100 f7ff469d 2100fff3
0000260 460f468b 4a144813 f0061a12 4b0eff4c
0000300 d0002b00 4b0d4798 d0002b00 20004798
0000320 00042100 480d000d d0022800 e000480c
0000340 f006bf00 0020ff0b f0060029 f006f84d
0000360 46c0feef 00080000 00000000 00000000
0000400 20002000 2000035c 20000c70 00000000
0000420 00000000 68c26903 d8014293 181b6880
0000440 1a983b01 b5f74770 68c46883 1b1f9301
0000460 00056843 1918000e 429718a4 003ad807

You can see that the last word in 2nd row is now 0000971c instead of 00000eb5. 0000971c is of course equal to 38684 which is the size (in bytes) of the file. There is of course a CRC-32 appended at the end of the file.

The bootloader can now use the stored length to calculate the CRC of the application and find the location of the stored CRC to compare against. If they match it means that there is a valid application in memory that can be started. 🙂

Watch our for binaries built with IAR as they don't have to be multiple of 4 bytes.