Fast integer scaling on Cortex-M0

Integer scaling is very common in embedded systems. For example to make human-readable values of ADC readings (correcting for gain or voltage dividers). Two operations are needed to do scaling without using floating-point numbers – multiplication and division. Cortex-M0 is a nice little beast with a single cycle integer multiplier. ARM provides a smaller area 32-cycle multiplier but I have not yet come across a chip that would have it. All chip makers seem to choose the faster option. Cortex-M0 however totally lacks integer division. Division is handled by library functions. A simple code like y = x * 123 / 120; gets compiled to a function call. Software division of course takes time (and energy). There is a nice hack though 🙂
Continue reading “Fast integer scaling on Cortex-M0”

Reducing FFT code size of CMSIS DSP

CMSIS DSP is a fantastic library develeoped by ARM that provides various math primitives (like matrices, filters and FFT). It is usually my first pick when implementing signal processing on microcontrollers bacause it is highly optimized for ARM Cortex-M cores and is free. I was therefore quite surprised when my project suddenly stopped fitting into a 128 KB flash MCU after adding a simple FFT call.
Continue reading “Reducing FFT code size of CMSIS DSP”

Debouncing buttons on EFM32 Happy Gecko

Button bounce is always a problem for microcontrollers. There are many ways to deal with the issue. The pins can be sampled at a low frequency, so that the bounce will settle between consecutive samplings. They can be low-pass filtered in software. Some approaches require the pin to be stable for some amount of time to register a press.

Interrupts are usually avoided because the MCU could register almost every edge of the button bounce. With a small piece of code however interrupts can be used for one-shot operation and “rearmed” later from a timer interrupt (after the bounce period). This is the approach the driver uses. A major benefit is that the system reacts to the press immediately.
Continue reading “Debouncing buttons on EFM32 Happy Gecko”

STM32L4 I2C driver for FreeRTOS without HAL

I2C remains a popular communication interface between MCUs and all kinds of auxiliary chips like ADCs, digipots and GPIO expanders. I had to make a simple and universal driver for an upcoming STM32L432 project to control Microchip digipots. STM32 I2C peripheral is simple enough to use without the burden of HAL libraries, additionally I needed a custom driver because my application uses FreeRTOS.
Continue reading “STM32L4 I2C driver for FreeRTOS without HAL”

STM32L4 UART driver for FreeRTOS without HAL

This is a driver for STM32L432 LPUART. It should also work with the “full” UART. The LPUART is a simple peripheral (compared to the clock tree or ADC). In this case it is easier to master the usage of a couple of registers, than use full-size HAL drivers, as they are very generic to cover every possible flavor of a peripheral across the whole STM32 line, which in turn makes them big in terms of code size and actually harder to follow than the register layout.

The driver can be safely used within FreeRTOS, It can even be used by multiple tasks, but it probably would make little sense anyway, unless there can be different devices connected at runtime to the same UART or the application has separate operating modes implemented in different tasks.
Continue reading “STM32L4 UART driver for FreeRTOS without HAL”

Generating signals with STM32L4 timer, DMA and DAC

Generating arbitrary signals using an MCU is extremely useful. It can be used for example to play back any audio or make a modulator for a modem. The most needed MCU peripheral is of course a DAC, but it also needs other peripherals to efficiently play back the samples without loading the CPU.

This post shows how to implement a signal generator on an STM32L432 without using HAL libraries.

Continue reading “Generating signals with STM32L4 timer, DMA and DAC”

Cortex-M – Debugging runtime memory corruption

Runtime memory corruption is one of the worst class of bugs a C/C++ application can have. I do not mean design problems like abuse of global variables, but seemingly correct code clobbering memory it should never touch (for example due to runaway pointers). Compared to “regular” crashes that are obvious and much simpler to fix (even if they are rare they leave a stacktrace), memory corruption is often silent. It can go unnoticed for a long period and manifest itself in subtle ways. For example: the application sometimes acts weirdly or a particular variable is sometimes wrong. Fortunately Cortex-M3 and M4 cores are equipped with special hardware that can assist in catching rogue memory accesses.
Continue reading “Cortex-M – Debugging runtime memory corruption”

Preserving debugging breadcrumbs across reboots in Cortex-M

Debugging embedded systems during development even with the best tools can be hard. Certainly a good debug probe makes life easier, but what do you do after the product is shipped? What if the customer complains that something strange is happening sometimes or a bug makes the device reboot, but only once a week? You make the firmware gather diagnostic information for you. This is the first post in series.
Continue reading “Preserving debugging breadcrumbs across reboots in Cortex-M”

Reducing firmware size by removing libc

The C standard library (libc) is a component that gets little attention. It is just there. However for embedded systems it brings some challenges and overhead in terms of code size. As firmware size is often critical, it sometimes makes sense to use a trimmed version of the standard library or to remove it entirely. I will focus on reducing the code size that may be beneficial for a small application like a bootloader.
Continue reading “Reducing firmware size by removing libc”

Fixing Cortex-M startup code for link-time optimization

Link-time optimization is a powerful output size reducing feature. Even though (as of 2018) still regarded as somewhat experimental, LTO is worth trying, if the binary size is very important and the application can be reliably tested afterwards, as link-time optimized code is hard to debug. A bootloader can be an ideal example. LTO is very easy to enable, but there are some small quirks that have to be taken care of. I will use GCC 7.2.1 from GNU Arm Embedded as an example.
Continue reading “Fixing Cortex-M startup code for link-time optimization”