CMSIS DSP is a fantastic library develeoped by ARM that provides various math primitives (like matrices, filters and FFT). It is usually my first pick when implementing signal processing on microcontrollers bacause it is highly optimized for ARM Cortex-M cores and is free. I was therefore quite surprised when my project suddenly stopped fitting into a 128 KB flash MCU after adding a simple FFT call.
I dived into the code of arm_rfft_init_q15.c and found out that the library uses lookup tables to speed up FFT. The problem is that no matter which FFT bin sizes you actually use, all possible bin size tables will be compiled in. This could be solved by enabling link-time optimization in the toolchain to constant propagate only the desired bin size, but it comes with its own set of risks so I decided to force the bin size in the init function:
This tiny fix hardcodes the FFT bin size at compile time and optimizes out all other lookup tables. The hack could be supplemented by an assertion to give an obvious error when the demanded bin size is differen t to the one that is built. I am using CMSIS DSP V.1.5.1 in this particular project. This issue has been fixed by ARM. However updating complete libraries is often not easy in embedded projects (due to testing, compliance, certification and other reasons) so simple hacks like this can save the day. :)