Making call graphs with GCC, egypt and cflow


Call graphs are a visual way of showing relations between functions in a piece of code. They can be useful to analyze dependencies and to get basic understanding of a large, unfamiliar codebase. Tracing program flow can also help in finding bugs.

I will use my antenna switch and rotator controller as an example. Its firmware is written in C and uses AVR GCC.

egypt

One tool I’ve found is egypt. It is a perl script that uses graphviz to generate the graphs. First all the code has to be compiled with the GCC option -fdump-rtl-expand. It will generate lots of .c.170r.expand files. All those files have to be passed to egypt:

egypt *.expand | dot -Gsize=3000,3000 -Grankdir=LR -Tpng -o /tmp/callgraph.png

This command instructs graphviz to generate a 3000x3000px png image, which in my case looks like this (click to enlarge):

Each node (vertex) is a function. Each line (edge) is a function call. Dashed lines indicate indirect calls via function pointers.

Besides visualizing the caller-callee relations, the graph shows some less obvious information. Most functions’ origins can be traced to main (or in case of ARM – startup function). Some functions however are “dangling” or are not connected to main in any way. What are they doing and how are they called?

There can be mainly two reasons behind that. One is that a particular function is not used at all (eg. adc_init, adc_task in top-left), so it has no callers (I only know that because I have written the code and I remember what is not used 🙂 ). GCC generates expand files for all .c files and egypt does not see linker unused code removal optimizations.

Another case in embedded code are interrupt service routines. From main code’s point of view they are functions that are literally called out of the blue. So they do not have obvious callers. My examples: __vector_14 in top-left or __vector_20 in bottom-right.

In my opinion the most interesting things during analysis are the ones, that are not shown in the call graph. My example: __vector_14 and button_task_from_ISR in top-left. This task reads physical buttons, but does not do anything afterwards. So how does it “communicate” with the rest of the code? Using global variables. Call graphs show absolutely no variables, nor data. Using global variables for communication is almost always a bad idea. In the case of transferring information from interrupts to main context it is the only one.

Conclusion: global variables are hard to debug 🙂


cflow

Second graph generator I use is cflow. It is text-based and trivial to use. Just execute:

cflow *.c

and it will output text representation of the call graph:

This output can be easier to read and search through, than wandering around a big graphic. Another useful feature of cflow is the reverse call graph. It is generated by adding -r option and shows all callers for each function. It can be useful in refactoring or removing unused code.