Listing stuck Python threads
More than once I have been in the situation where a multithreaded Python application (eventually 😁)
runs fine, all the threads cooperate nicely, but do not shut down cleanly.
This for example happens after closing GUI windows where the GUI goes away but
some worker threads do not. A
Ctrl+C might help to shut down the stuck application only if the
main thread is stuck.
Still, every application should shut down cleanly so that it can be correctly monitored
and restarted by
systemd / wrapper script etc.
It is not that easy to analyze a mostly shut down application.
For example I can put a
logging statement at the end of every thread function.
If the threads are mostly static and there are not too many of them I can try to guess which
end message is missing from the logs. This approach does not scale well.
What is always available in any running process? Signals! What if I simply list the threads from a signal handler? Will it work?
Unix signals are somewhat special and many operations that are thread-safe are not signal-safe. This is also applicable to Python code. However, barely reading a list of active threads should do no harm (and the application is half shut down anyway). This is the lifehack:
1 2 3 4 5 6 7 8 9
After telling my application to close and getting it stuck I simply issue
kill -SIGUSR1 <pid> from another terminal
and I can see all the names of thread functions still running. This hack can be easily added at the last minute and does
not need any special instrumentation nor debuggers. 🙂