lockdep for pthreads
Linux has a great tool called lockdep for identifying locking dependency problems. Instead of waiting until an actual deadlock occurs (which may be extremely difficult when it is a timing-sensitive thing), lockdep keeps track of which locks are already held when any new lock is taken, and ensures that there are no cycles in the dependency graph.
The other day I was sifting through gdb backtraces decoding a deadlock bug in the OSD daemon when it occured to me that it would be nice to have a similar tool for user space applications using pthreads. A quick search didn’t turn up anything promising, so I put together a simple dependency checker and hooked it into Ceph’s existing Mutex and RWLock wrappers. It was surprisingly quick to put together, and it works! I was a little disappointed to only find two real dependency bugs. But the project also motivated me to disable recursive locking (since my lockdep doesn’t cope with that), and that turned up a half dozen other instances of lazyness.