How do you unit test synchronization logic?
Local example: Each instance of my class may be in use by multiple threads. I protect data members using synchronization primitives. How do I test that I've done this correctly?
One notion is to drive an instance from two threads, with synchronization primitives stubbed out. The stubs interact to force various execution-order patterns in the two threads.
It would be even better to mess with execution order at the level of individual instructions. Perhaps a scriptable thread-aware debugger would work? On Linux, gdb isn't thread-aware, but can attach to a thread after-the-fact. Perhaps start the threads, start multiple gdb instances, and drive the whole mess from a script.
Remote example: I have implemented a distributed transaction mechanism. Locks are acquired before transactions are started. How do I test that I've acquired the correct locks?
This opens one more level of control over the local example. Rather than stepping at the instruction level, I can step at the message level, re-ordering messages according to the message-order rules of whatever protocol is in use.
The only way I know is to stop one thread dead in its tracks. That leaves any locks hung. The best approach is a nasty semi-infinite loop since sleeping threads give up their locks (at least in Java). Then bash the methods that are synchronized. If they succeed, then there's a problem. This also works for database locks. A thread spawner can drive each method. Any thread that comes back reports a problem. --RichardHenderson.
One general possibility is to do fine-grained tracing/logging of everything that happens, and then analyzing the logs afterwards. You could write a program to check the traces.
Another approach, which is not often possible, is to use non-preemptive, user-space threads when unit testing. This gives you absolute control over the scheduling of threads. Your tests can switch between threads at will and test the state of threads without worrying about race conditions. You can set up tests that switch threads to test each sequence of significant actions. I have used this approach to test synchronization very successfully in the Ruby language. Another advantage is that the scheduler can detect deadlock; for example, Ruby throws an exception when deadlock occurs that can be caught in a unit test.
The other approach I have used is to model the valid action sequences that threads can exhibit. You can test each thread for those sequences using mock objects.
Modelling with a process calculus can help you quickly find synchronization errors before coding, and you can easily translate the process calculus into threaded code to test as above.
I don't know if it is really possible. Most of the interesting problems happen at extreme load conditions and resource exhaustion which are very difficult to reliably produce. It would be nice to have an analytic solution. For example, many of our most obscure bugs are where people assume the next line will get executed. Even if that line is an i++ you can be preempted before it executes. It does get executed 99.999 percent of the time, but sometimes it doesn't. I'd have excellent asserts and load your program as much as possible in all ways, run it this way for days, and see if something bad happens.
It would be nice to have an analytic solution.
I think that there is an analytic solution. You need a static code analysis tool (any lint vendors listening ?) which supports the following technique:
See also ExtremeProgrammingChallengeFourteen