Newsflash: debugging parallel programs ain’t easy
I ran into a situation recently where I was asked to debug a legacy C# program that was crashing due to multiple threads trying to write to the same file at the same time. I was asked because I was the last guy to modify it, so I guess I had no room to complain. I focused on the changes that I had made, trying to figure out how the heck I could have introduced the bug – my changes weren’t anywhere close to the source of the crash!
Then it hit me – my changes were a bunch of refactoring to make the code faster. The bug was always there, we were just more likely to hit it after my changes since each thread executes in less time. I should have probably guessed right away – there was a shared resource that was not being handled properly – but I was blinded by my assumption that my changes had to have introduced the bug. I guess that’s one moral of the story. (And now that I think about it, in the past *I* have been the guy that introduced a parallelism related bug that someone had to fix later.)
Another lesson is perhaps that it’s unwise to screw around with multicore parallelism unless you know what you are doing. Say you’ve got 4 cores, and let’s say that you get a 3x speedup out of them (which is often pretty generous). Many times I would rather be 3x slower but compleletely reliable and avoid random crashes. Microsoft’s task parallel library is kind of cool, but kind of dangerous. I’m not sure how often it’s really helpful.