Pete
I'm having a problem with my boost multithreaded application. My app runs fine as a Windows Service, but as a Unix daemon it stops responding after some time. I noticed that the fewer threads I have available, the quicker the application stops responding: HP-UX(64) after 20mins, RH8(256) after >1hr, Linux RH9(1024) >5hrs. This led me to assume that perhaps the threads are not cleaned up properly.
That could well be right. Perhaps there is some kind of deadlock at thread exit?
Therefore, I wrote a little test app which ought to demonstrate the problem. To my surprise, the test application showed an entirely different problem, but again, only under Unix (it works as expected under Windows). I'm totally lost what the reason is and appreciate any help.
Expected: Actual: --------------- --------------- Start running Start running Start swimming Finish running Start cycling Start swimming Finish swimming Finish swimming Finish cycling Start cycling Finish running Finish cycling
In windows I get the actual output but under Linux RH9, I get the Actual output. Under RH9, the first five lines of the actual output appear immediately, but the last one takes approx. 4secs. In other words, only the last threads don't sleep the number of secs specified.
According to the manual page: "sleep() makes the current process sleep until seconds sec- onds have elapsed or a signal arrives which is not ignored." (Note that each POSIX thread is/was implemented as a process in Linux.) If thread creation involves receiving a signal then that would explain the above behaviour. I would expect it under RH8 because that has the old LinuxThreads implementation of pthreads that uses signals heavily, but not in RH9 which has NPTL. Still, you would be better off using boost::thread::sleep() instead.