Re: threads and signals question

4 Sep 2002

      On 29 Aug 2002, at 20:12, "lparrab"  wrote:
...
I'm starting a multi-threaded program using Boost.Thread library: 
it is a "server", in which a thread loops around a "select" to 
accept new connections and see when a socket has data to be read, in 
which case puts the fds in queue from which other threads are taking 
them out to read the data. 
The problem I have, is that when a "ReceiverThread" finishes reading 
the data from the socket, it puts the fd back in the "all 
connections list", but by this time the other ("main")thread already 
copied the fds from the list and got into the next "select" loop, 
which means the socket from which the "receiver" just finished 
reading, will not be "selected" (monitored) untill the next select 
loop, which will not be untill a) a new connection comes or b)one of 
the other connected sockets gets some data that takes makes the main 
thread return from select.
The cleanest way to wake the main thread from select is to make one 
of the file descriptors in its FDSET become readable.  What you need 
is a special file descriptor that is always in the FDSET that becomes 
readable when some other thread needs the main thread to come out of 
the select call.  For this you can use a pipe.

Here's a way that requires minimal changes to your existing design:

When a thread needs the main thread to wake up, it writes a byte into 
one end of the pipe.  It doesn't matter what the byte contains, its 
presence in the pipe is the "information" payload.  When the main 
thread comes out of the select, if its end of the pipe was among the 
readable fds, then it should pull one byte out of the pipe (and 
discard it).  If you use extreme care, you can instead flush the pipe 
of all bytes if you're sure you won't cause a race condition that 
way.  (Example of race condition: Main thread comes out of select, 
Main thread builds new FDSET for next select call, Receiver thread 
writes to pipe to inform it of newly available fd, Main thread 
flushes pipe and blocks in select.  To fix this race condition, you 
would reverse the order in which the Main thread flushes the pipe and 
builds the FDSET.)

Here's the better design:

Consider: why do you need the Receiver threads to wake the Main 
thread?  To transfer responsibility for an fd from a Receiver thread 
to the Main thread.  So, this is similar to the way the Main thread 
transfers fds to Receiver threads through a queue, and should be 
implemented in a similar fashion.  Instead of using the pipe just to 
signal the Main thread to wake, actually use it to transfer the fds, 
just as you use the queue in the other direction.

So now, when Receiver threads are done with an fd, they write the 
actual fd into the pipe.  The Main thread wakes because the other end 
of the pipe was in its FDSET.  It should read any and all fds out of 
the pipe and add them to its "connection list" for its next pass 
through select.  (You can read just one fd from the pipe for each 
time that select indicates the pipe is readable, or you can set the 
pipe to be non-blocking and read until the pipe is empty.)  This 
approach is a better design because it implements the actual transfer 
of responsibility and because the connection list should be private 
to the Main thread.  This way doesn't require that the Receiver 
threads access the connection list, so the connection list doesn't 
have to be protected by a mutex.
...
I think having a thread waiting on each open connection is not the
right approach here, so I thought of going a kind of "production line"
approach in which a few threads (5-10) are reading the data from a
bunch of open connections (50-100), the only thing they do is read the
data, parse it and form a "Request" which they put in a queue for
further processing and then go on to service the other connections.
Then another group of threads is taking this requests and doing
something, and putting Response objects in yet another queue, which
another group of threads "Sender" is taking and sending thorugh the
socket.
Just be aware that this design may mean that the client sees 
responses arrive out of order from their requests.  If the client 
only submits a request after any previous requests have been 
satisfied, you're OK.  If the client can have multiple requests 
outstanding at once, the Request threads or the Sender threads may 
finish out-of-order.  Also, if the Sender threads don't write 
atomically or with exclusive access, then the data from multiple 
responses might be interleaved.

This illustrates why it is cleaner to have a thread take full 
ownership of a connection for its lifetime as others have suggested.  
Of course, you're not being unreasonable in your approach, but you 
can marry the two somewhat.  You might have each worker thread 
combine the Receiver, Request, and Sender jobs in sequence and then 
relinquish the connection back to the main thread.  It's not clear to 
me that there is any advantage in terms of number of threads, thread 
workload, or throughput to splitting those jobs up.

Hope that helps,
Ken

Ken Thomases

tags

participants (1)