Hello, Following a previous thread that asked about how to parallelize a large number of calculations, I took this summary: I am thinking of choosing a simple model whereby: There are M computers (possibly heterogeneous). I will stick to 1 process per computer. Each process will have N threads. Giving a total M*N "execution units". I will take a simple solution in that the M and N are fixed though determined at runtime. So there is this pool of M*N exec units and I give them 100 000 tasks to do. Those tasks get scheduled in the same process/computer, they share their memory. Each of the computers gets a duplicate of the memory used as inputs to the tasks. For thread pool, TBB, boost::asio were suggested. I believe there is also threadpool.sourceforge.net For the cross-computer communication, boost::mpi. Is this something boost::mpi + mpi can help with? Regards,