boost-mpi - Serious performance degradation with OpenMPI 1.10.1
My project has been using boost's mpi interface for a number of years and have recently encountered a significant performance obstacle moving to openmpi 1.10.1. We are seeing a factor of 7-10x slowdown in the communication portion of our calculation. Our communication patterns are mostly governed by all-to-one reductions and some all-to-all reductions. Colleagues who call mpi directly have not seen this behavior. We would be willing to work with someone to find a solution. Steve Nolen LANL
Hello
My project has been using boost’s mpi interface for a number of years and have recently encountered a significant performance obstacle moving to openmpi 1.10.1. We are seeing a factor of 7-10x slowdown in the communication portion of our calculation. Our communication patterns are mostly governed by all-to-one reductions and some all-to-all reductions. Colleagues who call mpi directly have not seen this behavior.
I've seen similar slowdowns for ptp communication for a long time: http://lists.boost.org/boost-users/2015/02/83807.php including with other MPI implementations. If you can use git bisect on https://github.com/open-mpi/ompi to find in which commit your slowdown starts that might help a lot. Boost::mpi doesn't seem to be developed much nowadays even though there is git activity so unless you're transferring very complicated data types I'd consider skipping boost::mpi and using MPI directly. Ilja
I didn't check that specific version of open-mpi. I cannot guarantee to find much time to look into this problem but if you could provide a simple test case...
I used to spend some time on Boost.MPI but having to deal with Boost specificities (bjam, git sub modules, whatever is used for documentation these days) has just become overkill in my development environment. Bjam alone requires more brain power than I'm willing to dedicate to a single project.
FWIW, most "recent" changes in the chain were made in the serialization library. I don't know if openMPI 1.10.1 did anything with respect to multi-threading ? (Boost.MPI now authorize selecting MT versions of openMPI which might disable some IB drivers).
Alain
On Wednesday, March 16, 2016 06:29 CET, Ilja Honkonen
My project has been using boost’s mpi interface for a number of years and have recently encountered a significant performance obstacle moving to openmpi 1.10.1. We are seeing a factor of 7-10x slowdown in the communication portion of our calculation. Our communication patterns are mostly governed by all-to-one reductions and some all-to-all reductions. Colleagues who call mpi directly have not seen this behavior.
I've seen similar slowdowns for ptp communication for a long time: http://lists.boost.org/boost-users/2015/02/83807.php including with other MPI implementations. If you can use git bisect on https://github.com/open-mpi/ompi to find in which commit your slowdown starts that might help a lot. Boost::mpi doesn't seem to be developed much nowadays even though there is git activity so unless you're transferring very complicated data types I'd consider skipping boost::mpi and using MPI directly. Ilja _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
I didn't check that specific version of open-mpi. I cannot guarantee to find much time to look into this problem but if you could provide a simple test case...
I used http://lists.boost.org/boost-users/att-83807/test.cpp which showed that boost::mpi was at least 10x slower than MPI for sending largeish vectors between two processes on a laptop with openmpi 1.6.5 but the problem didn't seem specific to openmpi. Ilja
participants (3)
-
Ilja Honkonen
-
Miniussi Alain
-
Nolen, Steven Douglas