[Boost-users] [mpi] broadcast performance

5 Sep 2012

      Hi all -

I need some advice for using the broadcast function of boost::mpi.  We
have a large buffer (sometimes gigabytes) that we need to get to all
child nodes.  We currently use boost::serialization with a binary
archive to write the data into a std::vector<char>.  Then we send that
data across, and deserialize.  The data is sent with MPI_Bcast.

I've started testing with similar functionality using
boost::mpi::broadcast to handle serialization and deserialization.
Tracing through the code, it seems that the data is sent to all the
child nodes via isend.  Is there something I can do to ensure that
Bcast will be used instead?  If I only have a couple of nodes, the
former is fine, but with more nodes, the MPI implementation of Bcast
may do a better job (logarithmic or even constant time with the number
of nodes).

What are the suggestions for getting fast broadcast in this case?  I
don't think that using skeletons will help, since each instance of the
broadcast will have unique data with potentially different layouts.

Thanks,
  Brian

[Boost-users] [mpi] broadcast performance

Brian Budge