[mpi] broadcast performance
Hi all - I need some advice for using the broadcast function of boost::mpi. We have a large buffer (sometimes gigabytes) that we need to get to all child nodes. We currently use boost::serialization with a binary archive to write the data into a std::vector<char>. Then we send that data across, and deserialize. The data is sent with MPI_Bcast. I've started testing with similar functionality using boost::mpi::broadcast to handle serialization and deserialization. Tracing through the code, it seems that the data is sent to all the child nodes via isend. Is there something I can do to ensure that Bcast will be used instead? If I only have a couple of nodes, the former is fine, but with more nodes, the MPI implementation of Bcast may do a better job (logarithmic or even constant time with the number of nodes). What are the suggestions for getting fast broadcast in this case? I don't think that using skeletons will help, since each instance of the broadcast will have unique data with potentially different layouts. Thanks, Brian
Hi Brian, If i understood correctly, you're actually doing something like: std::vector<char> gigaVec; MPI_Bcast(blah, blah, ..., &gigaVec[0]) and want to replace that by boost::mpi::broadcast, is that correct? Just do it the same way, if the type of the container is a MPI type, you're guaranteed that the underlying MPI implementation will be called. Regards, Júlio.
Okay. I can do that. I was just wondering if there was a trick to
make it happen under the hood. I'm curious as to why Bcast doesn't
get called by boost::mpi::broadcast for non-trivial types.
Thanks,
Brian
On Wed, Sep 5, 2012 at 7:00 PM, Júlio Hoffimann
Hi Brian,
If i understood correctly, you're actually doing something like:
std::vector<char> gigaVec; MPI_Bcast(blah, blah, ..., &gigaVec[0])
and want to replace that by boost::mpi::broadcast, is that correct?
Just do it the same way, if the type of the container is a MPI type, you're guaranteed that the underlying MPI implementation will be called.
Regards, Júlio.
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
Brian,
You can think of Boost.MPI as a very well-designed wrapper. All it does is
to call the underlying C (OpenMPI, MPICH, others) implementation when the
types are covered by the MPI standard.
On other hand, i agree with you, maybe would be possible to specialize a
template for std::vector<T> that handles it as a raw buffer. Someone has an
opinion about this?
When i have time, i'll think carefully to see if i can contribute a patch.
Regards,
Júlio.
2012/9/6 Brian Budge
Okay. I can do that. I was just wondering if there was a trick to make it happen under the hood. I'm curious as to why Bcast doesn't get called by boost::mpi::broadcast for non-trivial types.
Thanks, Brian
On Wed, Sep 5, 2012 at 7:00 PM, Júlio Hoffimann
wrote: Hi Brian,
If i understood correctly, you're actually doing something like:
std::vector<char> gigaVec; MPI_Bcast(blah, blah, ..., &gigaVec[0])
and want to replace that by boost::mpi::broadcast, is that correct?
Just do it the same way, if the type of the container is a MPI type, you're guaranteed that the underlying MPI implementation will be called.
Regards, Júlio.
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
Hi Julio -
I may be completely wrong, but I was under the understanding that when
a send call happens, serialization magic occurs that builds an
MPI_Datatype, and that by then handing the data into MPI_Send etc...
we avoid an extra copy?
But perhaps that won't work in my case. I doubt that MPI_Recv is
capable of building a complex hierarchy back up including pointers,
using operator new etc... Perhaps you have to have a fully
instantiated object of the same kind in order to use this
functionality with MPI_Recv?
I have a virtual message hierarchy, and the messages (or shared_ptrs
of messages) perform virtual dispatch upon being recv'd. Is there
anything performance-wise to be gained by using boost::mpi for
send/recv/broadcast? Or is the MPI_Datatype performance gain only
applicable to classes that have (perhaps complex, but) concrete layout
with object instantiation on the stack?
It seems that if I can't get the MPI_Datatype benefit for my types, I
may be better off maintaining my own buffers for serialization, so I
can potentially lower the number of memory allocations.
Thanks,
Brian
On Thu, Sep 6, 2012 at 3:29 AM, Júlio Hoffimann
Brian,
You can think of Boost.MPI as a very well-designed wrapper. All it does is to call the underlying C (OpenMPI, MPICH, others) implementation when the types are covered by the MPI standard.
On other hand, i agree with you, maybe would be possible to specialize a template for std::vector<T> that handles it as a raw buffer. Someone has an opinion about this?
When i have time, i'll think carefully to see if i can contribute a patch.
Regards, Júlio.
2012/9/6 Brian Budge
Okay. I can do that. I was just wondering if there was a trick to make it happen under the hood. I'm curious as to why Bcast doesn't get called by boost::mpi::broadcast for non-trivial types.
Thanks, Brian
On Wed, Sep 5, 2012 at 7:00 PM, Júlio Hoffimann
wrote: Hi Brian,
If i understood correctly, you're actually doing something like:
std::vector<char> gigaVec; MPI_Bcast(blah, blah, ..., &gigaVec[0])
and want to replace that by boost::mpi::broadcast, is that correct?
Just do it the same way, if the type of the container is a MPI type, you're guaranteed that the underlying MPI implementation will be called.
Regards, Júlio.
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
Hi Brian, I don't remember the details, but what you said is completely right, when we pass an object to any of the Boost.MPI methods, it can be either of MPI type, in what case it's properly forwarded to the C implementation, or it can be of a serializable type when the magic happens. At the other end of the wire, Boost.MPI will magically deserialize the object and you have no additional work. You keep working on high-level C++. The main bottleneck here is the act of serialize/deserialize repeatedly. As you already know, Boost.MPI solved this problem for some cases (with fixed layout): the skeleton and content approach. When the approach is not applicable, you have to live with C raw buffers doing the &vec[0] trick. I'll take a better look to see if specializing the template with that trick is safe and covered by the C++ standard. You're also free to investigate and produce patches. :-) Regards, Júlio.
The only idea I had was potentially to use MPI_Hindexed and
MPI_Address to create the full memory layout, and then go through the
data calling placement new, etc...
Given the lack of documentation around how to actually do this with
MPI, I can't really think of anything better than what is currently
happening inside boost::mpi. If I need the better performance, I will
have to uglify my code :)
Thanks,
Brian
On Thu, Sep 6, 2012 at 10:49 AM, Júlio Hoffimann
Hi Brian,
I don't remember the details, but what you said is completely right, when we pass an object to any of the Boost.MPI methods, it can be either of MPI type, in what case it's properly forwarded to the C implementation, or it can be of a serializable type when the magic happens.
At the other end of the wire, Boost.MPI will magically deserialize the object and you have no additional work. You keep working on high-level C++.
The main bottleneck here is the act of serialize/deserialize repeatedly. As you already know, Boost.MPI solved this problem for some cases (with fixed layout): the skeleton and content approach.
When the approach is not applicable, you have to live with C raw buffers doing the &vec[0] trick. I'll take a better look to see if specializing the template with that trick is safe and covered by the C++ standard.
You're also free to investigate and produce patches. :-)
Regards, Júlio.
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
participants (2)
-
Brian Budge
-
Júlio Hoffimann