Details of boost parallel graph implementation
On Wed, Jul 29, 2015 at 4:26 AM Mayank Bakshi
Hello,
I am working on mpi implementation of boost for breadth first search. What is the algorithmic detail for BFS in boost.mpi? How the graph is processed by each rank which we provide in mpirun (load distribution among different processes)?
The sequential BGL library is overloaded on distributed graphs: https://github.com/boostorg/graph/blob/master/include/boost/graph/breadth_fi... Look for the mpl::true_ and mpl::bool_ to see where distributed vs. not-distributed selection is made. graph_parallel provides some additional overloads of the helper, but it does not define a brand new BFS: https://github.com/boostorg/graph_parallel/blob/master/include/boost/graph/d... The whole magic happens in the queue, graph, and property maps data structures. The queue distributes work, and it also implements a global "empty" check, and the property maps allow accessing remote vertices. So, if you need MPI BFS, one is already there.
Hi Marcin,
Thanks for the reply.
1. Which are the files in the source tree where I can get the details of
implementation of queue,graph and the property map for serial as well as
distributed (MPI ) cases.
Also, I had the following query.
With reference to document
http://www.boost.org/doc/libs/1_58_0/libs/graph_parallel/doc/html/overview.h...
For parallel version of BFS, the steps that I inferred are as below. Could
you please correct me in this:
1. The root node process reads the input file which is in the metix
format.
2. The root node process creates the adjacency list representation of the
graph.
3. The root node process creates the property map and "ghost cells" of the
graph. The various entries are marked as white corresponding to each of
the node.
4. The root node process will distribute the adjacency list and property
map to each of the processes.
5. Each of the non-root nodes now has its own copy of adjacency list and
property map along with details of the source node where the BFS is to
start
6. Now the process having the source node starts the traversal of the
graph.
7. The source node is put in a queue.
8. The source node will dequeue the source node and mark all its neighbors
as gray and put them in queue. Also it will mark itself as black.
9. The property maps on each of the nodes will now be updated
10. Each of the node processes will be monitoring its own property map
If any of the processes finds a node marked as gray, it will
enqueue that particular node and start its own traversal locally.
11. Thus on every node the property map will be marked as gray and then
eventually black after visit is complete. The property maps will be
synchronised.
12. The steps from 8 to 11 are repeated till all the entries in every
property map are marked as black.
Is my interpretation of the implementation for parallel correct?
How often are the property maps on various nodes synchronized and under
what condition?
Is there any synchronisation with respect to the queues ?
please help,
regards,
-Mahesh
From: Marcin Zalewski
On Thu, Jul 30, 2015 at 12:17 PM Mahesh Barve
Hi Marcin,
Thanks for the reply.
1. Which are the files in the source tree where I can get the details of implementation of queue,graph and the property map for serial as well as distributed (MPI ) cases.
BFS implementation is here: https://github.com/boostorg/graph_parallel/blob/master/include/boost/graph/d... You can call it with your own queue: https://github.com/boostorg/graph_parallel/blob/master/include/boost/graph/d... or without: https://github.com/boostorg/graph_parallel/blob/master/include/boost/graph/d... When you call it without the queue, one is created for you: https://github.com/boostorg/graph_parallel/blob/master/include/boost/graph/d... Here is the queue header file: https://github.com/boostorg/graph_parallel/blob/master/include/boost/graph/d... and implementation: https://github.com/boostorg/graph_parallel/blob/master/include/boost/graph/d... Any other implementation you want to see can be found by browsing the linked repository.
Also, I had the following query.
With reference to document
http://www.boost.org/doc/libs/1_58_0/libs/graph_parallel/doc/html/overview.h... For parallel version of BFS, the steps that I inferred are as below. Could you please correct me in this:
1. The root node process reads the input file which is in the metix format. 2. The root node process creates the adjacency list representation of the graph.
http://www.boost.org/doc/libs/1_58_0/libs/graph_parallel/doc/html/distribute... Use the documentation above. If you will create the adjacency_list using the edge list constructor, you should call the constructor on every node with an equivalent list of edges. So probably you should read in your graph on every node, or distribute it from the root node. While the documentation says that every node needs an equivalent list of edges, if you look at the implementation you can get away with just the list of edges for the particular node, but you have to be careful about it.
3. The root node process creates the property map and "ghost cells" of the graph. The various entries are marked as white corresponding to each of the node. 4. The root node process will distribute the adjacency list and property map to each of the processes.
Everything must happen on all nodes. Here is an example of how to run BFS: https://github.com/boostorg/graph_parallel/blob/master/example/breadth_first... The same program is run in every MPI process (SPMD). The example will probably answer most of your questions.
5. Each of the non-root nodes now has its own copy of adjacency list and property map along with details of the source node where the BFS is to start 6. Now the process having the source node starts the traversal of the graph. 7. The source node is put in a queue. 8. The source node will dequeue the source node and mark all its neighbors as gray and put them in queue. Also it will mark itself as black. 9. The property maps on each of the nodes will now be updated 10. Each of the node processes will be monitoring its own property map If any of the processes finds a node marked as gray, it will enqueue that particular node and start its own traversal locally. 11. Thus on every node the property map will be marked as gray and then eventually black after visit is complete. The property maps will be synchronised. 12. The steps from 8 to 11 are repeated till all the entries in every property map are marked as black.
Is my interpretation of the implementation for parallel correct?
Maybe I misunderstood your question. Are you implementing your own BFS or using the Boost one?
How often are the property maps on various nodes synchronized and under what condition?
Is there any synchronisation with respect to the queues ?
Yes, check the linked implementation.
please help, regards, -Mahesh
From: Marcin Zalewski
To: boost-users@lists.boost.org Date: 07/30/2015 10:03 AM Subject: Re: [Boost-users] Details of boost parallel graph implementation Sent by: "Boost-users" ------------------------------ On Wed, Jul 29, 2015 at 4:26 AM Mayank Bakshi <*mayank.bakshi@tcs.com*
> wrote: Hello, I am working on mpi implementation of boost for breadth first search. What is the algorithmic detail for BFS in boost.mpi? How the graph is processed by each rank which we provide in mpirun (load distribution among different processes)?
The sequential BGL library is overloaded on distributed graphs:
*https://github.com/boostorg/graph/blob/master/include/boost/graph/breadth_fi... https://github.com/boostorg/graph/blob/master/include/boost/graph/breadth_fi...
Look for the mpl::true_ and mpl::bool_ to see where distributed vs. not-distributed selection is made. graph_parallel provides some additional overloads of the helper, but it does not define a brand new BFS:
*https://github.com/boostorg/graph_parallel/blob/master/include/boost/graph/d... https://github.com/boostorg/graph_parallel/blob/master/include/boost/graph/d...
The whole magic happens in the queue, graph, and property maps data structures. The queue distributes work, and it also implements a global "empty" check, and the property maps allow accessing remote vertices. So, if you need MPI BFS, one is already there. _______________________________________________
Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
=====-----=====-----===== Notice: The information contained in this e-mail message and/or attachments to it may contain confidential or privileged information. If you are not the intended recipient, any dissemination, use, review, distribution, printing or copying of the information contained in this e-mail message and/or attachments to it are strictly prohibited. If you have received this communication in error, please notify us by reply e-mail or telephone and immediately and permanently delete the message and any attachments. Thank you
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
Hi Marcin,
Thanks for the reply.
1. Which are the files in the source tree where I can get the details of
implementation of queue,graph and the property map for serial as well as
distributed (MPI ) cases.
Is there any document that gives details of the implementation?
2. How often are the property maps on various nodes synchronized and under
what condition?
3. Is there any synchronisation with respect to the queues ? Where can I
get more details of these?
please help,
regards,
-Mahesh
From: Marcin Zalewski
participants (3)
-
Mahesh Barve
-
Marcin Zalewski
-
Mayank Bakshi