Hi, I've been looking around the BGL for some time now, but I still can't decide which graph I should use. Here are my limitations : - Very large graph ( > 40 million vertices) - About 4 to 8 edge per vertex (Sparse) I want to associate to each edge a property P such that : edge (A,B) returns P edge (B,A) returns -P (Directed) BUT I do not want to duplicate the storage of P. If edge (A,B) exists then (B,A) also exists. (Bidirectional) I also want to be able to access all edges connected to vertex A as quickly as possible (so I guess using in_edges and out_edges in a directed adjacency list will be too slow). I think the compressed_sparse_row_graph might be the way to go by storing all duplicate edges (the only implementation for the moment is directedS), but I need a way to not duplicate the properties associated with each edge. All advice will be greatly appreciated !! Thanks Jacques --------------------------- This email contains information that is private and confidential and is intended only for the addressee. If you are not the intended recipient please delete it and notify us immediately by e-mailing the sender. Note: All email sent to or from this address may be accessed by someone other than the recipient, for system management and security reasons. Aircraft Research Association Ltd. Registered in England, Registration No 503668 Registered Office: Manton Lane, Bedford MK41 7PF England VAT No GB 196351245
Hi,
On 9/12/07, Jacques Papper
Hi,
I've been looking around the BGL for some time now, but I still can't decide which graph I should use.
Here are my limitations :
- Very large graph ( > 40 million vertices) - About 4 to 8 edge per vertex (Sparse)
I want to associate to each edge a property P such that : edge (A,B) returns P edge (B,A) returns -P (Directed) BUT I do not want to duplicate the storage of P.
If edge (A,B) exists then (B,A) also exists. (Bidirectional)
I also want to be able to access all edges connected to vertex A as quickly as possible (so I guess using in_edges and out_edges in a directed adjacency list will be too slow).
I think the compressed_sparse_row_graph might be the way to go by storing all duplicate edges (the only implementation for the moment is directedS), but I need a way to not duplicate the properties associated with each edge.
All advice will be greatly appreciated !!
I agree with you, the only graph type you can use in BGL with very large graphs is the compressed_sparse_row_graph. You may use the singleton pattern for properties associated with each edge, so you can avoid creating multiple instances of property object. Otherwise, I use the BCG graph (http://www.inrialpes.fr/vasy/cadp/man/bcg.html) to handle very large graphs, but this graph format is not documented yet (a technical report is on-going), so I have to use the BCG environment providing by the CADP toolbox. Regards, -- Johan
On Sep 12, 2007, at 10:19 AM, Johan Oudinet wrote:
pper@ara.co.uk> wrote:
Here are my limitations :
- Very large graph ( > 40 million vertices) - About 4 to 8 edge per vertex (Sparse)
That should be feasible with compressed_sparse_row_graph. 40M vertices * 8 bytes per vertex = 320MB for vertex storage 40M vertices * 8 edges per vertex * 4 bytes per edge = 1.28GB
I want to associate to each edge a property P such that : edge (A,B) returns P edge (B,A) returns -P (Directed) BUT I do not want to duplicate the storage of P.
If edge (A,B) exists then (B,A) also exists. (Bidirectional)
Well, you can have both (A, B) and (B, A) hold pointers to a P object on the heap, along with a flag that states which edge is the forward edge. If the compressed sparse row graph supported undirectedS or bidirectionalS, one of those would work well for you.
I also want to be able to access all edges connected to vertex A as quickly as possible (so I guess using in_edges and out_edges in a directed adjacency list will be too slow).
Why will in_edges and out_edges be too slow? You're just iterating over a vector for each, or moving through indices in an array.
I think the compressed_sparse_row_graph might be the way to go by storing all duplicate edges (the only implementation for the moment is directedS), but I need a way to not duplicate the properties associated with each edge.
All advice will be greatly appreciated !!
I agree with you, the only graph type you can use in BGL with very large graphs is the compressed_sparse_row_graph.
Right. adjacency_list has a bit too much overhead.
You may use the singleton pattern for properties associated with each edge, so you can avoid creating multiple instances of property object.
Otherwise, I use the BCG graph (http://www.inrialpes.fr/vasy/cadp/man/bcg.html) to handle very large graphs, but this graph format is not documented yet (a technical report is on-going), so I have to use the BCG environment providing by the CADP toolbox.
An undocumented format... That's unfortunate, since it precludes any interoperability with the rest of the world. There is a highly-compressed graph representation that is compatible with the BGL, here: http://homer.informatics.indiana.edu/~nan/webgraph/ It's targeted a web graphs, but it may be useful for this. - Doug
Hi, I want to solve a shortest path problem using the Dijkstra algorithm with Fibonacci heaps. The particular aspect of the problem is that the labels associated to the vertices other than the source are not infinity. The labels are different each time the Dijkstra procedure is called. I would be extremely grateful if I could get any help on how to implement this within a VS2003 and what are the libraries that are available to handle the issue. Thank you in advance. Alberto PS: I am not only new to Boost but also new to C++ programming. _________________________________________________________________ Enter to win a night a VIP night out at TIFF http://redcarpet.sympatico.msn.ca/
I want to solve a shortest path problem using the Dijkstra algorithm with Fibonacci heaps. The particular aspect of the problem is that the labels associated to the vertices other than the source are not infinity. The labels are different each time the Dijkstra procedure is called.
Hi, Dijkstra's algorithm is implemented in the boost graph library (BGL, http://www.boost.org/libs/graph/doc/). See http://www.boost.org/libs/graph/doc/dijkstra_shortest_paths.html for details. The default implementation uses an implementation of an Relaxed heap (similiar to Fibonnaci heaps) as priority queue. The algorithm takes as parameter (among others) a data structure for storing the distances. You can initialize this prior to execution and call dijkstra_shortest_paths_no_init to avoid setting all labels to infinity. hth Moritz -- Moritz Hilger Combinatorial Optimization & Graph Algorithms TU Berlin +49 30 314-25773
participants (5)
-
Amar Alabiari
-
Douglas Gregor
-
Jacques Papper
-
Johan Oudinet
-
moritz Hilger