[graph] dijkstra pull request ping

older
Boost.Log 1.59.0 incompatible with...

Piotr Wygocki

20 Oct 2015 20 Oct '15

2:24 p.m.

Dear all, this mail concerns this pull request: https://github.com/boostorg/graph/pull/13 which is currently over one year long. Maybe, the idea of the patch is unclear or needs some additional discussion. The patch solves the problem which arises when using no_init version on dijkstra algorithm and passing a custom distance_map. This algorithm is implemented as call to BFS, which is very unfortunate since it is not BFS. Maybe BFS could be seen as special case of dijkstra with custom priority queue but not the other way around. The current implementation introduces some inefficiencies, which are described in the pull request's comment. I've created very short and direct patch which makes the implementation straight forward. Additionally, the unit test is added to check mentioned efficiency gain. Are there any additional questions concerning this pull request? Could it be possibly be merged to master? Maybe, we can use a favour of the community maintenance team. Regards, Piotr Wygocki

Show replies by date

Tim Keitt

20 Oct 20 Oct

4:21 p.m.

I'm not a maintainer, but was trying to ascertain whether the behavior is changed at all. Does it return an identical result in all cases? THK http://www.keittlab.org/ On Tue, Oct 20, 2015 at 9:24 AM, Piotr Wygocki <wygos@mimuw.edu.pl> wrote:

...

Dear all, this mail concerns this pull request:

https://github.com/boostorg/graph/pull/13

which is currently over one year long. Maybe, the idea of the patch is unclear or needs some additional discussion.

The patch solves the problem which arises when using no_init version on dijkstra algorithm and passing a custom distance_map. This algorithm is implemented as call to BFS, which is very unfortunate since it is not BFS. Maybe BFS could be seen as special case of dijkstra with custom priority queue but not the other way around. The current implementation introduces some inefficiencies, which are described in the pull request's comment.

I've created very short and direct patch which makes the implementation straight forward. Additionally, the unit test is added to check mentioned efficiency gain.

Are there any additional questions concerning this pull request? Could it be possibly be merged to master? Maybe, we can use a favour of the community maintenance team.

Regards,

Piotr Wygocki

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Piotr Wygocki

21 Oct 21 Oct

6:58 a.m.

Dear Tim,

...

Does it return an identical result in all cases?

No, it does not. I provided the test, which fails without my patch.

Regards, Piotr

Tim Keitt

1:59 p.m.

http://www.keittlab.org/ On Wed, Oct 21, 2015 at 1:58 AM, Piotr Wygocki <wygos@mimuw.edu.pl> wrote:

...

Dear Tim,

...
Does it return an identical result in all cases?

No, it does not. I provided the test, which fails without my patch.

Sorry. I mis-typed. I meant would a user ever need to know that the algorithm code was different or are there cases where your change would break existing code? THK

...

Regards,

Piotr

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

alex

11:26 a.m.

...

Dear all, this mail concerns this pull request:

https://github.com/boostorg/graph/pull/13

which is currently over one year long. Maybe, the idea of the patch is unclear or needs some additional discussion.

I think it does. The idea of your patch is to be able to specify a maximum distance for each node. As long as the graph distance is above that maximum distance the node does not get added to the queue. I can see that is useful, but it is not Dijkstra's algorithm. Could you just make your own function dijkstra_with_max_node_distance() or something like that?

...

The patch solves the problem which arises when using no_init version on dijkstra algorithm and passing a custom distance_map. This algorithm is implemented as call to BFS, which is very unfortunate since it is not BFS. Maybe BFS could be seen as special case of dijkstra with custom priority queue but not the other way around.

This does not seem a real problem. The BFS function is well suited to Dijkstra's algorithm, perhaps the only problem is that its name is not fully appropriate.

...

The current implementation introduces some inefficiencies, which are described in the pull request's comment.

I've created very short and direct patch which makes the implementation straight forward.

It does not really seem short as it is duplicating most of the BFS code.

...

Additionally, the unit test is added to check mentioned efficiency gain.

It is efficient for another problem than Dijkstra's.

Piotr Wygocki

1:22 p.m.

Dear Alex, Thank you for your feedback! I can see that is useful, but it is not

...

Dijkstra's algorithm.

Could you give the reference to the "real" Dijkstra algorithm? We are talking about implementation detail which was not concerned at all in the original Dijkstra paper. Note, that standard dijkstra call with default distance map works exactly the same with or without my patch. IMHO, the current conception (without my patch) is caused only by using BFS abstraction. Try to modify my code to achieve previous functionality and you find it really awkward.

...

This does not seem a real problem. The BFS function is well suited to Dijkstra's algorithm, perhaps the only problem is that its name is not fully appropriate.

It is important to make proper abstractions. As I previously said BFS could be expressed as Dijkstra with custom priority queue implemented as FIFO. It does not work the other way around. Dijkstra is NOT a special kind of BFS. Implementation happens to be similar, which was used in this case (which is bad IMO). Regards, Piotr

alex

3:33 p.m.

...

I can see that is useful, but it is not

...
Dijkstra's algorithm.

Could you give the reference to the "real" Dijkstra algorithm? We are talking about implementation detail which was not concerned at all in the original Dijkstra paper.

Dijkstra (1959) A Note on Two Problems in Connexion with Graphs: "If the node R belongs to set C, it is added to set B " Boost: if (v_color == Color::white()) { put(color, v, Color::gray()); Patch: if(decreased) { if (v_color == Color::white()) { put(color, v, Color::gray()); Now, I know that in the classical usage 'decreased' is always true, so there is no change in behaviour. I also know that the way that Boost implements Dijkstra, it evaluates 'decreased' anyway despite knowing beforehand it must be true. So you are not proposing to change behaviour or to make the function less efficient. But, I think BGL shouldn't evaluate 'decreased' to begin with (https://svn.boost.org/trac/boost/ticket/7387).

...

Note, that standard dijkstra call with default distance map works exactly the same with or without my patch. IMHO, the current conception (without my patch) is caused only by using BFS abstraction. Try to modify my code to achieve previous functionality and you find it really awkward.

I think one solution would be to use a visitor with a reference to the queue and to a map indicating precalculated nodes. You could then use the edge_relaxed() callback with void edge_relaxed(G& , E& e) { Vertev t = target(e); if( get(is_precalculated, t ) { queue.push(t) put(is_precalculated, t, false); } To use this you would need to initialize the color_map such that the color for precalculated nodes is gray.

...

...
This does not seem a real problem. The BFS function is well suited to Dijkstra's algorithm, perhaps the only problem is that its name is not fully appropriate.

It is important to make proper abstractions. As I previously said BFS could be expressed as Dijkstra with custom priority queue implemented as FIFO. It does not work the other way around. Dijkstra is NOT a special kind of BFS. Implementation happens to be similar, which was used in this case (which is bad IMO).

To me it looks like a good abstraction with a bad name.

Marcin Zalewski

8:15 p.m.

On Wed, Oct 21, 2015 at 11:33 AM alex <alexhighviz@hotmail.com> wrote:

...

...
I can see that is useful, but it is not

...
Dijkstra's algorithm.

Could you give the reference to the "real" Dijkstra algorithm? We are talking about implementation detail which was not concerned at all in the original Dijkstra paper.

Dijkstra (1959) A Note on Two Problems in Connexion with Graphs:

"If the node R belongs to set C, it is added to set B "

Boost:

if (v_color == Color::white()) { put(color, v, Color::gray());

Patch:

if(decreased) { if (v_color == Color::white()) { put(color, v, Color::gray());

Now, I know that in the classical usage 'decreased' is always true, so there is no change in behaviour. I also know that the way that Boost implements Dijkstra, it evaluates 'decreased' anyway despite knowing beforehand it must be true. So you are not proposing to change behaviour or to make the function less efficient. But, I think BGL shouldn't evaluate 'decreased' to begin with (https://svn.boost.org/trac/boost/ticket/7387).

The edge has to be relaxed, so we get decreased for free anyway, right? What do you mean by not evaluating decreased? Then we use decreased to check which visitor function we should call, so I am not sure if one could get rid of that without having two versions of the code (one for the "classical" version, and one for an arbitrary distance map).

...

...
Note, that standard dijkstra call with default distance map works exactly the same with or without my patch. IMHO, the current conception (without my patch) is caused only by using BFS abstraction. Try to modify my code to achieve previous functionality and you find it really awkward.

I think one solution would be to use a visitor with a reference to the queue and to a map indicating precalculated nodes.

You could then use the edge_relaxed() callback with void edge_relaxed(G& , E& e) { Vertev t = target(e); if( get(is_precalculated, t ) { queue.push(t) put(is_precalculated, t, false); }

To use this you would need to initialize the color_map such that the color for precalculated nodes is gray.

This sounds a bit complicated overall.

alex

9:55 p.m.

...

-----Original Message----- From: Boost [mailto:boost-bounces@lists.boost.org] On Behalf Of Marcin Zalewski Sent: 21 October 2015 21:16 To: boost@lists.boost.org Subject: Re: [boost] [graph] dijkstra pull request ping

...
Now, I know that in the classical usage 'decreased' is always true, so there is no change in behaviour. I also know that the way that Boost implements Dijkstra, it evaluates 'decreased' anyway despite knowing beforehand it must be true. So you are not proposing to change behaviour or to make the function less efficient. But, I think BGL shouldn't evaluate 'decreased' to begin with (https://svn.boost.org/trac/boost/ticket/7387).

The edge has to be relaxed, so we get decreased for free anyway, right?

The edge has to be relaxed, but the current boost code still applies a comparison to see that it is relaxed, which is wasteful (that is ticket 7387),

...

What do you mean by not evaluating decreased? Then we use decreased to check which visitor function we should call, so I am not sure if one could get rid of that without having two versions of the code (one for the "classical" version, and one for an arbitrary distance map).

Sorry I am not clear. Yes, I do think that you should get rid of the redundant check. And, yes that would mean that if you want the arbitrary distance map, you should have two versions.

...

...
Snip complicated stuff

This sounds a bit complicated overall.

Yes it is. I think it would be better to have two versions.

Marcin Zalewski

22 Oct 22 Oct

1:11 p.m.

On Wed, Oct 21, 2015 at 5:55 PM alex <alexhighviz@hotmail.com> wrote:

...

...
What do you mean by not evaluating decreased? Then we use decreased to check which visitor function we should call, so I am not sure if one could get rid of that without having two versions of the code (one for the "classical" version, and one for an arbitrary distance map).

Sorry I am not clear. Yes, I do think that you should get rid of the redundant check. And, yes that would mean that if you want the arbitrary distance map, you should have two versions.

I would only agree with that if we could demonstrate that there is a statistically significant performance advantage of doing so. Looking at the code, I am skeptical that this check has much or any impact on performance on most modern platforms. Do you have any evidence that shows there would be a benefit to having two versions of the code?

alex

8:55 p.m.

...

...
...
What do you mean by not evaluating decreased? Then we use decreased to check which visitor function we should call, so I am not sure if one could get rid of that without having two versions of the code (one for the "classical" version, and one for an arbitrary distance map).

Sorry I am not clear. Yes, I do think that you should get rid of the redundant check. And, yes that would mean that if you want the arbitrary distance map, you should have two versions.

I would only agree with that if we could demonstrate that there is a statistically significant performance advantage of doing so. Looking at the code, I am skeptical that this check has much or any impact on performance on most modern platforms. Do you have any evidence that shows there would be a benefit to having two versions of the code?

There are the following advantages: 1. you avoid one unnecessary check for each vertex 2. you do not need to initialize values in the distance map 3. you do not need to specify an "inf" value for distance (1) and (2) should bring some performance gain, but it is so little that I did not notice it and did not feel the urge to measure it. I don't think performance is a good reason to take out the redundant check. (2) and (3) however make the algorithm simpler and cleaner to use. It becomes easier / possible to apply the method when there is no good value for "inf". I do think that is a good reason to take out the redundant check. If you take out the redundant checks in the relax function, you no longer need to use boost::closed_plus and can do with std::plus. Again, to me that makes the function simpler and cleaner. But that is a separate issue, because you can do that whether you make two separate functions or not.

Marcin Zalewski

23 Oct 23 Oct

1:32 p.m.

On Thu, Oct 22, 2015 at 4:56 PM alex <alexhighviz@hotmail.com> wrote:

...

...
...
...
What do you mean by not evaluating decreased? Then we use decreased to check which visitor function we should call, so I am not sure if one could get rid of that without having two versions of the code (one for the "classical" version, and one for an arbitrary distance map).

Sorry I am not clear. Yes, I do think that you should get rid of the redundant check. And, yes that would mean that if you want the arbitrary distance map, you should have two versions.

I would only agree with that if we could demonstrate that there is a statistically significant performance advantage of doing so. Looking at the code, I am skeptical that this check has much or any impact on performance on most modern platforms. Do you have any evidence that shows there would be a benefit to having two versions of the code?

There are the following advantages:

1. you avoid one unnecessary check for each vertex 2. you do not need to initialize values in the distance map 3. you do not need to specify an "inf" value for distance

(1) and (2) should bring some performance gain, but it is so little that I did not notice it and did not feel the urge to measure it. I don't think performance is a good reason to take out the redundant check.

The check only becomes redundant if there are two versions of the code, one for dijkstra_shortest_paths_no_init and one for dijkstra_shortest_paths. The code that covers dijkstra_shortest_paths_no_init automatically covers dijkstra_shortest_paths. There is an advantage to it, and if one would provide separate code for the two, there should be a compelling reason. In my opinion, in this case the reason should be performance. Otherwise, why do this?

...

(2) and (3) however make the algorithm simpler and cleaner to use. It becomes easier / possible to apply the method when there is no good value for "inf". I do think that is a good reason to take out the redundant check.

Yes, I agree with you that it makes *one version* of the algorithm cleaner, but we have to make a decision that we want to provide that version in the first place. I think that providing an elegant version at the cost of two different implementations is actually additional complexity.

...

If you take out the redundant checks in the relax function, you no longer need to use boost::closed_plus and can do with std::plus. Again, to me that makes the function simpler and cleaner. But that is a separate issue, because you can do that whether you make two separate functions or not.

True, agreed. Still, the same as above: do we want to have two implementations instead of one?

alex

2:47 p.m.

...

-----Original Message----- From: Boost [mailto:boost-bounces@lists.boost.org] On Behalf Of Marcin Zalewski Sent: 23 October 2015 14:33 To: boost@lists.boost.org Subject: Re: [boost] [graph] dijkstra pull request ping

On Thu, Oct 22, 2015 at 4:56 PM alex <alexhighviz@hotmail.com> wrote:

...
...
...
...
What do you mean by not evaluating decreased? Then we use decreased to check which visitor function we should call, so I am not sure if one could get rid of that without having two versions of the code (one for the "classical" version, and one for an arbitrary distance map).

Sorry I am not clear. Yes, I do think that you should get rid of the redundant check. And, yes that would mean that if you want the arbitrary distance map, you should have two versions.

I would only agree with that if we could demonstrate that there is a statistically significant performance advantage of doing so. Looking at the code, I am skeptical that this check has much or any impact on performance on most modern platforms. Do you have any evidence that shows there would be a benefit to having two versions of the code?

There are the following advantages:

1. you avoid one unnecessary check for each vertex 2. you do not need to initialize values in the distance map 3. you do not need to specify an "inf" value for distance

(1) and (2) should bring some performance gain, but it is so little that I did not notice it and did not feel the urge to measure it. I don't think performance is a good reason to take out the redundant check.

The check only becomes redundant if there are two versions of the code, one for dijkstra_shortest_paths_no_init and one for dijkstra_shortest_paths. The code that covers dijkstra_shortest_paths_no_init automatically covers dijkstra_shortest_paths.

Now you lost me. I thought the dilemma is to have either one version of dijkstra_shortest_paths_no_init based on the patch by Piotr, or two have two versions, one as in the patch by Piotr and one as currently implemented. In either case, dijkstra_shortest_paths can remain just as it is. The patch by Piotr stands in the way of some other optimizations that were proposed 3 years ago. The benefit of this optimizations in terms of performance is only marginal, but they do make the dijkstra_shortest_path simpler and cleaner. Also, to me it seems his patch is no longer really Dijkstra's but some extension of it. To me it is natural to make that a separate function and give it another name. However since Piotr's patch doesn't break anything, but just extends the function, I can see that it is really just an issue of preference.

...

There is an advantage to it, and if one would provide separate code for the two, there should be a compelling reason. In my opinion, in this case the reason should be performance. Otherwise, why do this?

...
(2) and (3) however make the algorithm simpler and cleaner to use. It becomes easier / possible to apply the method when there is no good value for "inf". I do think that is a good reason to take out the redundant check.

Yes, I agree with you that it makes *one version* of the algorithm cleaner, but we have to make a decision that we want to provide that version in the first place. I think that providing an elegant version at the cost of two different implementations is actually additional complexity.

In my opinion it would not really be two versions of the same algorithm, but two functions for two similar algorithms. You can see that they are two different algorithms because in the application that Piotr references (http://paal.mimuw.edu.pl/docs/do_vv_2kminus1.html) it is used as a means of computing approximate shortest paths rather than exact shortest paths (if I understand correctly, because the precalculated distances supplied to the algorithm are approximate rather than exact).

...

...
If you take out the redundant checks in the relax function, you no longer need to use boost::closed_plus and can do with std::plus. Again, to me

that

...

...
makes the function simpler and cleaner. But that is a separate issue, because you can do that whether you make two separate functions or not.

True, agreed. Still, the same as above: do we want to have two implementations instead of one?

In this case there is not really that dilemma. All that is required here is an additional variation of the relax function (relax_target_only), but that is an implementation detail. This works with and without Piotr's patch.

...

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Marcin Zalewski

3:05 p.m.

On Fri, Oct 23, 2015 at 10:47 AM alex <alexhighviz@hotmail.com> wrote:

...

...
-----Original Message----- From: Boost [mailto:boost-bounces@lists.boost.org] On Behalf Of Marcin Zalewski Sent: 23 October 2015 14:33 To: boost@lists.boost.org Subject: Re: [boost] [graph] dijkstra pull request ping

On Thu, Oct 22, 2015 at 4:56 PM alex <alexhighviz@hotmail.com> wrote:

...
...
...
...
What do you mean by not evaluating decreased? Then we use decreased to check which visitor function we should call, so I am not sure if one could get rid of that without having two versions of the code (one for the "classical" version, and one for an arbitrary distance map).

Sorry I am not clear. Yes, I do think that you should get rid of the redundant check. And, yes that would mean that if you want the arbitrary distance map, you should have two versions.

I would only agree with that if we could demonstrate that there is a statistically significant performance advantage of doing so. Looking at the code, I am skeptical that this check has much or any impact on performance on most modern platforms. Do you have any evidence that shows there would be a benefit to having two versions of the code?

There are the following advantages:

1. you avoid one unnecessary check for each vertex 2. you do not need to initialize values in the distance map 3. you do not need to specify an "inf" value for distance

(1) and (2) should bring some performance gain, but it is so little that I did not notice it and did not feel the urge to measure it. I don't think performance is a good reason to take out the redundant check.

The check only becomes redundant if there are two versions of the code, one for dijkstra_shortest_paths_no_init and one for dijkstra_shortest_paths. The code that covers dijkstra_shortest_paths_no_init automatically covers dijkstra_shortest_paths.

Now you lost me. I thought the dilemma is to have either one version of dijkstra_shortest_paths_no_init based on the patch by Piotr, or two have two versions, one as in the patch by Piotr and one as currently implemented. In either case, dijkstra_shortest_paths can remain just as it is.

The patch by Piotr stands in the way of some other optimizations that were proposed 3 years ago. The benefit of this optimizations in terms of performance is only marginal, but they do make the dijkstra_shortest_path simpler and cleaner. Also, to me it seems his patch is no longer really Dijkstra's but some extension of it. To me it is natural to make that a separate function and give it another name.

However since Piotr's patch doesn't break anything, but just extends the function, I can see that it is really just an issue of preference.

The current implementation works both for the case of no_init where the distance property map has values in it already and for the "plain" case where we initialize the values to inf. Piotr's patch does not change that. He removes the visitor code and adds a direct loop code, but there is still only one version of the actual traversal. His patch makes the code consistent with the pseudo code here: http://www.boost.org/doc/libs/1_59_0/libs/graph/doc/dijkstra_shortest_paths.... Note that in the pseudo code discovery of a vertex happens only if the new distance is less than the old distance. If we would go with the ideas you describe, the code would not work for arbitrary distance map. There are 2 cases: 1) All values in the distance map are inf (or are implicitly inf). 2) User provides a distance map with arbitrary values and they must be respected per pseudo code. As far as I understand, you want to do something specific for case 1) where we do not need to do the check in the pseudo code because we know it will always be true, right? If we did so, we need to have code that also works for case 2). Right now we just have the code for case 2) and it also works for case 1). Let me know if I misunderstood something.

...

...
There is an advantage to it, and if one would provide separate code for the two, there should be a compelling reason. In my opinion, in this case the reason should be performance. Otherwise, why do this?

...
(2) and (3) however make the algorithm simpler and cleaner to use. It becomes easier / possible to apply the method when there is no good value for "inf". I do think that is a good reason to take out the redundant check.

Yes, I agree with you that it makes *one version* of the algorithm cleaner, but we have to make a decision that we want to provide that version in the first place. I think that providing an elegant version at the cost of two different implementations is actually additional complexity.

In my opinion it would not really be two versions of the same algorithm, but two functions for two similar algorithms. You can see that they are two different algorithms because in the application that Piotr references (http://paal.mimuw.edu.pl/docs/do_vv_2kminus1.html) it is used as a means of computing approximate shortest paths rather than exact shortest paths (if I understand correctly, because the precalculated distances supplied to the algorithm are approximate rather than exact).

I think it is the same function. It implements pseudo code in: http://www.boost.org/doc/libs/1_59_0/libs/graph/doc/dijkstra_shortest_paths.... The only difference in the two algorithms you describe is whether the distance map is provided by the user or whether it is initialized to inf for all vertices. We could write different code for the user-initialized and the inf-initialized maps, but that would give us more code to maintain. Piotr's implementation actually implements the check condition that can be seen in pseudo code before checking for WHITE.

...

...
...
If you take out the redundant checks in the relax function, you no

...
...
need to use boost::closed_plus and can do with std::plus. Again, to me

longer that

...
...
makes the function simpler and cleaner. But that is a separate issue, because you can do that whether you make two separate functions or not.

True, agreed. Still, the same as above: do we want to have two implementations instead of one?

In this case there is not really that dilemma. All that is required here is an additional variation of the relax function (relax_target_only), but that is an implementation detail. This works with and without Piotr's patch.

I am not sure about that, but it could be the case. Perhaps a patch could demonstrate that.

...

...
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

alex

4:49 p.m.

...

Note that in the pseudo code discovery of a vertex happens only if the new distance is less than the old distance.

No, the pseudo code is based on the implied precondition that all distances are infinity. And on that condition the pseudo code and the actual code are equivalent. Note that this implied precondition is also reflected in the description of post-conditions on the same page: "Dijkstra's algorithm finds all the shortest paths from the source vertex to every other vertex" "Upon completion of the algorithm, the edges (p[u],u) for all u in V are in the tree" "If p[u] = u then u is either the source vertex or a vertex that is not reachable from the source" "The shortest path weight is the sum of the edge weights along the shortest path" "At the end of the algorithm, vertices reachable from the source vertex will have been colored black. All other vertices will still be white" None of these are upheld for a distance map with arbitrary values.

...

If we would go with the ideas you describe, the code would not work for arbitrary distance map. There are 2 cases:

1) All values in the distance map are inf (or are implicitly inf). 2) User provides a distance map with arbitrary values and they must be respected per pseudo code.

I disagree, I don't think case 2 is a valid case for Dijkstra's shortest paths. At the moment using it like that gives inconsistent results. I don't think the purpose of the no_init version is to allow arbitrary distance values.

...

As far as I understand, you want to do something specific for case 1) where we do not need to do the check in the pseudo code because we know it will always be true, right? If we did so, we need to have code that also works for case 2). Right now we just have the code for case 2) and it also works for case 1). Let me know if I misunderstood something.

If you mean Piotr's patch, yes you understood correct. If you mean the current Boost version: you misunderstood, this is just for case 1.

Marcin Zalewski

27 Oct 27 Oct

8:45 p.m.

On Fri, Oct 23, 2015 at 12:49 PM alex <alexhighviz@hotmail.com> wrote:

...

...
Note that in the pseudo code discovery of a vertex happens only if the new distance is less than the old distance.

No, the pseudo code is based on the implied precondition that all distances are infinity. And on that condition the pseudo code and the actual code are equivalent.

I don't believe in "implied" assumptions. A comment in the pseudo code says: " *This loop is not run in dijkstra_shortest_paths_no_init"* I think that there are strictly two cases, one in which the loop is run and the other in which it is not. When the loop is run, the algorithm can be optimized in the way you have described previously (i.e., the check for improvement of distance may be removed when a vertex is white).

...

Note that this implied precondition is also reflected in the description of post-conditions on the same page:

"Dijkstra's algorithm finds all the shortest paths from the source vertex to every other vertex"

That sentence continues with "by" and the rest of the paragraph which specifies what is actually supposed to hapen.

...

"Upon completion of the algorithm, the edges (p[u],u) for all u in V are in the tree"

That just says what the tree is.

...

"If p[u] = u then u is either the source vertex or a vertex that is not reachable from the source"

That is probably tough. With the no_init version one can have vertices without a parent that are still reachable from source. I would not take this as restricting the possible values of the distance map in the no_init version. I think that by specifying a custom distance map, one can basically modify what is reachable and what is not.

...

"The shortest path weight is the sum of the edge weights along the shortest path"

That is still true, modulo the meaning of the word reachable.

...

"At the end of the algorithm, vertices reachable from the source vertex will have been colored black. All other vertices will still be white"

Again, modulo reachable.

...

None of these are upheld for a distance map with arbitrary values.

They mostly are, except that one can make some vertices unreachable. I think that all the post-conditions you have listed here are not strong enough to forbid arbitrary values in the distance map for the no_init version of the algorithm. I myself would assume by reading the pseudo code and the documentation that there is nothing preventing me from using the no_init version of the algorithm along with a custom distance map to control traversal. I think that forbidding custom distance map at this point would not be backward compatible, but it would also be counterproductive, as there is really not a good reason to do so. If anything, the documentation should be made more unambiguous on the point of what are the possible inputs for the distance map.

...

...
If we would go with the ideas you describe, the code would not work for arbitrary distance map. There are 2 cases:

1) All values in the distance map are inf (or are implicitly inf). 2) User provides a distance map with arbitrary values and they must be respected per pseudo code.

I disagree, I don't think case 2 is a valid case for Dijkstra's shortest paths. At the moment using it like that gives inconsistent results. I don't think the purpose of the no_init version is to allow arbitrary distance values.

OK, so here we disagree. I think that this is precisely the reason for the no_init version. Why would we need it otherwise? Besides, I think that there is no reason to forbid such a usage, and this is similar to providing a preinitialized color map to BFS. The documentation is quite clear to what the algorithm actually does, and the pseudo code is not broken by allowing arbitrary distance maps as inputs. I don't see a compelling reason to forbid controlling the behavior of the algorithm by providing a custom distance map. On the other hand, if there was more voices from the users that would want to eliminate such a use, we could do that, but I am not sure how this improves the library. I do agree though that the documentation of the no_init features could be improved.

...

...
As far as I understand, you want to do something specific for case 1) where we do not need to do the check in the pseudo code because we know it will always be true, right? If we did so, we need to have code that also works for case 2). Right now we just have the code for case 2) and it also works for case 1). Let me know if I misunderstood something.

If you mean Piotr's patch, yes you understood correct. If you mean the current Boost version: you misunderstood, this is just for case 1.

This point would depend on what decision we make on the discussion above. If we allow arbitrary distance maps, then we would two versions of code to provide the optimization you propose.

...

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

alex

28 Oct 28 Oct

9:22 p.m.

...

I don't believe in "implied" assumptions. A comment in the pseudo code says:

Ok, I think it is then best to just accept Piotr's patch and forget about the other (marginal) optimisation that would not work alongside that patch.

...

OK, so here we disagree. I think that this is precisely the reason for the no_init version. Why would we need it otherwise?

I have used the no_init version of the BFS to supply the algorithm with a *consistent* set of precomputed values; i.e. a consistent priority queue, color map and distance map. This allowed me to interrupt and resume shortest path calculations and hence interweave several shortest path calculation. Allowing arbitrary and inconsistent input does not feel right to me. But it is just a feeling and based on "implied" assumption. So you can safely ignore it.

Marcin Zalewski

21 Oct 21 Oct

8:46 p.m.

On Wed, Oct 21, 2015 at 9:23 AM Piotr Wygocki <vwygos@gmail.com> wrote:

...

Dear Alex,

Thank you for your feedback!

I can see that is useful, but it is not

...
Dijkstra's algorithm.

Could you give the reference to the "real" Dijkstra algorithm? We are talking about implementation detail which was not concerned at all in the original Dijkstra paper. Note, that standard dijkstra call with default distance map works exactly the same with or without my patch. IMHO, the current conception (without my patch) is caused only by using BFS abstraction. Try to modify my code to achieve previous functionality and you find it really awkward.

We should probably use the pseudo code in the documentation: http://www.boost.org/doc/libs/1_59_0/libs/graph/doc/dijkstra_shortest_paths.... The pseudo code seems to support your assertion that a vertex should not be discovered if relaxation was not successful. So it seems that the actual implementation is not in agreement with the documentation. Do you agree with that characterization of the issue? If so, your patch does take care of the issue, but we should make sure that this is the best way to go.

...

...
This does not seem a real problem. The BFS function is well suited to Dijkstra's algorithm, perhaps the only problem is that its name is not fully appropriate.

It is important to make proper abstractions. As I previously said BFS could be expressed as Dijkstra with custom priority queue implemented as FIFO. It does not work the other way around. Dijkstra is NOT a special kind of BFS. Implementation happens to be similar, which was used in this case (which is bad IMO).

I think that the perspective here was to reuse breadth_first_visit as a traversal pattern rather than as a BFS. However, as you rightly noted, that fails for the reason that when a white vertex is uncovered, it is unconditionally inserted into the queue. I've been staring at the code trying to come up with a workaround while keeping breadth_first_visit, but it quickly becomes inelegant and defeats the purpose. I think that unless we make some modifications to breadth_first_search, we may have to go with your patch. The only thing I am not sure of is this: https://github.com/boostorg/graph/pull/13/files#diff-55abe8912b7c5c01b7be025... This branch seems to be for testing purposes only to avoid using a relaxed heap, no? In any case, commenting out this if statement should probably be a separate patch with a separate explanation anyway. Does anyone have any ideas on whether there are better solutions than Piotr's patch?

...

Regards,

Piotr

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Piotr Wygocki

22 Oct 22 Oct

7:24 a.m.

...

We should probably use the pseudo code in the documentation:

http://www.boost.org/doc/libs/1_59_0/libs/graph/doc/dijkstra_shortest_paths....

The pseudo code seems to support your assertion that a vertex should not be discovered if relaxation was not successful. So it seems that the actual implementation is not in agreement with the documentation. Do you agree with that characterization of the issue?

Agreed. Alex and Tim suggested two versions of algorithm. We have already 7 versions of Dijkstra's algorithm (one is not documented), having two versions of dijkstra_no_init would introduce another two overloads. IMO this should be done only in case in which we find sensible use cases for both versions. On the other hand, this change might be breaking for some users but I expect this to happen very rarely. Regards, Piotr

Marcin Zalewski

1:21 p.m.

On Thu, Oct 22, 2015 at 3:25 AM Piotr Wygocki <vwygos@gmail.com> wrote:

...

...
We should probably use the pseudo code in the documentation:

http://www.boost.org/doc/libs/1_59_0/libs/graph/doc/dijkstra_shortest_paths....

...
The pseudo code seems to support your assertion that a vertex should not

be

...
discovered if relaxation was not successful. So it seems that the actual implementation is not in agreement with the documentation. Do you agree with that characterization of the issue?

Agreed.

Alex and Tim suggested two versions of algorithm. We have already 7 versions of Dijkstra's algorithm (one is not documented), having two versions of dijkstra_no_init would introduce another two overloads. IMO this should be done only in case in which we find sensible use cases for both versions. On the other hand, this change might be breaking for some users but I expect this to happen very rarely.

I agree that two versions are the worst case scenario that should only happen if we know of benefits of such a solution.

Tim Keitt

21 Oct 21 Oct

1:57 p.m.

http://www.keittlab.org/ On Wed, Oct 21, 2015 at 6:26 AM, alex <alexhighviz@hotmail.com> wrote:

...

...
Dear all, this mail concerns this pull request:

https://github.com/boostorg/graph/pull/13

which is currently over one year long. Maybe, the idea of the patch is unclear or needs some additional discussion.

I think it does.

The idea of your patch is to be able to specify a maximum distance for each node. As long as the graph distance is above that maximum distance the node does not get added to the queue. I can see that is useful, but it is not Dijkstra's algorithm. Could you just make your own function dijkstra_with_max_node_distance() or something like that?

That would be my suggestion as well. THK

...

...
The patch solves the problem which arises when using no_init version on dijkstra algorithm and passing a custom distance_map. This algorithm is implemented as call to BFS, which is very unfortunate since it is not BFS. Maybe BFS could be seen as special case of dijkstra with custom priority queue but not the other way around.

This does not seem a real problem. The BFS function is well suited to Dijkstra's algorithm, perhaps the only problem is that its name is not fully appropriate.

...
The current implementation introduces some inefficiencies, which are described in the pull request's comment.

I've created very short and direct patch which makes the implementation straight forward.

It does not really seem short as it is duplicating most of the BFS code.

...
Additionally, the unit test is added to check mentioned efficiency gain.

It is efficient for another problem than Dijkstra's.

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

3545

Age (days ago)

3553

Last active (days ago)

List overview

Download

20 comments

5 participants

participants (5)

alex
Marcin Zalewski
Piotr Wygocki
Piotr Wygocki
Tim Keitt