Random permutations using boost::random_number_generator and std::random_shuffle.
Hi,
We need the ability to randomly permute some range, and we already
have this functionality available in the computer algebra system
Maxima. We use Maxima for numerous things, often implementing basic
algorithms in it, then later writing more efficient versions in C++.
The problem we have is that we'd like to be able to make sure that any
function we write to randomly shuffle a range, matches up with the
Maxima "random_permutation" function.
We believe it uses the Mersenne twister (boost::mt19937) and Knuth
shuffle algorithms, and so we thought to implement these in C++ using
boost::uniform_distribution, boost:variate_generator etc.
However, we have two questions (open problems):
1) How does boost::variate_generator interact with mt19937, and the
distribution? Is this documented somewhere?
In the attached code, it seems we can construct
boost::uniform_distribution with any maximum value, and still the
number generator returns values outside of that range? Is this really
right?
2) What algorithm does std::random_shuffle implement? (Not boost
related specifically but hopefully someone will know)
Is this defined by the C++ standard? It doesn't seem to be the simple
Knuth shuffle algorithm (starting from top of range to bottom or vice
versa) as we implement in the attached code. If anyone could point me
in the right direction, I'd very much appreciate it.
Thanks
Matthew Gwynne
http://cs.swan.ac.uk/~csmg/
Code:
#include <vector>
#include <iostream>
#include <algorithm>
#include <iterator>
#include <cassert>
#include
AMDG On 1/19/2011 10:45 AM, Matthew Gwynne wrote:
We need the ability to randomly permute some range, and we already have this functionality available in the computer algebra system Maxima. We use Maxima for numerous things, often implementing basic algorithms in it, then later writing more efficient versions in C++.
The problem we have is that we'd like to be able to make sure that any function we write to randomly shuffle a range, matches up with the Maxima "random_permutation" function.
We believe it uses the Mersenne twister (boost::mt19937) and Knuth shuffle algorithms, and so we thought to implement these in C++ using boost::uniform_distribution, boost:variate_generator etc.
If you want to duplicate the behavior exactly, you may have to write it yourself, except for mt19937 which has precisely specified behavior.
However, we have two questions (open problems):
1) How does boost::variate_generator interact with mt19937, and the distribution? Is this documented somewhere?
In the attached code, it seems we can construct boost::uniform_distribution with any maximum value, and still the number generator returns values outside of that range? Is this really right?
Yes. You don't need to use uniform_int with random_number_generator. You can plug mt19937 directly into random_number_generator.
2) What algorithm does std::random_shuffle implement? (Not boost related specifically but hopefully someone will know)
I suggest that you read the code. It's not very complex.
Is this defined by the C++ standard?
No it isn't.
It doesn't seem to be the simple Knuth shuffle algorithm (starting from top of range to bottom or vice versa) as we implement in the attached code. If anyone could point me in the right direction, I'd very much appreciate it.
In Christ, Steven Watanabe
Hi,
Thanks for the response!
On Wed, Jan 19, 2011 at 8:30 PM, Steven Watanabe
AMDG
On 1/19/2011 10:45 AM, Matthew Gwynne wrote:
We need the ability to randomly permute some range, and we already have this functionality available in the computer algebra system Maxima. We use Maxima for numerous things, often implementing basic algorithms in it, then later writing more efficient versions in C++.
The problem we have is that we'd like to be able to make sure that any function we write to randomly shuffle a range, matches up with the Maxima "random_permutation" function.
We believe it uses the Mersenne twister (boost::mt19937) and Knuth shuffle algorithms, and so we thought to implement these in C++ using boost::uniform_distribution, boost:variate_generator etc.
If you want to duplicate the behavior exactly, you may have to write it yourself, except for mt19937 which has precisely specified behavior.
The problem we now have is that we don't know how the result of mt19937 is being cast down into the number range we are asking for. As far as I am aware, MT19937 only defines how to get a 32 bit integer, not how to then scale that to say the range 1..10. Is it specified clearly how boost::random_number_generator or boost::uniform_distribution does this?
However, we have two questions (open problems):
1) How does boost::variate_generator interact with mt19937, and the distribution? Is this documented somewhere?
In the attached code, it seems we can construct boost::uniform_distribution with any maximum value, and still the number generator returns values outside of that range? Is this really right?
Yes. You don't need to use uniform_int with random_number_generator. You can plug mt19937 directly into random_number_generator.
Thanks! Is the behaviour exactly the same? It seems so, but the documentation seems to suggest using boost::uniform_distribution.
2) What algorithm does std::random_shuffle implement? (Not boost related specifically but hopefully someone will know)
I suggest that you read the code. It's not very complex.
Ah yes! I see what you mean :) Thanks! That was very helpful, I should've looked there first, but didn't have source immediately to hand.
Is this defined by the C++ standard?
No it isn't.
:( Thanks again! Matthew Gwynne http://cs.swan.ac.uk/~csmg/
AMDG On 1/21/2011 1:57 PM, Matthew Gwynne wrote:
The problem we now have is that we don't know how the result of mt19937 is being cast down into the number range we are asking for. As far as I am aware, MT19937 only defines how to get a 32 bit integer, not how to then scale that to say the range 1..10. Is it specified clearly how boost::random_number_generator or boost::uniform_distribution does this?
boost::random_number_generator is implemented using boost::uniform_int and no the algorithm isn't specified. It's considered an implementation detail. If you really want to know the details, the source is in boost/random/uniform_int.hpp. I don't expect the implementation to change again, but no guarantees.
However, we have two questions (open problems):
1) How does boost::variate_generator interact with mt19937, and the distribution? Is this documented somewhere?
In the attached code, it seems we can construct boost::uniform_distribution with any maximum value, and still the number generator returns values outside of that range? Is this really right?
Yes. You don't need to use uniform_int with random_number_generator. You can plug mt19937 directly into random_number_generator.
Thanks! Is the behaviour exactly the same?
It isn't going to be exactly the same. If you use mt19937 directly, then random_number_generator will transform the output of the generator to the desired range. If you use uniform_int, the generator's output will first be transformed to the range that you specified for uniform_int, then again by random_number_generator. The result will still have the correct distribution, but the exact values may be different.
It seems so, but the documentation seems to suggest using boost::uniform_distribution.
Where in the documentation? In Christ, Steven Watanabe
Hi,
On Fri, Jan 21, 2011 at 10:16 PM, Steven Watanabe
AMDG
On 1/21/2011 1:57 PM, Matthew Gwynne wrote:
The problem we now have is that we don't know how the result of mt19937 is being cast down into the number range we are asking for. As far as I am aware, MT19937 only defines how to get a 32 bit integer, not how to then scale that to say the range 1..10. Is it specified clearly how boost::random_number_generator or boost::uniform_distribution does this?
boost::random_number_generator is implemented using boost::uniform_int and no the algorithm isn't specified. It's considered an implementation detail. If you really want to know the details, the source is in boost/random/uniform_int.hpp. I don't expect the implementation to change again, but no guarantees.
Hmm... this seems odd. Surely the point behind having seeds and things like mt19937 precisely specified is that they should be completely reproducible and predictable. If there's no specification, then how can one use such things reliably. I guess we'll either have to use it and add lots of tests to ensure if it changes in future, that we can implement the old version of the code. Either that or we simply implement it ourselves. Sorry if it seems like I'm averse to looking into the code. It's not that, it's simply as you suggest, the code can change and I'm trying to find out the interface that is being provided which then shouldn't (or is less likely to change). Thanks for the clarification!
However, we have two questions (open problems):
1) How does boost::variate_generator interact with mt19937, and the distribution? Is this documented somewhere?
In the attached code, it seems we can construct boost::uniform_distribution with any maximum value, and still the number generator returns values outside of that range? Is this really right?
Yes. You don't need to use uniform_int with random_number_generator. You can plug mt19937 directly into random_number_generator.
Thanks! Is the behaviour exactly the same?
It isn't going to be exactly the same. If you use mt19937 directly, then random_number_generator will transform the output of the generator to the desired range. If you use uniform_int, the generator's output will first be transformed to the range that you specified for uniform_int, then again by random_number_generator. The result will still have the correct distribution, but the exact values may be different.
Ah, thanks! We will have to pick one and stick with it as, for us, the two aren't equivalent as they don't exhibit the same behaviour and so if we swap from one to the other at some point, we lose reproducibility of our experiments. Thanks for the explanation!
It seems so, but the documentation seems to suggest using boost::uniform_distribution.
Where in the documentation?
Apologies, it seems I remembered incorrectly, it was using boost::uniform_distribution with boost::variate_generator : http://www.boost.org/doc/libs/1_45_0/doc/html/boost_random.html
In Christ, Steven Watanabe _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
Thanks again!! Matthew Gwynne http://cs.swan.ac.uk/~csmg/
participants (2)
-
Matthew Gwynne
-
Steven Watanabe