[math] & [random] distribution object compatibility
Boost [random] has probability distribution objects for drawing random samples of those distributions. Boost [math] also has probability distributions, those provide free functions for computing properties of the distributions, like mean, pdf etc. I like the free function design of math variant e.g. it uses pdf(distribution,x) and I was wondering why this hasn’t been adopt in C++11 for <random>? To me a free function syntax like random(distribution,engine) would make sense. It would align with the fact that C++11 has added begin(container) and end(container) free functions. I think it’s a bit inconvenient to have two libs that both contain probability distributions. Especially since I’m hoping that boost/math would some day en up in the standard? One way we could integrate the two libs is by adding a random(distribution,engine) free function to the math lib which would accept both distribution objects from the random lib and the math lib. This however could stimulate users in writing less C++11 portable code. My questions are: * Why wasn’t the free function syntax adopted in C++11? Perhaps because of performance options? * Is there an effort to integrate the distributions in math and random within boost? * Are there concrete plans to add elements of math to the C++ standard? We already have <random> now, I would love more.
AMDG On 04/13/2014 07:36 AM, Thijs van den Berg wrote:
Boost [random] has probability distribution objects for drawing random samples of those distributions. Boost [math] also has probability distributions, those provide free functions for computing properties of the distributions, like mean, pdf etc.
I like the free function design of math variant e.g. it uses pdf(distribution,x) and I was wondering why this hasn’t been adopt in C++11 for <random>? To me a free function syntax like random(distribution,engine) would make sense. It would align with the fact that C++11 has added begin(container) and end(container) free functions.
I think it’s a bit inconvenient to have two libs that both contain probability distributions.
They need to be separate types, because efficient algorithms for generating random variates often require extra members that are useless for anything else. At one time I considered trying to make param_type a typedef for the corresponding Boost.Math distribution, but the required interface is a bit of a problem.
Especially since I’m hoping that boost/math would some day en up in the standard? One way we could integrate the two libs is by adding a random(distribution,engine) free function to the math lib which would accept both distribution objects from the random lib and the math lib. This however could stimulate users in writing less C++11 portable code.
My questions are: * Why wasn’t the free function syntax adopted in C++11? Perhaps because of performance options? * Is there an effort to integrate the distributions in math and random within boost? * Are there concrete plans to add elements of math to the C++ standard? We already have <random> now, I would love more.
In Christ, Steven Watanabe
AMDG
On 04/13/2014 07:36 AM, Thijs van den Berg wrote:
Boost [random] has probability distribution objects for drawing random samples of those distributions. Boost [math] also has probability distributions, those provide free functions for computing properties of the distributions, like mean, pdf etc.
I like the free function design of math variant e.g. it uses pdf(distribution,x) and I was wondering why this hasn’t been adopt in C++11 for <random>? To me a free function syntax like random(distribution,engine) would make sense. It would align with the fact that C++11 has added begin(container) and end(container) free functions.
I think it’s a bit inconvenient to have two libs that both contain probability distributions.
They need to be separate types, because efficient algorithms for generating random variates often require extra members that are useless for anything else.
At one time I considered trying to make param_type a typedef for the corresponding Boost.Math distribution, but the required interface is a bit of a problem.
that’s clear. In [random] the algorithms are inside the distribution class, in [math] the algorithms are outside the class, inside non-member functions (like pdf). The distributions in [math] are hardly more than a param_type. If in [random] the algorithms would have been moved to a random non-member function (or functor) then both libs could use a shared minimalistic distribution type that would be be very much like param_type. The proposed random non-member function/functor would be very similar to the “random_distribution::operator()(eng, param_type) const” with param_type something shared by both libs.
Especially since I’m hoping that boost/math would some day en up in the standard? One way we could integrate the two libs is by adding a random(distribution,engine) free function to the math lib which would accept both distribution objects from the random lib and the math lib. This however could stimulate users in writing less C++11 portable code.
My questions are: * Why wasn’t the free function syntax adopted in C++11? Perhaps because of performance options? * Is there an effort to integrate the distributions in math and random within boost? * Are there concrete plans to add elements of math to the C++ standard? We already have <random> now, I would love more.
In Christ, Steven Watanabe
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
-----Original Message----- From: Boost [mailto:boost-bounces@lists.boost.org] On Behalf Of Thijs van den Berg Sent: 13 April 2014 19:46 To: boost@lists.boost.org Subject: Re: [boost] [math] & [random] distribution object compatibility
AMDG
On 04/13/2014 07:36 AM, Thijs van den Berg wrote:
Boost [random] has probability distribution objects for drawing random samples of those distributions. Boost [math] also has probability distributions, those
free functions for computing properties of the distributions, like mean, pdf etc.
I like the free function design of math variant e.g. it uses
was wondering why this hasn't been adopt in C++11 for <random>? To me a free function syntax like random(distribution,engine) would make sense. It would align with the fact that C++11 has added begin(container) and end(container) free functions.
I think it's a bit inconvenient to have two libs that both contain
provide pdf(distribution,x) and I probability
distributions.
Indeed - but the requirements are quite different. Boost.Math aims to be accurate (and with extension to use Boost.Multiprecision and <cstdfloat> , very, very accurate). Boost.Random must be very fast, but need not be accurate - indeed it may be rather inaccurate? So I doubt if changing either is a good idea. (And anyway they are by different authors developed at different times - so NIH probably applies). Paul --- Paul A. Bristow Prizet Farmhouse Kendal UK LA8 8AB +44 01539 561830
AMDG On 04/14/2014 11:46 AM, Paul A. Bristow wrote:
Indeed - but the requirements are quite different.
Boost.Math aims to be accurate (and with extension to use Boost.Multiprecision and <cstdfloat> , very, very accurate).
Boost.Random must be very fast, but need not be accurate - indeed it may be rather inaccurate?
I've considered ways to make Boost.Random more accurate for years. I think it's generally possible to achieve near perfect accuracy without a huge performance cost by iteratively tightening the squeeze steps. Since the expected number of iterations is only slightly greater than one, this won't materially affect performance. The primary increase in cost would come from generating the initial guess, as there's no way to get k bits of accuracy without getting at least k bits from the underlying PRNG. The biggest problem is that it's nearly impossible to test as it requires a ridiculous number of samples to detect the inaccuracy. If you have k bits of accuracy, you would need at least \Omega(2^k) and possibly as many as \Omega(2^{2k}) samples before the bias becomes apparent.
So I doubt if changing either is a good idea.
(And anyway they are by different authors developed at different times - so NIH probably applies).
In Christ, Steven Watanabe
On 16 Apr 2014, at 21:33, Steven Watanabe
AMDG
On 04/14/2014 11:46 AM, Paul A. Bristow wrote:
Indeed - but the requirements are quite different.
Boost.Math aims to be accurate (and with extension to use Boost.Multiprecision and <cstdfloat> , very, very accurate).
Boost.Random must be very fast, but need not be accurate - indeed it may be rather inaccurate?
I've considered ways to make Boost.Random more accurate for years. I think it's generally possible to achieve near perfect accuracy without a huge performance cost by iteratively tightening the squeeze steps. Since the expected number of iterations is only slightly greater than one, this won't materially affect performance. The primary increase in cost would come from generating the initial guess, as there's no way to get k bits of accuracy without getting at least k bits from the underlying PRNG.
The biggest problem is that it's nearly impossible to test as it requires a ridiculous number of samples to detect the inaccuracy. If you have k bits of accuracy, you would need at least \Omega(2^k) and possibly as many as \Omega(2^{2k}) samples before the bias becomes apparent.
Exactly. The bottleneck in random simulation is sample noise. If you throw a coin 3 times then there is no good way to test for it’s bias, you won’t be able to distinguish between a good 50-50 coin and a bad 40-60 coin. The outcome will mainly be driven by change outcomes, and the bias will be 2nd order. Even the simple uniform [0,1) distribution has lots of weird finite precision resolution issues. Samples close to 1 will have a minimal distance between them that’s much larger (around 2^-53) than samples close to 0 (std::numeric_limits<double>::min() which is 10^-300). So if you compute some Monte Carlo measure that depends on the distribution of distances between samples then you’ll have lots of trouble purely coming from float representations. A friend and I have been thinking about quality measures that give you grips about biases causes by * finite resolution in the random engine integers * finite resolution in the distribution function and the float representation of the variate * uncertainty due to finite number of samples Another important issue in finance is that we use low discrepancy sequences to speed up convergence of Monte Calro simulation, but those techniques don’t work well with rejection sampling methods. Rejection sampling also gives synchronisation issues on GPU’s, e.g. when a single rejection in some core causes all other core to have to wait.
So I doubt if changing either is a good idea.
(And anyway they are by different authors developed at different times - so NIH probably applies).
In Christ, Steven Watanabe
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
participants (3)
-
Paul A. Bristow
-
Steven Watanabe
-
Thijs van den Berg