On Mon, Apr 21, 2014 at 7:51 PM, Steven Watanabe
AMDG
On 04/21/2014 02:28 PM, John Salmon wrote:
<snip> I would love to see a boost implementation that captured and exposed this capability for general use. Thus, I think a better approach is as follows:
- define a new "RandomFunction" concept that dscribes/enforces the essential, common features of the randomizing function in Threefry, Philox, ARS and other counter-based random generators.
I don't think this is a particularly useful abstraction. Most users should not care at all about this, since they should be using the predefined engine typedefs. If more similar generators are added later, then the implementation can be adjusted to factor out the common boilerplate, but I really don't see the point of creating a whole new concept for this.
I don't think this is a particularly useful abstraction. Most users should not care at all about this, since they should be using the predefined engine typedefs. If more similar generators are added later, then the implementation can be adjusted to factor out the common boilerplate, but I really don't see the point of creating a whole new concept for this.
It may well be true that most users should use a predefined generator because the Engine concept is well matched to the style of random number generator that has been in use for 65 years. But the question is not whether a RandomFunction abstraction should or would be used by *most* users. The question is whether it's useful to *enough* users. Allow me to try to motivate the RandomFunction abstraction. The traditional Engine is inherently stateful and sequential. Users have struggled with using random number generators in parallel applications for decades (long before Boost.Random). With sequential generators, a sufficiently motivated and careful user can use the discard() function to initialize multiple generators in parallel. For most generators, discard(N) is O(N), defeating its use for this purpose, but for a few (including Threefry), discard(N) is O(1) making this a practical, but hardly elegant or natural approach to parallelism. Over the years, the idea of using a stateless function has come up from time to time. The papers and articles that discuss them usually conclude with a sentiment along the lines of: "This would be great for parallel random number generation. It's a shame that it's so slow". They have always been considered curiousities rather than serious candidates for practical use. Our contribution was to describe some RandomFunctions that are statistically sound and extremely fast. I hope our paper is not the last word on the subject and that more high-quality, high-speed RandomFunctions follow. The usage model for RandomFunction is fundamentally different from a sequential generator. It really helps to put the idea of an Engine or a generator out of your mind. Imagine instead that all you have a pure function that takes 128-bit input and returns 128-bit output, and that the output is "random" as long as you follow the simple rule of never re-using an input value. How would you create random numbers for a parallel Monte Carlo simulation? Parallelism argues against "global state", so you're disinclined to encapsulate a single global counter into an Engine-like object that calls the function. Instead, you think about how and where you actually use random values. It's likely that you need to associate independent random sequences with the time-varying logical elements of your program, stock prices or atoms or finite elements or whatever. You could associate a 128-bit counter with each logical element that needs a stream of randoms, and increment the counter every time you need a new random. But that entails the overhead of storing and synchronizing counters. Counter state could easily dominate the program's memory footprint and frequent updates could thrash the cache. A better approach is to devise a trivial, application-specific mapping of time (iteration number) plus some stable aspect your logical elements (e.g., atom id) into unique 128-bit inputs. You can now obtain the random values you need, wherever and whenever you need them simply by generating a unique 128-bit input and calling the RandomFunction. There is no space overhead. If your parallel algorithm dictates that you need the same random value, e.g., to perform an "update" in two different threads or nodes, you just call the RandomFunction with the appropriate arguments wherever you need it and you get the results you need, where you need them. In parallel applications this is much more natural than trying to divvy up a single stream into multiple substreams by 'discarding' or 'jumping' an inherently sequential state. It is cheaper and easier than associating a stateful generator with each of your logical objects and worrying about the storage and bandwidth to keep them in-sync across the threads/nodes of a parallel computation. I believe that if Lehmer, who in 1949 invented the linear congruential generator, had instead glued together a few shifts and adds and muls and invented something like Threefry or Philox, then this is how we'd all think about random number generation today. The 'Engine' concept is an artifact of the sequential nature of the linear congruential generator, and its successors. I'm not suggesting that the Engine concept should go away - 65 years of history carries a lot of inertia. Plenty of people will and should continue to use it. But I don't think Engines are the only way or necessarily the best way to think about generating random values. I think there are many users who would use and benefit from an exposed RandomFunction concept. At this point, if I haven't swayed you, then I fear that my powers of persuasion aren't up to the task. Arguing in the abstract is difficult. Specifics might help. Maybe someone else who has read this far and thinks that they would benefit from RandomFunctions can offer some perspective. Or maybe I'm just wrong. If a RandomFunction concept isn't going to make it into boost, then I'd still be pleased to see Thijs' Threefry implementation move ahead. Cheers, John Salmon
In Christ, Steven Watanabe
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost