-----Original Message----- From: boost-users-bounces@lists.boost.org [mailto:boost-users- bounces@lists.boost.org] On Behalf Of Matthias Troyer I would go with reading once and broadcasting, especially if, as was mentioned before, one aims at going to thousands of processes. No I/O system can scale, and implementing the broadcast is trivial: a single function call.
Matthias
_______________________________________________
The large calculation that I currently do serially and that I intend to
parallelize is the maximum of the return values of a large number of
evaluations of a given "function" in the mathematical sense.
The number of arguments of the function is only known at runtime.
Let's say it is determined at runtime that the number of arguments is 10, ie
we have 10 arguments x0, ..., x9
Each argument can take a different number of values, for e.g. x0 can be
x0_0, x0_1 .... x0_n0
x1 can be x1_0, x1_1, ...x1_n1 and so on...n0 and n1 are typically known at
runtime and different
so serially, I run
f(x0_0, x1_0, ..., x9_0)
f(x0_0, x1_0, ..., x9_1)
...
f(x0_0, x1_0, ..., x9_n9)
then with all the x8 then all the x7 ... then all the x0.
There is n0*n1*...*n9 runs
Then I get the maximum of the return values.
Imagining I have N mpi processes, ideally each process would run
n0*n1*...*n9/N function evaluations.
How do I split?
In terms of current implementation, each of the x is a boost::variant over 4
types:
a double, a