I have no experience with mt programming other than with tbb and there i a good chance that i am asking something very naive/not-very-smart.
I have used parallel_reduce to obtain various statistics of vector numerical expressions (like mean and variance).
All seems to work wel, when I have a single random variable (RV).
Tried to extend this for many RVs. However, I notice that two consecutive runs of the calculator give very different results ( not explainable by the order of calculation).
The idea is, that one does a single sweep fo all RVs and calculates per thread a range of te indices.
The attached code snippet is a slimmed down version of my actual app.
there is a buffer of all the random numbers (C-format mcSize x nbrRVs) and "Vector"s (in reality views to this buffer).
Each thread has to have a local buffer equal to the number of RVs. At join the "local" class adds on the buffer of the call arg to the one of the caller.
In the global, everything is added to an external buffer.
Can someone help?
(the effort with the "global" buffer would be a nice to understand - i did not expect this tobe an issue since there is no read/write mixing. Also, any stl sructure hasbeen stripped off in an effort to exclude problem that could come from the stl thread safety).
Thank you in advance,
ps: unfortunately the Add Files button does not seem to work for me..