- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have the loop, inside its body running the function with array member (dependent on loop index) as an argument, and returning one value. I can parallelized this loop by using cilk_for() operator instead of regular for() - and it is simple and works well. This is explicit parallelization. Instead of explicit loop instruction I can use Array Notation contruction (as shown below) - it is implicit loop. My routine is relatively long and complecs, and has Array Notation constructions inside, so it cannot be declared as a vector (elemental) one. When I use implicit loop - it is not parallelized, the run time is increased substantially. float foo(float f_in) { float f_result; // LONG computation containing CILK+ Array Notation operations ///////////////////////////////////////////////////////// return f_result; } int main() { float af_in, af_out ; // Explicit parallelized loop cilk_for(int i=0; i<n; i++) af_out = foo(af_in); // Implicit non-parallelized loop af_out[:] = foo(af_in[:]); } My question is: does somebody know, if there is the way "to say" to compiler, that my implicit loop (Array Notation assignment) has independent steps and should be parallelized (pragma, something else) ?
Link Copied
1 Reply
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Have you tried #pragma simd? Essentially that tells the compiler that the loop should be vectorized, even if the auto vectorization fails.
- Barry
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page