Software Archive
Read-only legacy content
17061 Discussions

Parallel Search With Cilk Plus

Ömer_Faruk_Kalkan
501 Views

Hi everyone , I have a program that generating random number if it does not exist in ( allocated custom size) array  then add array. But if custom size is very big   ( 1 million )  after a period search is very slowing down. I did learn  cilk_for and reducers.I want to paralleize but I could not decide what reducer is suitable for array. Is there someone who can help me ? 

(Sorry for my english if you do not understand my problem you can write my e-mail   " 03011241@st.meliksah.edu.tr " )  

 

0 Kudos
1 Solution
ARCH_R_Intel
Employee
501 Views

Variables j and currentPoint are what compiler people call "induction variables", because their value in each iteration depends on their value in the previous iteration.  Parallelization requires removing induction variables, typically by calculating their values directly as a function of the loop index variable.  In your example, j is 14-i and currentPoint is the original currentPoint+i.  If you need their values after the loop, add an explicit calculation after the loop that computes their values.  Here is what the code looks like after induction variables are replaced with direct calculations:

long number = 0;
int j = 14; // j will be the power of the 3 in loop

for (int i = 0; i < 15; i++, j--)
{
    number += (*(currentPoint+i)) * (pow (3 , j-i)); // convert to number-based 10
}
j -= 15;
currentPoint += 15;

Then the for can be replaced by cilk_for and number declared as a cilk::reducer_opadd<long>.

View solution in original post

0 Kudos
6 Replies
ARCH_R_Intel
Employee
501 Views

Before trying to parallelize the code, I'd like to understand if all practical serial improvements have been made.  For example, I'm wondering if open-addressing hashing might be used.  That could decrease the search time from O(N) to O(1) in the average case.

0 Kudos
Ömer_Faruk_Kalkan
501 Views

I did not use any hash function. linear search is being done with....

I have a question also ..This code snippet is entry of  search program.

 

    srand (time (NULL)); 

    char * first;
    first = ( char * ) malloc (14348907);
    char * currentPoint = first;

    for (int i = 0; i < 14348907; i++)
    {
        *currentPoint =rand () % 3;     //  it is simple assign  but assignment process will be complex
        currentPoint++;
    }

  I want to  use reducer for currentPoint and parallize this loop ...

0 Kudos
Jim_S_Intel
Employee
501 Views

Making parallel calls to rand() is generally not going to get good performance because there will be contention (and probably a race) on the shared state within rand().

One solution is to use the deterministic parallel random-number generator that is implemented in Cilkpub.  You can download that code from the Cilk Plus community website:

http://www.cilkplus.org/sites/default/files/contributions/cilkpub_v104.tar.gz

With that, the code looks something like:

#include <cilkpub/dotmix.h>


...

cilkpub::DotMix rng(seed);   // Seed RNG
cilk_for(int i = 0; i < n; ++i) {
    currentPoint = rng.get() % 3;
}


That being said, I'd first consider Arch's suggestion of investigating whether there are serial optimizations you can make to your program first.  I don't know the details of the problem you are trying to solve, but it sounds like hashing would be the better approach than linear search...

Cheers,

Jim

0 Kudos
Ömer_Faruk_Kalkan
501 Views

Thank you for suggestions...I decided to make hash function.And I will implement Jim`s sugesstion.( deterministic parallel random-number generator  )

I briefly tell you what do my program . Generator is creating guess 0 , 1 or 2 as random 15 steps. ( I will produce with an algorithm in next ) . Later , merging to be juxtaposed which produced it like this "011012212012211" . After , it is adding in array if the result is true.

Now in this step I am converting  to decimal which is a 15-digit number at the base 3 like this 011012212012211 = 2.241.103 . When search the element (2.241.103 ) if that element is empty, the result will be true. So this converting provides O(n) to O(1). But too many iterations are still experiencing slowdowns.Here we can do to parallel conversion stage.

    long number = 0;
    int j = 14; // j will be the power of the 3 in loop

    for (int i = 0; i < 15; i++, j--)
    {
        number += (*currentPoint) * (pow (3 , j)); // convert to number-based 10
        currentPoint++;
    }

This section parallels how should it be ?

Thanks...

0 Kudos
ARCH_R_Intel
Employee
502 Views

Variables j and currentPoint are what compiler people call "induction variables", because their value in each iteration depends on their value in the previous iteration.  Parallelization requires removing induction variables, typically by calculating their values directly as a function of the loop index variable.  In your example, j is 14-i and currentPoint is the original currentPoint+i.  If you need their values after the loop, add an explicit calculation after the loop that computes their values.  Here is what the code looks like after induction variables are replaced with direct calculations:

long number = 0;
int j = 14; // j will be the power of the 3 in loop

for (int i = 0; i < 15; i++, j--)
{
    number += (*(currentPoint+i)) * (pow (3 , j-i)); // convert to number-based 10
}
j -= 15;
currentPoint += 15;

Then the for can be replaced by cilk_for and number declared as a cilk::reducer_opadd<long>.

0 Kudos
Ömer_Faruk_Kalkan
501 Views

I try your code but results are not true, I`ve changed like this and began to work properly...

  cilk::reducer_opadd<long>number (0);       
.
.
.

 cilk_for (int i = 14; i >= 0; i--)
        {
            number += (*(currentPoint+(14-i))) * (pow (3 , i));// this code convert to number-based 10
        }

        currentPoint += 15;
        number.set_value (0); // needed for the next iteration
.
.
.

I can now generate and search quickly . thanks :)

0 Kudos
Reply