Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Beginner
5 Views

Where should I put ANNOTATE_ITERATION_TASK?

I'm using Intel Advisor to analyze my parallel application. I have this code, which is the main loop of my program and where is spent most of the time:

       ANNOTATE_SITE_BEGIN(solve);
       for(size_t i=0; i<wrapperIndexes.size(); i++){
           const int r = wrapperIndexes.r;
           const int c = wrapperIndexes.c;
           const float val = localWrappers[wrapperIndexes.i].cur.at<float>(wrapperIndexes.r,wrapperIndexes.c);
           if ( (val > positiveThreshold && (isMax(val, localWrappers[wrapperIndexes.i].cur, r, c) && isMax(val, localWrappers[wrapperIndexes.i].low, r, c) && isMax(val, localWrappers[wrapperIndexes.i].high, r, c))) ||
                (val < negativeThreshold && (isMin(val, localWrappers[wrapperIndexes.i].cur, r, c) && isMin(val, localWrappers[wrapperIndexes.i].low, r, c) && isMin(val, localWrappers[wrapperIndexes.i].high, r, c))) )
              // either positive -> local max. or negative -> local min.
                ANNOTATE_ITERATION_TASK(localizeKeypoint);
                localizeKeypoint(r, c, localCurSigma[wrapperIndexes.i], localPixelDistances[wrapperIndexes.i], localWrappers[wrapperIndexes.i]);
       }
       ANNOTATE_SITE_END();

As you can see, `localizeKeypoint` is where most of the time the loop is spent (if you don't consider the `if` clause). I want to do a Suitability Report to estimate the gain from parallelizing the loop above. So I've written this:

       ANNOTATE_SITE_BEGIN(solve);
       for(size_t i=0; i<wrapperIndexes.size(); i++){
           const int r = wrapperIndexes.r;
           const int c = wrapperIndexes.c;
           const float val = localWrappers[wrapperIndexes.i].cur.at<float>(wrapperIndexes.r,wrapperIndexes.c);
           if ( (val > positiveThreshold && (isMax(val, localWrappers[wrapperIndexes.i].cur, r, c) && isMax(val, localWrappers[wrapperIndexes.i].low, r, c) && isMax(val, localWrappers[wrapperIndexes.i].high, r, c))) ||
                (val < negativeThreshold && (isMin(val, localWrappers[wrapperIndexes.i].cur, r, c) && isMin(val, localWrappers[wrapperIndexes.i].low, r, c) && isMin(val, localWrappers[wrapperIndexes.i].high, r, c))) )
              // either positive -> local max. or negative -> local min.
                ANNOTATE_ITERATION_TASK(localizeKeypoint);
                localizeKeypoint(r, c, localCurSigma[wrapperIndexes.i], localPixelDistances[wrapperIndexes.i], localWrappers[wrapperIndexes.i]);
       }
       ANNOTATE_SITE_END();

And the Suitability Report given an excellent 6.69x gain, as you can see here:

However, launching dependencies check, I got this problem message:
In particular see "Missing start task".

In addition, if I place `ANNOTATE_ITERATION_TASK` at the beggining of the loop, like this:

       ANNOTATE_SITE_BEGIN(solve);
       for(size_t i=0; i<wrapperIndexes.size(); i++){
            ANNOTATE_ITERATION_TASK(localizeKeypoint);
           const int r = wrapperIndexes.r;
           const int c = wrapperIndexes.c;
           const float val = localWrappers[wrapperIndexes.i].cur.at<float>(wrapperIndexes.r,wrapperIndexes.c);
           if ( (val > positiveThreshold && (isMax(val, localWrappers[wrapperIndexes.i].cur, r, c) && isMax(val, localWrappers[wrapperIndexes.i].low, r, c) && isMax(val, localWrappers[wrapperIndexes.i].high, r, c))) ||
                (val < negativeThreshold && (isMin(val, localWrappers[wrapperIndexes.i].cur, r, c) && isMin(val, localWrappers[wrapperIndexes.i].low, r, c) && isMin(val, localWrappers[wrapperIndexes.i].high, r, c))) )
              // either positive -> local max. or negative -> local min.
                localizeKeypoint(r, c, localCurSigma[wrapperIndexes.i], localPixelDistances[wrapperIndexes.i], localWrappers[wrapperIndexes.i]);
       }
       ANNOTATE_SITE_END();


    
The gain is horrible:

Am I doing something wrong?

0 Kudos
1 Reply
Highlighted
5 Views

Please do not create multiple threads on the same topic. It doesn't make you more likely to get a response, it just makes things disorganized on our end.

I'm going to close this thread and post a link to here from your other thread so conversation can continue in a single place.

The other thread is here: https://software.intel.com/en-us/forums/intel-advisor-xe/topic/731505

0 Kudos