Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)
Announcements
Intel Support hours are Monday-Fridays, 8am-5pm PST, except Holidays. Thanks to our community members who provide support during our down time or before we get to your questions. We appreciate you!

Need Forum Guidance? Click here
Search our FPGA Knowledge Articles here.
15332 Discussions

Kernel Vectorization: branching is thread ID dependent ... cannot vectorize

RJin1
Beginner
1,069 Views

I have the following code, and in the kernel the norm is get from the rowdata and row data is dependent on the thread ID.

In the myfastsquaredDistance function, the if else statement is dependent on the norm, thus when I compile, it gives warning that it cannot vectorize. Does anyone know how to get around this? Thanks!

 

double myfastSquaredDistance(__constant double *rowdata, __constant double *rowcenter, double norm1, double norm2, int n, int y)

{

 double precision = 1e-6;

  

 double sumSquaredNorm = norm1 * norm1 + norm2 * norm2;

 double normDiff = norm1 - norm2;

 double sqDist = 0.0;

  

 double precisionBound1 = 2.0 * epsilon() * sumSquaredNorm / (normDiff * normDiff + epsilon());

 if (precisionBound1 < precision) {

  sqDist = sumSquaredNorm - 2.0 * mydot(rowdata, rowcenter, n, y);

 } else {

  sqDist = sqdist(rowdata, rowcenter, n, y);

 }

 return sqDist;

}

 

__attribute__((num_simd_work_items(4)))

__attribute__((reqd_work_group_size(32,1,1)))

__kernel

void kmeansTest(

int num_vectors,

  int vector_length,

  int num_clusters,

  __constant double* restrict data,

  __constant double* restrict center,

  __global double* restrict result

  )

{

 const uint y = get_global_id(0);

 

 int bestindex;

 double norm1, norm2;

 

  double min_distance = FLT_MAX;

 

  __constant double* rowdata = &data [y * (vector_length + 1)];

  norm1 = rowdata[vector_length];

   

 for (int i = 0; i < num_clusters; ++i){

   double dotProduct = 0;

   double precision = 1e-6;

    

   __constant double *rowcenter = center + i * (vector_length + 1);

   norm2 = rowcenter[vector_length];

 

   dotProduct = myfastSquaredDistance(data, rowcenter, norm1, norm2, vector_length, y);

   if(dotProduct < min_distance){

    min_distance = dotProduct;

    bestindex = i;

   }

  }

  result[y * 2] = bestindex;

  result[y * 2 + 1] = sqrt(min_distance);

}

0 Kudos
6 Replies
Nooraini_Y_Intel
Employee
200 Views

Hi,

 

Currently I am reviewing the forum for any open questions and found this thread. I apologize that no one seems to answer this question that you posted. Since it has been a while you posted this question, I'm wondering if you have found the answer? If not, please let me know, I will try to assign someone to assist you. Thank you.

 

Regards,

Nooraini

RJin1
Beginner
200 Views

Hi,

I haven't figured out the answer yet. If you can find someone to help me that would be great. Thanks!

Nooraini_Y_Intel
Employee
200 Views

Hi RJin1,

 

Sure, I'm checking here to find someone that may able to assist you on this thread.

 

Regards,

Nooraini

MuhammadAr_U_Intel
200 Views

Hi @RJin1​ 

 

What version of OpenCL compiler are you using ?

 

Thanks,

Arslan

RJin1
Beginner
200 Views

16.0 sdk

HRZ
Valued Contributor II
200 Views

Your code does not compile as it is due to missing definition for many of the functions. I believe it requires an external header. Please provide a code snippet that can be compiled (preferably using the code insertion mechanism provided in the forum so that indentation is preserved) so that we can take a look at the report and compiler messages. I fail to see why this cannot be vectorized, though; the statement in the condition does not depend on thread-ID, only its value does. You might be able to get around it using the (condition) ? (value1) : (value2) notation instead of if/else.

Reply