OpenCL* for CPU
Ask questions and share information on Intel® SDK for OpenCL™ Applications and OpenCL™ implementations for Intel® CPU.
Announcements
This forum covers OpenCL* for CPU only. OpenCL* for GPU questions can be asked in the GPU Compute Software forum. Intel® FPGA SDK for OpenCL™ questions can be ask in the FPGA Intel® High Level Design forum.

mul_hi bug report

George_W_
Beginner
598 Views
Windows 8.1 64-bit, Intel HD 4600 (both latest release and beta drivers), the following snippet from an OpenCL kernel produces an incorrect result:

    if (k_delta == 38443432 && jj==4620)
       printf((__constant char *)"cl_barrett32_87_gs:  jj=%x kdelta=%x  mulhi=%x\n",
         (uint)jj, (uint)k_delta, (uint)mul_hi((uint)jj,(uint)k_delta));

the output is:

      cl_barrett32_87_gs:  jj=120c kdelta=24a99a8  mulhi=0

It is my understanding that mul_hi should not produce a zero result here.  

I also have a (likely) related multiplication bug:

    facdist = (ulong) (2 * NUM_CLASSES) * (ulong) exponent;

fails with the upper 32-bits being zero where NUM_CLASSES is a #define for 4620 and exponent is a value in the 50 million area.

0 Kudos
4 Replies
George_W_
Beginner
598 Views

 

OK, now it gets weird.   If I add one line, that really does nothing, then the code snippet works (mul_hi returns 0x29).  That line is:

    jj = jj % k_delta;

Update:   This code actually does something.  In the original code snippet a smart optimizing compiler can determine that jj is a constant 4620.  Adding the line above forces the compiler to place the jj value in a register or memory.

 

0 Kudos
Raghupathi_M_Intel
598 Views

Hi George,

Is it possible to add the full reproducer?

Thanks,
Raghu

0 Kudos
George_W_
Beginner
598 Views

Hi Raghu,

I failed at creating a tiny reproducible case so I removed a ton of extraneous code from my program and zipped it up for you.  The zip includes all the source, MSVC make files, and a prebuilt executable.

The buggy code is in src/gpusieve.cl function CalcModularInverses above the "if (prime == 13)" printfs.    It reproduces both the mul_hi and ulong multiplication bug.  It also includes the correct result when the constant 4620 is assigned to a variable.

Let me know if you need more.

Regards,

George

0 Kudos
George_W_
Beginner
598 Views

Hi,

Were you able to reproduce the bug with the code in my last post?  Anything else I can provide to help?

Regards,

George

0 Kudos
Reply