OpenCL* for CPU
Ask questions and share information on Intel® SDK for OpenCL™ Applications and OpenCL™ implementations for Intel® CPU.
Announcements
This forum covers OpenCL* for CPU only. OpenCL* for GPU questions can be asked in the GPU Compute Software forum. Intel® FPGA SDK for OpenCL™ questions can be ask in the FPGA Intel® High Level Design forum.

I found that the result of multiplying bit64 numbers was wrong. WHY?

jianzhong_wang
Beginner
2,360 Views

OpenCL1.2/OpenCL2.0
OS:Win10
Intel(R) HD Graphics 4600

 


__kernel void miniTest(__global unsigned long long * buff )
{
    const unsigned int M = 0x3FFFFFFUL;
    const unsigned int R = 0x3D10UL;

    unsigned long long u = 0;
    unsigned long long c = 0;
    unsigned long long d = 0;

    buff[0] = 0x32974b6;
    buff[1] = 0x6695f8a;
    buff[2] = 0x3cc840c;
    buff[3] = 0xc8998;
    buff[4] = 0x4f06406;
    buff[5] = 0x438d148;
    buff[6] = 0x29e81a2;
    buff[7] = 0x784c484;
    buff[8] = 0x6ad6038;
    buff[9] = 0x1383ee;

    unsigned long long aA[10] = { 0 };
    aA[0] = (unsigned long long)buff[0];
    aA[1] = (unsigned long long)buff[1];
    aA[2] = (unsigned long long)buff[2];
    aA[3] = (unsigned long long)buff[3];
    aA[4] = (unsigned long long)buff[4];
    aA[5] = (unsigned long long)buff[5];
    aA[6] = (unsigned long long)buff[6];
    aA[7] = (unsigned long long)buff[7];
    aA[8] = (unsigned long long)buff[8];
    aA[9] = (unsigned long long)buff[9];

    d = (aA[0] * 2) * aA[9]
      + (aA[1] * 2) * aA[8]
      + (aA[2] * 2) * aA[7]
      + (aA[3] * 2) * aA[6]
      + (aA[4] * 2) * aA[5];

    d >>= 26;

    d += (aA[1] * 2) * aA[9]
       + (aA[2] * 2) * aA[8]
       + (aA[3] * 2) * aA[7]
       + (aA[4] * 2) * aA[6]
       + (aA[5]    ) * aA[5];

    u = d & (unsigned long long)M;       //At here , the u=0x0a1db38
    c = u * (unsigned long long)R;       //the result(c=0x09b5a0b80) is always wrong. WHY?

    unsigned int iCLow = (unsigned int)c;
    unsigned int iCHig = (unsigned int)(c >> 32);
    printf( "c=0x%x%x;\r\n" , iCHig , iCLow );

//===================================================

    aA[0] = 0x32974b6;
    aA[1] = 0x6695f8a;
    aA[2] = 0x3cc840c;
    aA[3] = 0xc8998;
    aA[4] = 0x4f06406;
    aA[5] = 0x438d148;
    aA[6] = 0x29e81a2;
    aA[7] = 0x784c484;
    aA[8] = 0x6ad6038;
    aA[9] = 0x1383ee;

    d = (aA[0] * 2) * aA[9]
      + (aA[1] * 2) * aA[8]
      + (aA[2] * 2) * aA[7]
      + (aA[3] * 2) * aA[6]
      + (aA[4] * 2) * aA[5];

    d >>= 26;
    d += (aA[1] * 2) * aA[9]
       + (aA[2] * 2) * aA[8]
       + (aA[3] * 2) * aA[7]
       + (aA[4] * 2) * aA[6]
       + (aA[5]    ) * aA[5];

    u = d & (unsigned long long)M;        //At here , the u=0x0a1db38
    c = u * (unsigned long long)R;        //the result(c=0x269b5a0b80) is ok.

    iCLow = (unsigned int)c;
    iCHig = (unsigned int)(c >> 32);
    printf( "c=0x%x%x;\r\n" , iCHig , iCLow );
}
 

0 Kudos
13 Replies
jianzhong_wang
Beginner
2,360 Views

initialize the device as a gpu(iResult = clGetDeviceIDs(platform_id, CL_DEVICE_TYPE_GPU, 1, &device_id, &ret_num_devices);).

the result is wrong(c=0x09b5a0b80).

 

initialize the device as a cpu(iResult = clGetDeviceIDs(platform_id, CL_DEVICE_TYPE_CPU, 1, &device_id, &ret_num_devices);).

the result is right(c=0x269b5a0b80).

 

CPU:Intel(R) Core(TM) i5-4690 CPU @ 3.50GHz , GenuineIntel

GPU:Intel(R) HD Graphics 4600

----------------------------------------

so, i think it is a bug of graphic chip.

 

0 Kudos
Michael_C_Intel1
Moderator
2,360 Views

Hi JianzhongW,

 

Thanks for sending the example and the interest.

 

Program tweaks.:

  • The program used unsigned long long, That's not a data type per opencl-c 2.0 spec. In this retest of the program, unsigned long type was used. We'll see about healthier parsing feedback for unsigned long long when it's used at the kernel compilation step... ioc64 had odd errors with the original unsigned long long kernel targeting iGFX.
  • The buffer size constant used 0x1024 instead of the hardcoded 1024 used in the malloc for offload memory.
  • The kernel was compiled with -cl-std=CL2.0, and executed on Windows* 10 Intel® Core™ i5-6300U w Graphics Driver Revision 25.20.100.6444.

 

With CL_DEVICE_TYPE_CPU and printing both u and c

GetPlatformIDSTotal 2
u=0xa1db38;
c=0x269b5a0b80;
u=0xa1db38;
c=0x269b5a0b80;
Hello World!

With CL_DEVICE_TYPE_GPU and printing both u and c.

GetPlatformIDSTotal 2
u=0xa1db38;
c=0x269b5a0b80;
u=0xa1db38;
c=0x269b5a0b80;
Hello World!

 

Can you try the program with the standard compliant tweaks applied?

-MichaelC

0 Kudos
Michael_C_Intel1
Moderator
2,360 Views

It may be that implementers interpret the 'C99 as a base specification'  differently... I'll probe on the issue.

-MichaelC

0 Kudos
Michael_C_Intel1
Moderator
2,360 Views

6.1.4 indicates unsigned long long is a reserved type: https://www.khronos.org/registry/OpenCL/specs/2.2/pdf/OpenCL_C.pdf

Per the structure of the kernel program, it doesn't look like it's intended per use and the program intent is that of C99 expectations 2^64. 

 

-MichaelC

0 Kudos
jianzhong_wang
Beginner
2,360 Views

MICHAEL C. (Intel)

    tnks for your answer.

1.The program used unsigned long long, That's not a data type per opencl-c 2.0 spec.

------  I retest the program after modify the code.

2.The buffer size constant used 0x1024 instead of the hardcoded 1024 used in the malloc for offload memory.

------ ok

3.The kernel was compiled with -cl-std=CL2.0

------ i can't find the compile option in vs2017

 

-----------------------------

the result of test is wrong.

GetPlatformIDSTotal 1
c=0x09b5a0b80;
c=0x269b5a0b80;
Hello World!

 

0 Kudos
jianzhong_wang
Beginner
2,360 Views

it's  odd when i compile the cl file.

That prompt saw me, it doesn't support 64bit.

 

0 Kudos
jianzhong_wang
Beginner
2,360 Views

it's odd when i compile the cl file.

That prompt show me,it doesn't support 64bit.

0 Kudos
Michael_C_Intel1
Moderator
2,360 Views

Jianzhong Wang,

 

Nothing to be concerned about with 64bit.

Thanks for seeking a clarification. The targetos option is fed from the IDE build interface provided by code builder.... it passes -TARGETOS as a parameter in the 64bit  build launch of ioc64.exe... thus it emits the warning because it's not required. ioc32 builds can use that parameter.

Kernel build options are specified through the (Additional build Options field) also known as the -bo=" " toggle of ioc64. Check the Properties of the kernel file in the GUI (right click)... It's under OpenCL Code Builder -> General... additional build options... Users can supply the -cl-std= toggle of their choice there.

Also... ioc64 is effectively just an offline compile pass in MSVS when 'build' is executed by default... it should give you errors if they exist. The ioc64 pass can be configured via the menu to perform other build actions.

bo="" / Additional Build options field... also maps to the api call clBuildProgram(...) char* options field... here you could use  a macro... something like -DMY_VAR=5 to help with your debug if necessary. A standard toggle that maps to whatever OpenCL revision is desired and supported by the hardware is recommended for inclusion in the program. At least to be consistent for standard hardware.

Would you mind reposting your edited code?

Can you ensure your Windows Intel Graphics Driver is most up to date? The driver package contains both CPU and iGFX OCL runtimes. Go to the system vendor support website first.... if a driver package from there doesn't apply, try downloadcenter.intel.com.

 

What Intel® CPU is in use on the system? Can you share the system vendor and model?

Thanks,

 

-MichaelC

0 Kudos
jianzhong_wang
Beginner
2,360 Views

MICHAEL C. (Intel)

1. the sys environment:

OS:  WIN10

CPU:Intel(R) Core(TM) i5-4690 CPU @ 3.50GHz

2.i update my Graphics Driver today.(win64_15.40.42.5063.exe)

3. my graphics seems  to support OpenCL1.2 only.

It is error when the kernel was compiled with -cl-std=CL2.0. but it is ok with 1.2

4.i backup the sysinfo to the attach file by intel SSU

 

0 Kudos
jianzhong_wang
Beginner
2,360 Views

the sys infomation in my pc.

i backup into sys-info.txt by  intel SSU

0 Kudos
Michael_C_Intel1
Moderator
2,360 Views

Thanks JianzhongW,

Thanks for providing the meta information.

Would you mind posting your last edited issue reproducer? I want to confirm exactly the last thing you've checked.

I can't fully reproduce it because I don't have immediate access to the legacy hardware... But I'd like to ask the dev team to see if they can reproduce on the driver stack you deployed.

Thanks again,

 

-MichaelC

0 Kudos
jianzhong_wang
Beginner
2,360 Views

Hi, MICHAEL C.

I'm sorry.it is chinese holidays in the past few days

I didn't get the right result in the end,so I changed the code to 32 bits.

My system environment:

1. the sys environment:

OS:  WIN10

CPU:Intel(R) Core(TM) i5-4690 CPU @ 3.50GHz

2.graphics  driver : win64_15.40.42.5063.exe)

 

 

 

0 Kudos
jianzhong_wang
Beginner
2,360 Views

i'm very thank for your help

0 Kudos
Reply