Intel® oneAPI DPC++/C++ Compiler
Talk to fellow users of Intel® oneAPI DPC++/C++ Compiler and companion tools like Intel® oneAPI DPC++ Library, Intel® DPC++ Compatibility Tool, and Intel® Distribution for GDB*
594 Discussions

parallel_for with range<3> iterates over wider range than specified

baranovsky
Beginner
759 Views

Hi,

I've been developing with DPC++, and recently trying to change version from 2023  to the latest 2024.0.1.

The output of my software is corrupted because of the following phenomena, which may not be intended.

 

In the attached program,

  • parallel_for(range<1>().... works fine
  • but parallel_for(range<3>()... iterates  over wider range than specified.

I'm wondering if this is a bug or something is missing ...

 

Attachments:

main.txt: this file is originally 'main.cpp'

Makefile.txt: this file is originally 'Makefile'

 

My development environment:

OS: Ubuntu 22.04 LTS

oneAPI: 2024.0.1 with oneapi-for-nvidia-gpus-2024.0.1-cuda-12.0-linux.sh

CPU: Intel Core i9-14900KF

GPU: NVIDIA GeForce RTX 4090

CUDA: 12.2 (driver version: 535.146.02)

 

Thank you in advance for your help.

0 Kudos
1 Reply
baranovsky
Beginner
706 Views

I've noticed that no result is shown in my post.

 

In my environment, the console output looks like

> 32767 should be 32767
> 49151 should be 32767

 

In parallel_for(range<1>(x * y * z), where x = y = z = 32, 

the maximum value returned by item.get_linear_id() is 32767,

which is identical to 32 * 32 * 32 - 1.

This seems correct.

 

On the other hand, in parallel_for(range<3>(x, y, z),

the maximum value is 49151, which exceeds (32 * 32 * 32 - 1)

This seems to be too large.

0 Kudos
Reply