Intel® oneAPI Data Parallel C++
Support for Intel® oneAPI DPC++ Compiler, Intel® oneAPI DPC++ Library, Intel ICX Compiler , Intel® DPC++ Compatibility Tool, and GDB*
581 Discussions

problems about SYCL reduction variables using oneAPI

JamesFBM
Beginner
918 Views

Hi, I'm new to SYCL and am trying to run an example about reduction variables from the SYCL Specification with oneAPI 2023.0.0. I revise the code a little and the final version is shown as follows:

 

#include <iostream>
#include <sycl/sycl.hpp>
#include <numeric>

using namespace sycl;

int main() {
	buffer<int> valuesBuf{ 1024 };
	{
		host_accessor a{ valuesBuf };
        // std::iota(a.begin(), a.end(), 0);
        std::iota(&(a[0]), &(a[1023]), 0);
	}

	int sumResult = 0;
	buffer<int> sumBuf{ &sumResult, 1 };
	int maxResult = 0;
	buffer<int> maxBuf{ &maxResult, 1 };

	queue myQueue;

	myQueue.submit([&](handler& cgh) {
		auto inputValues = valuesBuf.get_access<access_mode::read>(cgh);
	auto sumReduction = reduction(sumBuf, cgh, plus<>());
	auto maxReduction = reduction(maxBuf, cgh, maximum<>());

    /*
	 cgh.parallel_for(range<1> {2048, 1024}, sumReduction, maxReduction,
		[=](nd_item<1> it, auto& sum, auto& max) {
			sum += inputValues[it.get_local_id()];
			max.combine(inputValues[it.get_global_id()]);
		});
    */

    cgh.parallel_for(nd_range<1> {2048, 1024}, sumReduction, maxReduction,
		[=](nd_item<1> it, auto& sum, auto& max) {
			sum += inputValues[it.get_local_id()];
	        max.combine(inputValues[it.get_local_id()]);
		});
    });

	// assert(maxBuf.get_host_access()[0] == 1023 && sumBuf.get_host_access()[0] == 523776);
    std::cout << maxBuf.get_host_access()[0] << std::endl;
    std::cout << sumBuf.get_host_access()[0] << std::endl;
}

 

However, the program compiled with icpx in cmd does not work as expected. The program typically prints out nothing and exits with a meaningless value. The result is shown below:

C:\Users\a1595\Desktop\dr\c>icpx -fsycl c.cpp -o c.exe

C:\Users\a1595\Desktop\dr\c>c.exe

C:\Users\a1595\Desktop\dr\c>echo %ERRORLEVEL%
-1073740791​

I am wondering whether there is anything wrong with the code, compilation or anything else.

 

I am looking forward to any help. Thanks a lot.

Labels (1)
0 Kudos
1 Solution
SeshaP_Intel
Moderator
876 Views

Hi,

 

Thank you for posting in Intel Communities.

 

Please find the below modified DPC++ source code which is producing the correct values after the execution.

#include <iostream>
#include <CL/sycl.hpp>
#include <numeric>
using namespace sycl;

int main()
{
    buffer<int> valuesBuf { 1024 };
    {
        host_accessor a { valuesBuf };
        std::iota(&a[0], &a[0] + 1024, 0);
    }
    
    int sumResult = 0;
    buffer<int> sumBuf { &sumResult, 1 };
    
    int maxResult = 0;
    buffer<int> maxBuf{ &maxResult, 1 };
    
    queue myQueue;
    myQueue.submit([&](handler& cgh) {
        
        auto inputValues = valuesBuf.get_access<access_mode::read>(cgh);       
        auto sumReduction = reduction(sumBuf, cgh, plus<>());
        auto maxReduction = reduction(maxBuf, cgh, maximum<>());
                                    
        cgh.parallel_for(nd_range<1> {1024, 256}, sumReduction, maxReduction,
                         [=](nd_item<1> it, auto& sum, auto& max) {
                             sum+= inputValues[it.get_global_id()];
                             max.combine(inputValues[it.get_global_id()]);
                         });
    });
   
    std::cout << "maxReduction = "<<maxBuf.get_host_access()[0] << std::endl;
    std::cout << "sumReduction = "<<sumBuf.get_host_access()[0] << std::endl;
    return 0;
}

Please find the output screenshot below.

SeshaP_Intel_0-1675919697364.png

Hope this resolves your issue.

 

Thanks and Regards,

Pendyala Sesha Srinivas

 

View solution in original post

0 Kudos
4 Replies
SeshaP_Intel
Moderator
877 Views

Hi,

 

Thank you for posting in Intel Communities.

 

Please find the below modified DPC++ source code which is producing the correct values after the execution.

#include <iostream>
#include <CL/sycl.hpp>
#include <numeric>
using namespace sycl;

int main()
{
    buffer<int> valuesBuf { 1024 };
    {
        host_accessor a { valuesBuf };
        std::iota(&a[0], &a[0] + 1024, 0);
    }
    
    int sumResult = 0;
    buffer<int> sumBuf { &sumResult, 1 };
    
    int maxResult = 0;
    buffer<int> maxBuf{ &maxResult, 1 };
    
    queue myQueue;
    myQueue.submit([&](handler& cgh) {
        
        auto inputValues = valuesBuf.get_access<access_mode::read>(cgh);       
        auto sumReduction = reduction(sumBuf, cgh, plus<>());
        auto maxReduction = reduction(maxBuf, cgh, maximum<>());
                                    
        cgh.parallel_for(nd_range<1> {1024, 256}, sumReduction, maxReduction,
                         [=](nd_item<1> it, auto& sum, auto& max) {
                             sum+= inputValues[it.get_global_id()];
                             max.combine(inputValues[it.get_global_id()]);
                         });
    });
   
    std::cout << "maxReduction = "<<maxBuf.get_host_access()[0] << std::endl;
    std::cout << "sumReduction = "<<sumBuf.get_host_access()[0] << std::endl;
    return 0;
}

Please find the output screenshot below.

SeshaP_Intel_0-1675919697364.png

Hope this resolves your issue.

 

Thanks and Regards,

Pendyala Sesha Srinivas

 

0 Kudos
PC-1
Beginner
789 Views

Hi,  SeshaP,

 

Unfortunately I cannot compile you code in visual studio 2022.. 

Error message is "error : SYCL kernel cannot call a variadic function"

and "error : SYCL kernel cannot call an undefined function without SYCL_EXTERNAL attribute"

 

Could you help me with that?

0 Kudos
JamesFBM
Beginner
806 Views

Hi, Sesha. Thanks a lot for your help. Your code works on my computer and I simply misunderstood the usage of nd_range.

0 Kudos
SeshaP_Intel
Moderator
781 Views

Hi,

 

Thanks for accepting our solution. 

>> Unfortunately I cannot compile you code in visual studio 2022.

Please refer to the below output screenshot built on Visual Studio in Release Mode. 

SeshaP_Intel_0-1676441264517.png

 

If you need any additional information, please post a new question as this thread will no longer be monitored by Intel.

 

Thanks and Regards,

Pendyala Sesha Srinivas

 

0 Kudos
Reply