Intel® oneAPI Data Parallel C++
Support for Intel® oneAPI DPC++ Compiler, Intel® oneAPI DPC++ Library, Intel ICX Compiler , Intel® DPC++ Compatibility Tool, and GDB*
561 Discussions

std::ostream << vec<T,length> not supoorted in OneAPI (beta07)

CFR
New Contributor II
1,946 Views

Lenovo T570, Ubuntu 18.04.4, OneAPI beta07

As best I can tell it's perfectly acceptable for me to use the "vec" types (or the corresponding aliases) in my host code (i.e. not in kernels).  The OneAPI software has methods to print "vec" types to a sycl::stream, but there are no corresponding methods to print to std::ostream (i.e. cout).  It's easy enough to hack something together, but it seems to me this capability should be supported by default.

0 Kudos
1 Solution
GouthamK_Intel
Moderator
1,736 Views

Hi,

Glad to know that you are able to figure out the appropriate function by yourself.


Regarding the feature request:

We have forwarded your request to the concerned internal team, depending on the number of similar requests received from the other community members this feature request may be included in future releases.


We appreciate for bringing up this feature request to us.


Could you please let us know if we are able to resolve all your issues?


Thanks & Regards

Goutham


View solution in original post

0 Kudos
13 Replies
GouthamK_Intel
Moderator
1,923 Views

Hi,


According to SYCL specification, sycl::stream is a buffered output stream class, that is based on the std::ostream of C++. This class aims to replace the C style printf of OpenCL C, which enables outputting information via stdout from within an SYCL kernel.

The stream class is designed for use only within kernels in order to stream from the kernel back to the host CPU for printing via stdout and is passed to a kernel in a similar fashion to accessors. When strings are stream to the object within a kernel, the contents are stored on an internal cl_mem object until the kernel finishes execution. The output is then output to stdout when the stream object is destroyed by the runtime.


Apart from sycl::stream, Intel oneAPI provides experimental printf which can be used inside the kernel.

Below is the syntax for the same.


Before Main:


#ifdef __SYCL_DEVICE_ONLY__

     #define CONSTANT __attribute__((opencl_constant))

#else

     #define CONSTANT

#endif


Inside kernel:


 static const CONSTANT char FMT[] = "Im experimental printf %d \n";

 sycl::intel::experimental::printf(FMT,value);


From the query you have posted: "It's easy enough to hack something together, but it seems to me this capability should be supported by default." Could please explain what exactly you are trying to convey? Are you asking for any feature requests?



Best Regards

--Goutham


0 Kudos
CFR
New Contributor II
1,859 Views

Sorry, in my mind I was clearer. ;^)  I understand the use of sycl::stream and using it within kernels.  What I feel is missing is that for the "vec" types (and aliases) there are no corresponding methods to print "vecs" to std::ostream outside of a kernel.  As far as I can tell it's perfectly acceptable to write code using "vec" types on the host side (either for actual computation ala AVX without intrinsics, or just to setup data to eventually be passed to a SYCL kernel that uses "vec" types).

I guess I would ask for a feature request to be considered.  That being the definition of methods to support printing of "vec" types to std::ostream that mirror those that support printing to sycl::stream.

By "easy to hack" I mean that it's easy enough to use something like...

 

 

template <typename T, int VectorLength>
std::ostream &operator<<(std::ostream &Out, const sycl::vec<T, VectorLength> &RHS) {
  for (int i=0; i<VectorLength; i++) Out << RHS[i] << (i==(VectorLength-1)?"":", ");
  return Out;
}

 

 

...to print my "vec" data but it seems natural that such capability be included in the software development tools.

 

0 Kudos
CFR
New Contributor II
1,859 Views

Sorry I wasn't more clear. My issue isn't with printing "vec" (and aliases) types from a kernel, but doing so from the host side. I can use the "vec" types in my host code (for computation or for data that eventually gets processed by a kernel) but there are no methods to support std::ostream operator<<. (At least I haven't found them.)

My feature request would be to add such methods so there is symmetrical support; i.e. there is a set on the host side that matches the set on the kernel side.

By "easy to hack" I mean that it's easy enough to make your own using something like...

template <typename T, int VectorLength>
std::ostream &operator<<(std::ostream &Out, const sycl::vec<T, VectorLength> &RHS) {
  for (int i=0; i<VectorLength; i++) Out << RHS[i] << (i==(VectorLength-1)?"":", ");
  return Out;
}

... but it I just thought it was something that should be included in the software tools by default.

0 Kudos
CFR
New Contributor II
1,860 Views

When I try to reply my post is labeled as spam and/or deleted?

0 Kudos
CFR
New Contributor II
1,860 Views

My issue isn't with printing "vec" (and aliases) types from a kernel, but doing so from the host side. I can use the "vec" types in my host code (for computation or for data that eventually gets processed by a kernel) but there are no methods to support std::ostream operator<<. (At least I haven't found them.)

My feature request would to add such methods so there is symmetrical support; i.e. there is a set on the host side that matches the set on the kernel side.

By "easy to hack" I mean that it's easy enough to make your own using something like...

template <typename T, int VectorLength>
std::ostream &operator<<(std::ostream &Out, const sycl::vec<T, VectorLength> &RHS) {
for (int i=0; i<VectorLength; i++) Out << RHS[i] << (i==(VectorLength-1)?"":", ");
return Out;
}

... but it I just thought it was something that should be included in the software tools by default.

0 Kudos
CFR
New Contributor II
1,860 Views

I've tried posting this as a reply to Goutham's reply but the forum keeps deleting my messages.  I'll try posting it here and see what happens.

My issue isn't with printing "vec" (and aliases) types from a kernel, but doing so from the host side. I can use the "vec" types in my host code (for computation or for data that eventually gets processed by a kernel) but there are no methods to support std::ostream operator<<. (At least I haven't found them.)

My feature request would to add such methods so there is symmetrical support; i.e. there is a set on the host side that matches the set on the kernel side.

By "easy to hack" I mean that it's easy enough to make your own using something like...

template <typename T, int VectorLength>
std::ostream &operator<<(std::ostream &Out, const sycl::vec<T, VectorLength> &RHS) {
for (int i=0; i<VectorLength; i++) Out << RHS[i] << (i==(VectorLength-1)?"":", ");
return Out;
}

... but it I just thought it was something that should be included in the software tools by default.

0 Kudos
CFR
New Contributor II
1,860 Views

I give up.  I've tried for 3 days to followup to this and my messages keep being called spam and/or being deleted. If necessary we can try to move the discussion to StackOverflow.

0 Kudos
GouthamK_Intel
Moderator
1,823 Views

Hi,


Apologize for the delay and inconvenience caused.!


We have reported to the concerned team regarding the deletion of posts commented by you.


Regarding the feature request, As you have mentioned that you need symmetrical support across host and device code. Could you please share any particular case where you faced inconvenience due to lack of this feature.


This will help us to file a feature request with the concerned team accordingly.

Please do share the required details if possible.



Thanks & Regards

Goutham


0 Kudos
CFR
New Contributor II
1,810 Views

(Sorry for all the posts. I kept trying different posts but they kept getting flagged as SPAM.  Someone eventually unblocked them all rather than just one of the attempts to respond).

As for printing "vec" on the host side....

I'm still at the point of just playing around with SYCL, so I don't have a compelling application to press the issue.  But, I can see that anytime I implement a kernel that uses a "vec" type, I'm going to want to do things with that data on the host as well, so I would think it's a pretty useful thing.  (I can say that I've written a number of SIMD classes for AVX usage and the ability to print out data is always useful/important).  It's also just an aesthetically pleasing thing to have the host and device side be symmetric.  In any case, here is a toy program pared down from the experiment I originally caused me to run into this:

#include <CL/sycl.hpp>
namespace sycl = cl::sycl;

#include <iostream>

const int Nproc = 6;
const int Niter = 1024;

/* kludge to output "vec" on the host side */
template <typename T, int VectorLength>
std::ostream &operator<<(std::ostream &Out, const sycl::vec<T, VectorLength> &RHS) {
  for (int i=0; i<VectorLength; i++) Out << RHS[i] << (i==(VectorLength-1)?"":", ");
  return Out;
}

/* kludge to get number of elements in a "vec" variable */
template <typename T, int VectorLength>
int numElements(sycl::vec<T,VectorLength> type) {return VectorLength; }

template <typename T>
T lcg48(T state, int n)
{
  for (int i=0; i<n; i++) {
    state = (state * 0x5DEECE66D + 0xB) % (1L<<48);
  }
  T retval = ((state >> 17) & 0x7FFF8000) + ((state >> 17) & 0x7FFF);
  return retval;
}

#define INIT()                                                          \
  std::cout << "Initial data" << std::endl;                             \
  for (int p=0; p<Nproc; p++) {                                         \
    dataT tmp;                                                          \
    tmp = (10*p);                                                       \
    for (int i=0; i<numElements(tmp); i++) tmp[i] = tmp[i]+i;           \
    ydata[p] = tmp;                                                     \
  }                                                                     \
  for (int p=0; p<Nproc; p++) std::cout << std::dec << p << " " << ydata[p] << std::endl; 

using dataT = sycl::ulong8;

int
main(int argc, char *argv[])
{
  dataT ydata[Nproc];

  /** C++ **/
  std::cout << "C++ dataT" << std::endl;
  INIT();
  for (int p=0; p<Nproc; p++) {
    ydata[p] = lcg48(ydata[p], Niter);
  }
  std::cout << "results" << std::endl;                             \
  for (int p=0; p<Nproc; p++) std::cout << std::hex << p << " " << ydata[p] << std::endl;

  /** SYCL CPU **/
  {
    std::cout << "SYCL CPU" << std::endl;
    sycl::device dev = sycl::cpu_selector().select_device();
    std::cout << "Device: " 
        << "name: " << dev.get_info<sycl::info::device::name>() << std::endl
        << "vendor: " << dev.get_info<sycl::info::device::vendor>() << std::endl;
    INIT();
    sycl::queue q(dev);
    sycl::buffer<dataT, 1> ybuf((dataT *)ydata, sycl::range<1> {Nproc});
    q.submit([&](sycl::handler& cgh) {
      sycl::stream sout {1024, 1024, cgh};
      //sout << "Kernel" << sycl::endl;
      auto yacc = ybuf.get_access<sycl::access::mode::read_write, sycl::access::target::global_buffer>(cgh);
      cgh.parallel_for<class kernelCPU>(
          sycl::range<1> {Nproc}, 
          [=] (sycl::item<1> item) {
              int idx = item.get_linear_id();
              sout << idx << " before " << sycl::hex << yacc[idx] << sycl::dec << sycl::endl;
              yacc[idx] = lcg48(yacc[idx], Niter);
              sout << idx << " after " << sycl::hex << yacc[idx] << sycl::dec << sycl::endl;
              }
          );
      }
      ); 
  }
  std::cout << "results" << std::endl;                             \
  for (int p=0; p<Nproc; p++) std::cout << std::hex << p << " " << ydata[p] << std::endl;
}

I'll note that in trying to tidy up the example I came across another (I think) deficiency in the "vec" class.  As far as I could tell there's no way to get the number of elements in a type.  The 'getNumElements' static method is private.  As illustrated in this example, sometimes it's necessary to parameterize the code based on the number of elements.  (see the "INIT()" macro).

Anyway, there's my suggestion. Being a newbie to SYCL though I'm willing to defer to the language lawyers if they have a reason this is a bad idea. I could be all wrong and maybe shouldn't be doing things this way.  I'm happy to learn the right/better way.

0 Kudos
GouthamK_Intel
Moderator
1,790 Views

Hi,


Regarding sycl::stream usage outside the kernel:

Thank you for providing your valuable inputs, we are discussing regarding your feature request with the concerned internal team.


Regarding "deficiency in the 'vec' class" you have mentioned in your previous comment:

Could you please raise a new thread for the same as we think we are little deviating from the current thread topic. Moreover, it would be more convenient to track the topic and will ease other community people to quickly refer to, if they are facing a similar issue.



Thanks & Regards

Goutham


0 Kudos
CFR
New Contributor II
1,783 Views

I withdraw the issue about getting the number of elements in a "vec".  It turns out there is a method "get_count" that provides NumElements.  (That's what I get for reading the header file as opposed to reading the spec :^\)

0 Kudos
GouthamK_Intel
Moderator
1,737 Views

Hi,

Glad to know that you are able to figure out the appropriate function by yourself.


Regarding the feature request:

We have forwarded your request to the concerned internal team, depending on the number of similar requests received from the other community members this feature request may be included in future releases.


We appreciate for bringing up this feature request to us.


Could you please let us know if we are able to resolve all your issues?


Thanks & Regards

Goutham


0 Kudos
GouthamK_Intel
Moderator
1,704 Views

Hi,

As this issue has been resolved and we will no longer respond to this thread. If you require additional assistance from Intel, please start a new thread. Any further interaction in this thread will be considered community only.


Thanks & Regards

Goutham


0 Kudos
Reply