Intel® oneAPI Base Toolkit
Support for core tools and libraries to build and deploy high-performance data-centric applications
274 Discussions

sycl::queue::memcpy (sycl::handler::memcpy) with std::string_view and static storage duration

breyerml
Novice
583 Views

OS: Ubuntu 20.04.2 LTS
DPCPP compiler version: Intel(R) oneAPI DPC++ Compiler 2021.1 (2020.10.0.1113)

I tried playing around with samples from the "Data Parallel C++" book and stumbled upon an issue with USM, memcpy and static storage duration.

 

#include <CL/sycl.hpp>

#include <cstring>
#include <iostream>
#include <string>
#include <string_view>
#include <type_traits>
using namespace sycl;


template <typename T>
std::size_t str_size(const T& str) {
    if constexpr (std::is_pointer_v<T>) {
        return std::strlen(str);
    } else {
        return std::size(str);
    }
}
template <typename T>
const char* str_data(const T& str) {
    if constexpr (std::is_pointer_v<T>) {
        return str;
    } else {
        return std::data(str);
    }
}


template <typename T>
void test() {
    T str = "12345678";
    const std::size_t size = str_size(str);
    
    // use std::memcpy
    std::cout << "  std::memcpy" << std::endl;
    {
        queue Q;
        char* result = malloc_shared<char>(size + 1, Q);
        // init
        for (std::size_t i = 0; i < size; ++i) {
            result[i] = '-';
        }
        result[size] = '\0';
        std::cout << "    " << result << std::endl;
        
        // perform memcpy
        std::memcpy(result, str_data(str), size + 1);
        std::cout << "    " << result << std::endl;
        
        // kernel
        Q.parallel_for(size, [=](id<1> i) { result[i] += 1; }).wait();
        std::cout << "    " << result << std::endl;
        std::cout << "    -> " 
                  << (std::string_view{result} == std::string_view{"23456789"})
                  << std::endl;
    }
    
    // use sycl::queue::memcpy
    std::cout << "  sycl::queue::memcpy" << std::endl;
    {
        queue Q;
        char* result = malloc_shared<char>(size + 1, Q);
        // init
        for (std::size_t i = 0; i < size; ++i) {
            result[i] = '-';
        }
        result[size] = '\0';
        std::cout << "    " << result << std::endl;
        
        // perform memcpy
        Q.memcpy(result, str_data(str), size + 1).wait();
        std::cout << "    " << result << std::endl;
        
        // kernel
        Q.parallel_for(size, [=](id<1> i) { result[i] += 1; }).wait();
        std::cout << "    " << result << std::endl;
        std::cout << "    -> " 
                  << (std::string_view{result} == std::string_view{"23456789"})
                  << std::endl;
    }
    
    // use sycl::handler::memcpy
    std::cout << "  sycl::handler::memcpy" << std::endl;
    {
        queue Q;
        char* result = malloc_shared<char>(size + 1, Q);
        // init
        for (std::size_t i = 0; i < size; ++i) {
            result[i] = '-';
        }
        result[size] = '\0';
        std::cout << "    " << result << std::endl;
        
        // perform memcpy
        Q.submit([&](handler& h) {
            h.memcpy(result, str_data(str), size + 1);
        }).wait();
        std::cout << "    " << result << std::endl;
        
        // kernel
        Q.parallel_for(size, [=](id<1> i) { result[i] += 1; }).wait();
        std::cout << "    " << result << std::endl;
        std::cout << "    -> " 
                  << (std::string_view{result} == std::string_view{"23456789"})
                  << std::endl;
    }
    
    std::cout << std::endl;
}


int main() {
    std::cout << std::boolalpha;
    
    // std::string
    std::cout << "std::string" << std::endl;
    test<std::string>();
    
    std::cout << "std::string_view" << std::endl;
    test<std::string_view>();
    
    std::cout << "const char*" << std::endl;
    test<const char*>();
    
    
  return 0;
}

 

The idea is the following:
- create a string
- allocate USM of the respective size (and initialize it with '-')
- copy string to USM pointer
- modify string inside kernel

This works fine for std::string, std::string_view, and const char* when using std::memcpy.
However, when using sycl::queue::memcpy or sycl::handler::memcpy it only works for std::string, using std::string_view or const char* results in garbage being copied by the respective memcpy operation.

Here one sample output on my machine:

 

std::string
  std::memcpy
    --------
    12345678
    23456789
    -> true
  sycl::queue::memcpy
    --------
    12345678
    23456789
    -> true
  sycl::handler::memcpy
    --------
    12345678
    23456789
    -> true

std::string_view
  std::memcpy
    --------
    12345678
    23456789
    -> true
  sycl::queue::memcpy
    --------
    ��7
    ��8D@�O�
    -> false
  sycl::handler::memcpy
    --------
     �
h�! 
    !�
      i�"!
    -> false

const char*
  std::memcpy
    --------
    12345678
    23456789
    -> true
  sycl::queue::memcpy
    --------
     �
h�! �"
    !�
      i�"!�"
    -> false
  sycl::handler::memcpy
    --------
    ��7
    ��8D@��$
    -> false

 

The code has been compiled with

 

dpcpp -std=c++17 memcpy.cpp

 

 

I guess that the weird results have something to do with the static storage duration of the const char* c-string literal (and hence the std::string_view wrapping a const char*).

Is it explicitly disallowed by SYCL to memcpy the contents of a c-style string literal using the respective sycl memcpy functions?

 

EDIT:
As of https://intel.github.io/llvm-docs/doxygen/classcl_1_1sycl_1_1queue.html#a6bc6a510e5e9abcbf1ee904d4b8... sycl::queue::memcpy accepts only USM pointers for both source and destination pointers. However, the SYCL standard states explicitly that "Copies numBytes of data from the pointer src to the pointer dest. Both dest and src may be either host or USM pointers. For moredetail on USM, please see Section 4.8." on page 301. With the restriction of both pointers being USM pointer memcpy would be rather useless...

0 Kudos
2 Replies
AbhishekD_Intel
Moderator
510 Views

Hi,


Thanks for reaching out to us.

We are looking into your issue. We will update you as soon as we get any updates.


Warm Regards,

Abhishek


Sravani_K_Intel
Moderator
424 Views

Hi,


Thanks for reporting. This issue is due to a bug in the DPC++ Runtime while copying from the .data section to the device. The issue has been escalated to development and will be fixed timely.


DPC++ is aligned with SYCL spec for sycl::queue::memcpy() and both src and destination pointers can be either USM or host pointers. The above link has a documentation glitch and will be fixed shortly.



Thanks.


Reply