Solved: dpct failed to create file with invalid argument on Windows10

Jim · ‎04-26-2020

Hi all

I try to use dpct-beta05 on Windows 10 Version 1919 OS build 18363.778

The tested CUDA toolkit version is v10.2

The error looks like:

>dpct vector_add.cu
NOTE: Could not auto-detect compilation database for file 'vector_add.cu' in 'C:\Users\jim\Workspace\cudatest' or any parent directory.
The directory "dpct_output" is used as "out-root"
Processing: C:\Users\jim\Workspace\cudatest\vector_add.cu
[ERROR] Create file : d:\workspace\cudatest\dpct_outputc:\users\jim\workspace\cudatest\vector_add.dp.cpp fail: invalid argument
dpct exited with code: -2 (Error: Saving output file(s))

BTW This process is ok on Ubuntu 18.04 with same dpct and CUDA toolkit version.

PS: dpct seems have no verbose output and on windows I don't know how to strace.

Thanks

JenniferJ · ‎03-19-2021

Hello all,

this issue has been fixed sometime ago. it is working on Windows now. The generated code is also compiled ok with dpcpp, and runs ok as well.

There're always improvements and new added support for additional APIs in the dpct for each release. Please make sure to download and install the latest release of oneAPI Base Toolkit.

View solution in original post

GouthamK_Intel · ‎04-27-2020

Hi Jim,

Thanks for reaching out to us!

Our team is working on your query, we will get back to you.

Could you please share the source code if possible. So that it will be helpful to investigate more regarding the issue you are facing.

Regards

Goutham

Jim · ‎04-29-2020

Hi Goutham,

My code is really easy:

#include <cuda.h>
#include <stdio.h>
#define VECTOR_SIZE 256

__global__ void VectorAddKernel(float* A, float* B, float* C)
{
    A[threadIdx.x] = threadIdx.x + 1.0f;
    B[threadIdx.x] = threadIdx.x + 1.0f;
    C[threadIdx.x] = A[threadIdx.x] + B[threadIdx.x];
}

int main()
{
    float *d_A, *d_B, *d_C;
	
    cudaMalloc(&d_A, VECTOR_SIZE*sizeof(float));
    cudaMalloc(&d_B, VECTOR_SIZE*sizeof(float));
    cudaMalloc(&d_C, VECTOR_SIZE*sizeof(float));
    
    VectorAddKernel<<<1, VECTOR_SIZE>>>(d_A, d_B, d_C);
    
    float Result[VECTOR_SIZE] = { };
    cudaMemcpy(Result, d_C, VECTOR_SIZE*sizeof(float), cudaMemcpyDeviceToHost);

    cudaFree(d_A);
    cudaFree(d_B);
    cudaFree(d_C);

    for (int i = 0; i < VECTOR_SIZE; i++) {
        if (i % 16 == 0) {
            printf("\n");
        }
        printf("%f ", Result);    
    }
	
    return 0;
}

I suspect if some syscall failed in dpct on Windows 10, because dpct failed with "fail: invalid argument"

Thanks

GouthamK_Intel · ‎04-30-2020

Hi Jim,

Thanks, for providing the source code.

We are able to migrate the same source code successfully without any errors. Below are my system environment details.

OS Version: Windows10

oneAPI Basekit Version: 2021.1-beta05

CUDA Toolkit Version: 10.1

However, to investigate more on your issue, we have escalated this to the concerned team.

Regards

Goutham

JenniferJ · ‎03-19-2021

Hello all,

this issue has been fixed sometime ago. it is working on Windows now. The generated code is also compiled ok with dpcpp, and runs ok as well.

There're always improvements and new added support for additional APIs in the dpct for each release. Please make sure to download and install the latest release of oneAPI Base Toolkit.