Intel® oneAPI Base Toolkit
Support for the core tools and libraries within the base toolkit that are used to build and deploy high-performance data-centric applications.

BaseKit code samples fail to compile on devcloud

Skovhede__Kenneth
3,166 Views

Following the guide: https://devcloud.intel.com/oneapi/get-started/base-toolkit/#fpga-vector-add-sample-walkthrough

 

After about one hour wait time, the compile step fails with:

Error (213009): File name "output_files/afu_import.green_region.pmsf" does not exist or can't be read

 

Command used to submit the task:

qsub -l nodes=1:fpga_compile:ppn=2 -d . build-hw.sh

 

Contents of build-hw.sh:

#!/bin/bash
source /opt/intel/inteloneapi/setvars.sh
make -f Makefile.fpga clean
make -f Makefile.fpga hw

Working directory (from basekit code samples checkout):

BaseKit-code-samples/DPC++Compiler/vector-add

 

Anyone seeing this error or knows how to fix it?

(Debugging through slurm is painful, and the compile time makes it worse..)

 

0 Kudos
11 Replies
RahulV_intel
Moderator
3,166 Views

Hi,

Thanks for reaching out to us. We are working on this issue and we'll get back to you.

 

Rahul

0 Kudos
RahulV_intel
Moderator
3,166 Views

Hi,

I've tried building fpga-hardware sample in interactive mode. However, this issue doesn't seem to appear for me. Please note that hardware compile job can take a long time.

We suggest you to use this flag

-l walltime=hh:mm:ss

to increase the timeout for batch job, if running in non-interactive mode. Also please note that the maximum time available is 24 hrs.

If you wish to use the interactive mode, kindly follow these steps:

qsub -I -l nodes=1:fpga:ppn=2 
source /opt/intel/inteloneapi/setvars.sh
chmod +x build_fpga_hw.sh
./build_fpga_hw.sh

Once the compilation is done(it might take several hours), run the other script file

chmod +x run_fpga_hw.sh
./run_fpga_hw.sh

Also kindly check and let us know if you are able to build/run vector-add application across other platforms(gpu/fpga-emu etc). If not, kindly delete the existing basekit samples directory and download it again from the github repo.

 

Rahul

0 Kudos
Skovhede__Kenneth
3,166 Views

I tried manually setting the time to 6h (it is default 6h) and the error happens after slightly less than 1h (regardless of the walltime argument):

qsub -l walltime=06:00:00 -l nodes=1:fpga_compile:ppn=2 -d . build-hw.sh

 

I did a clean checkout before attempting to build. I can build the emulated version and run it with no issues.

I can try the interactive mode, but I would be surprised if that has an effect on the compiler.

0 Kudos
Skovhede__Kenneth
3,166 Views

Ok, tried in interactive mode, and get the same error:

aoc: Compiling for FPGA. This process may take several hours to complete.  Prior to performing this compile, be sure to check the reports to ensure the design will meet your performance targets.  If the reports indicate performance targets are not being met, code edits may be required.  Please refer to the oneAPI FPGA Optimization Guide for information on performance tuning applications for FPGAs.
Error (213009): File name "output_files/afu_import.green_region.pmsf" does not exist or can't be read
Error: Quartus Prime Convert_programming_file was unsuccessful. 1 error, 0 warnings
Error (23031): Evaluation of Tcl script compile_script.tcl unsuccessful
Error: Quartus Prime Shell was unsuccessful. 1 error, 0 warnings
Error (23031): Evaluation of Tcl script build/entry.tcl unsuccessful
Error: Quartus Prime Shell was unsuccessful. 1 error, 0 warnings
Error: Compiler Error, not able to generate hardware

 

This happened after appx 75min in the interactive shell. I have removed the entire BaseKit directory, and `rf -rf ~/tmp/*` before doing a fresh git clone.

Not sure what to do next?

0 Kudos
asenjo
Innovator
3,166 Views

I'm in the same situation. By looking at the "Time use" field of the qstat command, we've notice that time is accounted faster than what the wall clock indicates. For example, these two qstat queries have been requested within just 1minute of difference, but the elapsed time reported is almost 40min:

u36780@login-1:~$ qstat
Job ID                    Name             User            Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
463775.v-qsvr-1            build_fpga_hw.sh u36780          06:03:49 R batch          

u36780@login-1:~/scrimp-oneAPI/src$ qstat
Job ID                    Name             User            Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
463775.v-qsvr-1            build_fpga_hw.sh u36780          06:42:55 R batch

A more visible example is the following one, where two jobs have been scheduled. The second qstat query shows that the job 463774 has used ~30 minutes (although it should be around 4-5min). However, for the job 463775, ~1h:45 has been accounted in the same amount of time.

u36780@login-1:~/BaseKit-code-samples/DPC++Compiler/vector-add$ qstat

Job ID                    Name             User            Time Use S Queue

------------------------- ---------------- --------------- -------- - -----

463774.v-qsvr-1            build_fpga_hw.sh u36780          02:17:59 R batch          

463775.v-qsvr-1            build_fpga_hw.sh u36780          00:43:57 R batch          

u36780@login-1:~/BaseKit-code-samples/DPC++Compiler/vector-add$ qstat

Job ID                    Name             User            Time Use S Queue

------------------------- ---------------- --------------- -------- - -----

463774.v-qsvr-1            build_fpga_hw.sh u36780          02:49:13 R batch          

463775.v-qsvr-1            build_fpga_hw.sh u36780          02:06:04 R batch  

 

If time is accounted at this pace, 24h would't be enough to compile even the vector-add example.

Thanks in advance for your help.

 

0 Kudos
Skovhede__Kenneth
3,166 Views

Could be a time issue, but I was not disconnected in interactive mode, and still got the same error.

0 Kudos
Dmitry_Savin
New Contributor I
3,166 Views

I suppose it is the CPU time, not the wall time. The former can be the number of cores times the latter. Try running a single command under "time" utility and compare the output to the qstat value.

UPD: Use "qstat -a" to get the elapced wall time and "qstat -f" for a more detailed output.
0 Kudos
asenjo
Innovator
3,166 Views

Ok, using "qstat -n -1" I can see the time passing as expected. But still, after 1h15' (~75min) I keep getting the same error:

Error (213009): File name "output_files/afu_import.green_region.pmsf" does not exist or can't be read

This is even for the provided (not modified) vector-add example. I have a saved bit-stream (fpga executable) from a compilation I did one month ago, so it was compiling, but not anymore. Thanks.

0 Kudos
agond2
Beginner
3,166 Views

Keep getting the same error while compiling the provided vector-add example. I also compiled OpenCL kernels and I get the same error followed by the error Quartus Prime Convert_programming_file was unsuccessful. Attaching the Quartus _sh_compile.log file

Error (213009): File name "output_files/afu_import.green_region.pmsf" does not exist or can't be read
Error: Quartus Prime Convert_programming_file was unsuccessful. 1 error, 0 warnings
Error (23031): Evaluation of Tcl script compile_script.tcl unsuccessful
Error: Quartus Prime Shell was unsuccessful. 1 error, 0 warnings
Error (23031): Evaluation of Tcl script build/entry.tcl unsuccessful
Error: Quartus Prime Shell was unsuccessful. 1 error, 0 warnings
Error: Compiler Error, not able to generate hardware

0 Kudos
PSath2
Beginner
3,166 Views

Also having the same problem with trivial vectorAdd OpenCL kernel as well as more significant ones that were compiling fine a few weeks ago.

It's pretty clearly a botched upgrade of some toolset, is there a known workaround to just use an older minor version of the SYCL/OpenCL toolchain that might still be on the default queue nodes?

0 Kudos
RahulV_intel
Moderator
3,166 Views

Hi,

A new patch has been applied across Devcloud FPGA nodes addressing this issue. You may now try to submit hardware compile jobs across node s001-147 to 156. Let us know in case you face any issues.

Henceforth, requesting you to raise FPGA related queries to the forum: Intel® High Level Design

 

Rahul

0 Kudos
Reply