Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)

openCL memory error

Altera_Forum
Honored Contributor II
3,177 Views

Hello,  

 

This is my first thread, I´m new at openCL and I´m doing some work to integrate this technology with some other software. So I´m trying to execute an aocx inside another program written in C++. When my code reaches this line 

 

program = clcreateprogramwithbinary(context, 1, &device, (const size_t *)&binary_size, (const unsigned char **)&binary_buf, &binary_status, &status_ocl); 

 

returns this error: *** error in `../../bin/linux-arm/hpsfpga1dtest': munmap_chunk(): invalid pointer: 0x00294a98 *** 

 

i haven´t found any help from google and i´m a bit lost. my aocx file runs in an arm/fpga soc system from intel/altera and was cross-compiled in another host. if iexecute it directly: 

 

aocl program /dev/acl0 hello_world.aocx 

 

it returns this:  

aocl program: running reprogram from /root/opencl_arm32_rte/board/c5soc/arm32/bin 

reprogramming was successful! 

 

can anyone tell why i receive this error? maybe it is something to be with my own program, but i would need an explanation of this issue, thanks!! 

 

regards, ricardo
0 Kudos
8 Replies
Altera_Forum
Honored Contributor II
1,437 Views

Hello Again,  

 

Maybe I can give more info about the problem. I have generated a library in my host for the ARM device. I have got the pointer for every element which is called in clCreateProgramWithBinary: 

 

context: 0x18dc0c 

size: 0x18dc34 

buffer: 0x18dc2c 

binary status: 0x18dc28 

status: 0x18dc20 

device: 0x18dc08 

program: 0x18dc18 

*** Error in `../../bin/linux-arm/hpsfpga1dtest': munmap_chunk(): invalid pointer: 0x00248eb8 *** 

 

Maybe the issue is related to Host Compilation? Do you know a way to discover it? 

I have tried another way to load the binary with AOCL_utils from Altera: 

 

 

scoped_array<cl_device_id> devices; 

cl_uint num_devices; 

 

devices.reset(getDevices(platform, CL_DEVICE_TYPE_ALL, &num_devices)); 

 

// We'll just use the first device. 

device = devices[0]; 

std::string binary_file = getBoardBinaryFile("./hello_world", device); 

printf("Using AOCX: %s\n", binary_file.c_str()); 

program = createProgramFromBinary(context, binary_file.c_str(), devices, num_devices); 

 

However I get the same result. I would be glad to receive any feedback, thanks! 

Ricardo
0 Kudos
Altera_Forum
Honored Contributor II
1,437 Views

Hello Again,  

 

I have finally executed valgrind to see whats happening. I wonder if anyone could give me a hand understanding what´s telling Valgrind. I attach valgrind output if someone can help me to understand it. Thanks in advance!
0 Kudos
Altera_Forum
Honored Contributor II
1,437 Views

Hi, 

If you're getting an invalid pointer error, it sounds like it is on the host side of the code. I haven't dealt too much with SOCs so I'm not sure of the differences that may arise. 

 

It looks like the last call to the AOCL_Utils is the setCwdToExeDir() in the valgrind report. I would remove that call to the function in your main program since that exists in many of the examples (It's usually in the init() function in the OpenCL Examples). To my understanding it just changes your working directory, I don't really know the purpose of this but it may cause unexpected results later down the road. The code in the examples will probably look like this: 

bool init(){ cl_int status; if(!setCwdToExeDir()){ return false; } ... } 

 

To get a better understanding it might be good to run gdb to see which pointer is invalid. 

To get gdb running you'd first need to compile with the debug info flag which is normally -g on your host code then you can run gdb like a normal program but with the gdb command before it. gdb ./hpsfpga1dtest.  

 

There are signal interrupts that need to be ignored so you can put in handle SIG44 nostop and then run your code with r. When it hits the invalid pointer it should exit and then you can enter bt to get a backtrace which will get you the last line from your code that was being executed. 

 

If you don't ignore the interrupts, the program will stop at the interrupts treating them as breakpoints, so you'd need to press c to continue the program execution. 

 

From there, what line does the program fail on? 

 

If I had to guess I'd say it almost looks like it can't find the aocx file?
0 Kudos
Altera_Forum
Honored Contributor II
1,437 Views

Thanks a lot for your answer, I will try as you say. First I will compile without the redirection, it can give problems for me because I don´t execute from the same path as the executable is. Second I will backtrace with gdb. That´s something I have done and the program gives me the error in clCreateProgramWithBinary, but it also says: CL_INVALID_PROGRAM, this result is different from the one I have when it is normally executed. I will copy the message as soon as I try your idea, thanks a lot! 

 

Regards, Ricardo
0 Kudos
Altera_Forum
Honored Contributor II
1,437 Views

Hello Again,  

 

I have commented the code: 

 

if(!setCwdToExeDir()){ return false; } 

 

I have also explicitly search the file in absolute path: 

 

fp = fopen("/opt/epics/support/hello_world.aocx", "r"); 

 

With gdb I can confirm that fp opens the file properly. Then everything still the same, this is the error in gdb if I ignore SIGABRT: 

*** Error in `/opt/epics/support/areaDetector/ADHPS_FPGA_1D/iocs/HPSFPGA1DIOC/bin/linux-arm/hpsfpga1dtest': munmap_chunk(): invalid pointer: 0x00247ed0 *** 

 

Program received signal SIGABRT, Aborted.  

 

If I wait for the signal the info with backtrace: 

*** Error in `/opt/epics/support/areaDetector/ADHPS_FPGA_1D/iocs/HPSFPGA1DIOC/bin/linux-arm/hpsfpga1dtest': munmap_chunk(): invalid pointer: 0x00247ed0 *** 

 

Program received signal SIGABRT, Aborted. 0x75edda34 in raise () from /lib/libc.so.6 (gdb) bt # 0 0x75edda34 in raise () from /lib/libc.so.6 # 1 0x75ededb0 in abort () from /lib/libc.so.6 # 2 0x75f165f0 in ?? () from /lib/libc.so.6 Backtrace stopped: previous frame identical to this frame (corrupt stack?) (gdb) exit 

 

Then I have executed again valgrind and it again gives me another result (I have attached the file): 

This is the most significant part: 

==529== Warning: whilst reading EXIDX: ExtabEntryDecode: failed with error code: -10 ==529== Warning: whilst reading EXIDX: ExtabEntryDecode: failed with error code: -10 ==529== Warning: whilst reading EXIDX: ExtabEntryDecode: failed with error code: -10 ==529== Warning: whilst reading EXIDX: ExtabEntryDecode: failed with error code: -10 ==529== Warning: whilst reading EXIDX: Implausible EXIDX last entry size 4294964079; using 1 instead. --529-- Reading EXIDX entries: 564 attempted, 452 successful --529-- REDIR: 0x58528c0 (libc.so.6:memset) redirected to 0x484b6bc (memset) --529-- REDIR: 0x5859c40 (libc.so.6:memcpy) redirected to 0x484a0b4 (memcpy) --529-- REDIR: 0x5851170 (libc.so.6:rindex) redirected to 0x48486e8 (rindex) --529-- REDIR: 0x5850dc1 (libc.so.6:strlen) redirected to 0x4848cc8 (strlen) --529-- REDIR: 0x585216c (libc.so.6:bcmp) redirected to 0x484b08c (bcmp) --529-- REDIR: 0x58503f1 (libc.so.6:strcmp) redirected to 0x48499f8 (strcmp) --529-- REDIR: 0x5850fe4 (libc.so.6:strncmp) redirected to 0x48492fc (strncmp) --529-- REDIR: 0x584bb88 (libc.so.6:malloc) redirected to 0x4845320 (malloc) --529-- REDIR: 0x58506f0 (libc.so.6:stpcpy) redirected to 0x484b164 (stpcpy) --529-- REDIR: 0x5850700 (libc.so.6:strcpy) redirected to 0x4848dc4 (strcpy) --529-- REDIR: 0x580aaa0 (libc.so.6:putenv) redirected to 0x484c930 (putenv) --529-- REDIR: 0x5850310 (libc.so.6:index) redirected to 0x48488d8 (index) --529-- REDIR: 0x5850e9c (libc.so.6:strnlen) redirected to 0x4848c68 (strnlen) --529-- REDIR: 0x584c3fc (libc.so.6:realloc) redirected to 0x4847ab0 (realloc) --529-- REDIR: 0x584c6e8 (libc.so.6:calloc) redirected to 0x48478a8 (calloc) --529-- REDIR: 0x584c328 (libc.so.6:free) redirected to 0x4846894 (free) --529-- REDIR: 0x58520d1 (libc.so.6:memchr) redirected to 0x4849bc0 (memchr) --529-- REDIR: 0x5852580 (libc.so.6:memmove) redirected to 0x484b72c (memmove) --529-- REDIR: 0x566299c (libstdc++.so.6:operator new(unsigned int)) redirected to 0x4845968 (operator new(unsigned int)) --529-- REDIR: 0x5851b48 (libc.so.6:strstr) redirected to 0x484c45c (strstr) --529-- REDIR: 0x585484c (libc.so.6:strchrnul) redirected to 0x484bdc4 (strchrnul) --529-- REDIR: 0x5851124 (libc.so.6:strncpy) redirected to 0x4848f3c (strncpy) --529-- REDIR: 0x58547c0 (libc.so.6:rawmemchr) redirected to 0x484bdec (rawmemchr) --529-- REDIR: 0x5660640 (libstdc++.so.6:operator delete(void*)) redirected to 0x4846df0 (operator delete(void*)) --529-- REDIR: 0x5851604 (libc.so.6:strspn) redirected to 0x484c69c (strspn) --529-- REDIR: 0x585123c (libc.so.6:strpbrk) redirected to 0x484c5d0 (strpbrk) ==529== Syscall param rt_sigaction(act->sa_mask) points to uninitialised byte(s) ==529== at 0x56F1A4C: __libc_sigaction (in /lib/libpthread-2.23.so) ==529== Address 0x7dd8df8c is on thread 1's stack ==529== Uninitialised value was created by a stack allocation ==529== at 0x4E03AE6: aocl_mmd_open (in /root/opencl_arm32_rte/host/arm32/lib/libalterammdpcie.so) ==529== ==529== Syscall param write(buf) points to uninitialised byte(s) ==529== at 0x56F0228: write (in /lib/libpthread-2.23.so) ==529== by 0x4E024DB: ACL_PCIE_MM_IO_DEVICE::write_block(unsigned int, unsigned int, void*) (in /root/opencl_arm32_rte/host/arm32/lib/libalterammdpcie.so) ==529== Address 0x7dd8d02c is on thread 1's stack ==529== Uninitialised value was created by a stack allocation ==529== at 0x4E0245E: ACL_PCIE_MM_IO_DEVICE::write_block(unsigned int, unsigned int, void*) (in /root/opencl_arm32_rte/host/arm32/lib/libalterammdpcie.so) ==529== --529-- REDIR: 0x5662a88 (libstdc++.so.6:operator new(unsigned int)) redirected to 0x48461b8 (operator new(unsigned int)) --529-- REDIR: 0x5660648 (libstdc++.so.6:operator delete(void*)) redirected to 0x4847410 (operator delete(void*)) ==529== Invalid read of size 4 ==529== at 0x56A405C: std::string::_Rep::_M_grab(std::allocator<char> const&, std::allocator<char> const&) (in /usr/lib/libstdc++.so.6.0.20) ==529== Address 0x5aa7654 is 4 bytes before a block of size 20 alloc'd ==529== at 0x484625C: operator new(unsigned int) (in /usr/lib/valgrind/vgpreload_memcheck-arm-linux.so) ==529== ==529== Invalid read of size 4 ==529== at 0x56A3B68: std::string::_Rep::_M_refcopy() (in /usr/lib/libstdc++.so.6.0.20) ==529== Address 0x5aa7654 is 4 bytes before a block of size 20 alloc'd ==529== at 0x484625C: operator new(unsigned int) (in /usr/lib/valgrind/vgpreload_memcheck-arm-linux.so) ==529== ==529== Invalid write of size 4 ==529== at 0x56A3B70: std::string::_Rep::_M_refcopy() (in /usr/lib/libstdc++.so.6.0.20) ==529== Address 0x5aa7654 is 4 bytes before a block of size 20 alloc'd ==529== at 0x484625C: operator new(unsigned int) (in /usr/lib/valgrind/vgpreload_memcheck-arm-linux.so) ==529== ==529== Invalid read of size 4 ==529== at 0x56A3258: std::string::_Rep::_M_dispose(std::allocator<char> const&) (in /usr/lib/libstdc++.so.6.0.20) ==529== Address 0x5aa7654 is 4 bytes before a block of size 20 alloc'd ==529== at 0x484625C: operator new(unsigned int) (in /usr/lib/valgrind/vgpreload_memcheck-arm-linux.so) ==529== ==529== Invalid write of size 4 ==529== at 0x56A3260: std::string::_Rep::_M_dispose(std::allocator<char> const&) (in /usr/lib/libstdc++.so.6.0.20) ==529== Address 0x5aa7654 is 4 bytes before a block of size 20 alloc'd ==529== at 0x484625C: operator new(unsigned int) (in /usr/lib/valgrind/vgpreload_memcheck-arm-linux.so) ==529== ==529== Invalid read of size 4 ==529== at 0x56A2F94: std::string::compare(char const*) const (in /usr/lib/libstdc++.so.6.0.20) ==529== Address 0x5aa764c is 12 bytes before a block of size 20 alloc'd ==529== at 0x484625C: operator new(unsigned int) (in /usr/lib/valgrind/vgpreload_memcheck-arm-linux.so) ==529== ==529== Conditional jump or move depends on uninitialised value(s) ==529== at 0x48468E4: free (in /usr/lib/valgrind/vgpreload_memcheck-arm-linux.so) ==529== Uninitialised value was created by a heap allocation ==529== at 0x48453C4: malloc (in /usr/lib/valgrind/vgpreload_memcheck-arm-linux.so) ==529== ==529==  

 

The result of the execution with valgrind: ERROR: CL_INVALID_PROGRAM Location: ../HPS_FPGA_1D.cpp:212 Failed to build program 

 

However I have tested my aocx with another program that can load it (in a very similar way) and it works properly. Do you know what else can be? Thanks again for your help
0 Kudos
Altera_Forum
Honored Contributor II
1,437 Views

Hmm, it seems like it should work. It almost looks like the C/C++ and Altera libraries are having issues. When you say: "I have tested my aocx with another program that can load it (in a very similar way) and it works properly," is the other program using the same board with the SoC and are the host codes compiled on the ARM SoC? 

 

I've heard of a similar issue a while back where programming the board seemed to fail on the SoCs and I think the way they got around it was by flashing the FPGA with the aocx file you intend to run and then running the host code. I'm not sure of those details. What board and chip are you using?
0 Kudos
Altera_Forum
Honored Contributor II
1,437 Views

Hello Fand,  

 

Yes, the other program runs in the same SoC system with the same aocx. It´s just that my program is more large and calls more libraries. There´s also another difference, this init() function is called in the main() of the application that works. In the second case, init() is called from an object that belongs to a library.  

Another important thing is that is of key importance that the aocx is loaded by code, as it gives enormous flexibility. I´m using a de1SoC [http://www.terasic.com.tw/cgi-bin/page/archive.pl?language=english&no=836

 

Thanks for your help, I will try something else. I have included the init function without being part of the class that is in the library. I will try to include the init function as a method of my class. I will tell you my results this evening. 

 

Thanks for your help! 

Ricardo
0 Kudos
Altera_Forum
Honored Contributor II
1,437 Views

I´m sorry to say that it didn´t work also. I can´t guess what´s happening with this program. Could it be that it is too large? I have 1GB RAM so I don´t think so. I welcome any idea to test my program and make it work. Thanks in advance! 

 

Regards, Ricardo
0 Kudos
Reply