- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Valgrind tool which is a well known memory analyzing tool reports an Invalid Read in OCIStmtPrepare in Oracle C API Function. This can be observed in several such Oracle C API functions.
Please refer the following stack trace.
According to my observations and understanding the the application creates a buffer of 317 bytes. However when it is passed to Oracle library it does some memory copy using the __intel_new_memcpy function. However the __intel_new_memcpy function copies 320 bytes (which is 8 from 312). The actual allocated memory was 317 bytes.
Could you please confirm whether this behaviour correct? What goes wrong in this?
==22195== Invalid read of size 8
==22195== at 0x68CD2D9: __intel_new_memcpy (in /x02/app/oracle/product/11.2.0/client_1/lib/libclntsh.so.11.1)
==22195== by 0x5D84158: kpurclientparse (in /x02/app/oracle/product/11.2.0/client_1/lib/libclntsh.so.11.1)
==22195== by 0x5D878DE: kpureq (in /x02/app/oracle/product/11.2.0/client_1/lib/libclntsh.so.11.1)
==22195== by 0x5D607FA: OCIStmtPrepare (in /x02/app/oracle/product/11.2.0/client_1/lib/libclntsh.so.11.1)
==22195== by 0x4099E0: DBCursor::Parse(char const*) (OCICPP.C:1020)
==22195== by 0x40CE29: DBCon::NewCursor(char const*, int) (OCICPP.C:753)
==22195== by 0x4047A6: main (main.cpp:59)
==22195== Address 0xa2e7e68 is 312 bytes inside a block of size 317 alloc'd
==22195== at 0x4C26E1C: operator new[](unsigned long) (vg_replace_malloc.c:305)
==22195== by 0x4EBD00F: String::Set(char const*, unsigned int) (String.cpp:544)
==22195== by 0x4EBD169: String::Set(char const*) (String.cpp:512)
==22195== by 0x4EBD188: String::operator=(char const*) (String.cpp:590)
==22195== by 0x404784: main (main.cpp:55)
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This looks similar to this report:
https://sft.its.cern.ch/jira/browse/CORALCOOL-1191
I can't tell why that report was closed.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
According to the jira link in issue https://software.intel.com/en-us/forums/intel-c-compiler/topic/698479#comment-1886701, this had been reported as a Oracle SR and has been closed as no issue found.
I also have seen that Oracle SR, However that has been closed without the explanation saying there is no issue as Valgrind does not know the advance optimization techniques done in Oracle.
Can someone explain the actual implementation of __intel_new_memcpy which will be benifitted to get this resolved or ignore.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
According to that Oracle SR,
********************************************************************************************************************************************************************
If you call malloc 317 bytes, it should actually allocate same number of bytes.
Since our memory management routines are very complex and are internal Valgrind tool is not able to determine what goes on.
You can safely ignore this warning, if you are not facing any error or memory leak.
If any error or leak, please upload the test code so that i can reproduce the issue and file a new bug.
********************************************************************************************************************************************************************
Actually this is not a memory leaks. The OCIStmtPrepare function tries to read memory beyond allocated. The buffer of length 317 is allocated by our application and it is passed to OCIStmtPrepare function with the length as the same. The __intel_new_memcpy function tries to read 8 bytes from 312th byte, that it reads 320. Could you please confirm whether the __intel_new_memcpy functrions copying behavior is correct? Does it track the actual size allocated by the OS memory management system. Even if we call malloc 317 bytes, does it actually allocate 320 and __intel_new_memcpy reads the same length (320). We can simply ignore the given Valgrind "Invalid Read", if it is the actual implementation. Please confirm.
********************************************************************************************************************************************************************
If there are no other symptoms these warnings can be safely ignored, and the valgrind documentation shows how this can be done automatically so that only messages pertaining to your own code will be displayed.
Valgrind is not able to determine what goes on in optimised code, and our memory management routines are complex.
If you believe you have encountered a memory leak or other error, a valgrind report on its own is not sufficient to raise a bug.
We will need a reproducing testcase that demonstrates the problem that valgrind is claiming will happen.
For a supposed memory leak this is fairly easy. Just identify the statement that valgrind says will leak memory, then call that statement repeatedly in an infinite loop, together with appropriate cleanup code.
For example if you are testing a connect statement, you must include in the loop a disconnect, or if you are testing createEnvironment then you must also include terminateEnvironment.
To demonstrate a memory leak this must show unbounded growth of memory usage until the process crashes due to a lack of memory.
Reference:
Note.1300407.1 Valgrind Throws Lots Of Errors For The OCCI Library
********************************************************************************************************************************************************************
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks Sergey for clarification.
Please consider that I do not have any issue with the stability of the OCI API. I'm just trying to what it going when my application which uses OCI runs with valgrind.
My understanding is,
The memory was allocated by the default malloc function (new) in default gnu c++. However when it runs with Valgrind it traps the malloc into a function inside Valgrind and just get the length recorded inside Valgrind to be checked once it is read by any function.
I'm worried that can these buffers which are allocated by non intel function passed into the function __intel_new_memcpy which considers the memory was allocated by _mm_malloc?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This memcpy function surely would take advantage of aligned memory, but should not require it. An aligned malloc would take additional bytes as required to get alignment, but a proper matching of malloc and free functions would free those extra unused bytes.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Rambling thoughts:
Can you look at the assembly code to see if the compiler generated a non-masked load, followed by a masked store. Note, you may have several loads into different ymm/zmm registers, followed by several stores (last in sequence being masked).
Note, an aligned load cannot cross a page boundary, and thus would be safe. The masked aligned store would protect the memory following the allocated node. The use of the aligned load without mask (and past end of data) may be an optimization.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you all for clarifications.
I was able to re-create is even with Oracle 12c libraries.
==26333== Invalid read of size 16
==26333== at 0x75FA410: __intel_ssse3_rep_memcpy (in /x02/app/oracle/product/12.1.0.2/client_1/lib/libclntsh.so.12.1)
==26333== by 0x75F3F25: _intel_fast_memcpy.P (in /x02/app/oracle/product/12.1.0.2/client_1/lib/libclntsh.so.12.1)
==26333== by 0x69FD17C: kpurclientparse (in /x02/app/oracle/product/12.1.0.2/client_1/lib/libclntsh.so.12.1)
==26333== by 0x69FE9DE: kpureq (in /x02/app/oracle/product/12.1.0.2/client_1/lib/libclntsh.so.12.1)
==26333== by 0x69D59CE: OCIStmtPrepare (in /x02/app/oracle/product/12.1.0.2/client_1/lib/libclntsh.so.12.1)
==26333== by 0x6273F63: soci::oracle_statement_backend::prepare(std::string const&, soci::details::statement_type) (statement.cpp:65)
==26333== by 0x5D8877: prepare (statement.h:163)
==26333== by 0x52B8ED: Loadup::Load() (Loadup.cpp:92)
==26333== by 0xB9027B5: start_thread (in /lib64/libpthread-2.11.3.so)
==26333== Address 0xfe655b0 is 224 bytes inside a block of size 237 alloc'd
==26333== at 0x4C2936F: operator new(unsigned long) (vg_replace_malloc.c:324)
==26333== by 0xAEA90C8: allocate (new_allocator.h:104)
==26333== by 0xAEA90C8: std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) (basic_string.tcc:607)
==26333== by 0x57DB574: char* std::string::_S_construct<char const*>(char const*, char const*, std::allocator<char> const&, std::forward_iterator_tag) (in /x04/exreg/libs/libboost_filesystem.so.1.59.0)
==26333== by 0xAEAAE75: _S_construct_aux<char const*> (basic_string.h:1743)
==26333== by 0xAEAAE75: _S_construct<char const*> (basic_string.h:1764)
==26333== by 0xAEAAE75: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&) (basic_string.tcc:215)
==26333== by 0x52B8ED: Loadup::Load() (Loadup.cpp:90)
==26333== by 0xB9027B5: start_thread (in /lib64/libpthread-2.11.3.so)
Hi Sergey,
Is this the code in ICC's __intel_new_memcpy? Or any other compiler?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sergey,
Can you examine the assembly code to see if the corresponding write of the memcpy extends beyond the allocation?
Note, if the generated code assures that the last vector to be read is short .AND. will not span a page boundary, then it is (CPU-wise) benign to read beyond allocation .PROVIDED. the write in the memcpy does not also go beyond the allocated memory (length of source). IOW it is safe to perform a load pd followed (potentially later) by store sd.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Please consider that these corresponding Intel functions (__intel_ssse3_rep_memcpy or _intel_fast_memcpy ) were called by the Oracle Client Libraries and my code does not directly interact with the ICC. Therefore I reported this as an Oracle SR, but their response was to ignore this saying that the Oracle’s memory management is complex and cannot examine in Valgrind. I was in an expression that some Intel black belt engineer may know the actual implementation of the _intel_fast_memcpy and answer. That is why this was logged in this forum.
Ok. However even I can examine the assembly code of Oracle C library but will be extremely harder. I’ll try and get back.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>> However even I can examine the assembly code of Oracle C library but will be extremely harder.
This is relatively easy. Start the program with the debugger (step into), but do not run it. instead, (assuming you ran the same program that failed above), examine (disassemble) the instructions around the failing location (at 0x75FA410). It would be best to start several instructions earlier (lower address), and follow through later. You should see a sequence of instructions moving memory to a group of SSE (xmm) or AVX (ymm) registers, followed by a series of moves from those registers back to memory followed by an add or sub from a register and a branch back.
Jim Dempsey

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page