Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
7220 ディスカッション

valgrind runs clean on 32-bit MKL test but not on 64-bit one

Rhys_Ulerich
新規コントリビューター I
1,071件の閲覧回数
I have an MKL-based test case that reports valgrind memcheck errors on 64-bit systems, but which runs cleanly on a 32-bit system. Ignoring valgrind's protests, the unit test behaves correctly on both systems. What are the odds that the problem is in MKL and not specifically in my program? The problem valgrind reports is below a DSCAL call:

Invalid read of size 8
at 0x4015F04: (within /lib/ld-2.7.so)
by 0x400ABB3: (within /lib/ld-2.7.so)
by 0x4006204: (within /lib/ld-2.7.so)
by 0x4008697: (within /lib/ld-2.7.so)
by 0x4012068: (within /lib/ld-2.7.so)
by 0x400DE15: (within /lib/ld-2.7.so)
by 0x401193A: (within /lib/ld-2.7.so)
by 0x75F0F8A: (within /lib/libdl-2.7.so)
by 0x400DE15: (within /lib/ld-2.7.so)
by 0x75F14EC: (within /lib/libdl-2.7.so)
by 0x75F0EF0: dlopen (in /lib/libdl-2.7.so)
by 0x5BD723F: mkl_serv_load_dll (in /opt/intel/mkl/10.0.3.020/lib/em64t/libmkl_core.so)
by 0x59BF24C: mkl_blas_dscal (in /opt/intel/mkl/10.0.3.020/lib/em64t/libmkl_lapack.so)
by 0x51E6944: DSCAL (in /opt/intel/mkl/10.0.3.020/lib/em64t/libmkl_intel_lp64.so)
by 0x5400A7: suzerain_blas_dscal (blas_et_al.c:370)
by 0x4EDC23: void suzerain::blas::scal(int, double, double*, int) (blas_et_al.hpp:292)
by 0x4EE656: suzerain::RealState::scale(double) (state.hpp:310)
by 0x47C883: RealState::scale::test_method() (test_state.cpp:138)
by 0x47E2EF: RealState::scale_invoker() (test_state.cpp:128)
by 0x4F3CBC: boost::unit_test::ut_detail::unused boost::unit_test::ut_detail::invoker<:UNIT_TEST::UT_DETAIL::UNUSED>::invoke(void (*)()&) (callback.hpp:56)
by 0x4F4888: boost::unit_test::ut_detail::callback0_impl_t<:UNIT_TEST::UT_DETAIL::UNUSED>::invoke() (callback.hpp:89)
by 0x4F4D91: boost::unit_test::callback0<:UNIT_TEST::UT_DETAIL::UNUSED>::operator()() const (callback.hpp:118)
by 0x5186A2: boost::unit_test::(anonymous namespace)::zero_return_wrapper_t<:UNIT_TEST::CALLBACK0><:UNIT_TEST::UT_DETAIL::UNUSED> >::operator()() (unit_test_monitor.ipp:41)
by 0x4F3C91: int boost::unit_test::ut_detail::invoker::invoke<:UNIT_TEST::> > >(boost::unit_test::(anonymous namespace)::zero_return_wrapper_t<:UNIT_TEST::CALLBACK0><:UNIT_TEST::UT_DETAIL::UNUSED> >&) (callback.hpp:42)
by 0x4F485C: boost::unit_test::ut_detail::callback0_impl_t > >::invoke() (callback.hpp:89)

Address 0x7ad77d0 is 48 bytes inside a block of size 50 alloc'd
at 0x4C22FAB: malloc (vg_replace_malloc.c:207)
by 0x4005F67: (within /lib/ld-2.7.so)
by 0x400887F: (within /lib/ld-2.7.so)
by 0x4012068: (within /lib/ld-2.7.so)
by 0x400DE15: (within /lib/ld-2.7.so)
by 0x401193A: (within /lib/ld-2.7.so)
by 0x75F0F8A: (within /lib/libdl-2.7.so)
by 0x400DE15: (within /lib/ld-2.7.so)
by 0x75F14EC: (within /lib/libdl-2.7.so)
by 0x75F0EF0: dlopen (in /lib/libdl-2.7.so)
by 0x5BD723F: mkl_serv_load_dll (in /opt/intel/mkl/10.0.3.020/lib/em64t/libmkl_core.so)
by 0x59BF24C: mkl_blas_dscal (in /opt/intel/mkl/10.0.3.020/lib/em64t/libmkl_lapack.so)
by 0x51E6944: DSCAL (in /opt/intel/mkl/10.0.3.020/lib/em64t/libmkl_intel_lp64.so)
by 0x5400A7: suzerain_blas_dscal (blas_et_al.c:370)
by 0x4EDC23: void suzerain::blas::scal(int, double, double*, int) (blas_et_al.hpp:292)
by 0x4EE656: suzerain::RealState::scale(double) (state.hpp:310)
by 0x47C883: RealState::scale::test_method() (test_state.cpp:138)
by 0x47E2EF: RealState::scale_invoker() (test_state.cpp:128)
by 0x4F3CBC: boost::unit_test::ut_detail::unused boost::unit_test::ut_detail::invoker<:UNIT_TEST::UT_DETAIL::UNUSED>::invoke(void (*)()&) (callback.hpp:56)
by 0x4F4888: boost::unit_test::ut_detail::callback0_impl_t<:UNIT_TEST::UT_DETAIL::UNUSED>::invoke() (callback.hpp:89)
by 0x4F4D91: boost::unit_test::callback0<:UNIT_TEST::UT_DETAIL::UNUSED>::operator()() const (callback.hpp:118)
by 0x5186A2: boost::unit_test::(anonymous namespace)::zero_return_wrapper_t<:UNIT_TEST::CALLBACK0><:UNIT_TEST::UT_DETAIL::UNUSED> >::operator()() (unit_test_monitor.ipp:41)
by 0x4F3C91: int boost::unit_test::ut_detail::invoker::invoke<:UNIT_TEST::> > >(boost::unit_test::(anonymous namespace)::zero_return_wrapper_t<:UNIT_TEST::CALLBACK0><:UNIT_TEST::UT_DETAIL::UNUSED> >&) (callback.hpp:42)
by 0x4F485C: boost::unit_test::ut_detail::callback0_impl_t > >::invoke() (callback.hpp:89)
by 0x4F4DE5: boost::unit_test::callback0::operator()() const (callback.hpp:118)

Moreover, on a 64-bit system when use valgrind's --db-attach=yes option and then enter gdb at the problematic point, I get a funky looking backtrace:

(gdb) bt
#0 0x0000000004015f04 in ?? ()
#1 0x000000000400abb4 in ?? ()
#2 0x0000000290000001 in ?? ()
#3 0x0000000005d90600 in ?? ()
#4 0x0000000007ad77a0 in ?? ()
#5 0x000000000000000e in ?? ()
#6 0x000000000401f210 in ?? ()
#7 0x0000000000000000 in ?? ()


Loading symbols from my binary doesn't help. Any ideas for how to get a meaningful context from this backtrace? gdb's unaware of any other threads, and the reported backtrace from valgrind clearly indicates there's more than 7 frames of context available.

I'm happy to provide the source that caused the issue, though I've got to admit it's got a long dependency chain to make autoconf pleased enough to build it.

Thanks,
Rhys
0 件の賞賛
1 解決策
barragan_villanueva_
高評価コントリビューター I
1,071件の閲覧回数
Hi,

Valgrind warnings are possibly because of misunderstanding of complier-code e.g. Intel's compiler.
I was able to see some false warnings on simple parallel code like as follows:

==30321== Use of uninitialised value of size 8
==30321== at 0x44DFC3: exit (in ddot)

==30500== Conditional jump or move depends on uninitialised value(s)
==30500== at 0x40158E: __intel_new_proc_init.H (in ddot)
==30500== by 0x400F85: main (in ddot)

So, I'd suggest to use the latest valgrind 3.5.0

Thanks
-- Victor

元の投稿で解決策を見る

4 返答(返信)
barragan_villanueva_
高評価コントリビューター I
1,071件の閲覧回数
Hi,

What vesion of valgrind do you use? And for 64-bit test you should use 64-bit version of valgrind.

-- Victor
Rhys_Ulerich
新規コントリビューター I
1,071件の閲覧回数
Hi,

What vesion of valgrind do you use? And for 64-bit test you should use 64-bit version of valgrind.

-- Victor

I am using a 64-bit valgrind on the 64-bit system:

[282 rhys@gauss ~][255]$ uname -a
Linux gauss.HIDDEN 2.6.24-24-generic #1 SMP Tue Aug 18 16:22:17 UTC 2009 x86_64 GNU/Linux
[283 rhys@gauss ~]$ which valgrind
/usr/bin/valgrind
[284 rhys@gauss ~]$ which valgrind.bin
/usr/bin/valgrind.bin
[285 rhys@gauss ~]$ /usr/bin/valgrind.bin --version
valgrind-3.3.0-Debian
[286 rhys@gauss ~]$ file /usr/bin/valgrind.bin
/usr/bin/valgrind.bin: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), for GNU/Linux 2.6.8, dynamically linked (uses shared libs), not stripped
[287 rhys@gauss ~]$ ldd /usr/bin/valgrind.bin
linux-vdso.so.1 => (0x00007fffe4dfb000)
libc.so.6 => /lib/libc.so.6 (0x00002b72c5f21000)
/lib64/ld-linux-x86-64.so.2 (0x00002b72c5d02000)


barragan_villanueva_
高評価コントリビューター I
1,072件の閲覧回数
Hi,

Valgrind warnings are possibly because of misunderstanding of complier-code e.g. Intel's compiler.
I was able to see some false warnings on simple parallel code like as follows:

==30321== Use of uninitialised value of size 8
==30321== at 0x44DFC3: exit (in ddot)

==30500== Conditional jump or move depends on uninitialised value(s)
==30500== at 0x40158E: __intel_new_proc_init.H (in ddot)
==30500== by 0x400F85: main (in ddot)

So, I'd suggest to use the latest valgrind 3.5.0

Thanks
-- Victor
Rhys_Ulerich
新規コントリビューター I
1,071件の閲覧回数
Hi Victor,

Valgrind 3.50 runs cleanly against my test on the 64-bit system where Valgrind 3.30 did not. Thank you for the suggestion to move up.

Appreciate it,
Rhys
返信