- 新着としてマーク
- ブックマーク
- 購読
- ミュート
- RSS フィードを購読する
- ハイライト
- 印刷
- 不適切なコンテンツを報告
I have an MKL-based test case that reports valgrind memcheck errors on 64-bit systems, but which runs cleanly on a 32-bit system. Ignoring valgrind's protests, the unit test behaves correctly on both systems. What are the odds that the problem is in MKL and not specifically in my program? The problem valgrind reports is below a DSCAL call:
Invalid read of size 8
at 0x4015F04: (within /lib/ld-2.7.so)
by 0x400ABB3: (within /lib/ld-2.7.so)
by 0x4006204: (within /lib/ld-2.7.so)
by 0x4008697: (within /lib/ld-2.7.so)
by 0x4012068: (within /lib/ld-2.7.so)
by 0x400DE15: (within /lib/ld-2.7.so)
by 0x401193A: (within /lib/ld-2.7.so)
by 0x75F0F8A: (within /lib/libdl-2.7.so)
by 0x400DE15: (within /lib/ld-2.7.so)
by 0x75F14EC: (within /lib/libdl-2.7.so)
by 0x75F0EF0: dlopen (in /lib/libdl-2.7.so)
by 0x5BD723F: mkl_serv_load_dll (in /opt/intel/mkl/10.0.3.020/lib/em64t/libmkl_core.so)
by 0x59BF24C: mkl_blas_dscal (in /opt/intel/mkl/10.0.3.020/lib/em64t/libmkl_lapack.so)
by 0x51E6944: DSCAL (in /opt/intel/mkl/10.0.3.020/lib/em64t/libmkl_intel_lp64.so)
by 0x5400A7: suzerain_blas_dscal (blas_et_al.c:370)
by 0x4EDC23: void suzerain::blas::scal(int, double, double*, int) (blas_et_al.hpp:292)
by 0x4EE656: suzerain::RealState::scale(double) (state.hpp:310)
by 0x47C883: RealState::scale::test_method() (test_state.cpp:138)
by 0x47E2EF: RealState::scale_invoker() (test_state.cpp:128)
by 0x4F3CBC: boost::unit_test::ut_detail::unused boost::unit_test::ut_detail::invoker<:UNIT_TEST::UT_DETAIL::UNUSED>::invoke(void (*)()&) (callback.hpp:56)
by 0x4F4888: boost::unit_test::ut_detail::callback0_impl_t<:UNIT_TEST::UT_DETAIL::UNUSED>::invoke() (callback.hpp:89)
by 0x4F4D91: boost::unit_test::callback0<:UNIT_TEST::UT_DETAIL::UNUSED>::operator()() const (callback.hpp:118)
by 0x5186A2: boost::unit_test::(anonymous namespace)::zero_return_wrapper_t<:UNIT_TEST::CALLBACK0><:UNIT_TEST::UT_DETAIL::UNUSED> >::operator()() (unit_test_monitor.ipp:41)
by 0x4F3C91: int boost::unit_test::ut_detail::invoker::invoke<:UNIT_TEST::> > >(boost::unit_test::(anonymous namespace)::zero_return_wrapper_t<:UNIT_TEST::CALLBACK0><:UNIT_TEST::UT_DETAIL::UNUSED> >&) (callback.hpp:42)
by 0x4F485C: boost::unit_test::ut_detail::callback0_impl_t> >::invoke() (callback.hpp:89)
Address 0x7ad77d0 is 48 bytes inside a block of size 50 alloc'd
at 0x4C22FAB: malloc (vg_replace_malloc.c:207)
by 0x4005F67: (within /lib/ld-2.7.so)
by 0x400887F: (within /lib/ld-2.7.so)
by 0x4012068: (within /lib/ld-2.7.so)
by 0x400DE15: (within /lib/ld-2.7.so)
by 0x401193A: (within /lib/ld-2.7.so)
by 0x75F0F8A: (within /lib/libdl-2.7.so)
by 0x400DE15: (within /lib/ld-2.7.so)
by 0x75F14EC: (within /lib/libdl-2.7.so)
by 0x75F0EF0: dlopen (in /lib/libdl-2.7.so)
by 0x5BD723F: mkl_serv_load_dll (in /opt/intel/mkl/10.0.3.020/lib/em64t/libmkl_core.so)
by 0x59BF24C: mkl_blas_dscal (in /opt/intel/mkl/10.0.3.020/lib/em64t/libmkl_lapack.so)
by 0x51E6944: DSCAL (in /opt/intel/mkl/10.0.3.020/lib/em64t/libmkl_intel_lp64.so)
by 0x5400A7: suzerain_blas_dscal (blas_et_al.c:370)
by 0x4EDC23: void suzerain::blas::scal(int, double, double*, int) (blas_et_al.hpp:292)
by 0x4EE656: suzerain::RealState::scale(double) (state.hpp:310)
by 0x47C883: RealState::scale::test_method() (test_state.cpp:138)
by 0x47E2EF: RealState::scale_invoker() (test_state.cpp:128)
by 0x4F3CBC: boost::unit_test::ut_detail::unused boost::unit_test::ut_detail::invoker<:UNIT_TEST::UT_DETAIL::UNUSED>::invoke(void (*)()&) (callback.hpp:56)
by 0x4F4888: boost::unit_test::ut_detail::callback0_impl_t<:UNIT_TEST::UT_DETAIL::UNUSED>::invoke() (callback.hpp:89)
by 0x4F4D91: boost::unit_test::callback0<:UNIT_TEST::UT_DETAIL::UNUSED>::operator()() const (callback.hpp:118)
by 0x5186A2: boost::unit_test::(anonymous namespace)::zero_return_wrapper_t<:UNIT_TEST::CALLBACK0><:UNIT_TEST::UT_DETAIL::UNUSED> >::operator()() (unit_test_monitor.ipp:41)
by 0x4F3C91: int boost::unit_test::ut_detail::invoker::invoke<:UNIT_TEST::> > >(boost::unit_test::(anonymous namespace)::zero_return_wrapper_t<:UNIT_TEST::CALLBACK0><:UNIT_TEST::UT_DETAIL::UNUSED> >&) (callback.hpp:42)
by 0x4F485C: boost::unit_test::ut_detail::callback0_impl_t> >::invoke() (callback.hpp:89)
by 0x4F4DE5: boost::unit_test::callback0::operator()() const (callback.hpp:118)
Moreover, on a 64-bit system when use valgrind's --db-attach=yes option and then enter gdb at the problematic point, I get a funky looking backtrace:
(gdb) bt
#0 0x0000000004015f04 in ?? ()
#1 0x000000000400abb4 in ?? ()
#2 0x0000000290000001 in ?? ()
#3 0x0000000005d90600 in ?? ()
#4 0x0000000007ad77a0 in ?? ()
#5 0x000000000000000e in ?? ()
#6 0x000000000401f210 in ?? ()
#7 0x0000000000000000 in ?? ()
Loading symbols from my binary doesn't help. Any ideas for how to get a meaningful context from this backtrace? gdb's unaware of any other threads, and the reported backtrace from valgrind clearly indicates there's more than 7 frames of context available.
I'm happy to provide the source that caused the issue, though I've got to admit it's got a long dependency chain to make autoconf pleased enough to build it.
Thanks,
Rhys
1 解決策
- 新着としてマーク
- ブックマーク
- 購読
- ミュート
- RSS フィードを購読する
- ハイライト
- 印刷
- 不適切なコンテンツを報告
Hi,
Valgrind warnings are possibly because of misunderstanding of complier-code e.g. Intel's compiler.
I was able to see some false warnings on simple parallel code like as follows:
==30321== Use of uninitialised value of size 8
==30321== at 0x44DFC3: exit (in ddot)
==30500== Conditional jump or move depends on uninitialised value(s)
==30500== at 0x40158E: __intel_new_proc_init.H (in ddot)
==30500== by 0x400F85: main (in ddot)
So, I'd suggest to use the latest valgrind 3.5.0
Thanks
-- Victor
Valgrind warnings are possibly because of misunderstanding of complier-code e.g. Intel's compiler.
I was able to see some false warnings on simple parallel code like as follows:
==30321== Use of uninitialised value of size 8
==30321== at 0x44DFC3: exit (in ddot)
==30500== Conditional jump or move depends on uninitialised value(s)
==30500== at 0x40158E: __intel_new_proc_init.H (in ddot)
==30500== by 0x400F85: main (in ddot)
So, I'd suggest to use the latest valgrind 3.5.0
Thanks
-- Victor
コピーされたリンク
4 返答(返信)
- 新着としてマーク
- ブックマーク
- 購読
- ミュート
- RSS フィードを購読する
- ハイライト
- 印刷
- 不適切なコンテンツを報告
Hi,
What vesion of valgrind do you use? And for 64-bit test you should use 64-bit version of valgrind.
-- Victor
What vesion of valgrind do you use? And for 64-bit test you should use 64-bit version of valgrind.
-- Victor
- 新着としてマーク
- ブックマーク
- 購読
- ミュート
- RSS フィードを購読する
- ハイライト
- 印刷
- 不適切なコンテンツを報告
Quoting - Victor Pasko (Intel)
Hi,
What vesion of valgrind do you use? And for 64-bit test you should use 64-bit version of valgrind.
-- Victor
What vesion of valgrind do you use? And for 64-bit test you should use 64-bit version of valgrind.
-- Victor
I am using a 64-bit valgrind on the 64-bit system:
[282 rhys@gauss ~][255]$ uname -a
Linux gauss.HIDDEN 2.6.24-24-generic #1 SMP Tue Aug 18 16:22:17 UTC 2009 x86_64 GNU/Linux
[283 rhys@gauss ~]$ which valgrind
/usr/bin/valgrind
[284 rhys@gauss ~]$ which valgrind.bin
/usr/bin/valgrind.bin
[285 rhys@gauss ~]$ /usr/bin/valgrind.bin --version
valgrind-3.3.0-Debian
[286 rhys@gauss ~]$ file /usr/bin/valgrind.bin
/usr/bin/valgrind.bin: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), for GNU/Linux 2.6.8, dynamically linked (uses shared libs), not stripped
[287 rhys@gauss ~]$ ldd /usr/bin/valgrind.bin
linux-vdso.so.1 => (0x00007fffe4dfb000)
libc.so.6 => /lib/libc.so.6 (0x00002b72c5f21000)
/lib64/ld-linux-x86-64.so.2 (0x00002b72c5d02000)
- 新着としてマーク
- ブックマーク
- 購読
- ミュート
- RSS フィードを購読する
- ハイライト
- 印刷
- 不適切なコンテンツを報告
Hi,
Valgrind warnings are possibly because of misunderstanding of complier-code e.g. Intel's compiler.
I was able to see some false warnings on simple parallel code like as follows:
==30321== Use of uninitialised value of size 8
==30321== at 0x44DFC3: exit (in ddot)
==30500== Conditional jump or move depends on uninitialised value(s)
==30500== at 0x40158E: __intel_new_proc_init.H (in ddot)
==30500== by 0x400F85: main (in ddot)
So, I'd suggest to use the latest valgrind 3.5.0
Thanks
-- Victor
Valgrind warnings are possibly because of misunderstanding of complier-code e.g. Intel's compiler.
I was able to see some false warnings on simple parallel code like as follows:
==30321== Use of uninitialised value of size 8
==30321== at 0x44DFC3: exit (in ddot)
==30500== Conditional jump or move depends on uninitialised value(s)
==30500== at 0x40158E: __intel_new_proc_init.H (in ddot)
==30500== by 0x400F85: main (in ddot)
So, I'd suggest to use the latest valgrind 3.5.0
Thanks
-- Victor
- 新着としてマーク
- ブックマーク
- 購読
- ミュート
- RSS フィードを購読する
- ハイライト
- 印刷
- 不適切なコンテンツを報告
Hi Victor,
Valgrind 3.50 runs cleanly against my test on the 64-bit system where Valgrind 3.30 did not. Thank you for the suggestion to move up.
Appreciate it,
Rhys
