- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In my Fortran code, I have the line nxi = sqrt(nxi), and nxi=0.000000000000000D+00. The compiler gave me the error related to the Floating-point invalid operation shown below. How should I do to avoid this error? Thanks.
Unhandled exception at 0x00007FF7E9EFECBB in bsam20_2022_08_03_fe9fe6e.exe: 0xC0000090: Floating-point invalid operation (parameters: 0x0000000000000000, 0x0000000000009961).
BTW, this error message only happened when my simulation model is big. When the model is small, no error message.
Carly
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We need more information regarding the circumstances in which the error occurs.
The following program fits the description that you gave, except for "model is big", and shows no error.
program cgao
double precision :: n = 0.0d0
n = sqrt(n)
print *,n
end program
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This kind of obvious errors is often the sign of a stack corruption elsewhere in the program. The fact that it appears only when "the model is big" reinforce that suspicion.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I think it is related to how I should set up some data options in visual studio 2019. I used the following lines to avoid the error or warning because of sqrt:
if (nxi.lt.1e-10) nxi = 0.d0
nxi = sqrt(nxi)
However, the codes crashed in another place:
do i = 1,6
enormal(i) = xi(i)/nxi
end do !
where nxi=6.429822199431394D-002,
Name | Value | Type | |
---|---|---|---|
xi(1) | -9.269196949204623D-003 | REAL(8) | |
xi(2) | 8.046774271464946D-003 | REAL(8) | |
xi(3) | 1.222422677739678D-003 | REAL(8) | |
xi(4) | -4.427044871218580D-005 | REAL(8) | |
xi(5) | -2.973747192262531D-005 | REAL(8) | |
xi(6) | 4.462114273406856D-002 | REAL(8) |
and enomal was defined as real(8) enormal in the beginning.
I think I need to set up some flags or options to avoid this. Do you know how to do that? Thanks.
Carly
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Or how should I detect where the stack corruption is?
Is it possible the stack is not big enough for my big model simulation? I can set up the stack commit size and reserve size via Properties - Configuration Properties - Linker - System in Visual Studio 2019. Is there any recommendation what kind of value I should put there? I don't want to crash the computer if the number is too big.
Thanks.
Carly
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The quickest way to ascertain if this issue is stack related is to enable heap-arrays.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I set optimization-> Heap Arrays to be 0, and Floating Point->Check Floating-Point Stack=Yes (Qfp-stack-check). I rerun the simulation. The error message is as below.
Unhandled exception at 0x00007FF7AE35F30B in bsam20_2022_08_03_fe9fe6e.exe: 0xC000008E: Floating-point division by zero (parameters: 0x0000000000000000, 0x0000000000009964).
The crashed location is the same as before.
do i = 1,6
enormal(i) = xi(i)/nxi
end do
However, nxi =6.158640282302164D-002. So this place is not where the error was triggered. How should I locate the error location. Thanks.
_________________________________________________________
I just realized I am using MPI, so it might show a wrong location for me. Then I switched to only one processor. Now the code crashed at the end of another subroutine (the "end subroutine con_initial" line). This happened a lot before, even for small model simulation. I just run multiple time. Eventually I can run through the simulation. The error message is just "bsam20_2022_08_03_fe9fe6e.exe has triggered a breakpoint.".
subroutine con_initial(cid, elem_in, qp, deltaT, cmat_tang)
----------
end subroutine con_initial
Anybody has clue about what caused this breakpoint? Thanks.
Carly
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
When you get the breakpoint message, look in another window for the console window that will have more information.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I used Valgrind to detect the memory issues for my codes (mixed by C++ and Fortran). Here are the error messages. It seems the memory lost is from oneAPI? Does anybody know how to understand these information? Thanks.
==22803== LEAK SUMMARY:
==22803== definitely lost: 76 bytes in 3 blocks
==22803== indirectly lost: 0 bytes in 0 blocks
==22803== possibly lost: 0 bytes in 0 blocks
==22803== still reachable: 5,173 bytes in 17 blo
==22803==
==22803== HEAP SUMMARY:
==22803== in use at exit: 5,249 bytes in 20 blocks
==22803== total heap usage: 9,716 allocs, 9,696 frees, 14,371,548 bytes allocated
==22803==
==22803== Searching for pointers to 20 not-freed blocks
==22803== Checked 2,357,980,840 bytes
==22803==
==22803== 8 bytes in 1 blocks are still reachable in loss record 1 of 18
==22803== at 0x5487738: operator new(unsigned long) (vg_replace_malloc.c:417)
==22803== by 0xBBF83F: CryptoPP::NewObject<CryptoPP::OAEP<CryptoPP::SHA1, CryptoPP::P1363_MGF1> >::operator()() const (misc.h:258)
==22803== by 0xBBF954: CryptoPP::Singleton<CryptoPP::OAEP<CryptoPP::SHA1, CryptoPP::P1363_MGF1>, CryptoPP::NewObject<CryptoPP::OAEP<CryptoPP::SHA1, CryptoPP::P1363_MGF1> >, 0>::Ref() const (misc.h:346)
==22803== by 0xBBF0AC: CryptoPP::TF_ObjectImplBase<CryptoPP::TF_DecryptorBase, CryptoPP::TF_CryptoSchemeOptions<CryptoPP::TF_ES<CryptoPP::RSA, CryptoPP::OAEP<CryptoPP::SHA1, CryptoPP::P1363_MGF1>, int>, CryptoPP::RSA, CryptoPP::OAEP<CryptoPP::SHA1, CryptoPP::P1363_MGF1> >, CryptoPP::InvertibleRSAFunction>::GetMessageEncodingInterface() const (pubkey.h:594)
==22803== by 0xBBC103: CryptoPP::TF_CryptoSystemBase<CryptoPP::PK_Decryptor, CryptoPP::TF_Base<CryptoPP::TrapdoorFunctionInverse, CryptoPP::PK_EncryptionMessageEncodingMethod> >::FixedMaxPlaintextLength() const (pubkey.h:273)
==22803== by 0xBB61CF: license::decrypt(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (Encrypt.cpp:83)
==22803== by 0xBAEB6D: license::is_valid(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, int const&, double const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) (License.cpp:140)
==22803== by 0x8ACA0B: license::check_license(double) (License.cpp:120)
==22803== by 0x891712: SHE_license_check_license (wrapsheff_license.cpp:17)
==22803== by 0x6CCFF6: varnam_ (varnam.f90:108)
==22803== by 0x410079: MAIN__ (mainf1.f:47)
==22803== by 0x410011: main (in /home/ga/gaoz/bin/bsam20_2022_10)
==22803==
==22803== 13 bytes in 1 blocks are definitely lost in loss record 2 of 18
==22803== at 0x5487017: malloc (vg_replace_malloc.c:380)
==22803== by 0x116950C9: strdup (in /usr/lib64/libc-2.17.so)
==22803== by 0x12DD9197: ???
==22803== by 0x486B8F2: _dl_init (in /usr/lib64/ld-2.17.so)
==22803== by 0x48704CD: dl_open_worker (in /usr/lib64/ld-2.17.so)
==22803== by 0x486B703: _dl_catch_error (in /usr/lib64/ld-2.17.so)
==22803== by 0x486FABA: _dl_open (in /usr/lib64/ld-2.17.so)
==22803== by 0xEF31EEA: dlopen_doit (in /usr/lib64/libdl-2.17.so)
==22803== by 0x486B703: _dl_catch_error (in /usr/lib64/ld-2.17.so)
==22803== by 0xEF324EC: _dlerror_run (in /usr/lib64/libdl-2.17.so)
==22803== by 0xEF31F80: dlopen@@GLIBC_2.2.5 (in /usr/lib64/libdl-2.17.so)
==22803== by 0x11E6E051: ofi_reg_dl_prov (in /home/ba/ballardmk/bin/intel/oneapi/mpi/2021.1.1/libfabric/lib/libfabric.so.1)
==22803== 31 bytes in 1 blocks are definitely lost in loss record 5 of 18
==22803== at 0x5487017: malloc (vg_replace_malloc.c:380)
==22803== by 0x12DD96F5: ???
==22803== by 0x486B8F2: _dl_init (in /usr/lib64/ld-2.17.so)
==22803== by 0x48704CD: dl_open_worker (in /usr/lib64/ld-2.17.so)
==22803== by 0x486B703: _dl_catch_error (in /usr/lib64/ld-2.17.so)
==22803== by 0x486FABA: _dl_open (in /usr/lib64/ld-2.17.so)
==22803== by 0xEF31EEA: dlopen_doit (in /usr/lib64/libdl-2.17.so)
==22803== by 0x486B703: _dl_catch_error (in /usr/lib64/ld-2.17.so)
==22803== by 0xEF324EC: _dlerror_run (in /usr/lib64/libdl-2.17.so)
==22803== by 0xEF31F80: dlopen@@GLIBC_2.2.5 (in /usr/lib64/libdl-2.17.so)
==22803== by 0x11E6E051: ofi_reg_dl_prov (in /home/ba/ballardmk/bin/intel/oneapi/mpi/2021.1.1/libfabric/lib/libfabric.so.1)
==22803== by 0x11E6E6B8: fi_ini (in /home/ba/ballardmk/bin/intel/oneapi/mpi/2021.1.1/libfabric/lib/libfabric.so.1)
==22803==
==22803== 32 bytes in 1 blocks are definitely lost in loss record 8 of 18
==22803== at 0x548B778: calloc (vg_replace_malloc.c:1117)
==22803== by 0x12B7905A: ???
==22803== by 0x12B6017A: ???
==22803== by 0x11E6E06F: ofi_reg_dl_prov (in /home/ba/ballardmk/bin/intel/oneapi/mpi/2021.1.1/libfabric/lib/libfabric.so.1)
==22803== by 0x11E6E6B8: fi_ini (in /home/ba/ballardmk/bin/intel/oneapi/mpi/2021.1.1/libfabric/lib/libfabric.so.1)
==22803== by 0x11E6F460: fi_getinfo@@FABRIC_1.3 (in /home/ba/ballardmk/bin/intel/oneapi/mpi/2021.1.1/libfabric/lib/libfabric.so.1)
==22803== by 0x11E73E65: fi_getinfo@FABRIC_1.1 (in /home/ba/ballardmk/bin/intel/oneapi/mpi/2021.1.1/libfabric/lib/libfabric.so.1)
==22803== by 0x106E1BB2: MPIDI_OFI_mpi_init_hook (ofi_init.c:1167)
==22803== by 0x1022FCE7: MPID_Init (ch4_init.c:1138)
==22803== by 0x104C624F: MPIR_Init_thread (initthread.c:137)
==22803== by 0x104C624F: PMPI_Init_thread (initthread.c:269)
==22803== by 0xFDC0F4B: MPI_INIT_THREAD (initthreadf.c:270)
==22803== by 0x6391A8: mpi_util_mp_mpi_util_start_ (mpi_util.f90:52)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I used Valgrind to detect the heap corruption in my code (mixed by C++ and Fortran). Here are the error messages. It seems the memory loss is from OneAPI? Does anyone know how to interpret the information? Thanks.
==22803==
==22803== HEAP SUMMARY:
==22803== in use at exit: 5,249 bytes in 20 blocks
==22803== total heap usage: 9,716 allocs, 9,696 frees, 14,371,548 bytes allocated
==22803==
==22803== Searching for pointers to 20 not-freed blocks
==22803== Checked 2,357,980,840 bytes
==22803==
==22803== 8 bytes in 1 blocks are still reachable in loss record 1 of 18
==22803== at 0x5487738: operator new(unsigned long) (vg_replace_malloc.c:417)
==22803== by 0xBBF83F: CryptoPP::NewObject<CryptoPP::OAEP<CryptoPP::SHA1, CryptoPP::P1363_MGF1> >::operator()() const (misc.h:258)
==22803== by 0xBBF954: CryptoPP::Singleton<CryptoPP::OAEP<CryptoPP::SHA1, CryptoPP::P1363_MGF1>, CryptoPP::NewObject<CryptoPP::OAEP<CryptoPP::SHA1, CryptoPP::P1363_MGF1> >, 0>::Ref() const (misc.h:346)
==22803== by 0xBBF0AC: CryptoPP::TF_ObjectImplBase<CryptoPP::TF_DecryptorBase, CryptoPP::TF_CryptoSchemeOptions<CryptoPP::TF_ES<CryptoPP::RSA, CryptoPP::OAEP<CryptoPP::SHA1, CryptoPP::P1363_MGF1>, int>, CryptoPP::RSA, CryptoPP::OAEP<CryptoPP::SHA1, CryptoPP::P1363_MGF1> >, CryptoPP::InvertibleRSAFunction>::GetMessageEncodingInterface() const (pubkey.h:594)
==22803== by 0xBBC103: CryptoPP::TF_CryptoSystemBase<CryptoPP::PK_Decryptor, CryptoPP::TF_Base<CryptoPP::TrapdoorFunctionInverse, CryptoPP::PK_EncryptionMessageEncodingMethod> >::FixedMaxPlaintextLength() const (pubkey.h:273)
==22803== by 0xBB61CF: license::decrypt(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (Encrypt.cpp:83)
==22803== by 0xBAEB6D: license::is_valid(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, int const&, double const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) (License.cpp:140)
==22803== by 0x8ACA0B: license::check_license(double) (License.cpp:120)
==22803== by 0x891712: SHE_license_check_license (wrapsheff_license.cpp:17)
==22803== by 0x6CCFF6: varnam_ (varnam.f90:108)
==22803== by 0x410079: MAIN__ (mainf1.f:47)
==22803== by 0x410011: main (in /home/ga/gaoz/bin/bsam20_2022_10)
==22803==
==22803== 13 bytes in 1 blocks are definitely lost in loss record 2 of 18
==22803== at 0x5487017: malloc (vg_replace_malloc.c:380)
==22803== by 0x116950C9: strdup (in /usr/lib64/libc-2.17.so)
==22803== by 0x12DD9197: ???
==22803== by 0x486B8F2: _dl_init (in /usr/lib64/ld-2.17.so)
==22803== by 0x48704CD: dl_open_worker (in /usr/lib64/ld-2.17.so)
==22803== by 0x486B703: _dl_catch_error (in /usr/lib64/ld-2.17.so)
==22803== by 0x486FABA: _dl_open (in /usr/lib64/ld-2.17.so)
==22803== by 0xEF31EEA: dlopen_doit (in /usr/lib64/libdl-2.17.so)
==22803== by 0x486B703: _dl_catch_error (in /usr/lib64/ld-2.17.so)
==22803== by 0xEF324EC: _dlerror_run (in /usr/lib64/libdl-2.17.so)
==22803== by 0xEF31F80: dlopen@@GLIBC_2.2.5 (in /usr/lib64/libdl-2.17.so)
==22803== by 0x11E6E051: ofi_reg_dl_prov (in /home/ba/ballardmk/bin/intel/oneapi/mpi/2021.1.1/libfabric/lib/libfabric.so.1)
==22803== 31 bytes in 1 blocks are definitely lost in loss record 5 of 18
==22803== at 0x5487017: malloc (vg_replace_malloc.c:380)
==22803== by 0x12DD96F5: ???
==22803== by 0x486B8F2: _dl_init (in /usr/lib64/ld-2.17.so)
==22803== by 0x48704CD: dl_open_worker (in /usr/lib64/ld-2.17.so)
==22803== by 0x486B703: _dl_catch_error (in /usr/lib64/ld-2.17.so)
==22803== by 0x486FABA: _dl_open (in /usr/lib64/ld-2.17.so)
==22803== by 0xEF31EEA: dlopen_doit (in /usr/lib64/libdl-2.17.so)
==22803== by 0x486B703: _dl_catch_error (in /usr/lib64/ld-2.17.so)
==22803== by 0xEF324EC: _dlerror_run (in /usr/lib64/libdl-2.17.so)
==22803== by 0xEF31F80: dlopen@@GLIBC_2.2.5 (in /usr/lib64/libdl-2.17.so)
==22803== by 0x11E6E051: ofi_reg_dl_prov (in /home/ba/ballardmk/bin/intel/oneapi/mpi/2021.1.1/libfabric/lib/libfabric.so.1)
==22803== by 0x11E6E6B8: fi_ini (in /home/ba/ballardmk/bin/intel/oneapi/mpi/2021.1.1/libfabric/lib/libfabric.so.1)
==22803==
==22803== 32 bytes in 1 blocks are definitely lost in loss record 8 of 18
==22803== at 0x548B778: calloc (vg_replace_malloc.c:1117)
==22803== by 0x12B7905A: ???
==22803== by 0x12B6017A: ???
==22803== by 0x11E6E06F: ofi_reg_dl_prov (in /home/ba/ballardmk/bin/intel/oneapi/mpi/2021.1.1/libfabric/lib/libfabric.so.1)
==22803== by 0x11E6E6B8: fi_ini (in /home/ba/ballardmk/bin/intel/oneapi/mpi/2021.1.1/libfabric/lib/libfabric.so.1)
==22803== by 0x11E6F460: fi_getinfo@@FABRIC_1.3 (in /home/ba/ballardmk/bin/intel/oneapi/mpi/2021.1.1/libfabric/lib/libfabric.so.1)
==22803== by 0x11E73E65: fi_getinfo@FABRIC_1.1 (in /home/ba/ballardmk/bin/intel/oneapi/mpi/2021.1.1/libfabric/lib/libfabric.so.1)
==22803== by 0x106E1BB2: MPIDI_OFI_mpi_init_hook (ofi_init.c:1167)
==22803== by 0x1022FCE7: MPID_Init (ch4_init.c:1138)
==22803== by 0x104C624F: MPIR_Init_thread (initthread.c:137)
==22803== by 0x104C624F: PMPI_Init_thread (initthread.c:269)
==22803== by 0xFDC0F4B: MPI_INIT_THREAD (initthreadf.c:270)
==22803== by 0x6391A8: mpi_util_mp_mpi_util_start_ (mpi_util.f90:52)
==22803==
==22803==
==22803== LEAK SUMMARY:
==22803== definitely lost: 76 bytes in 3 blocks
==22803== indirectly lost: 0 bytes in 0 blocks
==22803== possibly lost: 0 bytes in 0 blocks
==22803== still reachable: 5,173 bytes in 17 blocks
==22803== suppressed: 0 bytes in 0 blocks
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page