- Marcar como novo
- Marcador
- Subscrever
- Silenciar
- Subscrever fonte RSS
- Destacar
- Imprimir
- Denunciar conteúdo inapropriado
In my Fortran code, I have the line nxi = sqrt(nxi), and nxi=0.000000000000000D+00. The compiler gave me the error related to the Floating-point invalid operation shown below. How should I do to avoid this error? Thanks.
Unhandled exception at 0x00007FF7E9EFECBB in bsam20_2022_08_03_fe9fe6e.exe: 0xC0000090: Floating-point invalid operation (parameters: 0x0000000000000000, 0x0000000000009961).
BTW, this error message only happened when my simulation model is big. When the model is small, no error message.
Carly
Link copiado
- Marcar como novo
- Marcador
- Subscrever
- Silenciar
- Subscrever fonte RSS
- Destacar
- Imprimir
- Denunciar conteúdo inapropriado
We need more information regarding the circumstances in which the error occurs.
The following program fits the description that you gave, except for "model is big", and shows no error.
program cgao
double precision :: n = 0.0d0
n = sqrt(n)
print *,n
end program
- Marcar como novo
- Marcador
- Subscrever
- Silenciar
- Subscrever fonte RSS
- Destacar
- Imprimir
- Denunciar conteúdo inapropriado
This kind of obvious errors is often the sign of a stack corruption elsewhere in the program. The fact that it appears only when "the model is big" reinforce that suspicion.
- Marcar como novo
- Marcador
- Subscrever
- Silenciar
- Subscrever fonte RSS
- Destacar
- Imprimir
- Denunciar conteúdo inapropriado
Hi,
I think it is related to how I should set up some data options in visual studio 2019. I used the following lines to avoid the error or warning because of sqrt:
if (nxi.lt.1e-10) nxi = 0.d0
nxi = sqrt(nxi)
However, the codes crashed in another place:
do i = 1,6
enormal(i) = xi(i)/nxi
end do !
where nxi=6.429822199431394D-002,
Name | Value | Type | |
---|---|---|---|
xi(1) | -9.269196949204623D-003 | REAL(8) | |
xi(2) | 8.046774271464946D-003 | REAL(8) | |
xi(3) | 1.222422677739678D-003 | REAL(8) | |
xi(4) | -4.427044871218580D-005 | REAL(8) | |
xi(5) | -2.973747192262531D-005 | REAL(8) | |
xi(6) | 4.462114273406856D-002 | REAL(8) |
and enomal was defined as real(8) enormal in the beginning.
I think I need to set up some flags or options to avoid this. Do you know how to do that? Thanks.
Carly
- Marcar como novo
- Marcador
- Subscrever
- Silenciar
- Subscrever fonte RSS
- Destacar
- Imprimir
- Denunciar conteúdo inapropriado
Or how should I detect where the stack corruption is?
Is it possible the stack is not big enough for my big model simulation? I can set up the stack commit size and reserve size via Properties - Configuration Properties - Linker - System in Visual Studio 2019. Is there any recommendation what kind of value I should put there? I don't want to crash the computer if the number is too big.
Thanks.
Carly
- Marcar como novo
- Marcador
- Subscrever
- Silenciar
- Subscrever fonte RSS
- Destacar
- Imprimir
- Denunciar conteúdo inapropriado
The quickest way to ascertain if this issue is stack related is to enable heap-arrays.
Jim Dempsey
- Marcar como novo
- Marcador
- Subscrever
- Silenciar
- Subscrever fonte RSS
- Destacar
- Imprimir
- Denunciar conteúdo inapropriado
Hi,
I set optimization-> Heap Arrays to be 0, and Floating Point->Check Floating-Point Stack=Yes (Qfp-stack-check). I rerun the simulation. The error message is as below.
Unhandled exception at 0x00007FF7AE35F30B in bsam20_2022_08_03_fe9fe6e.exe: 0xC000008E: Floating-point division by zero (parameters: 0x0000000000000000, 0x0000000000009964).
The crashed location is the same as before.
do i = 1,6
enormal(i) = xi(i)/nxi
end do
However, nxi =6.158640282302164D-002. So this place is not where the error was triggered. How should I locate the error location. Thanks.
_________________________________________________________
I just realized I am using MPI, so it might show a wrong location for me. Then I switched to only one processor. Now the code crashed at the end of another subroutine (the "end subroutine con_initial" line). This happened a lot before, even for small model simulation. I just run multiple time. Eventually I can run through the simulation. The error message is just "bsam20_2022_08_03_fe9fe6e.exe has triggered a breakpoint.".
subroutine con_initial(cid, elem_in, qp, deltaT, cmat_tang)
----------
end subroutine con_initial
Anybody has clue about what caused this breakpoint? Thanks.
Carly
- Marcar como novo
- Marcador
- Subscrever
- Silenciar
- Subscrever fonte RSS
- Destacar
- Imprimir
- Denunciar conteúdo inapropriado
When you get the breakpoint message, look in another window for the console window that will have more information.
- Marcar como novo
- Marcador
- Subscrever
- Silenciar
- Subscrever fonte RSS
- Destacar
- Imprimir
- Denunciar conteúdo inapropriado
Hi,
I used Valgrind to detect the memory issues for my codes (mixed by C++ and Fortran). Here are the error messages. It seems the memory lost is from oneAPI? Does anybody know how to understand these information? Thanks.
==22803== LEAK SUMMARY:
==22803== definitely lost: 76 bytes in 3 blocks
==22803== indirectly lost: 0 bytes in 0 blocks
==22803== possibly lost: 0 bytes in 0 blocks
==22803== still reachable: 5,173 bytes in 17 blo
==22803==
==22803== HEAP SUMMARY:
==22803== in use at exit: 5,249 bytes in 20 blocks
==22803== total heap usage: 9,716 allocs, 9,696 frees, 14,371,548 bytes allocated
==22803==
==22803== Searching for pointers to 20 not-freed blocks
==22803== Checked 2,357,980,840 bytes
==22803==
==22803== 8 bytes in 1 blocks are still reachable in loss record 1 of 18
==22803== at 0x5487738: operator new(unsigned long) (vg_replace_malloc.c:417)
==22803== by 0xBBF83F: CryptoPP::NewObject<CryptoPP::OAEP<CryptoPP::SHA1, CryptoPP::P1363_MGF1> >::operator()() const (misc.h:258)
==22803== by 0xBBF954: CryptoPP::Singleton<CryptoPP::OAEP<CryptoPP::SHA1, CryptoPP::P1363_MGF1>, CryptoPP::NewObject<CryptoPP::OAEP<CryptoPP::SHA1, CryptoPP::P1363_MGF1> >, 0>::Ref() const (misc.h:346)
==22803== by 0xBBF0AC: CryptoPP::TF_ObjectImplBase<CryptoPP::TF_DecryptorBase, CryptoPP::TF_CryptoSchemeOptions<CryptoPP::TF_ES<CryptoPP::RSA, CryptoPP::OAEP<CryptoPP::SHA1, CryptoPP::P1363_MGF1>, int>, CryptoPP::RSA, CryptoPP::OAEP<CryptoPP::SHA1, CryptoPP::P1363_MGF1> >, CryptoPP::InvertibleRSAFunction>::GetMessageEncodingInterface() const (pubkey.h:594)
==22803== by 0xBBC103: CryptoPP::TF_CryptoSystemBase<CryptoPP::PK_Decryptor, CryptoPP::TF_Base<CryptoPP::TrapdoorFunctionInverse, CryptoPP::PK_EncryptionMessageEncodingMethod> >::FixedMaxPlaintextLength() const (pubkey.h:273)
==22803== by 0xBB61CF: license::decrypt(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (Encrypt.cpp:83)
==22803== by 0xBAEB6D: license::is_valid(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, int const&, double const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) (License.cpp:140)
==22803== by 0x8ACA0B: license::check_license(double) (License.cpp:120)
==22803== by 0x891712: SHE_license_check_license (wrapsheff_license.cpp:17)
==22803== by 0x6CCFF6: varnam_ (varnam.f90:108)
==22803== by 0x410079: MAIN__ (mainf1.f:47)
==22803== by 0x410011: main (in /home/ga/gaoz/bin/bsam20_2022_10)
==22803==
==22803== 13 bytes in 1 blocks are definitely lost in loss record 2 of 18
==22803== at 0x5487017: malloc (vg_replace_malloc.c:380)
==22803== by 0x116950C9: strdup (in /usr/lib64/libc-2.17.so)
==22803== by 0x12DD9197: ???
==22803== by 0x486B8F2: _dl_init (in /usr/lib64/ld-2.17.so)
==22803== by 0x48704CD: dl_open_worker (in /usr/lib64/ld-2.17.so)
==22803== by 0x486B703: _dl_catch_error (in /usr/lib64/ld-2.17.so)
==22803== by 0x486FABA: _dl_open (in /usr/lib64/ld-2.17.so)
==22803== by 0xEF31EEA: dlopen_doit (in /usr/lib64/libdl-2.17.so)
==22803== by 0x486B703: _dl_catch_error (in /usr/lib64/ld-2.17.so)
==22803== by 0xEF324EC: _dlerror_run (in /usr/lib64/libdl-2.17.so)
==22803== by 0xEF31F80: dlopen@@GLIBC_2.2.5 (in /usr/lib64/libdl-2.17.so)
==22803== by 0x11E6E051: ofi_reg_dl_prov (in /home/ba/ballardmk/bin/intel/oneapi/mpi/2021.1.1/libfabric/lib/libfabric.so.1)
==22803== 31 bytes in 1 blocks are definitely lost in loss record 5 of 18
==22803== at 0x5487017: malloc (vg_replace_malloc.c:380)
==22803== by 0x12DD96F5: ???
==22803== by 0x486B8F2: _dl_init (in /usr/lib64/ld-2.17.so)
==22803== by 0x48704CD: dl_open_worker (in /usr/lib64/ld-2.17.so)
==22803== by 0x486B703: _dl_catch_error (in /usr/lib64/ld-2.17.so)
==22803== by 0x486FABA: _dl_open (in /usr/lib64/ld-2.17.so)
==22803== by 0xEF31EEA: dlopen_doit (in /usr/lib64/libdl-2.17.so)
==22803== by 0x486B703: _dl_catch_error (in /usr/lib64/ld-2.17.so)
==22803== by 0xEF324EC: _dlerror_run (in /usr/lib64/libdl-2.17.so)
==22803== by 0xEF31F80: dlopen@@GLIBC_2.2.5 (in /usr/lib64/libdl-2.17.so)
==22803== by 0x11E6E051: ofi_reg_dl_prov (in /home/ba/ballardmk/bin/intel/oneapi/mpi/2021.1.1/libfabric/lib/libfabric.so.1)
==22803== by 0x11E6E6B8: fi_ini (in /home/ba/ballardmk/bin/intel/oneapi/mpi/2021.1.1/libfabric/lib/libfabric.so.1)
==22803==
==22803== 32 bytes in 1 blocks are definitely lost in loss record 8 of 18
==22803== at 0x548B778: calloc (vg_replace_malloc.c:1117)
==22803== by 0x12B7905A: ???
==22803== by 0x12B6017A: ???
==22803== by 0x11E6E06F: ofi_reg_dl_prov (in /home/ba/ballardmk/bin/intel/oneapi/mpi/2021.1.1/libfabric/lib/libfabric.so.1)
==22803== by 0x11E6E6B8: fi_ini (in /home/ba/ballardmk/bin/intel/oneapi/mpi/2021.1.1/libfabric/lib/libfabric.so.1)
==22803== by 0x11E6F460: fi_getinfo@@FABRIC_1.3 (in /home/ba/ballardmk/bin/intel/oneapi/mpi/2021.1.1/libfabric/lib/libfabric.so.1)
==22803== by 0x11E73E65: fi_getinfo@FABRIC_1.1 (in /home/ba/ballardmk/bin/intel/oneapi/mpi/2021.1.1/libfabric/lib/libfabric.so.1)
==22803== by 0x106E1BB2: MPIDI_OFI_mpi_init_hook (ofi_init.c:1167)
==22803== by 0x1022FCE7: MPID_Init (ch4_init.c:1138)
==22803== by 0x104C624F: MPIR_Init_thread (initthread.c:137)
==22803== by 0x104C624F: PMPI_Init_thread (initthread.c:269)
==22803== by 0xFDC0F4B: MPI_INIT_THREAD (initthreadf.c:270)
==22803== by 0x6391A8: mpi_util_mp_mpi_util_start_ (mpi_util.f90:52)
- Marcar como novo
- Marcador
- Subscrever
- Silenciar
- Subscrever fonte RSS
- Destacar
- Imprimir
- Denunciar conteúdo inapropriado
Hi,
I used Valgrind to detect the heap corruption in my code (mixed by C++ and Fortran). Here are the error messages. It seems the memory loss is from OneAPI? Does anyone know how to interpret the information? Thanks.
==22803==
==22803== HEAP SUMMARY:
==22803== in use at exit: 5,249 bytes in 20 blocks
==22803== total heap usage: 9,716 allocs, 9,696 frees, 14,371,548 bytes allocated
==22803==
==22803== Searching for pointers to 20 not-freed blocks
==22803== Checked 2,357,980,840 bytes
==22803==
==22803== 8 bytes in 1 blocks are still reachable in loss record 1 of 18
==22803== at 0x5487738: operator new(unsigned long) (vg_replace_malloc.c:417)
==22803== by 0xBBF83F: CryptoPP::NewObject<CryptoPP::OAEP<CryptoPP::SHA1, CryptoPP::P1363_MGF1> >::operator()() const (misc.h:258)
==22803== by 0xBBF954: CryptoPP::Singleton<CryptoPP::OAEP<CryptoPP::SHA1, CryptoPP::P1363_MGF1>, CryptoPP::NewObject<CryptoPP::OAEP<CryptoPP::SHA1, CryptoPP::P1363_MGF1> >, 0>::Ref() const (misc.h:346)
==22803== by 0xBBF0AC: CryptoPP::TF_ObjectImplBase<CryptoPP::TF_DecryptorBase, CryptoPP::TF_CryptoSchemeOptions<CryptoPP::TF_ES<CryptoPP::RSA, CryptoPP::OAEP<CryptoPP::SHA1, CryptoPP::P1363_MGF1>, int>, CryptoPP::RSA, CryptoPP::OAEP<CryptoPP::SHA1, CryptoPP::P1363_MGF1> >, CryptoPP::InvertibleRSAFunction>::GetMessageEncodingInterface() const (pubkey.h:594)
==22803== by 0xBBC103: CryptoPP::TF_CryptoSystemBase<CryptoPP::PK_Decryptor, CryptoPP::TF_Base<CryptoPP::TrapdoorFunctionInverse, CryptoPP::PK_EncryptionMessageEncodingMethod> >::FixedMaxPlaintextLength() const (pubkey.h:273)
==22803== by 0xBB61CF: license::decrypt(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (Encrypt.cpp:83)
==22803== by 0xBAEB6D: license::is_valid(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, int const&, double const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) (License.cpp:140)
==22803== by 0x8ACA0B: license::check_license(double) (License.cpp:120)
==22803== by 0x891712: SHE_license_check_license (wrapsheff_license.cpp:17)
==22803== by 0x6CCFF6: varnam_ (varnam.f90:108)
==22803== by 0x410079: MAIN__ (mainf1.f:47)
==22803== by 0x410011: main (in /home/ga/gaoz/bin/bsam20_2022_10)
==22803==
==22803== 13 bytes in 1 blocks are definitely lost in loss record 2 of 18
==22803== at 0x5487017: malloc (vg_replace_malloc.c:380)
==22803== by 0x116950C9: strdup (in /usr/lib64/libc-2.17.so)
==22803== by 0x12DD9197: ???
==22803== by 0x486B8F2: _dl_init (in /usr/lib64/ld-2.17.so)
==22803== by 0x48704CD: dl_open_worker (in /usr/lib64/ld-2.17.so)
==22803== by 0x486B703: _dl_catch_error (in /usr/lib64/ld-2.17.so)
==22803== by 0x486FABA: _dl_open (in /usr/lib64/ld-2.17.so)
==22803== by 0xEF31EEA: dlopen_doit (in /usr/lib64/libdl-2.17.so)
==22803== by 0x486B703: _dl_catch_error (in /usr/lib64/ld-2.17.so)
==22803== by 0xEF324EC: _dlerror_run (in /usr/lib64/libdl-2.17.so)
==22803== by 0xEF31F80: dlopen@@GLIBC_2.2.5 (in /usr/lib64/libdl-2.17.so)
==22803== by 0x11E6E051: ofi_reg_dl_prov (in /home/ba/ballardmk/bin/intel/oneapi/mpi/2021.1.1/libfabric/lib/libfabric.so.1)
==22803== 31 bytes in 1 blocks are definitely lost in loss record 5 of 18
==22803== at 0x5487017: malloc (vg_replace_malloc.c:380)
==22803== by 0x12DD96F5: ???
==22803== by 0x486B8F2: _dl_init (in /usr/lib64/ld-2.17.so)
==22803== by 0x48704CD: dl_open_worker (in /usr/lib64/ld-2.17.so)
==22803== by 0x486B703: _dl_catch_error (in /usr/lib64/ld-2.17.so)
==22803== by 0x486FABA: _dl_open (in /usr/lib64/ld-2.17.so)
==22803== by 0xEF31EEA: dlopen_doit (in /usr/lib64/libdl-2.17.so)
==22803== by 0x486B703: _dl_catch_error (in /usr/lib64/ld-2.17.so)
==22803== by 0xEF324EC: _dlerror_run (in /usr/lib64/libdl-2.17.so)
==22803== by 0xEF31F80: dlopen@@GLIBC_2.2.5 (in /usr/lib64/libdl-2.17.so)
==22803== by 0x11E6E051: ofi_reg_dl_prov (in /home/ba/ballardmk/bin/intel/oneapi/mpi/2021.1.1/libfabric/lib/libfabric.so.1)
==22803== by 0x11E6E6B8: fi_ini (in /home/ba/ballardmk/bin/intel/oneapi/mpi/2021.1.1/libfabric/lib/libfabric.so.1)
==22803==
==22803== 32 bytes in 1 blocks are definitely lost in loss record 8 of 18
==22803== at 0x548B778: calloc (vg_replace_malloc.c:1117)
==22803== by 0x12B7905A: ???
==22803== by 0x12B6017A: ???
==22803== by 0x11E6E06F: ofi_reg_dl_prov (in /home/ba/ballardmk/bin/intel/oneapi/mpi/2021.1.1/libfabric/lib/libfabric.so.1)
==22803== by 0x11E6E6B8: fi_ini (in /home/ba/ballardmk/bin/intel/oneapi/mpi/2021.1.1/libfabric/lib/libfabric.so.1)
==22803== by 0x11E6F460: fi_getinfo@@FABRIC_1.3 (in /home/ba/ballardmk/bin/intel/oneapi/mpi/2021.1.1/libfabric/lib/libfabric.so.1)
==22803== by 0x11E73E65: fi_getinfo@FABRIC_1.1 (in /home/ba/ballardmk/bin/intel/oneapi/mpi/2021.1.1/libfabric/lib/libfabric.so.1)
==22803== by 0x106E1BB2: MPIDI_OFI_mpi_init_hook (ofi_init.c:1167)
==22803== by 0x1022FCE7: MPID_Init (ch4_init.c:1138)
==22803== by 0x104C624F: MPIR_Init_thread (initthread.c:137)
==22803== by 0x104C624F: PMPI_Init_thread (initthread.c:269)
==22803== by 0xFDC0F4B: MPI_INIT_THREAD (initthreadf.c:270)
==22803== by 0x6391A8: mpi_util_mp_mpi_util_start_ (mpi_util.f90:52)
==22803==
==22803==
==22803== LEAK SUMMARY:
==22803== definitely lost: 76 bytes in 3 blocks
==22803== indirectly lost: 0 bytes in 0 blocks
==22803== possibly lost: 0 bytes in 0 blocks
==22803== still reachable: 5,173 bytes in 17 blocks
==22803== suppressed: 0 bytes in 0 blocks

- Subscrever fonte RSS
- Marcar tópico como novo
- Marcar tópico como lido
- Flutuar este Tópico para o utilizador atual
- Marcador
- Subscrever
- Página amigável para impressora