Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

djacobix aborts after some iterations

Thomas_H_2
Beginner
279 Views

Hello,

i want to calculate the Jacobian of a complicated function. I sucessfully implemented djacobi for a simple test case using the reference page https://software.intel.com/en-us/mkl-developer-reference-fortran-2018-beta-jacobi

The actual function is complicated so i'll start in a compressed form, if you need to see more snippets just tell me.

module ContactForceTest_mod
    use MKL_RCI
    use MKL_RCI_type
    implicit none
contains
    subroutine TestContactForce
        type(ContactForce) :: Contact
        external :: wrapper
        ...
        if (djacobix(wrapper,12,12,Jacobian,Displacement,1e-6,%VAL(LOC(Contact))) /= TR_SUCCESS) then
            write(*,*), '| error in djacobix'
            call MKL_FREE_BUFFERS
            stop 1;
        end if
    end subroutine
end module

subroutine wrapper(m,n,Displacement, Force, Contact)
    use ContactForceTest_mod
    implicit none
    integer, intent(in) :: m,n
    real, dimension(12) :: RelativeDisplacement
    real, dimension(12) :: Force
    type(ContactForce_class) :: Contact

    call Contact%NonlinearContactForce(12,12,RelativeDisplacement, ContactForce)
end subroutine wrapper

I'm using the wrapper subroutine as i need to call a type bound procedure as the function of which i want to have the jacobian (this is apparently working). All variables are properly declared and if i commend the jacobian part, the program is working flawlessly.

After some iterations (i can see Relativedisplacement change its values by 1e-06 one after another), bash just tells me "aborted" without any further message. My problem at the moment is to locate the error. It happens apparently somewhere in djacobix. Is there a way to find out where exactly (as the routine is at least doing some iterations)?

Or any suggestions?

 

Edit: Here's something from Valgrind:

==14058== Memcheck, a memory error detector
==14058== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==14058== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==14058== Command: ./datar_test
==14058== 
==14058== Conditional jump or move depends on uninitialised value(s)
==14058==    at 0x603C8D: __intel_sse2_strcpy (in /home/SERVER-hoffmann/sync/04_DATAR/datar_shared/Unittest/datar_test)
==14058==    by 0x57F277: for__add_to_lf_table (in /home/SERVER-hoffmann/sync/04_DATAR/datar_shared/Unittest/datar_test)
==14058==    by 0x5BC548: for__open_proc (in /home/SERVER-hoffmann/sync/04_DATAR/datar_shared/Unittest/datar_test)
==14058==    by 0x58620F: for__open_default (in /home/SERVER-hoffmann/sync/04_DATAR/datar_shared/Unittest/datar_test)
==14058==    by 0x5A9C23: for_write_seq_lis (in /home/SERVER-hoffmann/sync/04_DATAR/datar_shared/Unittest/datar_test)
==14058==    by 0x40E2CA: fruit_mp_init_fruit_ (fruit.f90:635)
==14058==    by 0x44B553: MAIN__ (datar_test.f90:18)
==14058==    by 0x4032DD: main (in /home/SERVER-hoffmann/sync/04_DATAR/datar_shared/Unittest/datar_test)
==14058== 
 
 Test module initialized
 
    . : successful assert,   F : failed assert 
 
 --------------
 Starting contact force spectrum test
==14058== Syscall param sched_setaffinity(mask) points to unaddressable byte(s)
==14058==    at 0x6B6EFB9: syscall (in /lib64/libc-2.19.so)
==14058==    by 0x40A0787: __kmp_affinity_determine_capable (in /global/linux/64_bit/opt/intel/mkl/10.2.2.025/lib/em64t/libiomp5.so)
==14058==    by 0x4089F97: __kmp_env_initialize(char const*) (in /global/linux/64_bit/opt/intel/mkl/10.2.2.025/lib/em64t/libiomp5.so)
==14058==    by 0x4081175: ??? (in /global/linux/64_bit/opt/intel/mkl/10.2.2.025/lib/em64t/libiomp5.so)
==14058==    by 0x4074428: __kmp_get_global_thread_id_reg (in /global/linux/64_bit/opt/intel/mkl/10.2.2.025/lib/em64t/libiomp5.so)
==14058==    by 0x407FF98: __kmp_parallel_initialize (in /global/linux/64_bit/opt/intel/mkl/10.2.2.025/lib/em64t/libiomp5.so)
==14058==    by 0x406EC7D: omp_get_num_procs (in /global/linux/64_bit/opt/intel/mkl/10.2.2.025/lib/em64t/libiomp5.so)
==14058==    by 0x54211FA: MKL_get_N_Cores (in /global/linux/64_bit/opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_intel_thread.so)
==14058==    by 0x5420CA8: mkl_read_threads_env (in /global/linux/64_bit/opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_intel_thread.so)
==14058==    by 0x5420694: mkl_serv_mkl_domain_get_max_threads (in /global/linux/64_bit/opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_intel_thread.so)
==14058==    by 0x57BB632: mkl_dft_commit_descriptor_d_c2c_1d_omp (in /global/linux/64_bit/opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_intel_thread.so)
==14058==    by 0x4F4279C: mkl_dft_dfti_commit_descriptor_external (in /global/linux/64_bit/opt/intel/mkl/10.2.2.025/lib/em64t/libmkl_intel_lp64.so)
==14058==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==14058== 
OMP: Warning #2: Cannot open message catalog "libiomp5.cat":
OMP: System error #2: No such file or directory
OMP: Hint: Check NLSPATH environment variable, its value is "/global/linux/64_bit/opt/intel/mkl/10.2.2.025/lib/em64t/locale/%l_%t/%N".
OMP: Info #3: Default messages will be used.
==14058== Conditional jump or move depends on uninitialised value(s)
==14058==    at 0x5888D5: for_realloc_lhs (in /home/SERVER-hoffmann/sync/04_DATAR/datar_shared/Unittest/datar_test)
==14058==    by 0x46584B: contactforceparameters_mod_mp_getnormalizedequilibrium_ (ContactForceParameters_mod.f90:196)
==14058==    by 0x50CE29: contactforce_mod_mp_init_ (ContactForce_mod.f90:53)
==14058==    by 0x406269: contactforcetest_mod_mp_testcontactforce_ (ContactForceTest.f90:71)
==14058==    by 0x44B586: MAIN__ (datar_test.f90:26)
==14058==    by 0x4032DD: main (in /home/SERVER-hoffmann/sync/04_DATAR/datar_shared/Unittest/datar_test)
==14058== 
==14058== Conditional jump or move depends on uninitialised value(s)
==14058==    at 0x5888D5: for_realloc_lhs (in /home/SERVER-hoffmann/sync/04_DATAR/datar_shared/Unittest/datar_test)
==14058==    by 0x4666FA: contactforceparameters_mod_mp_getcontactforcecorrection_ (ContactForceParameters_mod.f90:241)
==14058==    by 0x50E076: contactforce_mod_mp_init_ (ContactForce_mod.f90:55)
==14058==    by 0x406269: contactforcetest_mod_mp_testcontactforce_ (ContactForceTest.f90:71)
==14058==    by 0x44B586: MAIN__ (datar_test.f90:26)
==14058==    by 0x4032DD: main (in /home/SERVER-hoffmann/sync/04_DATAR/datar_shared/Unittest/datar_test)
==14058== 
.........
 
     Start of FRUIT summary: 
 
 SUCCESSFUL!
 
   No messages 
 Total asserts :              9
 Successful    :              9
 Failed        :              0
Successful rate:   100.00%
 
 Successful asserts / total asserts : [            9 /           9  ]
 Successful cases   / total cases   : [            0 /           0  ]
   -- end of FRUIT summary
==14058== 
==14058== HEAP SUMMARY:
==14058==     in use at exit: 2,594 bytes in 8 blocks
==14058==   total heap usage: 2,462 allocs, 2,454 frees, 7,276,564 bytes allocated
==14058== 
==14058== LEAK SUMMARY:
==14058==    definitely lost: 0 bytes in 0 blocks
==14058==    indirectly lost: 0 bytes in 0 blocks
==14058==      possibly lost: 0 bytes in 0 blocks
==14058==    still reachable: 2,594 bytes in 8 blocks
==14058==         suppressed: 0 bytes in 0 blocks
==14058== Reachable blocks (those to which a pointer was found) are not shown.
==14058== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==14058== 
==14058== For counts of detected and suppressed errors, rerun with: -v
==14058== Use --track-origins=yes to see where uninitialised values come from
==14058== ERROR SUMMARY: 4 errors from 4 contexts (suppressed: 0 from 0)

 

0 Kudos
1 Reply
Khang_N_Intel
Employee
279 Views

Hi Thomas,

It would be helpful if you can provide us with the following information:

1) Testing system configuration: hardware and software (OS and software use including version numbers)

2) Your whole test harness, not just some code snippets, so that we can recreate the errors.

 

Thanks,

Khang

0 Kudos
Reply