- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I've seen a core dump in ifort 11.0.074 (fortcom) on our SGI Ice, compiling NEMO, an ocean model.
This code compiles successfully with ifort 9.
I have a core dump, (for fortcom), but in idbc it shows:
stokes2:~/ifort-11-core$ idbc /ichec/packages/intel/fce/11.0.074/bin/intel64/fortcom core
Intel Debugger for applications running on Intel 64, Version 11.0, Build [1.1510.2.97] ------------------ object file name: /ichec/packages/intel/fce/11.0.074/bin/intel64/fortcom
core file name: core Reading symbols from /panfs/panasas/packages/intel/fce/11.0.074/bin/intel64/fortcom...(no debugging symbols found)...done.
Core file produced from executable fortcom Initial part of arglist: /ichec/packages/intel/fce/11.0.074/bin/intel64/fortcom @/ichec/work/staff/amcki
Thread terminated at PC 0x00000000004f8f6b by signal SIGSEGV (idb) where #0 0x00000000004f8f6b in enter_derived_type_sym () in /panfs/panasas/packages/intel/fce/11.0.074/bin/intel64/fortcom
Failed to read from target memory 0x7fff496925f0 (idb)
How should I proceed ?
Notes: We have SGI Ice 8200, quad-core. 16 G ram on the compilation nodes.
cat /proc/cpuinfo gives processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel Xeon CPU X5355 @ 2.66GHz stepping : 11 cpu MHz : 2660.000 cache size : 4096 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 4 fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr dca lahf_lm bogomips : 5324.58 clflush size : 64 cache_alignment : 64 address sizes : 38 bits physical, 48 bits virtual power management:
This code compiles successfully with ifort 9.
I have a core dump, (for fortcom), but in idbc it shows:
stokes2:~/ifort-11-core$ idbc /ichec/packages/intel/fce/11.0.074/bin/intel64/fortcom core
Intel Debugger for applications running on Intel 64, Version 11.0, Build [1.1510.2.97] ------------------ object file name: /ichec/packages/intel/fce/11.0.074/bin/intel64/fortcom
core file name: core Reading symbols from /panfs/panasas/packages/intel/fce/11.0.074/bin/intel64/fortcom...(no debugging symbols found)...done.
Core file produced from executable fortcom Initial part of arglist: /ichec/packages/intel/fce/11.0.074/bin/intel64/fortcom @/ichec/work/staff/amcki
Thread terminated at PC 0x00000000004f8f6b by signal SIGSEGV (idb) where #0 0x00000000004f8f6b in enter_derived_type_sym () in /panfs/panasas/packages/intel/fce/11.0.074/bin/intel64/fortcom
Failed to read from target memory 0x7fff496925f0 (idb)
How should I proceed ?
Notes: We have SGI Ice 8200, quad-core. 16 G ram on the compilation nodes.
cat /proc/cpuinfo gives processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel Xeon CPU X5355 @ 2.66GHz stepping : 11 cpu MHz : 2660.000 cache size : 4096 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 4 fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr dca lahf_lm bogomips : 5324.58 clflush size : 64 cache_alignment : 64 address sizes : 38 bits physical, 48 bits virtual power management:
Link Copied
2 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Is it possible to determine whether the compiler died from a stack limit violation (e.g. by repeating the compilation running fortcom under debugger, or simply raising the limit)? If the compiler reported an internal error to the screen, then it would definitely be a reportable bug. A few bugs were fixed in more recent updates of the compiler.
Bug reporting is done with an account on premier.intel.com. Your sysadmin may have opened such an account, or might agree for you to open one yourself, using the license key.
As for work-arounds, possibilities might include lowering the level of optimization, as by removing -ipo, setting -fno-inline-functions, reducing to -O1, and the like. It's possible that some optimization which 9.1 didn't attempt exposes a problem.
Bug reporting is done with an account on premier.intel.com. Your sysadmin may have opened such an account, or might agree for you to open one yourself, using the license key.
As for work-arounds, possibilities might include lowering the level of optimization, as by removing -ipo, setting -fno-inline-functions, reducing to -O1, and the like. It's possible that some optimization which 9.1 didn't attempt exposes a problem.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I notice you have ifort installed on a Panfs filesystem (Panasas). In the past, I had an issue with odd behavior with the compiler on Lustre, but because the customer was on a classified system we could not reproduce.
Any chance you could try a local install of 11.0.074 on a local filesystem and try to reproduce? There are evaluation copies available here:
http://software.intel.com/en-us/articles/intel-software-evaluation-center/
you can do a non-root install into your home dir, assuming it's local or NFS and not Panfs. Then you'd need to source ~/intel/Compiler/11.0/074/bin/ifortvars.sh intel64 and make sure the build script is not hardcoding the compiler path.
But in the meantime, is NEMO open source? Or if not, can you find the 1 file causing the error and associated modules to get a simple reproducer without the entire application?
ron
Any chance you could try a local install of 11.0.074 on a local filesystem and try to reproduce? There are evaluation copies available here:
http://software.intel.com/en-us/articles/intel-software-evaluation-center/
you can do a non-root install into your home dir, assuming it's local or NFS and not Panfs. Then you'd need to source ~/intel/Compiler/11.0/074/bin/ifortvars.sh intel64 and make sure the build script is not hardcoding the compiler path.
But in the meantime, is NEMO open source? Or if not, can you find the 1 file causing the error and associated modules to get a simple reproducer without the entire application?
ron

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page