- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We have a very large structure ~10GB. GCC does not seem to have a problem with it but ICC is seg faulting. I've boiled it down to simple assignment statements. Any ideas? Files attached. If we don't allocate but declare the structure it works fine. Any problems with Intel allocating large amounts of memory? We've tried moving things around in the structure definition but it continues to seg fault.
icc -mcmodel=medium -shared-intel -g test.c -o test
gcc -mcmodel=medium -g test.c -o test
icc: Version 12.1.5.339 Build 20120612
gcc: gcc (GCC) 4.4.6 20120305 (Red Hat 4.4.6-4)
GCC output:
Allocate the grid struct
sizeof of grids 9.393871GB
Address of grids->G2.rain.count 0x7f5d05143190
Address of grids->G2.d0.count 0x7f5d61d35190
Address of grids->G2.Nt.count 0x7f5d96d2d190
Address of grids->G2.dBZm.count 0x7f5f079a0190
location: 0
G2.dBZ.stdev[ii][jj][kk][ll][mm] = 2.001000
Address of grids->G2.dBZm.count 0x7f5f079a0190
assigned G2.dBZm.count
assigend G2.dBZm.mean
assigned G2.dBZm.stdev
Address of G2.d0.count 0x7f5d61d35190
Going to assign d0.count
assigned G2.d0.count
Going to assign d0.mean
assigned G2.d0.mean
Going to assign d0.stdev
assigned G2.d0.stdev
Successful write
Intel ICC:
Allocate the grid struct
sizeof of grids 9.393871GB
Address of grids->G2.rain.count 0x7fc9e070f190
Address of grids->G2.d0.count 0x7fca3d301190
Address of grids->G2.Nt.count 0x7fca722f9190
Address of grids->G2.dBZm.count 0x7fcbe2f6c190
location: 0
G2.dBZ.stdev[ii][jj][kk][ll][mm] = 2.001000
Address of grids->G2.dBZm.count 0x7fcbe2f6c190
assigned G2.dBZm.count
assigend G2.dBZm.mean
assigned G2.dBZm.stdev
Address of G2.d0.count 0x7fca3d301190
Going to assign d0.count
Segmentation fault
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Sergey,
i have debug the sample ( icc execute ) with gdb, and proof it with valgrind now the segmentation error
is on:
0x400d85 <main+1649>; movl $0x6,(%rax)
so i have made a dump with objdump an attach the listing, i'm not a pc assembler specialist ( better on Mainframe Assembler ) but i hope
the listing helps a little
best regards
Franz Bernasek
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
FYI. Here is a similar test program using the same structure but in Fortran. Seems to work as expected.
I know this Fortran is using extensions and not strict F95/2003 but we are wedded to some legacy code for now.
ifort: Version 11.1 Build 20100806 Package ID: l_cprof_p_11.1.073
ifort -mcmodel=medium -shared-intel -fpp test.f90
Output:
Address of grids 601820
Size of grids 10086592896
Address of grids.G2.rain.count 2DD6F9A0
Address of grids.G2.d0.count B255B9A0
Address of grids.G2.Nt.count E75539A0
Address of grids.G2.dBZm.count 8A9619A0
write some values to grid
grids.G2.dBZ.stdev(ii,jj,kk,ll,mm) = 2.010000
Going to assign d0.count
Address of d0.count B255B9A0
Assigned d0.count
grids.G2.d0.count = 6
Successful write
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for the test case. I reproduced the problem and filed a report on this issue. I will let you know as soon as I get an update from the compiler development team.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Mark and Sergey,
Thank you very much for confirming the issue. I've passed this on to our Admins and they have gone through Premier Support as well.
We have other test codes that use upwards of 20GB arrays with no issues, reading and writing to all array elements.
Is the issue related to the size of the structure or the complexity of the structure with many arrays of mixed data types?
Please give us any indication if there is a short vs. long term fix.
We have machines with up to 128GB/256GB of physical memory and for us using these large structures is the most efficient way to consolidate our data. The data in the structure ends up in an internally compressed HDF5 file. In addition, we have applications and libraries written in C, Fortran and mixed language that need to access these large structures.
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Using: icc --version
icc (ICC) 12.1.3 20120212
A similar program to jkwi's with the structure allocated on the stack and my ulimit set to unlimited works. Program attached and output is:
./test_noloop_static
location: 0
G2.dBZ.stdev[ii][jj][kk][ll][mm] = 2.001000
Address of grids.G2.dBZm.count 0x7fffb7f81c40
assigned G2.dBZm.count
assigned G2.dBZm.mean
assigned G2.dBZm.stdev
Address of G2.d0.count 0x7ffe12316c40
Going to assign d0.count
assigned G2.d0.count
Going to assign d0.mean
assigned G2.d0.mean
Going to assign d0.stdev
assigned G2.d0.stdev
Successful write
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sergey,
>>"Mark, please try to do a very simple test ( for 32-bit and 64-bit platforms ) and here are source codes:"
Your test case compiles and runs fine.
Thanks,
--mark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Sergey,
i have compile your sample with the Parallel Studio XE 2013 for Linux Compiler( icc (ICC) 13.1.0 20130121
Copyright (C) 1985-2013 Intel Corporation. All rights reserved.
linux-cuda:~ # icpc --version
icpc (ICC) 13.1.0 20130121
Copyright (C) 1985-2013 Intel Corporation. All rights reserved.
linux-cuda:~ # ) under openSUSE 12.2 64 Bit Linux Kernel 3.4.30 , no problem
works , no compile error, no link error , executable runs result 77 was displayed.
#include "stdio.h"
typedef struct tagDataSet
{
// 2^26 = 67108864 - Default limit for 32-bit Intel C++ compiler
// 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
__int8 iData[2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2];
} DataSet;
DataSet ds = { 0x0 };
int main( void )
{
ds.iData[0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0] = 77;
printf( "%d\n", ds.iData[0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0] );
return ( int )0;
}
best regards
Franz
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page