Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Beginner
223 Views

Very large structs GCC ok, ICC seg fault, simple example.

We have a very large structure ~10GB. GCC does not seem to have a problem with it but ICC is seg faulting. I've boiled it down to simple assignment statements. Any ideas?  Files attached. If we don't allocate but declare the structure it works fine. Any problems with Intel allocating large amounts of memory?  We've tried moving things around in the structure definition but it continues to seg fault.

icc -mcmodel=medium -shared-intel -g  test.c -o test
gcc -mcmodel=medium -g test.c -o test

icc: Version 12.1.5.339 Build 20120612
gcc: gcc (GCC) 4.4.6 20120305 (Red Hat 4.4.6-4)

GCC output:
Allocate the grid struct
sizeof of grids 9.393871GB
Address of grids->G2.rain.count 0x7f5d05143190
Address of grids->G2.d0.count 0x7f5d61d35190
Address of grids->G2.Nt.count 0x7f5d96d2d190
Address of grids->G2.dBZm.count 0x7f5f079a0190
location: 0
G2.dBZ.stdev[ii][jj][kk][ll][mm] = 2.001000
Address of grids->G2.dBZm.count 0x7f5f079a0190
assigned G2.dBZm.count
assigend G2.dBZm.mean
assigned G2.dBZm.stdev
Address of G2.d0.count 0x7f5d61d35190
Going to assign d0.count
assigned G2.d0.count
Going to assign d0.mean
assigned G2.d0.mean
Going to assign d0.stdev
assigned G2.d0.stdev
Successful write

Intel ICC:
Allocate the grid struct
sizeof of grids 9.393871GB
Address of grids->G2.rain.count 0x7fc9e070f190
Address of grids->G2.d0.count 0x7fca3d301190
Address of grids->G2.Nt.count 0x7fca722f9190
Address of grids->G2.dBZm.count 0x7fcbe2f6c190
location: 0
G2.dBZ.stdev[ii][jj][kk][ll][mm] = 2.001000
Address of grids->G2.dBZm.count 0x7fcbe2f6c190
assigned G2.dBZm.count
assigend G2.dBZm.mean
assigned G2.dBZm.stdev
Address of G2.d0.count 0x7fca3d301190
Going to assign d0.count
Segmentation fault




0 Kudos
48 Replies
Highlighted
Valued Contributor II
31 Views

This is a short follow up regarding the problem with Microsoft C++ compiler... >>... >>------ Build started: Project: MemTestApp, Configuration: Release x64 ------ >>Compiling... >>cl : Command line warning D9035 : option 'Wp64' has been deprecated and will be removed in a future release >>Stdafx.cpp >>c:\wutemp\memtestapp\Tk-3dpr-hdf5.h(484) : error C2089: '' : 'struct' too large >>... I reported the problem to Microsoft and this is a response: ... ...Visual C++ does not handle structures larger than 2GB. This appears to be a known issue... ...
0 Kudos
Highlighted
New Contributor I
31 Views

Hi Sergey,

i have debug the sample ( icc execute ) with gdb, and proof it with valgrind now the segmentation error

is on:

0x400d85 <main+1649>;   movl $0x6,(%rax)

so i have made a dump with objdump an attach the listing, i'm not a pc assembler specialist ( better on Mainframe Assembler ) but i hope

the listing helps a little

best regards

Franz Bernasek

0 Kudos
Highlighted
New Contributor I
31 Views

Sergey

now the zip file

best regards

Franz Bernasek

0 Kudos
Highlighted
Valued Contributor II
31 Views

Thanks. >>...the segmentation error is on: >>... >>0x400d85
; movl $0x6,(%rax) >>... and in the codes this is: ... grids->G2.d0.count[ii][jj][kk][ll][mm] = ( __int32 )6; ... Let's wait for a response from Mark-sabahi (Intel) since he is already investigating the problem.
0 Kudos
Highlighted
Valued Contributor II
31 Views

I detected another problem with Intel C++ compiler when the struct was declared for automatic allocation and please take a look at compilation outputs: ------ Build started: Project: MemTestApp, Configuration: Release x64 ------ Compiling with Intel(R) C++ Compiler XE 13.0.0.089 [Intel(R) 64]... (Intel C++ Environment) Stdafx.cpp Compiling with Intel(R) C++ Compiler XE 13.0.0.089 [Intel(R) 64]... (Intel C++ Environment) MemTestApp.cpp Linking... (Intel C++ Environment) catastrophic error: Local variable size exceeds supported maximum xilink: error #10014: problem during multi-file optimization compilation (code 1) xilink: error #10014: problem during multi-file optimization compilation (code 1) MemTestApp - 3 error(s), 0 warning(s), 0 remark(s) ------ Build started: Project: MemTestApp, Configuration: Debug x64 ------ Compiling with Intel(R) C++ Compiler XE 13.0.0.089 [Intel(R) 64]... (Intel C++ Environment) Stdafx.cpp Compiling with Intel(R) C++ Compiler XE 13.0.0.089 [Intel(R) 64]... (Intel C++ Environment) MemTestApp.cpp catastrophic error: Local variable size exceeds supported maximum compilation aborted for .\MemTestApp.cpp (code 1) MemTestApp - 1 error(s), 0 warning(s), 0 remark(s) Note: Updated test project ( update 2 ) is attached.
0 Kudos
Highlighted
Beginner
31 Views

FYI. Here is a similar test program using the same structure but in Fortran. Seems to work as expected.
I know this Fortran is using extensions and not strict F95/2003 but we are wedded to some legacy code for now.


ifort: Version 11.1    Build 20100806 Package ID: l_cprof_p_11.1.073

ifort -mcmodel=medium -shared-intel -fpp test.f90

Output:

Address of grids                  601820
 Size of grids            10086592896
Address of grids.G2.rain.count                2DD6F9A0
Address of grids.G2.d0.count                B255B9A0
Address of grids.G2.Nt.count                E75539A0
Address of grids.G2.dBZm.count                8A9619A0
 write some values to grid
 grids.G2.dBZ.stdev(ii,jj,kk,ll,mm) =    2.010000    
 Going to assign d0.count
Address of d0.count               B255B9A0
 Assigned d0.count
 grids.G2.d0.count =            6
 Successful write



0 Kudos
Highlighted
Valued Contributor II
31 Views

>>...Here is a similar test program using the same structure but in Fortran. Seems to work as expected... Do you want me to verify the new test under Windows 7 Professional 64-bit?
0 Kudos
Highlighted
Valued Contributor II
31 Views

>>>>c:\wutemp\memtestapp\Tk-3dpr-hdf5.h(484) : error C2089: '' : 'struct' too large >>>>... >> >>I reported the problem to Microsoft and this is a response: >>... >>...Visual C++ does not handle structures larger than 2GB. This appears to be a known issue... >>... By the way, the problem is not fixed for three years ( 1st time it was reported in February 2010 ).
0 Kudos
Highlighted
Employee
31 Views

Thank you for the test case. I reproduced the problem and filed a report on this issue.  I will let you know as soon as I get an update from the compiler development team.

 

 

0 Kudos
Highlighted
Valued Contributor II
31 Views

>>... I reproduced the problem and filed a report on this issue. I will let you know as soon as I get an update from >>the compiler development team... It is good news, Mark. Thanks. I think limitations on a size of some structure(s) ( especially for automatic allocation ) have to be removed if an application is built for a 64-bit platform. I would also suggest an idea of a dynamic limitation(s) and it could be dependent on a total amount of memory ( physical + virtual ) available on the system.
0 Kudos
Highlighted
Beginner
31 Views

  Mark and Sergey,

  Thank you very much for confirming the issue. I've passed this on to our Admins and they have gone through Premier Support as well.

  We have other test codes that use upwards of 20GB arrays with no issues, reading and writing to all array elements.
  Is the issue related to the size of the structure or the complexity of the structure with many arrays of mixed data types?

  Please give us any indication if there is a short vs. long term fix.

  We have machines with up to 128GB/256GB of physical memory and for us using these large structures is the most efficient way to consolidate our data. The data in the structure ends up in an internally compressed HDF5 file.  In addition, we have applications and libraries written in C, Fortran and mixed language that need to access these large structures.

  Thanks.

0 Kudos
Highlighted
Valued Contributor II
31 Views

>>Is the issue related to the size of the structure or the complexity of the structure with many >>arrays of mixed data types? As we can see there is some issue with Intel C++ compiler and it is related to the size of the parent structure when it is greater than ~4GB and it is not related to the complexity of declaration of structures. >>We have machines with up to 128GB/256GB of physical memory and for us using these large structures is >>the most efficient way to consolidate our data. Absolutely agree since that approach is very simple. I hope that your comments about computers with 128GB/256GB of physical memory will be taken into account because we live in times of GigaBytes and TeraBytes, not MegaBytes. As a workaround in case of Intel C++ compiler I would suggest to split the primary structure into a couple of smaller structures and a simple test could verify if the workaround will work.
0 Kudos
Highlighted
Beginner
31 Views

Using: icc --version
icc (ICC) 12.1.3 20120212

A similar program to jkwi's with the structure allocated on the stack and my ulimit set to unlimited works. Program attached and output is:

./test_noloop_static
location: 0
G2.dBZ.stdev[ii][jj][kk][ll][mm] = 2.001000
Address of grids.G2.dBZm.count 0x7fffb7f81c40
assigned G2.dBZm.count
assigned G2.dBZm.mean
assigned G2.dBZm.stdev
Address of G2.d0.count 0x7ffe12316c40
Going to assign d0.count
assigned G2.d0.count
Going to assign d0.mean
assigned G2.d0.mean
Going to assign d0.stdev
assigned G2.d0.stdev
Successful write

0 Kudos
Highlighted
Valued Contributor II
31 Views

Mark, please try to do a very simple test ( for 32-bit and 64-bit platforms ) and here are source codes: #include "stdio.h" typedef struct tagSmallDataSet { int iData[4][4][4][4][4][4][4][4][4][4]; } SmallDataSet; void main( void ) { SmallDataSet sds; sds.iData[0][0][0][0][0][0][0][0][0][0] = 777; printf( "%d\n", ( int )sds[0][0][0][0][0][0][0][0][0][0] ); } Note: Added '.iData' and now it looks like: 'sds.iData[0]...'. Sorry about it.
0 Kudos
Highlighted
Valued Contributor II
31 Views

>>Using: icc --version >>icc (ICC) 12.1.3 20120212 >> >>A similar program to jkwi's with the structure allocated on the stack and my ulimit set to unlimited works. >>... Unfortunately I don't have a chance to verify it with a 64-bit Intel C++ compiler version 12.1.3. In my tests on a 64-bit Windows platform I used version 13.0.0.089.
0 Kudos
Highlighted
Employee
31 Views

Sergey,

>>"Mark, please try to do a very simple test ( for 32-bit and 64-bit platforms ) and here are source codes:"

Your test case compiles and runs fine.

Thanks,
--mark

0 Kudos
Highlighted
Valued Contributor II
31 Views

>>>>"Mark, please try to do a very simple test ( for 32-bit and 64-bit platforms ) and here are source codes:" >> >>Your test case compiles and runs fine. Mark, What version of Intel C++ compiler do you use? I received a private email from another IDZ user who confirms the problem with the latest test case. In essence, there is no need in GBs of memory in order to reproduce that problem for N-Dimensional structure ( where N is greater than 5 ). I will also test that simpliest test with Intel C++ compiler version 8.1.038 and all the rest C/C++ compilers I have. Attached is a new test project for Visual Studio 2008 Professional Edition. Intel C++ compiler XE 2013.0.0.089 compiles but executable fails ( an exception is thrown ) and take a look at sources.
0 Kudos
Highlighted
Valued Contributor II
31 Views

Hi everybody, The following compilation error happens when a 26-D struct is declared ( size is 67108864 bytes ) as static and I would consider it as expected error ( a test with a 25-D struct worked ): ...>icl Test.cpp Intel(R) C++ Compiler XE for applications running on IA-32, Version 12.1.3.300 Build 20120130 Copyright (C) 1985-2012 Intel Corporation. All rights reserved. Test.cpp Test.cpp(8) (col. 9): catastrophic error: out of memory compilation aborted for Test.cpp (code 4) You need to use switches that control Heap or Stack commit and reserved values used by the linker (!). #include "stdio.h" typedef struct tagDataSet { // 2^26 = 67108864 - Default limit for 32-bit Intel C++ compiler // 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 __int8 iData[2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2]; } DataSet; DataSet ds = { 0x0 }; int main( void ) { ds.iData[0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0] = 77; printf( "%d\n", ds.iData[0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0] ); return ( int )0; } Note 1: In case of automatic allocation use /F compiler switch: /F - set the stack reserve amount specified to the linker Note 2: In case of static allocation of a large struct a heap value ( as larger as possible ) has to be set for the linker and please do your own verifications and tests.
0 Kudos
Highlighted
Valued Contributor II
31 Views

I will do one more test ( with a test case from the previous post ) on a 64-bit Windows 7 Professional with Intel C++ compiler XE 2013 version 2013.0.0.089 and post results.
0 Kudos
Highlighted
New Contributor I
31 Views

Hi Sergey,

i have compile your sample with the Parallel Studio XE 2013 for Linux Compiler(  icc (ICC) 13.1.0 20130121
Copyright (C) 1985-2013 Intel Corporation.  All rights reserved.

linux-cuda:~ # icpc --version
icpc (ICC) 13.1.0 20130121
Copyright (C) 1985-2013 Intel Corporation.  All rights reserved.

linux-cuda:~ # )  under openSUSE 12.2  64 Bit Linux Kernel 3.4.30 ,  no problem

works , no compile error, no link error , executable runs result  77 was displayed.




#include "stdio.h"

typedef struct tagDataSet
{
// 2^26 = 67108864 - Default limit for 32-bit Intel C++ compiler
// 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
__int8 iData[2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2];
} DataSet;

DataSet ds = { 0x0 };

int main( void )
{
ds.iData[0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0] = 77;
printf( "%d\n", ds.iData[0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0] );
return ( int )0;
}

best regards

Franz

0 Kudos