Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.
7956 Discussions

out of memory asking for XXX bytes, backend signals (0)

Marián__VooDooMan__M
New Contributor II
452 Views

Greetings,

I have Intel Parallel Studio (Composor et al) 2011, update 6.

Compiler version string is: "Intel C++ Compiler 11.1.082 [Intel 64]". i.e. I am using x64 build.

I am building HUGE library (wxWidgets), and when I enable every optimization under the sun, it is giving me (at link phase/link-time code generation) error: "out of memory asking for XXX bytes" or error "backend signals (0)".

Tho I have 8 GiB RAM, and 8 GiB swap/paging file and Vista x64, this ICL version, sadly, has only 32 bit .exe/binary, thus cannot allocate more memory than like 3.8 GiB, as it is 32-bit process. I was examinating this by "ProcessExplorer" from SysInternals company, now the owner of that company is Microsoft, and the maintainer is Mark Russinovich, who is kernel developer @ Microsoft.

I was evaluating some of older ICL comiler in the past (I can't recall exact name/version) like one year or two ago, and in case of x64 build it was using x64 .exe/binary of icc.exe, and everything worked just fine.

Since my own application for my own non-commercial purposes (sound generating SW/music creation SW ; I release music under Creative Commons copy-left license D/L-able from my site), and since my application is extremely performance-dependant and extremely threading, and it is using wxWidgets library, I just want to enable every optimization under the sun.

My questions are:

1. is there a plan to install x64 version of .exe (for x64 Windows OS's) besides x86 version for those who have x86 OS?

2. What could you recommend to me? Which product should I try to evaluate to resolve my issues? My goal is enable every single optimization under the sun, and not disabling few of them to be able to successfully build the library.

Best,
Marian
.

0 Kudos
1 Solution
levicki
Valued Contributor I
452 Views
Well if you need cross-platform UI then you are bound to use wxWidgets or similar library. If you are doing Windows only development native code and Win32 API is much more efficient.

There is also an option to "roll your own", however that requires in-depth knowledge of all platforms you intend to cover.

Regarding the compiler, you can try using -Qip instead of -Qipo to see if the problem goes away.

View solution in original post

0 Kudos
10 Replies
levicki
Valued Contributor I
452 Views
If I remember correctly, wxWidgets library is a cross-platform user interface code. I sincerely doubt your application will benefit from optimizing that by enabling "every optimization under the sun".

Instead, you should focus on optimizing your own numerically intensive code, and you should look into using DSP primitives offered by Intel Performance Primitives library (signal processing part).

Finally, enabling optimizations blindly without knowing what each of them does may produce slower code then if you carefully chose optimization options to suit your needs.

Code optimization is an art, not brute force.
0 Kudos
Marián__VooDooMan__M
New Contributor II
452 Views
Quoting Igor Levicki
If I remember correctly, wxWidgets library is a cross-platform user interface code. I sincerely doubt your application will benefit from optimizing that by enabling "every optimization under the sun".

Instead, you should focus on optimizing your own numerically intensive code, and you should look into using DSP primitives offered by Intel Performance Primitives library (signal processing part).

Finally, enabling optimizations blindly without knowing what each of them does may produce slower code then if you carefully chose optimization options to suit your needs.

Code optimization is an art, not brute force.

Of course, in my own code I don't enable every optimization under the sun, I am fine-tuning. But that library is so huge, and few things related to GUI in my application are a bit slow without optimization, and unfurtuantelly, I don't have spare time to investigate that library's code and all of its intenals, but I found in previous versions that enabling every single optimization produces really good results. And because GUI is in my case not too performance-critical, it is just fine.
0 Kudos
levicki
Valued Contributor I
453 Views
Well if you need cross-platform UI then you are bound to use wxWidgets or similar library. If you are doing Windows only development native code and Win32 API is much more efficient.

There is also an option to "roll your own", however that requires in-depth knowledge of all platforms you intend to cover.

Regarding the compiler, you can try using -Qip instead of -Qipo to see if the problem goes away.

0 Kudos
Marián__VooDooMan__M
New Contributor II
452 Views
Quoting Igor Levicki
Well if you need cross-platform UI then you are bound to use wxWidgets or similar library. If you are doing Windows only development native code and Win32 API is much more efficient.

There is also an option to "roll your own", however that requires in-depth knowledge of all platforms you intend to cover.

Regarding the compiler, you can try using -Qip instead of -Qipo to see if the problem goes away.

Yes, -Qip instead of -Qipo indeed worked well. I was already thinking about this in the past.

0 Kudos
Marián__VooDooMan__M
New Contributor II
452 Views
Recently I am playing with Intel Parallel Studio XE 2011. Since I have Vista x64 it has "mcpcom.exe" (link-time IL byte code compiler/link-time code generation) in 64-bit version! So my problem disappeared, and I can use -Qipo and take full advantage of multi-file optimizations (instead of single-file).

I have 8 GiB RAM and 8 GiB swap/paging file, and 64-bit process "mcpcom.exe" at link-time is eating over 11 GiB of RAM(!). So you can imagine how my computer is slowed down due to swapping pages of memory to HDD and back (HDD is EXTREMELY thrashing)... And it it is still generating code for over 9 hours, and I wonder when it will finish. But I am patient.

But anyway, it is worth to me to use -Qipo over -Qip. So I must be patient, and my luck is that I will need to build wxWidgets library only once (in oppose to my DSP sound SW/real-time sound generator/visual programming SW which uses that GUI library).

  1. Conclusion:If you get errors like "out of memory asking for XXX bytes" while compiling HUGE project with -Qipo and you don't want t to use -Qip, evaluate and then maybe buy Intel Parallel Studio XE 2011 on your x64 Windows OS.
  2. NB: on x86, i.e. 32-bit Windows OS it will NOT work! Since Intel Compiler is using 32-bit version of "mcpcom.exe" in such case, and memory limit for 32-bit processes is approximately 3.8 GiB.
0 Kudos
Om_S_Intel
Employee
452 Views
Sometime it may help to use /Qipo compiler option. This option gereates n objects when using IPO and reduces the memory requirement.
0 Kudos
Marián__VooDooMan__M
New Contributor II
452 Views
Igor Levicki wrote:

Well if you need cross-platform UI then you are bound to use wxWidgets or similar library. If you are doing Windows only development native code and Win32 API is much more efficient.

There is also an option to "roll your own", however that requires in-depth knowledge of all platforms you intend to cover.

Regarding the compiler, you can try using -Qip instead of -Qipo to see if the problem goes away.

Alright, but I want to use wxWidgets for later porting to other platforms. wxWidgets is not only GUI library, but also contains its own STL-like containers (for its API, though they are universal, not GUI-bound), and with enabled "every optimization under the sun" my benchmarks shows great speed improvement when working with these containers, though in case the piece of code is not GUI thing, I prefer using STL, of course. "Optimization is art, not 'enabling everything under the sun'", but I don't have much time to investigate library's internals and fine-tuning, so I have enabled everything (but with piece of sanity), and the result is indeed very fast. I don't need even faster code, so I am not forced to do the "art" over wxWidgets, I do "art" on my application using this library (i.e. non-GUI code).
0 Kudos
Marián__VooDooMan__M
New Contributor II
452 Views
om-sachan (Intel) wrote:

Sometime it may help to use /Qipo compiler option. This option gereates n objects when using IPO and reduces the memory requirement.

I am using newer version of ICC, so this is little bit outdated, and the newer version is 64-bit executable, so there is (almost) no memory limit. Though, in 2 sub-libraries it eats like 7 GiB of RAM, and physically I have only 8 GiB + lots of swap. So it is swapping excessively, so I appreciate your /Qipo suggestion. Thank you for the great idea. I will try it.
0 Kudos
Marián__VooDooMan__M
New Contributor II
452 Views
VooDooMan wrote:

Quote:

om-sachan (Intel) wrote:

Sometime it may help to use /Qipo compiler option. This option gereates n objects when using IPO and reduces the memory requirement.

I am using newer version of ICC, so this is little bit outdated, and the newer version is 64-bit executable, so there is (almost) no memory limit. Though, in 2 sub-libraries it eats like 7 GiB of RAM, and physically I have only 8 GiB + lots of swap. So it is swapping excessively, so I appreciate your /Qipo suggestion. Thank you for the great idea. I will try it.

I am using ICC x64 13.0 (non-beta) under MSVC 2010 (the package number string is "Intel® C++ Composer XE 2013 Package ID: w_ccompxe_2013.0.089" and "Intel® C++ Composer XE 2013 Integration for Microsoft Visual Studio* 2010, Version 13.0.1179.2010"). the MSVC's command line is:
/I"..\..\lib\vc_dll\mswud" /I"..\..\include" /I"..\..\src\tiff\libtiff" /I"..\..\src\jpeg" /I"..\..\src\png" /I"..\..\src\zlib" /I"..\..\src\regex" /I"..\..\src\expat\lib" /Zi /nologo /W4 /MP /debug:expr-source-pos /O3 /Ob2 /Oi /Ot /Qipo /Qftz /Qopt-matmul /Quse-intel-optimized-headers /D "_SECURE_SCL=0" /D "WIN32" /D "_USRDLL" /D "DLL_EXPORTS" /D "_DEBUG" /D "__WXMSW__" /D "WXBUILDING" /D "WXUSINGDLL" /D "WXMAKINGDLL_CORE" /D "wxUSE_BASE=0" /D "_VC80_UPGRADE=0x0600" /D "_WINDLL" /D "_UNICODE" /D "UNICODE" /GF /EHsc /MDd /GS- /fp:fast /QxHost /Zc:wchar_t /Zc:forScope /GR /Yu"wx/wxprec.h" /Fp"vc_mswuddll\wxprec_coredll.pch" /Fa".\vc_mswuddll\core\" /Fo".\vc_mswuddll\core\" /Fd".\vc_mswuddll\core\vc100.pdb" /Qvec-report1 /Qlong_double /Qopt-class-analysis /Qopt-mem-bandwidth2 /Qopt-streaming-stores:auto /Qipo9
it produces warning saying that /Qipo9 is overriding /Qipo. but it still generates 3 ipo object files (as without /Qipo9, same as only with /Qipo) and all of three spawns consume like 7 GiB of RAM. IMHO this is bug, not a feature. Any suggestion? This is a bug report.
0 Kudos
Mark_S_Intel1
Employee
452 Views
The /Qipo9 appears later in the command line after /Qipo and overrides it, hence the warning you are getting. Please see the section "IPO-Related Performance Issues" especially the reference to "IPO for Large Programs" in the user's guide for additional information on using IPO with large projects. Using /QipoN, the compiler should generate N number of object files unless N exceeds the number of source flles.
0 Kudos
Reply