- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi there,
Im still using the 11.0.075 compiler. We moved from XP to Windows7 and got new hardware. Now I call a Intel Core i5-2540M Processor my own, before that Ihadsome Core2 Duo CPU.
But now our programs run only on 1 CPU (25% with 4 cores), I changed nearly every compiler setting, but it wont use all 4 cores as it did with the 2 cores of my old machine.
My command line is (and so it was on the old machine):
/nologo /O3 /Og /QaxSSE2 /QxHost /Qparallel /Qipo /D_Release /Qopenmp /warn:none /Qfp-stack-check /module:"Release\\\\" /object:"Release\\\\" /check:none /libs:qwin /c
Do I really have to buy the new compiler to get parallelism back?
Markus
Im still using the 11.0.075 compiler. We moved from XP to Windows7 and got new hardware. Now I call a Intel Core i5-2540M Processor my own, before that Ihadsome Core2 Duo CPU.
But now our programs run only on 1 CPU (25% with 4 cores), I changed nearly every compiler setting, but it wont use all 4 cores as it did with the 2 cores of my old machine.
My command line is (and so it was on the old machine):
/nologo /O3 /Og /QaxSSE2 /QxHost /Qparallel /Qipo /D_Release /Qopenmp /warn:none /Qfp-stack-check /module:"Release\\\\" /object:"Release\\\\" /check:none /libs:qwin /c
Do I really have to buy the new compiler to get parallelism back?
Markus
Link Copied
8 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Did you check whether the old compiler is recognizing the new CPU as genuine Intel and choosing a suitable option? If not, do you really need to depend on it doing so? Is /QxSSE2 or /arch:SSE2 (or no such option) any better? I suppose those tests are included in "nearly every compiler setting." To me, it always raises doubts when you specify conflicting options (hoping for the last one to take effect?).
Which version of Visual Studio do you use? If VS2005, did you apply both the general service pack and the Vista/Win7 one?
Did you install at least as much RAM on the new machine as on the old?
Which version of Visual Studio do you use? If VS2005, did you apply both the general service pack and the Vista/Win7 one?
Did you install at least as much RAM on the new machine as on the old?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have Visual Studio 2008 (Version 9.0.21022.8 RTM) installed.
How can I find out, whether the compiler identifies the CPU as genuine Intel or not?
Maybe I understandthe alternative code path wrong. I thought, that with /QxHost and /QaxSSE2 the executable consists of "two programs": one optimized for my CPU and a program optimized for Netburst (SSE2) CPUs, so older computers have some optimizations as well. But removing the /QaxSSE2 option didnt help.
With "nearly every compiler setting" I mean: changing under Fortran - Optimization: "Use Intel Processor Extension", "Optimization", "Global Optimizations", "Interprocedural Optimization", "Parallelization" and so on.
I have 8GB RAM on my new laptop, before Ive had 4GB with the XP 32-Bit system.
Markus
How can I find out, whether the compiler identifies the CPU as genuine Intel or not?
Maybe I understandthe alternative code path wrong. I thought, that with /QxHost and /QaxSSE2 the executable consists of "two programs": one optimized for my CPU and a program optimized for Netburst (SSE2) CPUs, so older computers have some optimizations as well. But removing the /QaxSSE2 option didnt help.
With "nearly every compiler setting" I mean: changing under Fortran - Optimization: "Use Intel Processor Extension", "Optimization", "Global Optimizations", "Interprocedural Optimization", "Parallelization" and so on.
I have 8GB RAM on my new laptop, before Ive had 4GB with the XP 32-Bit system.
Markus
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Here is some information that may be useful.
IFORT /nologo /O3 /QxHost /QxSSE /Fa /c daxpy.f
(note QxSSE, not QaxSSE) produces the helpful warning
IFORT: command line warning #10121: overriding '/QxHost' with '/QxSSE'
However, a command closer to your usage,
IFORT /nologo /O3 /QxHost /QaxSSE /Fa /c daxpy.f
produces no warning, but looking at the .asm file produced shows plenty of three-operand FPU instructions (such as 'vaddsd xmm2, xmm1, xmm1' that would not be available on older processors. As you guessed, the /QaxSSE option had no effect other than placing a string into the output file(s).
We would need to know more about the potential for parallelism, cache usage and such aspects of your application. The Intel compiler is so good at optimization that trying to help it by placing parallelization directives in just a few places in one's source code can actually slow the program.
IFORT /nologo /O3 /QxHost /QxSSE /Fa /c daxpy.f
(note QxSSE, not QaxSSE) produces the helpful warning
IFORT: command line warning #10121: overriding '/QxHost' with '/QxSSE'
However, a command closer to your usage,
IFORT /nologo /O3 /QxHost /QaxSSE /Fa /c daxpy.f
produces no warning, but looking at the .asm file produced shows plenty of three-operand FPU instructions (such as 'vaddsd xmm2, xmm1, xmm1' that would not be available on older processors. As you guessed, the /QaxSSE option had no effect other than placing a string into the output file(s).
We would need to know more about the potential for parallelism, cache usage and such aspects of your application. The Intel compiler is so good at optimization that trying to help it by placing parallelization directives in just a few places in one's source code can actually slow the program.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I doubt settings for SSE etc have any relavence to the question since he is talking about parallel CPU useage, not vector maths.
Are you using OpenMP ?
If so then call omp_get_max_threads() to find out how many threads are available.
Are you using OpenMP ?
If so then call omp_get_max_threads() to find out how many threads are available.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You have /Qopenmp there but do you have OpenMP directives in your code? I can't think of anything that would prevent OpenMP from using the available cores.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm concerned about whether /QxHost is useful on a CPU model which wasn't tested during development of your compiler. As mecej4 pointed out, the a in /QaxSSE2 ought to be redundant in the absence of other options, but a possible reading of /QaxSSE2 /QxHost might be "if genuine Intel, run SSE2 code, otherwise run code for the CPU which is present at compile time, if that CPU is recognized by the compiler." There have been several cases in the past where one of those choices failed with an old compiler. Your compiler would not generate AVX code for /QxHost, as a recent compiler would.
I have seen cases where the compiler failed to optimize when too many options were present. I hope that your choice between OpenMP and auto-parallel is not affected directly by your change of CPU, but auto-parallel might be affected by additional vectorizations which might be attempted via SSE4, if your old CPU was a Core 2 Duo without SSE4.
I have seen cases where the compiler failed to optimize when too many options were present. I hope that your choice between OpenMP and auto-parallel is not affected directly by your change of CPU, but auto-parallel might be affected by additional vectorizations which might be attempted via SSE4, if your old CPU was a Core 2 Duo without SSE4.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ive found my mistake... Maybe someone can tell mewhy it prevented parallelism:
/libs:static /Qprec-div /Qprec-sqrt /assume:protect_parens /Qopenmp-link:static
were my additional command line settings, I should have posted them too.
Removing the /Qopenmp-link:static entry brought back my 99% CPU usage. But why would a static linked OpenMP library show such behaviour?
Thanks for the help!
Markus
/libs:static /Qprec-div /Qprec-sqrt /assume:protect_parens /Qopenmp-link:static
were my additional command line settings, I should have posted them too.
Removing the /Qopenmp-link:static entry brought back my 99% CPU usage. But why would a static linked OpenMP library show such behaviour?
Thanks for the help!
Markus
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting TimP (Intel)
I'm concerned about whether /QxHost is useful on a CPU model which wasn't tested during development of your compiler.
I would not be concerned about that. The /QxHost feature will query the CPUID capability bits if the CPU type is not recognized.

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page