- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm sure this has been covered already, but I'm seeing a very marked decrease in performance for some functions in ifx when compared to code generated by ifort. I was conducting a test to see what was faster: cmplx( cos(x), sin(x) ) or exp( cmplx(0.0, x) ), and the following data came out (numbers are % CPU times relative to the first test):
COS,SIN (ifx): 1.0
EXP (ifx): 3.919
COS,SIN (ifort): 0.275
EXP (ifort): 0.280
These results are curious because I was under the impression that it's cheaper to compute the COS and SIN of an angle together than to do them separately, and the EXP( CMPLX(0.0, X) ) makes this explicit that we are trying to fetch both of these values. So that it's slower to do this in both ifx and ifort was a bit surprising. But the bigger shock was that ifort was 3.6x faster (COS,SIN) and 14x faster (EXP) than the same code compiled with ifx, using the same compile arguments. We are preparing to transition our scientific numerical package from ifort to ifx, but these results are pretty profound.
ifort version 2021.11.1
ifx version 2024.0.2
Compile flags for both tools are: "-O3 -assume nounderscore -warn all"
Test routine:
PROGRAM CIS_TEST
COMPLEX*8 VAR1, OUT
REAL*4 ARG
INTEGER*4 I
C
OUT = 0.0
DO I = 1, 100000000
ARG = 2.0 * 3.14159 * 150000000.0 * SNGL(I) / 1000000.0
VAR1 = CMPLX( COS(ARG), SIN(ARG) )
OUT = OUT + VAR1
END DO
CALL PRINT_USAGE
WRITE (*,*) ! Prevent code elimination
OUT = 0.0
DO I = 1, 100000000
ARG = 2.0 * 3.14159 * 150000000.0 * SNGL(I) / 1000000.0
VAR1 = EXP( CMPLX(0.0, ARG) )
OUT = OUT + VAR1
END DO
CALL PRINT_USAGE
WRITE (*,*) ! Prevent code elimination
#include <stdio.h>
#include <sys/time.h>
#include <sys/resource.h>
static double u_last, s_last = 0.0;
void print_usage() {
struct rusage usage;
double secs;
getrusage( RUSAGE_SELF, &usage );
secs = usage.ru_utime.tv_sec + usage.ru_utime.tv_usec * 0.000001;
printf("User time: %.3f\n", secs - u_last);
u_last = secs;
secs = usage.ru_stime.tv_sec + usage.ru_stime.tv_usec * 0.000001;
printf("System time: %.3f\n", secs - s_last);
s_last = secs;
}
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Digging into this a bit more, it appears as though IFX is not "recognizing" what underlying math operations should be called for certain lines of code (i.e., what does a call to EXP() really do?). Example: ifort correctly recognizes that CMPLX( COS(ARG), SIN(ARG) ) and EXP( 0.0, ARG) are mathematically equivalent, and if you examine the generated assembly, they both produce calls to __libm_sse2_sincosf. However, ifx is blindly calling cexpf, which is obviously slower. Despite trying different iterations of the "-march=", "-arch", etc. flags, I can't seem to get ifx to switch the cexpf call to the sincosf call, let alone a LIBM SSE2-optimized version of either one. So that's probably why it's so much slower: ifort was using an Intel-optimized SSE2 math library and calling sincosf, whereas the link map suggests ifx is using a SVML library and brute-force calling expf, cosf, and sinf separately (i.e., it isn't even calling sincosf for the combined test case on line 10 above).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Significant differences in performance with ifx vs ifort
I'm sure this has been covered already, but I'm seeing a very marked decrease in performance for some functions in ifx when compared to code generated by ifort. I was conducting a test to see what was faster: cmplx( cos(x), sin(x) ) or exp( cmplx(0.0, x) ), and the following data came out (numbers are % CPU times relative to the first test):
COS,SIN (ifx): 1.0
EXP (ifx): 3.919
COS,SIN (ifort): 0.275
EXP (ifort): 0.280
These results are curious because I was under the impression that it's cheaper to compute the COS and SIN of an angle together than to do them separately, and the EXP( CMPLX(0.0, X) ) makes this explicit that we are trying to fetch both of these values. So that it's slower to do this in both ifx and ifort was a bit surprising. But the bigger shock was that ifort was 3.6x faster (COS,SIN) and 14x faster (EXP) than the same code compiled with ifx, using the same compile arguments. We are preparing to transition our scientific numerical package from ifort to ifx, but these results are pretty profound.
ifort version 2021.11.1
ifx version 2024.0.2
Compile flags for both tools are: "-O3 -assume nounderscore -warn all"
Test routine:
PROGRAM CIS_TEST COMPLEX*8 VAR1, OUT REAL*4 ARG INTEGER*4 I C OUT = 0.0 DO I = 1, 100000000 ARG = 2.0 * 3.14159 * 150000000.0 * SNGL(I) / 1000000.0 VAR1 = CMPLX( COS(ARG), SIN(ARG) ) OUT = OUT + VAR1 END DO CALL PRINT_USAGE WRITE (*,*) ! Prevent code elimination OUT = 0.0 DO I = 1, 100000000 ARG = 2.0 * 3.14159 * 150000000.0 * SNGL(I) / 1000000.0 VAR1 = EXP( CMPLX(0.0, ARG) ) OUT = OUT + VAR1 END DO CALL PRINT_USAGE WRITE (*,*) ! Prevent code elimination
#include <stdio.h> #include <sys/time.h> #include <sys/resource.h> static double u_last, s_last = 0.0; void print_usage() { struct rusage usage; double secs; getrusage( RUSAGE_SELF, &usage ); secs = usage.ru_utime.tv_sec + usage.ru_utime.tv_usec * 0.000001; printf("User time: %.3f\n", secs - u_last); u_last = secs; secs = usage.ru_stime.tv_sec + usage.ru_stime.tv_usec * 0.000001; printf("System time: %.3f\n", secs - s_last); s_last = secs; }
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The largest ARG in that program is about 9.42477E+10, far beyond where COS(ARG), SIN(ARG) and EXP(CMPLX(0,ARG)) can be meaningfully calculated in single precision. Did ifort and ifx choose different workarounds?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
All of those functions are periodic, so they'll just be computed on MOD(2*PI). The difference in speed between EXP and SIN/COS is that ifx is not decomposing the former into the latter; computing the exponential on a complex number is an expensive operation when compared to computing the SIN and COS of a single non-complex number.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page