Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
28594 Discussions

in this program a lapack routine named "dsyevr" is called and ifort 10.0 gave a wrong result.

f2003
Beginner
1,077 Views
is it a bug?
thanks!

=====================================
apple@localhost~$iforttest.f90-llapack_if-lblas_if
apple@localhost~$./a.out
eval
0.1745844822254170.1745844822254170.174584482225417

apple@localhost~$gfortrantest.f90-llapack-lblas
apple@localhost~$./a.out
eval
-0.48943056644789246-0.186354201199527280.74131197099260637


test.f90 :
=====================================
program main
implicit none
integer(4)::n,il,iu,m,isuppz(6),iwork(30),lwork,liwork,info
real(8) :: a(3,3),z(3,3),w(3),work(80),vl,vu,abstol

n = 3
il = 1
iu = 3
abstol = 0.0_8
lwork = 80
liwork = 30

a=reshape([0.694444444444445_8,0.208135487069874_8,1.069274406890994E-002_8,0.208135487069874_8,-0.211332634659698_8,&
0.133515358818456_8,1.069274406890994E-002_8,0.133515358818456_8,-0.417584606439560_8],[3,3])

call dsyevr('V','I','U',n,a,n,vl,vu,il,iu,abstol,&
m, w, z, n, isuppz, work, lwork,iwork,liwork,info)
print *, "eval"
print *, w
end program main
0 Kudos
11 Replies
Steven_L_Intel1
Employee
1,077 Views
And where did the lapack library come from?
0 Kudos
f2003
Beginner
1,077 Views
of course I have downloaded lapack from netlib and compiled it with ifort and gfortran respectively.

just edit make.inc ,then make...
0 Kudos
Ron_Green
Moderator
1,077 Views
I am in the process of trying to reproduce this. 3 questions:
1) Was this on Linux or Mac OS X?
2) What did you set in make.inc for OPTS for ifort and gfortran respectively
3) Did you use the system blas, the source blas with lapack, or ?? other blas ??

thanks

ron
0 Kudos
Ron_Green
Moderator
1,077 Views
I have tried to reproduce this using very aggressive optimizations, thinking that perhaps overly aggressive optimizations might push the algorithm over a stability tipping point. No such luck. I can achieve differences in the last digit of precision, but one expects this at very high optimization levels.

Here is the most difference I've achieved using -O3 -xT -ip on both the blas and lapack libs:

./resp_opt
eval
-0.489430566447892 -0.186354201199528 0.741311970992606


Now I'd be quite curious to see your settings in make.inc. Also, are you 100% certain you cleaned up the lapack objs and libs between switching between ifort and gfortran? I do this sequence:

make cleanall
make blaslib
make lib

The 'cleanall' to insure that every bit of lapack/blas are cleaned between builds.

I am using the 10.0.026 compiler on Linux. I still await hearing what you have as a build platform. I just cannot replicate what you are seeing.

ron
0 Kudos
Ron_Green
Moderator
1,077 Views
I forgot to mention, if you want a mix of performance and accuracy, try this, it gave me exact results to unoptimized gfortran and ifort:

ifort -O3 -x[your arch] -fp-model source

that is, use those options for OPTS

ron
0 Kudos
f2003
Beginner
1,077 Views
thank you very much ,ron.

1) my OS is linux (glibc 2.5,gcc/gfortran 3.4.0pre-release ,binutils 2.18, kernel 2.6.23.1).

2) I always compile blas together with lapack.
I modified the Makefile,
# lib: lapacklib tmglib
to
lib: blaslib lapacklib tmglib

3) the result is correct when calling MKL's lapack routine.

4)can you tell me how to set the fp-model flag for ifort?

5) my make.inc :
=============================
SHELL = /bin/sh
VERSION = 3.1.1

# FORTRAN = gfortran
# FORTRAN_VERSION = 4.3.0pre-release
# OPTS = -msse2 -mfpmath=sse -O3 -ftree-vectorize -funroll-all-loops -fbounds-check
FORTRAN = ifort
FORTRAN_VERSION = 10.0.025
OPTS =
DRVOPTS = $(OPTS)
NOOPT =
LOADER = $(FORTRAN)
LOADOPTS =

PRE = lib
PLAT = linux

TIMER = NONE

ARCH = ar
ARCHFLAGS= cr
RANLIB = ranlib

BLASLIB = ../../$(PRE)blas_$(VERSION)_$(PLAT)_$(FORTRAN)-$(FORTRAN_VERSION).a
LAPACKLIB = $(PRE)lapack_$(VERSION)_$(PLAT)_$(FORTRAN)-$(FORTRAN_VERSION).a
TMGLIB = $(PRE)tmglib_$(VERSION)_$(PLAT)_$(FORTRAN)-$(FORTRAN_VERSION).a
EIGSRCLIB = $(PRE)eigsrc_$(VERSION)_$(PLAT)_$(FORTRAN)-$(FORTRAN_VERSION).a
LINSRCLIB = $(PRE)linsrc_$(VERSION)_$(PLAT)_$(FORTRAN)-$(FORTRAN_VERSION).a

0 Kudos
Ron_Green
Moderator
1,077 Views
For your OPTS variable in make.inc, you be very simple and add:

OPTS = -fp-model source

The default optimization for the compiler, since you are not specifying -O[ 0 | 1 | 2 | 3 ], is -O2. With the 10.0 compiler, this default optimization level provides very good performance for most codes on Intel architecture.

You'll note that I stepped it up a notch with -O3 and -x[architecture]. There is another thread on this forum regarding compiler options. Take a look at this article for the -x architure-specific vectorization settings:; http://www.intel.com/support/performancetools/sb/CS-009787.htm

As for the fp-model option, I recommend reading the documentation on the various settings. This option controls how strictly the floating point operations follow the IEEE 754 standard. This is always a tradeoff between performance and accuracy, so there is no general guidelines I can offer. Iterative solvers tend to be more sensitive to numerical precision, direct solvers not so much. For any given code, I usually try the default, which is fp-model fast=1. If I find the results are not within 'acceptable', then I slowly increase the compliance with fp-model source. If this is not sufficient, next step is 'precise' and finally for those extreme cases where every bit must match, fp-model strict. Note that you give up performance in each step towards complete IEEE compliance.

I hope this helps.

ron
0 Kudos
f2003
Beginner
1,077 Views
thanks for your great suggestion.

As a test, today I modified "-O3 -xN -ipo" flags of lapack's make.inc file ,and I modified Makefile also to make .so dynamic-linked libraries of lapack. I secceeded.but,

when using -O3 flag ,the a.out can execute (but the result is still wrong).

when using "-O3 -xN -ipo" flags, It goes wrong when executing:
===================================================
apple@localhost ~/linux/lapack-lite-3.1.1 $ ifort tes.f90 -L. -llapack -lblas
./liblapack.so: undefined reference to `__svml_cosf4'
./liblapack.so: undefined reference to `__svml_log2'
./liblapack.so: undefined reference to `__svml_roundf4'
./liblapack.so: undefined reference to `__svml_logf4'
./liblapack.so: undefined reference to `__svml_cos2'

what are those symbols?
Maybe I must use the MKL library with ifort.
0 Kudos
Steven_L_Intel1
Employee
1,077 Views
If you build lapack with -xN you need to also specify this when compiling/linking your source. Or you can add -lsvml.
0 Kudos
f2003
Beginner
1,077 Views
another boring question is how I can recompile glibc's libm.so library to using the sse family ISs instead of x87 FPU to gain more efficiency?
0 Kudos
f2003
Beginner
1,077 Views
it passed. thank you.

apple@localhost ~/linux/lapack-lite-3.1.1 $ ifort tes.f90 -L. -llapack -lblas -xN -O3 -lsvml
apple@localhost ~/linux/lapack-lite-3.1.1 $ ./a.out
eval
0.174584482225417 0.174584482225417 0.174584482225417

0 Kudos
Reply