Re: in this program a lapack routine named "dsyevr" is called a

f2003 · ‎10-14-2007

is it a bug?
thanks!

=====================================
apple@localhost~$iforttest.f90-llapack_if-lblas_if
apple@localhost~$./a.out
eval
0.1745844822254170.1745844822254170.174584482225417

apple@localhost~$gfortrantest.f90-llapack-lblas
apple@localhost~$./a.out
eval
-0.48943056644789246-0.186354201199527280.74131197099260637

test.f90 :
=====================================
program main
implicit none
integer(4)::n,il,iu,m,isuppz(6),iwork(30),lwork,liwork,info
real(8) :: a(3,3),z(3,3),w(3),work(80),vl,vu,abstol

n = 3
il = 1
iu = 3
abstol = 0.0_8
lwork = 80
liwork = 30

a=reshape([0.694444444444445_8,0.208135487069874_8,1.069274406890994E-002_8,0.208135487069874_8,-0.211332634659698_8,&
0.133515358818456_8,1.069274406890994E-002_8,0.133515358818456_8,-0.417584606439560_8],[3,3])

call dsyevr('V','I','U',n,a,n,vl,vu,il,iu,abstol,&
m, w, z, n, isuppz, work, lwork,iwork,liwork,info)
print *, "eval"
print *, w
end program main

Steven_L_Intel1 · ‎10-15-2007

And where did the lapack library come from?

f2003 · ‎10-16-2007

of course I have downloaded lapack from netlib and compiled it with ifort and gfortran respectively.

just edit make.inc ,then make...

Ron_Green · ‎10-16-2007

I am in the process of trying to reproduce this. 3 questions:
1) Was this on Linux or Mac OS X?
2) What did you set in make.inc for OPTS for ifort and gfortran respectively
3) Did you use the system blas, the source blas with lapack, or ?? other blas ??

thanks

ron

Ron_Green · ‎10-16-2007

I have tried to reproduce this using very aggressive optimizations, thinking that perhaps overly aggressive optimizations might push the algorithm over a stability tipping point. No such luck. I can achieve differences in the last digit of precision, but one expects this at very high optimization levels.

Here is the most difference I've achieved using -O3 -xT -ip on both the blas and lapack libs:

./resp_opt
eval
-0.489430566447892 -0.186354201199528 0.741311970992606

Now I'd be quite curious to see your settings in make.inc. Also, are you 100% certain you cleaned up the lapack objs and libs between switching between ifort and gfortran? I do this sequence:

make cleanall
make blaslib
make lib

The 'cleanall' to insure that every bit of lapack/blas are cleaned between builds.

I am using the 10.0.026 compiler on Linux. I still await hearing what you have as a build platform. I just cannot replicate what you are seeing.

ron

Ron_Green · ‎10-16-2007

I forgot to mention, if you want a mix of performance and accuracy, try this, it gave me exact results to unoptimized gfortran and ifort:

ifort -O3 -x[your arch] -fp-model source

that is, use those options for OPTS

ron

f2003 · ‎10-16-2007

thank you very much ,ron.

1) my OS is linux (glibc 2.5,gcc/gfortran 3.4.0pre-release ,binutils 2.18, kernel 2.6.23.1).

2) I always compile blas together with lapack.
I modified the Makefile,
# lib: lapacklib tmglib
to
lib: blaslib lapacklib tmglib

3) the result is correct when calling MKL's lapack routine.

4)can you tell me how to set the fp-model flag for ifort?

5) my make.inc :
=============================
SHELL = /bin/sh
VERSION = 3.1.1

# FORTRAN = gfortran
# FORTRAN_VERSION = 4.3.0pre-release
# OPTS = -msse2 -mfpmath=sse -O3 -ftree-vectorize -funroll-all-loops -fbounds-check
FORTRAN = ifort
FORTRAN_VERSION = 10.0.025
OPTS =
DRVOPTS = $(OPTS)
NOOPT =
LOADER = $(FORTRAN)
LOADOPTS =

PRE = lib
PLAT = linux

TIMER = NONE

ARCH = ar
ARCHFLAGS= cr
RANLIB = ranlib

BLASLIB = ../../$(PRE)blas_$(VERSION)_$(PLAT)_$(FORTRAN)-$(FORTRAN_VERSION).a
LAPACKLIB = $(PRE)lapack_$(VERSION)_$(PLAT)_$(FORTRAN)-$(FORTRAN_VERSION).a
TMGLIB = $(PRE)tmglib_$(VERSION)_$(PLAT)_$(FORTRAN)-$(FORTRAN_VERSION).a
EIGSRCLIB = $(PRE)eigsrc_$(VERSION)_$(PLAT)_$(FORTRAN)-$(FORTRAN_VERSION).a
LINSRCLIB = $(PRE)linsrc_$(VERSION)_$(PLAT)_$(FORTRAN)-$(FORTRAN_VERSION).a

Ron_Green · ‎10-17-2007

For your OPTS variable in make.inc, you be very simple and add:

OPTS = -fp-model source

The default optimization for the compiler, since you are not specifying -O[ 0 | 1 | 2 | 3 ], is -O2. With the 10.0 compiler, this default optimization level provides very good performance for most codes on Intel architecture.

You'll note that I stepped it up a notch with -O3 and -x[architecture]. There is another thread on this forum regarding compiler options. Take a look at this article for the -x architure-specific vectorization settings:; http://www.intel.com/support/performancetools/sb/CS-009787.htm

As for the fp-model option, I recommend reading the documentation on the various settings. This option controls how strictly the floating point operations follow the IEEE 754 standard. This is always a tradeoff between performance and accuracy, so there is no general guidelines I can offer. Iterative solvers tend to be more sensitive to numerical precision, direct solvers not so much. For any given code, I usually try the default, which is fp-model fast=1. If I find the results are not within 'acceptable', then I slowly increase the compliance with fp-model source. If this is not sufficient, next step is 'precise' and finally for those extreme cases where every bit must match, fp-model strict. Note that you give up performance in each step towards complete IEEE compliance.

I hope this helps.

ron

f2003 · ‎10-18-2007

thanks for your great suggestion.

As a test, today I modified "-O3 -xN -ipo" flags of lapack's make.inc file ,and I modified Makefile also to make .so dynamic-linked libraries of lapack. I secceeded.but,

when using -O3 flag ,the a.out can execute (but the result is still wrong).

when using "-O3 -xN -ipo" flags, It goes wrong when executing:
===================================================
apple@localhost ~/linux/lapack-lite-3.1.1 $ ifort tes.f90 -L. -llapack -lblas
./liblapack.so: undefined reference to `__svml_cosf4'
./liblapack.so: undefined reference to `__svml_log2'
./liblapack.so: undefined reference to `__svml_roundf4'
./liblapack.so: undefined reference to `__svml_logf4'
./liblapack.so: undefined reference to `__svml_cos2'

what are those symbols?
Maybe I must use the MKL library with ifort.

Steven_L_Intel1 · ‎10-18-2007

If you build lapack with -xN you need to also specify this when compiling/linking your source. Or you can add -lsvml.

f2003 · ‎10-18-2007

another boring question is how I can recompile glibc's libm.so library to using the sse family ISs instead of x87 FPU to gain more efficiency?

f2003 · ‎10-18-2007

it passed. thank you.

apple@localhost ~/linux/lapack-lite-3.1.1 $ ifort tes.f90 -L. -llapack -lblas -xN -O3 -lsvml
apple@localhost ~/linux/lapack-lite-3.1.1 $ ./a.out
eval
0.174584482225417 0.174584482225417 0.174584482225417

in this program a lapack routine named "dsyevr" is called and ifort 10.0 gave a wrong result.