Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
29277 Discussions

Efficient cube root calculation.

Intel_C_Intel
Employee
1,391 Views
In the version 0035 of the Intel Visual Fortran compiler, a vectprized version of the cube root fun cbrt does not exist. It would have been very nice if this function would be vectorised in a loop just like the other math functions sqrt etc. The following loop:
do i=1,10
a(i) = sqrt(b(i))
end do
gives very efficient square root calculations, i.e. sqrtpd or sqrtps, but the loop
do i=1,10
a(i) = cbrt(b(i))
end do
only results in a call to _cbrt which is not vectorised (that is the 10 evaluations of cbrt are not calculated conqurently, allthough the instruction it self used SSE/SSE2 instructions).
Do you think that Intel has any plans to make a vectorizeable version of cbrt soon?
Best wishes
Lars Petter Endresen
0 Kudos
9 Replies
TimP
Honored Contributor III
1,391 Views
In my installations of current linux 7.1 and 8.x compilers, libsvml.a includes vml_scbrt4 and vml_dcbrt2 functions. I assume these are intended as SSE/SSE2 short vector implementations which should help you. I suppose they can be called directly from C.
I see that I can get a report of successful Fortran vectorization, yet the svml implementation is not employed. That looks worth a report filing on premier.intel.com. The C compilers don't appear to attempt auto-vectorization of cbrtf() on an analogous source.
0 Kudos
Intel_C_Intel
Employee
1,391 Views

Thanks a lot for the information Tim!!

It seems that I have to write a C++ program to take full advantage of the vectorized cbrt functions in th svml library. Do you think that the Intel Fortran Compiler some time in the futeure will "become aware of" these truly useful functions? Writing cbrt in fortran does not work without a proper interface and I have only managed to write an interface to the cbrt in the libmmt library so far.

c:...compiler80/IA32/LIB>nm SVML_DISP.LIB | grep cbrt
./obj/nt/std/d/vml_dinvcbrt2.obj:
./obj/nt/std/d/vml_dcbrt2.obj:
./obj/nt/std/s/w7/vml_sinvcbrt4.obj:
./obj/nt/std/s/w7/vml_scbrt4.obj:
./obj/nt/std/s/a6/vml_sinvcbrt4.obj:
./obj/nt/std/s/a6/vml_scbrt4.obj:
./obj/nt/std/s/disp/vml_sinvcbrt4.obj:
./obj/nt/std/s/disp/vml_scbrt4.obj:

Best wishes,

Lars Petter Endresen

0 Kudos
TimP
Honored Contributor III
1,391 Views

Lars:

I suppose, as I hinted above, that filing your request as an issue on premier.intel.com could make auto-vectorization of cbrt() more likely to happen. I don't know whether a future compiler may include a (possibly non-portable) way to call svml functions directly from Fortran. The compilers are tending to increase the number of svml functions employed by auto-vectorization.

Tim

0 Kudos
Intel_C_Intel
Employee
1,391 Views

Hello.

I will send a request to premier support regarding cbrt from Fortran.

However, as I tried to use the cbrt in the svml library from C++, I also found that the vectorized cbrt (_vmldCbrt2)could not be enabled by the C++ compiler. The following code generated a call to _cbrt and not to (_vmldCbrt2:

double e[16],f[16];

int i;

for(i=0;i<=15;i++)

e = cbrt(f);

0 Kudos
Intel_C_Intel
Employee
1,391 Views
Compiling the code in the previous mesage using icl and all warnings enabled, the compiler complained that the loop could not be vectorized because it contained an unrecognized statement at the line with the cbrt statement. Inspecting the assembly dump of the code reveals that the code was indeed not vectorized.
Do you get the same results? Is the cbrt function in the svml library unavailable to both compilers (Fortran and C++).
0 Kudos
TimP
Honored Contributor III
1,391 Views
I just installed Windows Fortran 8.0.047. The situation with cbrt() functions is similaras described in the preceding posts. In addition, svml_disp.lib includes inverse versions of all the cbrt() functions, presumably to optimize calculations like a(:)/cbrt(b:). Expression of interest by sufficient customers, as in feature requests on premier.intel.com, could leadto incorporation of these functions in the auto-vectorization scheme.
0 Kudos
Intel_C_Intel
Employee
1,391 Views

A nice workaround is to compile with the /S option to generate an assembly file that can be modified manually. For example, one can write sinh() instead of cbrt() in a loop (sinh() is recognized by the auto-vectorization scheme, cbrt() is not), and then manually replace the occurence of _vmldSinh2 in the .asm file with _vmldCbrt2. Then it is possible to compile the .asm file using the ml command in a standard way. I have checked that this is working OK and produces the correct numerical results. Impressed?

0 Kudos
TimP
Honored Contributor III
1,391 Views
ifort 8.1 auto-vectorizer should generate svml cbrt() calls automatically, from source code like x**(1./3.) and x**(1d0/3d0). Apparently, this will be the recommended approach. There is a recent posting about this on the C forum, but I don't think it addresses the C counterpart of the Fortran syntax. Unfortunately, general availability of the 8.1 compilers is some months away.
0 Kudos
Intel_C_Intel
Employee
1,391 Views

Hello.

This feature is particularly useful for software engineers that are both conserned with software efficiency and reliability as the equations written in the fortran source code then strongly will resemble the equations written in a report or a scientific paper. In Fortran:

x = a**(1./3.)

will produce the most efficient run time code. Look at the C++ equivalent x=cbrt(a), it does not exacty look like the equations published in a paper, and the potential for typing errors increases dramatically. With features like this the Intel Fortran (Linux and Win32) will be the preferred programming language for the implementation of computational expensive mathematical models.

Lars Petter Endresen

0 Kudos
Reply