Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

jni.h

veitner
Beginner
1,570 Views
Hello Forum,

also it is possible to produce a mixed language library to interface a library from java I would prefer to use fortran only. Therefore I would like to start to translate the jni.h header into a fortran module and host that translation at sourceforge (perhaps MIT License) (despite there is a commercial tool called jniwrapper available http://www.teamdev.com/jniwrapper/downloads.jsf).

Is there anybody out there who would be interested in such a module and perhaps willing to help to hold that translation uptodate? Or has somebody performed that task already?

looking forward

Veit

0 Kudos
9 Replies
ArturGuzik
Valued Contributor I
1,570 Views
Quoting - veitner
Hello Forum,

also it is possible to produce a mixed language library to interface a library from java I would prefer to use fortran only. Therefore I would like to start to translate the jni.h header into a fortran module and host that translation at sourceforge (perhaps MIT License) (despite there is a commercial tool called jniwrapper available http://www.teamdev.com/jniwrapper/downloads.jsf).

Is there anybody out there who would be interested in such a module and perhaps willing to help to hold that translation uptodate? Or has somebody performed that task already?

looking forward

Veit

Veit,

I don't want to discourage you but before starting a (considerable) effort of translating jni.h header read this excellent article by Lorri Menard. The risk is that your translation will be valid for specific version only, as every new Java comes with new header.

A.
0 Kudos
veitner
Beginner
1,570 Views
Quoting - ArturGuzik
Veit,

I don't want to discourage you but before starting a (considerable) effort of translating jni.h header read this excellent article by Lorri Menard. The risk is that your translation will be valid for specific version only, as every new Java comes with new header.

A.

Hello Artur,

I know that Article and also the basing one from Chris Andersson with the nice sample of transient Heat Transfer.
My hope is to save an additional function call (in the app I'm working on I've at least 30000 calls). Pretty sure that combining Intel C++ and Fortran Compilers will inline the fortran functions but I'm not really happy with licensing the C++ Compiler only for some wrapping functions.
And I doubt that the GNU C++ compiler is able to inline the fortran code (I did not play around much but in the first try I got two dll's - the used makefile for the Anderson sample is attached).
But of course the effort is not neglectable to save a function call only - I've to do some tests first if it is worth the work.


Makefile:
CCFLAGS = -O -shared -Wl,--kill-at,--add-stdcall-alias
CPPDEFINES = -D __FORTRAN_BUILD__ -D __cplusplus
FFLAGS1 = -O -dll
FFLAGS2 = -O -shared -fno-underscoring
FC = ifort
FFLAGS = $(FFLAGS1)

IDIRS = -I/include/java -I/lib/gcc/mingw32/3.4.5/include/gcj
CCLIBS =

OFILES = tempCalcRoutines.dll
OBJ = tempCalcRoutines.obj
OLIB = tempCalcRoutines.lib

libTempCalcJava.obj : $(OFILES)
g++ $(CCFLAGS) -o TempCalcJava.dll $(OFILES) $(CPPDEFINES) $(CCLIBS) $(IDIRS) TempCalcJava.cpp


tempCalcRoutines.dll : tempCalcRoutines.f
$(FC) $(FFLAGS) tempCalcRoutines.f -o tempCalcRoutines.dll

clean:
rm -f *.dll
rm -f *.obj
rm -f *.so
rm -f *.lib
rm -f *.exp

0 Kudos
ArturGuzik
Valued Contributor I
1,570 Views
Quoting - veitner
Pretty sure that combining Intel C++ and Fortran Compilers will inline the fortran functions but I'm not really happy with licensing the C++ Compiler only for some wrapping functions.
And I doubt that the GNU C++ compiler is able to inline the fortran code


Veit,

OK, I see you know all the traps ahead of you.

What about MS C++ ExpressEdition (free)?

In any case, having Fortran would be very, very nice, but the amount of work is considerable with (at few points at least) questionable gain(s).

A.
0 Kudos
veitner
Beginner
1,570 Views
Quoting - ArturGuzik

Veit,

OK, I see you know all the traps ahead of you.

What about MS C++ ExpressEdition (free)?

In any case, having Fortran would be very, very nice, but the amount of work is considerable with (at few points at least) questionable gain(s).

A.

Hi Artur,

Took some time to do some tests (most time spent setting up the compiler and tools in windows).

Short explanation of the test (Visual Studio 2008, Intel Fortran 11.1.038, jdk1.6.0_11):
A fortran function calculating the dot product of two vectors using blas (mkl).
A fortran function calculating the dot product in a loop
A fortran function doing almost nothing
A c function calculating the dot product in a loop
A c function doing almost nothing

No idea about inlining - I had to load the fortran dll dynamically, statically did not work probably different calling conventions but I did not find the setting for (I dont feel very comfortable with that VisualStudio thingi). So the test shows the worst case.

The jni binding was created using gluegen from the jogl project.

Results (I repeated several times - tendency was the same):

Timings calling from java:
$ java -jar ../jdp/dist/jdp.jar
field length=20000, number of calls/function=300000
calling Fortran functions:
ms for 300000 calls to dpmkl (dot product using blas): 825.2
ms for 300000 calls to dp (dot product via loop): 802.1
ms for 300000 calls to e1 ((n+1)*n/2.): 5.0
calling C functions:
ms for 300000 calls to e2 (dot product in c): 870.3
ms for 300000 calls to e3 ((n+1)*n/2.): 5.0

Timings calling from c (field length was 20000):
E:MinGWhomeveitnervsdpbuild>c.exe
timing of c program calling fortran routines in dll
field length=20000, number of calls/function=300000
dpmkl took 1462.500000 ms
dp took 840.200000 ms
e1 took 0.000000 ms
e2 took 846.200000 ms
e3 took 0.000000 ms

The funny thing is that the blas routine "dot" took almost twice the time in release mode compared to debug mode of the calling c programm - but that is nothing which bothers me currently.
In debug mode (the fortran dll still with release flags) following results were obtained:
E:MinGWhomeveitnervsdpbuild>c.exe
timing of c program calling fortran routines in dll
field length=20000, number of calls/function=300000
dpmkl took 777.100000 ms
dp took 823.600000 ms
e1 took 2.000000 ms
e2 took 3047.300000 ms
e3 took 1.000000 ms


And here the files (I did not create a makefile - too much effort):

Testprogram in c/c++

#include
#include
#include
#include
#include
#include "fext.h"


int main(int argc, char* argv[])
{
static int n=20000;
static int cm=300000;

double *d1 = (double *)calloc(n,sizeof(double));
double *d2 = (double *)calloc(n,sizeof(double));
for (int i=0; i d1=1.;
d2=i+1;
}
double r = (n+1)*n/2.;
static double d=1E-8;

clock_t c0, c1;

printf("timing of c program calling fortran routines in dlln");
printf("field length=%d, number of calls/function=%dn",n,cm);
double res;

c0 = clock();
for (int i=0; i res = dpmkl(n,d1,d2);
if (fabs(r-res)>d) {
printf("%sn","unexpected result");
break;
}
}
c1 = clock();

printf("dpmkl took %f msn",100.*(c1-c0)/CLOCKS_PER_SEC);


c0 = clock();
for (int i=0; i res = dp(n,d1,d2);
if (fabs(r-res)>d) {
printf("%sn","unexpected result");
break;
}
}
c1 = clock();

printf("dp took %f msn",100.*(c1-c0)/CLOCKS_PER_SEC);

c0 = clock();
for (int i=0; i res = e1(n,d1,d2);
if (fabs(r-res)>d) {
printf("%sn","unexpected result");
break;
}
}
c1 = clock();
printf("e1 took %f msn",100.*(c1-c0)/CLOCKS_PER_SEC);

c0 = clock();
for (int i=0; i res = e2(n,d1,d2);
if (fabs(r-res)>d) {
printf("%sn","unexpected result");
break;
}
}
c1 = clock();
printf("e2 took %f msn",100.*(c1-c0)/CLOCKS_PER_SEC);

c0 = clock();
for (int i=0; i res = e3(n,d1,d2);
if (fabs(r-res)>d) {
printf("%sn","unexpected result in e3");
break;
}
}
c1 = clock();
printf("e3 took %f msn",100.*(c1-c0)/CLOCKS_PER_SEC);

free(d1);
free(d2);
printf("npress ENTER to exitn");
getch();
return 0;
}


Header file defining loading the fortran dll

#ifndef _FEXT_H // Specifies that the minimum required platform is Windows Vista.
#define _FEXT_H

#include
#include

#define WIMPORT __declspec(dllimport)
#define WCALL1 __cdecl
#define WCALL __stdcall

#ifdef __cplusplus
extern "C" {
#endif
static HINSTANCE getInst() {
HINSTANCE dll = LoadLibraryA("dp.dll");
return dll;
}
// WIMPORT double WCALL dp(int n, double *d1, double *d2);

HINSTANCE dll = 0;

typedef double (WCALL dpfp) (int n, double *, double *);
dpfp* dpmklf = 0;
dpfp* dpf = 0;
dpfp* e1f = 0;

double dpmkl(int n, double *d1, double *d2) {
if (dll==0) dll = getInst();
if (dpmklf==0) dpmklf = (dpfp*) GetProcAddress((HMODULE) dll, "dpmkl");
if (dpmklf) {
return (*dpmklf)(n,d1,d2);
} else {
return -1;
}
}

double dp(int n, double *d1, double *d2) {
if (dll==0) dll = getInst();
if (dpf==0) dpf=(dpfp*) GetProcAddress((HMODULE) dll, "dp");
if (dpf) {
return (*dpf)(n,d1,d2);
} else {
return -1;
}
}

double e1(int n, double *d1, double *d2) {
if (dll==0) dll = getInst();
if (e1f==0) e1f=(dpfp*) GetProcAddress((HMODULE) dll, "e1");
if (e1f) {
return (*e1f)(n,d1,d2);
} else {
return -1;
}
}

double e2(int n, double *d1, double *d2) {
double r=0.;
int i;
for (i=0; i r+=(d1*d2);
}
return r;
}

double e3(int n, double *d1, double *d2) {
return (n+1)*n/2.;
}

#ifdef __cplusplus
}
#endif

#endif /* _FEXT_H */


JNI Binding

/* !---- DO NOT EDIT: This file autogenerated by comsungluegenJavaEmitter.java on Wed Jul 29 04:26:24 PDT 2009 ----! */

#include
#include
#include "fext.h"


/* Java->C glue code:
* Java package: ext.DotProd
* Java method: double dp(int n, java.nio.DoubleBuffer d1, java.nio.DoubleBuffer d2)
* C function: double dp(int n, double * d1, double * d2);
*/
JNIEXPORT jdouble JNICALL
Java_ext_DotProd_dp0__ILjava_lang_Object_2ILjava_lang_Object_2I(JNIEnv *env, jclass _unused, jint n, jobject d1, jint d1_byte_offset, jobject d2, jint d2_byte_offset) {
double * _ptr1 = NULL;
double * _ptr2 = NULL;
double _res;
if (d1 != NULL) {
_ptr1 = (double *) (((char*) (*env)->GetDirectBufferAddress(env, d1)) + d1_byte_offset);
}
if (d2 != NULL) {
_ptr2 = (double *) (((char*) (*env)->GetDirectBufferAddress(env, d2)) + d2_byte_offset);
}
_res = dp((int) n, (double *) _ptr1, (double *) _ptr2);
return _res;
}


/* Java->C glue code:
* Java package: ext.DotProd
* Java method: double dp(int n, java.nio.DoubleBuffer d1, java.nio.DoubleBuffer d2)
* C function: double dp(int n, double * d1, double * d2);
*/
JNIEXPORT jdouble JNICALL
Java_ext_DotProd_dp1__ILjava_lang_Object_2ILjava_lang_Object_2I(JNIEnv *env, jclass _unused, jint n, jobject d1, jint d1_byte_offset, jobject d2, jint d2_byte_offset) {
double * _ptr1 = NULL;
double * _ptr2 = NULL;
double _res;
if (d1 != NULL) {
_ptr1 = (double *) (((char*) (*env)->GetPrimitiveArrayCritical(env, d1, NULL)) + d1_byte_offset);
}
if (d2 != NULL) {
_ptr2 = (double *) (((char*) (*env)->GetPrimitiveArrayCritical(env, d2, NULL)) + d2_byte_offset);
}
_res = dp((int) n, (double *) _ptr1, (double *) _ptr2);
if (d1 != NULL) {
(*env)->ReleasePrimitiveArrayCritical(env, d1, _ptr1, 0);
}
if (d2 != NULL) {
(*env)->ReleasePrimitiveArrayCritical(env, d2, _ptr2, 0);
}
return _res;
}


/* Java->C glue code:
* Java package: ext.DotProd
* Java method: double dpmkl(int n, java.nio.DoubleBuffer d1, java.nio.DoubleBuffer d2)
* C function: double dpmkl(int n, double * d1, double * d2);
*/
JNIEXPORT jdouble JNICALL
Java_ext_DotProd_dpmkl0__ILjava_lang_Object_2ILjava_lang_Object_2I(JNIEnv *env, jclass _unused, jint n, jobject d1, jint d1_byte_offset, jobject d2, jint d2_byte_offset) {
double * _ptr1 = NULL;
double * _ptr2 = NULL;
double _res;
if (d1 != NULL) {
_ptr1 = (double *) (((char*) (*env)->GetDirectBufferAddress(env, d1)) + d1_byte_offset);
}
if (d2 != NULL) {
_ptr2 = (double *) (((char*) (*env)->GetDirectBufferAddress(env, d2)) + d2_byte_offset);
}
_res = dpmkl((int) n, (double *) _ptr1, (double *) _ptr2);
return _res;
}


/* Java->C glue code:
* Java package: ext.DotProd
* Java method: double dpmkl(int n, java.nio.DoubleBuffer d1, java.nio.DoubleBuffer d2)
* C function: double dpmkl(int n, double * d1, double * d2);
*/
JNIEXPORT jdouble JNICALL
Java_ext_DotProd_dpmkl1__ILjava_lang_Object_2ILjava_lang_Object_2I(JNIEnv *env, jclass _unused, jint n, jobject d1, jint d1_byte_offset, jobject d2, jint d2_byte_offset) {
double * _ptr1 = NULL;
double * _ptr2 = NULL;
double _res;
if (d1 != NULL) {
_ptr1 = (double *) (((char*) (*env)->GetPrimitiveArrayCritical(env, d1, NULL)) + d1_byte_offset);
}
if (d2 != NULL) {
_ptr2 = (double *) (((char*) (*env)->GetPrimitiveArrayCritical(env, d2, NULL)) + d2_byte_offset);
}
_res = dpmkl((int) n, (double *) _ptr1, (double *) _ptr2);
if (d1 != NULL) {
(*env)->ReleasePrimitiveArrayCritical(env, d1, _ptr1, 0);
}
if (d2 != NULL) {
(*env)->ReleasePrimitiveArrayCritical(env, d2, _ptr2, 0);
}
return _res;
}


/* Java->C glue code:
* Java package: ext.DotProd
* Java method: double e1(int n, java.nio.DoubleBuffer d1, java.nio.DoubleBuffer d2)
* C function: double e1(int n, double * d1, double * d2);
*/
JNIEXPORT jdouble JNICALL
Java_ext_DotProd_e10__ILjava_lang_Object_2ILjava_lang_Object_2I(JNIEnv *env, jclass _unused, jint n, jobject d1, jint d1_byte_offset, jobject d2, jint d2_byte_offset) {
double * _ptr1 = NULL;
double * _ptr2 = NULL;
double _res;
if (d1 != NULL) {
_ptr1 = (double *) (((char*) (*env)->GetDirectBufferAddress(env, d1)) + d1_byte_offset);
}
if (d2 != NULL) {
_ptr2 = (double *) (((char*) (*env)->GetDirectBufferAddress(env, d2)) + d2_byte_offset);
}
_res = e1((int) n, (double *) _ptr1, (double *) _ptr2);
return _res;
}


/* Java->C glue code:
* Java package: ext.DotProd
* Java method: double e1(int n, java.nio.DoubleBuffer d1, java.nio.DoubleBuffer d2)
* C function: double e1(int n, double * d1, double * d2);
*/
JNIEXPORT jdouble JNICALL
Java_ext_DotProd_e11__ILjava_lang_Object_2ILjava_lang_Object_2I(JNIEnv *env, jclass _unused, jint n, jobject d1, jint d1_byte_offset, jobject d2, jint d2_byte_offset) {
double * _ptr1 = NULL;
double * _ptr2 = NULL;
double _res;
if (d1 != NULL) {
_ptr1 = (double *) (((char*) (*env)->GetPrimitiveArrayCritical(env, d1, NULL)) + d1_byte_offset);
}
if (d2 != NULL) {
_ptr2 = (double *) (((char*) (*env)->GetPrimitiveArrayCritical(env, d2, NULL)) + d2_byte_offset);
}
_res = e1((int) n, (double *) _ptr1, (double *) _ptr2);
if (d1 != NULL) {
(*env)->ReleasePrimitiveArrayCritical(env, d1, _ptr1, 0);
}
if (d2 != NULL) {
(*env)->ReleasePrimitiveArrayCritical(env, d2, _ptr2, 0);
}
return _res;
}


/* Java->C glue code:
* Java package: ext.DotProd
* Java method: double e2(int n, java.nio.DoubleBuffer d1, java.nio.DoubleBuffer d2)
* C function: double e2(int n, double * d1, double * d2);
*/
JNIEXPORT jdouble JNICALL
Java_ext_DotProd_e20__ILjava_lang_Object_2ILjava_lang_Object_2I(JNIEnv *env, jclass _unused, jint n, jobject d1, jint d1_byte_offset, jobject d2, jint d2_byte_offset) {
double * _ptr1 = NULL;
double * _ptr2 = NULL;
double _res;
if (d1 != NULL) {
_ptr1 = (double *) (((char*) (*env)->GetDirectBufferAddress(env, d1)) + d1_byte_offset);
}
if (d2 != NULL) {
_ptr2 = (double *) (((char*) (*env)->GetDirectBufferAddress(env, d2)) + d2_byte_offset);
}
_res = e2((int) n, (double *) _ptr1, (double *) _ptr2);
return _res;
}


/* Java->C glue code:
* Java package: ext.DotProd
* Java method: double e2(int n, java.nio.DoubleBuffer d1, java.nio.DoubleBuffer d2)
* C function: double e2(int n, double * d1, double * d2);
*/
JNIEXPORT jdouble JNICALL
Java_ext_DotProd_e21__ILjava_lang_Object_2ILjava_lang_Object_2I(JNIEnv *env, jclass _unused, jint n, jobject d1, jint d1_byte_offset, jobject d2, jint d2_byte_offset) {
double * _ptr1 = NULL;
double * _ptr2 = NULL;
double _res;
if (d1 != NULL) {
_ptr1 = (double *) (((char*) (*env)->GetPrimitiveArrayCritical(env, d1, NULL)) + d1_byte_offset);
}
if (d2 != NULL) {
_ptr2 = (double *) (((char*) (*env)->GetPrimitiveArrayCritical(env, d2, NULL)) + d2_byte_offset);
}
_res = e2((int) n, (double *) _ptr1, (double *) _ptr2);
if (d1 != NULL) {
(*env)->ReleasePrimitiveArrayCritical(env, d1, _ptr1, 0);
}
if (d2 != NULL) {
(*env)->ReleasePrimitiveArrayCritical(env, d2, _ptr2, 0);
}
return _res;
}


/* Java->C glue code:
* Java package: ext.DotProd
* Java method: double e3(int n, java.nio.DoubleBuffer d1, java.nio.DoubleBuffer d2)
* C function: double e3(int n, double * d1, double * d2);
*/
JNIEXPORT jdouble JNICALL
Java_ext_DotProd_e30__ILjava_lang_Object_2ILjava_lang_Object_2I(JNIEnv *env, jclass _unused, jint n, jobject d1, jint d1_byte_offset, jobject d2, jint d2_byte_offset) {
double * _ptr1 = NULL;
double * _ptr2 = NULL;
double _res;
if (d1 != NULL) {
_ptr1 = (double *) (((char*) (*env)->GetDirectBufferAddress(env, d1)) + d1_byte_offset);
}
if (d2 != NULL) {
_ptr2 = (double *) (((char*) (*env)->GetDirectBufferAddress(env, d2)) + d2_byte_offset);
}
_res = e3((int) n, (double *) _ptr1, (double *) _ptr2);
return _res;
}


/* Java->C glue code:
* Java package: ext.DotProd
* Java method: double e3(int n, java.nio.DoubleBuffer d1, java.nio.DoubleBuffer d2)
* C function: double e3(int n, double * d1, double * d2);
*/
JNIEXPORT jdouble JNICALL
Java_ext_DotProd_e31__ILjava_lang_Object_2ILjava_lang_Object_2I(JNIEnv *env, jclass _unused, jint n, jobject d1, jint d1_byte_offset, jobject d2, jint d2_byte_offset) {
double * _ptr1 = NULL;
double * _ptr2 = NULL;
double _res;
if (d1 != NULL) {
_ptr1 = (double *) (((char*) (*env)->GetPrimitiveArrayCritical(env, d1, NULL)) + d1_byte_offset);
}
if (d2 != NULL) {
_ptr2 = (double *) (((char*) (*env)->GetPrimitiveArrayCritical(env, d2, NULL)) + d2_byte_offset);
}
_res = e3((int) n, (double *) _ptr1, (double *) _ptr2);
if (d1 != NULL) {
(*env)->ReleasePrimitiveArrayCritical(env, d1, _ptr1, 0);
}
if (d2 != NULL) {
(*env)->ReleasePrimitiveArrayCritical(env, d2, _ptr2, 0);
}
return _res;
}



Fortran file


function dpmkl(n,d1,d2) result
!DEC$ ATTRIBUTES DLLEXPORT, STDCALL, ALIAS:"dpmkl" :: dpmkl
use mkl95_precision, only: wp => dp
use mkl95_blas
implicit none
integer :: n
real(wp), dimension(n) :: d1,d2
real(wp) :: r
r = dot(d1,d2)
end function dpmkl

function dp(n,d1,d2) result
!DEC$ ATTRIBUTES DLLEXPORT, STDCALL, ALIAS:"dp" :: dp
use mkl95_precision, only: wp => dp
implicit none
integer :: n
real(wp), dimension(n) :: d1,d2
real(wp) :: r
integer :: i
r = 0.
do i=1,n
r=r+d1(i)*d2(i)
end do
end function dp

function e1(n,d1,d2) result
!DEC$ ATTRIBUTES DLLEXPORT, STDCALL, ALIAS:"e1" :: e1
use mkl95_precision, only: wp => dp
implicit none
integer :: n
real(wp), dimension(n) :: d1,d2
real(wp) :: r
r = (n+1)*n/2.
end function e1


Java binding

/* !---- DO NOT EDIT: This file autogenerated by comsungluegenJavaEmitter.java on Wed Jul 29 04:26:24 PDT 2009 ----! */

package ext;

import com.sun.gluegen.runtime.*;

public class DotProd
{


/** Interface to C language function:
double dp(int n, double * d1, double * d2); */
public static double dp(int n, java.nio.DoubleBuffer d1, java.nio.DoubleBuffer d2)
{
boolean _direct = BufferFactory.isDirect(d1);
if (d2 != null && _direct != BufferFactory.isDirect(d2))
throw new RuntimeException("Argument "d2" : Buffers passed to this method must all be either direct or indirect");
if (_direct) {
return dp0(n, d1, BufferFactory.getDirectBufferByteOffset(d1), d2, BufferFactory.getDirectBufferByteOffset(d2));
} else {
return dp1(n, BufferFactory.getArray(d1), BufferFactory.getIndirectBufferByteOffset(d1), BufferFactory.getArray(d2), BufferFactory.getIndirectBufferByteOffset(d2));
}
}

/** Entry point to C language function:
double dp(int n, double * d1, double * d2); */
private static native double dp0(int n, Object d1, int d1_byte_offset, Object d2, int d2_byte_offset);

/** Entry point to C language function:
double dp(int n, double * d1, double * d2); */
private static native double dp1(int n, Object d1, int d1_byte_offset, Object d2, int d2_byte_offset);

/** Interface to C language function:
double dp(int n, double * d1, double * d2); */
public static double dp(int n, double[] d1, int d1_offset, double[] d2, int d2_offset)
{
if(d1 != null && d1.length <= d1_offset)
throw new RuntimeException("array offset argument "d1_offset" (" + d1_offset + ") equals or exceeds array length (" + d1.length + ")");
if(d2 != null && d2.length <= d2_offset)
throw new RuntimeException("array offset argument "d2_offset" (" + d2_offset + ") equals or exceeds array length (" + d2.length + ")");
return dp1(n, d1, BufferFactory.SIZEOF_DOUBLE * d1_offset, d2, BufferFactory.SIZEOF_DOUBLE * d2_offset);

}

/** Interface to C language function:
double dpmkl(int n, double * d1, double * d2); */
public static double dpmkl(int n, java.nio.DoubleBuffer d1, java.nio.DoubleBuffer d2)
{
boolean _direct = BufferFactory.isDirect(d1);
if (d2 != null && _direct != BufferFactory.isDirect(d2))
throw new RuntimeException("Argument "d2" : Buffers passed to this method must all be either direct or indirect");
if (_direct) {
return dpmkl0(n, d1, BufferFactory.getDirectBufferByteOffset(d1), d2, BufferFactory.getDirectBufferByteOffset(d2));
} else {
return dpmkl1(n, BufferFactory.getArray(d1), BufferFactory.getIndirectBufferByteOffset(d1), BufferFactory.getArray(d2), BufferFactory.getIndirectBufferByteOffset(d2));
}
}

/** Entry point to C language function:
double dpmkl(int n, double * d1, double * d2); */
private static native double dpmkl0(int n, Object d1, int d1_byte_offset, Object d2, int d2_byte_offset);

/** Entry point to C language function:
double dpmkl(int n, double * d1, double * d2); */
private static native double dpmkl1(int n, Object d1, int d1_byte_offset, Object d2, int d2_byte_offset);

/** Interface to C language function:
double dpmkl(int n, double * d1, double * d2); */
public static double dpmkl(int n, double[] d1, int d1_offset, double[] d2, int d2_offset)
{
if(d1 != null && d1.length <= d1_offset)
throw new RuntimeException("array offset argument "d1_offset" (" + d1_offset + ") equals or exceeds array length (" + d1.length + ")");
if(d2 != null && d2.length <= d2_offset)
throw new RuntimeException("array offset argument "d2_offset" (" + d2_offset + ") equals or exceeds array length (" + d2.length + ")");
return dpmkl1(n, d1, BufferFactory.SIZEOF_DOUBLE * d1_offset, d2, BufferFactory.SIZEOF_DOUBLE * d2_offset);

}

/** Interface to C language function:
double e1(int n, double * d1, double * d2); */
public static double e1(int n, java.nio.DoubleBuffer d1, java.nio.DoubleBuffer d2)
{
boolean _direct = BufferFactory.isDirect(d1);
if (d2 != null && _direct != BufferFactory.isDirect(d2))
throw new RuntimeException("Argument "d2" : Buffers passed to this method must all be either direct or indirect");
if (_direct) {
return e10(n, d1, BufferFactory.getDirectBufferByteOffset(d1), d2, BufferFactory.getDirectBufferByteOffset(d2));
} else {
return e11(n, BufferFactory.getArray(d1), BufferFactory.getIndirectBufferByteOffset(d1), BufferFactory.getArray(d2), BufferFactory.getIndirectBufferByteOffset(d2));
}
}

/** Entry point to C language function:
double e1(int n, double * d1, double * d2); */
private static native double e10(int n, Object d1, int d1_byte_offset, Object d2, int d2_byte_offset);

/** Entry point to C language function:
double e1(int n, double * d1, double * d2); */
private static native double e11(int n, Object d1, int d1_byte_offset, Object d2, int d2_byte_offset);

/** Interface to C language function:
double e1(int n, double * d1, double * d2); */
public static double e1(int n, double[] d1, int d1_offset, double[] d2, int d2_offset)
{
if(d1 != null && d1.length <= d1_offset)
throw new RuntimeException("array offset argument "d1_offset" (" + d1_offset + ") equals or exceeds array length (" + d1.length + ")");
if(d2 != null && d2.length <= d2_offset)
throw new RuntimeException("array offset argument "d2_offset" (" + d2_offset + ") equals or exceeds array length (" + d2.length + ")");
return e11(n, d1, BufferFactory.SIZEOF_DOUBLE * d1_offset, d2, BufferFactory.SIZEOF_DOUBLE * d2_offset);

}

/** Interface to C language function:
double e2(int n, double * d1, double * d2); */
public static double e2(int n, java.nio.DoubleBuffer d1, java.nio.DoubleBuffer d2)
{
boolean _direct = BufferFactory.isDirect(d1);
if (d2 != null && _direct != BufferFactory.isDirect(d2))
throw new RuntimeException("Argument "d2" : Buffers passed to this method must all be either direct or indirect");
if (_direct) {
return e20(n, d1, BufferFactory.getDirectBufferByteOffset(d1), d2, BufferFactory.getDirectBufferByteOffset(d2));
} else {
return e21(n, BufferFactory.getArray(d1), BufferFactory.getIndirectBufferByteOffset(d1), BufferFactory.getArray(d2), BufferFactory.getIndirectBufferByteOffset(d2));
}
}

/** Entry point to C language function:
double e2(int n, double * d1, double * d2); */
private static native double e20(int n, Object d1, int d1_byte_offset, Object d2, int d2_byte_offset);

/** Entry point to C language function:
double e2(int n, double * d1, double * d2); */
private static native double e21(int n, Object d1, int d1_byte_offset, Object d2, int d2_byte_offset);

/** Interface to C language function:
double e2(int n, double * d1, double * d2); */
public static double e2(int n, double[] d1, int d1_offset, double[] d2, int d2_offset)
{
if(d1 != null && d1.length <= d1_offset)
throw new RuntimeException("array offset argument "d1_offset" (" + d1_offset + ") equals or exceeds array length (" + d1.length + ")");
if(d2 != null && d2.length <= d2_offset)
throw new RuntimeException("array offset argument "d2_offset" (" + d2_offset + ") equals or exceeds array length (" + d2.length + ")");
return e21(n, d1, BufferFactory.SIZEOF_DOUBLE * d1_offset, d2, BufferFactory.SIZEOF_DOUBLE * d2_offset);

}

/** Interface to C language function:
double e2(int n, double * d1, double * d2); */
public static double e3(int n, java.nio.DoubleBuffer d1, java.nio.DoubleBuffer d2)
{
boolean _direct = BufferFactory.isDirect(d1);
if (d2 != null && _direct != BufferFactory.isDirect(d2))
throw new RuntimeException("Argument "d2" : Buffers passed to this method must all be either direct or indirect");
if (_direct) {
return e20(n, d1, BufferFactory.getDirectBufferByteOffset(d1), d2, BufferFactory.getDirectBufferByteOffset(d2));
} else {
return e21(n, BufferFactory.getArray(d1), BufferFactory.getIndirectBufferByteOffset(d1), BufferFactory.getArray(d2), BufferFactory.getIndirectBufferByteOffset(d2));
}
}

/** Entry point to C language function:
double e2(int n, double * d1, double * d2); */
private static native double e30(int n, Object d1, int d1_byte_offset, Object d2, int d2_byte_offset);

/** Entry point to C language function:
double e2(int n, double * d1, double * d2); */
private static native double e31(int n, Object d1, int d1_byte_offset, Object d2, int d2_byte_offset);

/** Interface to C language function:
double e2(int n, double * d1, double * d2); */
public static double e3(int n, double[] d1, int d1_offset, double[] d2, int d2_offset)
{
if(d1 != null && d1.length <= d1_offset)
throw new RuntimeException("array offset argument "d1_offset" (" + d1_offset + ") equals or exceeds array length (" + d1.length + ")");
if(d2 != null && d2.length <= d2_offset)
throw new RuntimeException("array offset argument "d2_offset" (" + d2_offset + ") equals or exceeds array length (" + d2.length + ")");
return e31(n, d1, BufferFactory.SIZEOF_DOUBLE * d1_offset, d2, BufferFactory.SIZEOF_DOUBLE * d2_offset);

}

} // end of class DotProd


Java Main file


import ext.DotProd;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.nio.ByteBuffer;
import java.nio.DoubleBuffer;
import java.util.Properties;
import java.util.logging.Level;
import java.util.logging.Logger;

/**
*
* @author veitner
*/
public class Main {

/**
* @param args the command line arguments
*/

static int sizeofdouble = Double.SIZE / 8;
public static void main(String[] args) {
// System.out.println(System.getProperty("java.library.path"));
int n = 20000;
int nfc = 300000;
Properties p = new Properties();
try {
p.load(new FileInputStream("jdp.cfg"));
n = Integer.parseInt(p.getProperty("n"));
nfc = Integer.parseInt(p.getProperty("nfc"));
} catch (Exception ex) {
p.put("dp.dll", "E:\MinGW\home\veitner\vs\dp\build\dp.dll");
p.put("jni.dll", "E:\MinGW\home\veitner\vs\dp\build\jni.dll");
try {
p.save(new FileOutputStream("jdp.cfg"), "auto-generated");
} catch (Exception ex1) {
Logger.getLogger(Main.class.getName()).log(Level.SEVERE, null, ex1);
}
}
try {
System.loadLibrary("dp");
} catch (UnsatisfiedLinkError e) {
System.out.println("Unable to load lib via System.loadLibrary. Trying System.load("+p.getProperty("dp.dll")+").");
System.load(p.getProperty("dp.dll"));
}
try {
System.loadLibrary("jni");
} catch (UnsatisfiedLinkError e) {
System.out.println("Unable to load lib via System.loadLibrary. Trying System.load("+p.getProperty("jni.dll")+").");
System.load(p.getProperty("jni.dll"));
}
System.out.println("field length="+n+", number of calls/function="+nfc);
DoubleBuffer b1, b2;
ByteBuffer b = ByteBuffer.allocateDirect(n * sizeofdouble);
b1 = b.asDoubleBuffer();
b = ByteBuffer.allocateDirect(n * sizeofdouble);
b2 = b.asDoubleBuffer();
double[] d1 = new double;
double[] d2 = new double;
for (int i = 0; i < n; i++) {
b1.put(1.);
b2.put(i + 1);
d1 = 1.;
d2 = (i + 1);

}
//the result of the functions has to be sum(i,i=1,n)=(n+1)*n/2.
double res = (n + 1) * n / 2.;

long start, stop;
double elapsed;
System.out.println("calling Fortran functions:");
start = System.currentTimeMillis();
double dp = 0.;
for (int i = 0; i < nfc; i++) {
//todo: fix call using buffers - it does not work currently
dp = DotProd.dpmkl(n, d1, 0, d2, 0);
if (dp != res) {
System.out.println("error in dpmkl");
break;
}
}
stop = System.currentTimeMillis();
//obviously the time is in 1/1000 seconds - factor 10 to convert to 1/100 seconds
elapsed = (stop - start)/10.;
System.out.println(new StringBuffer().append("ms for ").append(nfc).append(" calls to dpmkl (dot product using blas): ").append(elapsed).toString());

start = System.currentTimeMillis();
dp = 0.;
for (int i = 0; i < nfc; i++) {
dp = DotProd.dp(n, d1, 0, d2, 0);
if (dp != res) {
System.out.println("error in dp");
break;
}
}
stop = System.currentTimeMillis();
elapsed = (stop - start)/10.;
System.out.println(new StringBuffer().append("ms for ").append(nfc).append(" calls to dp (dot product via loop): ").append(elapsed).toString());

start = System.currentTimeMillis();
dp = 0.;
for (int i = 0; i < nfc; i++) {
dp = DotProd.e1(n, d1, 0, d2, 0);
if (dp != res) {
System.out.println("error in e1");
break;
}
}
stop = System.currentTimeMillis();
elapsed = (stop - start)/10.;
System.out.println(new StringBuffer().append("ms for ").append(nfc).append(" calls to e1 ((n+1)*n/2.): ").append(elapsed).toString());

System.out.println("calling C functions:");
start = System.currentTimeMillis();
dp = 0.;
for (int i = 0; i < nfc; i++) {
dp = DotProd.e2(n, d1, 0, d2, 0);
if (dp != res) {
System.out.println("error in e2");
break;
}
}
stop = System.currentTimeMillis();
elapsed = (stop - start)/10.;
System.out.println(new StringBuffer().append("ms for ").append(nfc).append(" calls to e2 (dot product in c): ").append(elapsed).toString());

start = System.currentTimeMillis();
dp = 0.;
for (int i = 0; i < nfc; i++) {
dp = DotProd.e3(n, d1, 0, d2, 0);
if (dp != res) {
System.out.println("error in e3");
break;
}
}
stop = System.currentTimeMillis();
elapsed = (stop - start)/10.;
System.out.println(new StringBuffer().append("ms for ").append(nfc).append(" calls to e3 ((n+1)*n/2.): ").append(elapsed).toString());
}
}




And at the end I would say the mentioned IDE is worth a look. Especially then you dont bother in some breaks having a coffee or joining a meeting while waiting for the IDE.

And I learned for the future that it would make much more sense to read some Docs regarding Compiler Settings instead dreaming of performance by making code changes only.

Thank you for forcing me to rethink the idea. That saved me a lot of time spending just in an illusion.

Some questions are still open:
What is with the mkl in release mode (I did not change any of the debug/release flags in VS2008)?
Will the the Intel C Compiler increase performace (maybe I'll download a trial edition to test)?
Why is the VS2008 generated code slower except in the inlined C Functions than the JNI method invocations?
Why does passing of DoubleBuffers dont work (I can remember some time ago that this was working - at least on Solaris)?
Are the timings correct (One can not switch off garbage collection in java)?

Oh and at least - my testsystem is WinXP in VirtualBox running on SolarisX86. So maybe different results on a windows only host (mkl parallel).

0 Kudos
veitner
Beginner
1,570 Views
Blame on me...

During the night I got enlightment (some at least).
The timings posted yesterday are wrong by a factor of 10.
Milli=10^-3 instead of 10^-2 as in the last posted codings (the C-Program divides the clock difference by 100 and CLOCKS_PER_SECONDS instead of 1000 and CLOCKS_PER_SECONDS and the Java Program the TimeMillis by 10). I relied on the C program coded first and adapted the Java code accordingly. With a stop watch I checked the numbers and did the same error again - I multiplied the seconds by 100 instead of 1000 and compared them to the results of the Code.

Sorry for the error

Veit
0 Kudos
ArturGuzik
Valued Contributor I
1,570 Views
Quoting - veitner
Some questions are still open:
What is with the mkl in release mode (I did not change any of the debug/release flags in VS2008)?
Will the the Intel C Compiler increase performace (maybe I'll download a trial edition to test)?


Thanks for update.

(1) MKL is always in release mode, no switches required (actually you're never allowed to distribute Debug libs as well)
(2) I think so. I mean it's usually much faster than MS. You have many optimization options available.

A.
0 Kudos
veitner
Beginner
1,570 Views
Quoting - ArturGuzik

Thanks for update.

(1) MKL is always in release mode, no switches required (actually you're never allowed to distribute Debug libs as well)
(2) I think so. I mean it's usually much faster than MS. You have many optimization options available.

A.
Hi Artur,

the funny thing with the call to the blas routine "dot" is:
C executable in debug mode: fast
C executable in release mode: slow
It's probably some MS specific behavior.
Attached I added all the files including an Makefile. Before calling the makefile you have to adjust the JAVAHOME Variable to point to your jdk. And you have to supply the path to MKL while launching.
Issue some command like
nmake MKLROOT="c:intelcompilerlatestmkl" lib32
But somehow the settings for the jni.dll are not correct but I cant spend more time to figure that out. So the dll created by the makefile is not callable from java (it produces wrong results - so probably calling convention or integer size or real size or what ever). Building the dll inside VS results in a callable dll - also attached.
And also the call to gluegen is handwork:
java -cp gluegen.jar;antlr-2.7.5.jar com.sun.gluegen.GlueGen -I. -Ecom.sun.gluegen.JavaEmitter -Cfext.cfg fext-gg.h
And I did not include a build script for the java stuff. But a ready made jar file
java -jar jdp.jar


Somewhen I have to go through the compiler switches manual

Regards
Veit

0 Kudos
veitner
Beginner
1,570 Views
Ah - I think I know the problem regarding the nonworking jni.dll.
Because I had no idea howto add the include statement "#include "fext.h"" into generated DotProd_JNI.c using the nmake makefile I renamed the file fext.h to fext.c and linked that two files together to build the dll. Otherwise the compiler complained about an unknown suffix .h.

Veit
0 Kudos
veitner
Beginner
1,570 Views
Quoting - veitner
Ah - I think I know the problem regarding the nonworking jni.dll.
Because I had no idea howto add the include statement "#include "fext.h"" into generated DotProd_JNI.c using the nmake makefile I renamed the file fext.h to fext.c and linked that two files together to build the dll. Otherwise the compiler complained about an unknown suffix .h.

Veit

To all the c gurus:
What is the difference between a simple header file like

double d(int n, double* a1, double* a2) {
int i;
double r=0.;
for (int i=0; i d+=(a1(i)*a2(i));
}
return r;
}

or a c file having same content.

When I link the c file against another file f.c I get different results compared to an include of the h file in f.c.
The h file and the c file are exactly the same except of the extension.

Attached are a new jdp.jar (The load of the libiomp5.dll is not needed) and a makefile including a task to generate the JNI_Binding (and adding the include to fext.h to the c source) and therefore a working jni.dll.

The tests showed that the initial idea was nonsense.

As a spinoff resulted a working sample which will enable anybody to call fortran code from java without writing any line of code for the binding by the use of gluegen. Thanks to the developers of the jogl project who created that tool.

Veit.

0 Kudos
Reply