- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
First of all hello to everyone. I am having some problems with the cblas_zgemm function. I am trying to multiply two matrices, and all the input parameters are correct and in the right order. For some reason, i dont know why if the matrices are smaller than 15x15 the calculations are correct, and if they are bigger than that the 15-th column is filled only with zeros, and those are not the correct values, because i checked them in matlab. All other values are correct, only the 15-th column is wrong. I am using mkl 8.0.1. This gives me great trouble because the error than stackes, because there are more multiplications further. I would appreciate any help.
Just ot add that no matter what the matrix size iz, if it is greater than 15x15 the 15-th column gives bad results, then the 30-th column... The arguments ae as follows:
cblas_zgemm(CblasRowMajor, CblasNoTrans, CblasTrans, N, N, N1, α, A, N1, A, N1, β, C, N) where
alpha=(1,0) and beta=(0,0). Thans in advance.
Just ot add that no matter what the matrix size iz, if it is greater than 15x15 the 15-th column gives bad results, then the 30-th column... The arguments ae as follows:
cblas_zgemm(CblasRowMajor, CblasNoTrans, CblasTrans, N, N, N1, α, A, N1, A, N1, β, C, N) where
alpha=(1,0) and beta=(0,0). Thans in advance.
Link Copied
8 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Actually this version is no longer support.
Please check this probelm with the latest 10.3 beta and let us know if the probelm is still there.
Here is the linkwhichinvites you to participate in the Intel MKL 10.3 Beta program.
--Gennady
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ok, will try. I would realy appreciate if anyone would go to the trouble to repeat my results, even with a different version of the mkl library. I have a relatively old computer, a 32 bit architecture, so i dont know if the new beta is realy for me. I made an experiment, and an arbitrary complex matrix works fine, but a mode matching matrix i am working with gives those fanthom zeros that drive me crazy. I mean there should be zeros in the matrix but not in those columns, or at least not in most of it. Here is the code, i forgot to copy the header files with the declarations but you can get a good picture from the code. Just call the function and see in the error . txt what you get. Any and all sugestions are welcome. Thank you all in advance.
#include
#include
#include
extern "C" {
#include"mkl_cblas.h"
#include"mkl_lapack.h"
}
using namespace std;
void Alfa::hstep(char* file, double a, double c, double x1, const int N, double f1, double f2, int k) {
const double c1 = 300000000;//299792458
const double PI = 3.141592653589793;
f1*=1000000000;
f2*=1000000000;
double st = (f2-f1)/k;
const int N1 = ((c/a*N)<(floor(c/a*N)+0.5)) ? (int)floor(c/a*N) : (int)ceil(c/a*N);
FILE *fs11 = fopen("Step.txt", "w");
FILE *fs12 = fopen("error.txt", "w");
complex *Bna, *Bnc;
Bna = new complex;
Bnc = new complex[N1];
MKL_Complex16 *Le = new MKL_Complex16[N*N1];
MKL_Complex16 *L = new MKL_Complex16[N*N];
MKL_Complex16 *Li = new MKL_Complex16[N*N], *Ld = new MKL_Complex16[N*N];
int *NLi = new int(N), *MLi = new int(N), *ldaLi = new int(N), *ipivLi = new int, *infoLi = new int(0);
int lworkLi = -1;
MKL_Complex16 *s11 = new MKL_Complex16[N*N];
MKL_Complex16 *s12 = new MKL_Complex16[N*N1];
MKL_Complex16 *s21 = new MKL_Complex16[N1*N];
MKL_Complex16 *s22 = new MKL_Complex16[N1*N1];
MKL_Complex16 *neka = new MKL_Complex16[N*N];
MKL_Complex16 *neka1 = new MKL_Complex16[N*N];
//
MKL_Complex16 *dummywork=new MKL_Complex16[1];
zgetri(NLi, Li, ldaLi, ipivLi, dummywork, &lworkLi, infoLi);
lworkLi=(int)(dummywork[0].real);
MKL_Complex16 *work=new MKL_Complex16[lworkLi];
//
for (double temp=f1; temp<=f2; temp+=st) {
double k0=2*PI*temp/c1;
for (int i=0; i = conj(sqrt(complex(k0*k0-((i+1)*PI/a)*((i+1)*PI/a),0)));
for (int i=0; i = conj(sqrt(complex(k0*k0-((i+1)*PI/c)*((i+1)*PI/c),0)));
for (int i=0; i for (int j=0; j complex prvi(c*a*Bnc.real(), c*a*Bnc.imag());
complex ctemp = sqrt(Bna/prvi);
double itemp = int1(a,c,i,j,x1);
complex ftemp(2*itemp*ctemp.real(), 2*itemp*ctemp.imag());
Le[i*N1+j].real = ftemp.real();
Le[i*N1+j].imag = ftemp.imag();
}
complex alpha(1,0), beta(0,0);
cblas_zgemm(CblasRowMajor, CblasNoTrans, CblasTrans, N, N, N1, α, Le, N1, Le, N1, β, L, N); <<<<<------------------------------------This is where the problem appeares
for (int i=0; i fprintf(fs12, "\n");
for (int j=0; j fprintf(fs12, "%lf\t %1d %.18lf %.18lf\n", temp/1000000000, i*N+j, L[i*N+j].real, L[i*N+j].imag);
}
}
////////////////////////////
int1.cpp
///////////////////////////
#include
using namespace std;
double Alfa::int1(double a, double c, int n, int m, double x1) {
const double PI = 3.141592653589793;
double ret = 0;
n++;
m++;
if (n*c!=a*m) {
double e = a*c*sin(PI*n*(c+x1)/a-PI*m)/(2*PI*(c*n-a*m));
double e1 = a*c*sin(PI*n*(c+x1)/a+PI*m)/(2*PI*(c*n+a*m));
double e2 = a*a*c*m*sin(PI*n*x1/a)/(PI*(a*m+c*n)*(a*m-c*n));
ret = e-e1+e2;
} else {
ret = c/2*cos(m*PI*x1/c);
}
return ret;
}
#include
#include
#include
extern "C" {
#include"mkl_cblas.h"
#include"mkl_lapack.h"
}
using namespace std;
void Alfa::hstep(char* file, double a, double c, double x1, const int N, double f1, double f2, int k) {
const double c1 = 300000000;//299792458
const double PI = 3.141592653589793;
f1*=1000000000;
f2*=1000000000;
double st = (f2-f1)/k;
const int N1 = ((c/a*N)<(floor(c/a*N)+0.5)) ? (int)floor(c/a*N) : (int)ceil(c/a*N);
FILE *fs11 = fopen("Step.txt", "w");
FILE *fs12 = fopen("error.txt", "w");
complex
Bna = new complex
Bnc = new complex
MKL_Complex16 *Le = new MKL_Complex16[N*N1];
MKL_Complex16 *L = new MKL_Complex16[N*N];
MKL_Complex16 *Li = new MKL_Complex16[N*N], *Ld = new MKL_Complex16[N*N];
int *NLi = new int(N), *MLi = new int(N), *ldaLi = new int(N), *ipivLi = new int
int lworkLi = -1;
MKL_Complex16 *s11 = new MKL_Complex16[N*N];
MKL_Complex16 *s12 = new MKL_Complex16[N*N1];
MKL_Complex16 *s21 = new MKL_Complex16[N1*N];
MKL_Complex16 *s22 = new MKL_Complex16[N1*N1];
MKL_Complex16 *neka = new MKL_Complex16[N*N];
MKL_Complex16 *neka1 = new MKL_Complex16[N*N];
//
MKL_Complex16 *dummywork=new MKL_Complex16[1];
zgetri(NLi, Li, ldaLi, ipivLi, dummywork, &lworkLi, infoLi);
lworkLi=(int)(dummywork[0].real);
MKL_Complex16 *work=new MKL_Complex16[lworkLi];
//
for (double temp=f1; temp<=f2; temp+=st) {
double k0=2*PI*temp/c1;
for (int i=0; i
for (int i=0; i
for (int i=0; i
complex
double itemp = int1(a,c,i,j,x1);
complex
Le[i*N1+j].real = ftemp.real();
Le[i*N1+j].imag = ftemp.imag();
}
complex
cblas_zgemm(CblasRowMajor, CblasNoTrans, CblasTrans, N, N, N1, α, Le, N1, Le, N1, β, L, N); <<<<<------------------------------------This is where the problem appeares
for (int i=0; i
for (int j=0; j
}
}
////////////////////////////
int1.cpp
///////////////////////////
#include
using namespace std;
double Alfa::int1(double a, double c, int n, int m, double x1) {
const double PI = 3.141592653589793;
double ret = 0;
n++;
m++;
if (n*c!=a*m) {
double e = a*c*sin(PI*n*(c+x1)/a-PI*m)/(2*PI*(c*n-a*m));
double e1 = a*c*sin(PI*n*(c+x1)/a+PI*m)/(2*PI*(c*n+a*m));
double e2 = a*a*c*m*sin(PI*n*x1/a)/(PI*(a*m+c*n)*(a*m-c*n));
ret = e-e1+e2;
} else {
ret = c/2*cos(m*PI*x1/c);
}
return ret;
}
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
don't doubt - the latest version supports all IA-32 Architecture systems generally compatible with the Intel Pentium processors, (for example, Intel Pentium 4 processor or Intel Xeon processor), or processors from other manufacturers supporting the same instruction set and running a 32-bit operating system.
--Gennady
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I've tried the same problem in mkl 10.2.2.025, the result was the same. Sorry about the bad news. Maybe i did something wrong in the definitions, or some parameter was wrong, or something else, but the fanthom zeros are there. If anybody is willing to repeat my results i would be gratefull. Any other idea is welcome. Thanks in advance.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Please give a specific example where the claimed error occurs, and state which compiler you used.
Please give source code that is complete, so that someone interested can compile and run the program to reproduce the errors that you saw. The code that you have given is incomplete (no main(), missing definition of Alfa), and you have not specified the values of variables (a, c, n, m, etc.) that are needed to enable running the program.
Please give source code that is complete, so that someone interested can compile and run the program to reproduce the errors that you saw. The code that you have given is incomplete (no main(), missing definition of Alfa), and you have not specified the values of variables (a, c, n, m, etc.) that are needed to enable running the program.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Will do. Im working in visual studio 2005. I will put all of the components, and the values that i used are the ones in the main program.
///////////////////////////
int1.h
//////////////////////////
#ifndef _int1_h_
#define _int1_h_
namespace Alfa {
double int1(double a, double c, int n, int m, double x1);
}
#endif
///////////////////////////
hstep.h
//////////////////////////
#ifndef _hstep_h_
#define _hstep_h_
namespace Alfa {
void hstep(char* file, double a, double c, double x1, int N, double f1, double f2, int k);
}
#endif
///////////////////////////
int1.cpp
//////////////////////////
#include "int1.h"
#include
using namespace std;
double Alfa::int1(double a, double c, int n, int m, double x1) {
const double PI = 3.141592653589793;
double ret = 0;
n++;
m++;
if (n*c!=a*m) {
double e = a*c*sin(PI*n*(c+x1)/a-PI*m)/(2*PI*(c*n-a*m));
double e1 = a*c*sin(PI*n*(c+x1)/a+PI*m)/(2*PI*(c*n+a*m));
double e2 = a*a*c*m*sin(PI*n*x1/a)/(PI*(a*m+c*n)*(a*m-c*n));
ret = e-e1+e2;
} else {
ret = c/2*cos(m*PI*x1/c);
}
return ret;
}
///////////////////////////
hstep.cpp
//////////////////////////
#include "hstep.h"
#include "int1.h"
#include
#include
#include
extern "C" {
#include"mkl_cblas.h"
#include"mkl_lapack.h"
}
using namespace std;
void Alfa::hstep(char* file, double a, double c, double x1, const int N, double f1, double f2, int k) {
const double c1 = 300000000;//299792458
const double PI = 3.141592653589793;
f1*=1000000000;
f2*=1000000000;
double st = (f2-f1)/k;
const int N1 = ((c/a*N)<(floor(c/a*N)+0.5)) ? (int)floor(c/a*N) : (int)ceil(c/a*N);
FILE *fs11 = fopen("Hstep.txt", "w");
FILE *fs12 = fopen("error.txt", "w");
complex *Bna, *Bnc;
Bna = new complex;
Bnc = new complex[N1];
MKL_Complex16 *Le = new MKL_Complex16[N*N1];
MKL_Complex16 *L = new MKL_Complex16[N*N];
MKL_Complex16 *Li = new MKL_Complex16[N*N], *Ld = new MKL_Complex16[N*N];
int *NLi = new int(N), *MLi = new int(N), *ldaLi = new int(N), *ipivLi = new int, *infoLi = new int(0);//*ipivLi = new int(0)
int lworkLi = -1;
MKL_Complex16 *s11 = new MKL_Complex16[N*N];
MKL_Complex16 *s12 = new MKL_Complex16[N*N1];
MKL_Complex16 *s21 = new MKL_Complex16[N1*N];
MKL_Complex16 *s22 = new MKL_Complex16[N1*N1];
MKL_Complex16 *neka = new MKL_Complex16[N*N];
MKL_Complex16 *neka1 = new MKL_Complex16[N*N];
//
MKL_Complex16 *dummywork=new MKL_Complex16[1];
zgetri(NLi, Li, ldaLi, ipivLi, dummywork, &lworkLi, infoLi);
lworkLi=(int)(dummywork[0].real);
MKL_Complex16 *work=new MKL_Complex16[lworkLi];
//
for (double temp=f1; temp<=f2; temp+=st) {
double k0=2*PI*temp/c1;
for (int i=0; i = conj(sqrt(complex(k0*k0-((i+1)*PI/a)*((i+1)*PI/a),0)));
for (int i=0; i = conj(sqrt(complex(k0*k0-((i+1)*PI/c)*((i+1)*PI/c),0)));
for (int i=0; i for (int j=0; j complex prvi(c*a*Bnc.real(), c*a*Bnc.imag());
complex ctemp = sqrt(Bna/prvi);
double itemp = int1(a,c,i,j,x1);
complex ftemp(2*itemp*ctemp.real(), 2*itemp*ctemp.imag());
Le[i*N1+j].real = ftemp.real();
Le[i*N1+j].imag = ftemp.imag();
}
complex alpha(1,0), beta(0,0);
cblas_zgemm(CblasRowMajor, CblasNoTrans, CblasTrans, N, N, N1, α, Le, N1, Le, N1, β, L, N); /*<<<<<<<<<<<-------------------This is the place where the problem begins, it happens in all the other places with cblas_zgemm*/
for (int i=0; i fprintf(fs12, "\n");
for (int j=0; j fprintf(fs12, "%lf\t %1d %.18lf %.18lf\n", temp/1000000000, i*N+j, L[i*N+j].real, L[i*N+j].imag);
}
}
for (int i=0; i for (int j=0; j Li[i*N+j]=L[i*N+j];
Ld[i*N+j]=L[i*N+j];
if (i==j) {
Li[i*N+j].real++;
Ld[i*N+j].real--;
}
}
zgetrf(MLi, NLi, Li, ldaLi, ipivLi, infoLi);
zgetri(NLi, Li, ldaLi, ipivLi, work, &lworkLi, infoLi);
cblas_zgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, N, N, N, α, Li, N, Ld, N, β, s11, N);
/*for (int i=0; i fprintf(fs12, "\n");
for (int j=0; j fprintf(fs12, "%lf\t %1d %.15lf %.15lf\n", temp/1000000000, i*N+j, s11[i*N+j].real, s11[i*N+j].imag);
}
}*/
alpha+=complex(1,0);
cblas_zgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, N, N1, N, α, Li, N, Le, N1, β, s12, N1);
for (int i=0; i for (int j=0; j neka[i*N+j].real = -s11[i*N+j].real;
neka[i*N+j].imag = -s11[i*N+j].imag;
if (i==j) neka[i*N+j].real++;
}
alpha-=complex(1,0);
cblas_zgemm(CblasRowMajor, CblasTrans, CblasNoTrans, N1, N, N, α, Le, N1, neka, N, β, s21, N);
cblas_zgemm(CblasRowMajor, CblasTrans, CblasNoTrans, N1, N1, N, α, Le, N1, s12, N1, β, neka1, N1);
for (int i=0; i for (int j=0; j s22[i*N1+j].real = -neka1[i*N1+j].real;
s22[i*N1+j].imag = -neka1[i*N1+j].imag;
if (i==j) s22[i*N1+j].real++;
}
if (temp==f1)
fprintf(fs11," > GHz %1d\n ", lworkLi);
//else if (temp>=f1) {
fprintf(fs11, "%lf\t %1d %1d 0 0 0 0 %.15lf %.15lf\n", temp/1000000000, 1, 1, s11[0].real, s11[0].imag);
fprintf(fs11, "%lf\t %1d %1d 0 0 0 0 %.15lf %.15lf\n", temp/1000000000, 1, 2, s12[0].real, s12[0].imag);
fprintf(fs11, "%lf\t %1d %1d 0 0 0 0 %.15lf %.15lf\n", temp/1000000000, 2, 1, s21[0].real, s21[0].imag);
fprintf(fs11, "%lf\t %1d %1d 0 0 0 0 %.15lf %.15lf\n", temp/1000000000, 2, 2, s22[0].real, s22[0].imag);
//}
}
fclose(fs11);
fclose(fs12);
delete[] Bna;
delete[] Bnc;
delete[] Le;
delete[] L;
delete[] Li;
delete NLi;
delete MLi;
delete ldaLi;
delete[] Ld;
delete[] work;
delete[] s11;
delete[] s12;
delete[] s21;
delete[] s22;
delete[] neka;
delete[] neka1;
delete[] ipivLi;
delete infoLi;
}
///////////////////////////
prog.cpp
//////////////////////////
#include
#include
#include
#include "int1.h"
#include "hstep.h"
extern "C" {
#include"mkl_cblas.h"
#include"mkl_lapack.h"
}
using namespace std;
int main() {
clock_t t1 = clock();
Alfa::hstep("", 0.1, 0.08, 0, 20, 1.8, 2.4, 3);
clock_t t2 = clock();
cout << (double)(t2-t1)/CLOCKS_PER_SEC;
}
Thanks to all, i hope that i didnt make any stupid errors.
///////////////////////////
int1.h
//////////////////////////
#ifndef _int1_h_
#define _int1_h_
namespace Alfa {
double int1(double a, double c, int n, int m, double x1);
}
#endif
///////////////////////////
hstep.h
//////////////////////////
#ifndef _hstep_h_
#define _hstep_h_
namespace Alfa {
void hstep(char* file, double a, double c, double x1, int N, double f1, double f2, int k);
}
#endif
///////////////////////////
int1.cpp
//////////////////////////
#include "int1.h"
#include
using namespace std;
double Alfa::int1(double a, double c, int n, int m, double x1) {
const double PI = 3.141592653589793;
double ret = 0;
n++;
m++;
if (n*c!=a*m) {
double e = a*c*sin(PI*n*(c+x1)/a-PI*m)/(2*PI*(c*n-a*m));
double e1 = a*c*sin(PI*n*(c+x1)/a+PI*m)/(2*PI*(c*n+a*m));
double e2 = a*a*c*m*sin(PI*n*x1/a)/(PI*(a*m+c*n)*(a*m-c*n));
ret = e-e1+e2;
} else {
ret = c/2*cos(m*PI*x1/c);
}
return ret;
}
///////////////////////////
hstep.cpp
//////////////////////////
#include "hstep.h"
#include "int1.h"
#include
#include
#include
extern "C" {
#include"mkl_cblas.h"
#include"mkl_lapack.h"
}
using namespace std;
void Alfa::hstep(char* file, double a, double c, double x1, const int N, double f1, double f2, int k) {
const double c1 = 300000000;//299792458
const double PI = 3.141592653589793;
f1*=1000000000;
f2*=1000000000;
double st = (f2-f1)/k;
const int N1 = ((c/a*N)<(floor(c/a*N)+0.5)) ? (int)floor(c/a*N) : (int)ceil(c/a*N);
FILE *fs11 = fopen("Hstep.txt", "w");
FILE *fs12 = fopen("error.txt", "w");
complex
Bna = new complex
Bnc = new complex
MKL_Complex16 *Le = new MKL_Complex16[N*N1];
MKL_Complex16 *L = new MKL_Complex16[N*N];
MKL_Complex16 *Li = new MKL_Complex16[N*N], *Ld = new MKL_Complex16[N*N];
int *NLi = new int(N), *MLi = new int(N), *ldaLi = new int(N), *ipivLi = new int
int lworkLi = -1;
MKL_Complex16 *s11 = new MKL_Complex16[N*N];
MKL_Complex16 *s12 = new MKL_Complex16[N*N1];
MKL_Complex16 *s21 = new MKL_Complex16[N1*N];
MKL_Complex16 *s22 = new MKL_Complex16[N1*N1];
MKL_Complex16 *neka = new MKL_Complex16[N*N];
MKL_Complex16 *neka1 = new MKL_Complex16[N*N];
//
MKL_Complex16 *dummywork=new MKL_Complex16[1];
zgetri(NLi, Li, ldaLi, ipivLi, dummywork, &lworkLi, infoLi);
lworkLi=(int)(dummywork[0].real);
MKL_Complex16 *work=new MKL_Complex16[lworkLi];
//
for (double temp=f1; temp<=f2; temp+=st) {
double k0=2*PI*temp/c1;
for (int i=0; i
for (int i=0; i
for (int i=0; i
complex
double itemp = int1(a,c,i,j,x1);
complex
Le[i*N1+j].real = ftemp.real();
Le[i*N1+j].imag = ftemp.imag();
}
complex
cblas_zgemm(CblasRowMajor, CblasNoTrans, CblasTrans, N, N, N1, α, Le, N1, Le, N1, β, L, N); /*<<<<<<<<<<<-------------------This is the place where the problem begins, it happens in all the other places with cblas_zgemm*/
for (int i=0; i
for (int j=0; j
}
}
for (int i=0; i
Ld[i*N+j]=L[i*N+j];
if (i==j) {
Li[i*N+j].real++;
Ld[i*N+j].real--;
}
}
zgetrf(MLi, NLi, Li, ldaLi, ipivLi, infoLi);
zgetri(NLi, Li, ldaLi, ipivLi, work, &lworkLi, infoLi);
cblas_zgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, N, N, N, α, Li, N, Ld, N, β, s11, N);
/*for (int i=0; i
for (int j=0; j
}
}*/
alpha+=complex
cblas_zgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, N, N1, N, α, Li, N, Le, N1, β, s12, N1);
for (int i=0; i
neka[i*N+j].imag = -s11[i*N+j].imag;
if (i==j) neka[i*N+j].real++;
}
alpha-=complex
cblas_zgemm(CblasRowMajor, CblasTrans, CblasNoTrans, N1, N, N, α, Le, N1, neka, N, β, s21, N);
cblas_zgemm(CblasRowMajor, CblasTrans, CblasNoTrans, N1, N1, N, α, Le, N1, s12, N1, β, neka1, N1);
for (int i=0; i
s22[i*N1+j].imag = -neka1[i*N1+j].imag;
if (i==j) s22[i*N1+j].real++;
}
if (temp==f1)
fprintf(fs11," > GHz %1d\n ", lworkLi);
//else if (temp>=f1) {
fprintf(fs11, "%lf\t %1d %1d 0 0 0 0 %.15lf %.15lf\n", temp/1000000000, 1, 1, s11[0].real, s11[0].imag);
fprintf(fs11, "%lf\t %1d %1d 0 0 0 0 %.15lf %.15lf\n", temp/1000000000, 1, 2, s12[0].real, s12[0].imag);
fprintf(fs11, "%lf\t %1d %1d 0 0 0 0 %.15lf %.15lf\n", temp/1000000000, 2, 1, s21[0].real, s21[0].imag);
fprintf(fs11, "%lf\t %1d %1d 0 0 0 0 %.15lf %.15lf\n", temp/1000000000, 2, 2, s22[0].real, s22[0].imag);
//}
}
fclose(fs11);
fclose(fs12);
delete[] Bna;
delete[] Bnc;
delete[] Le;
delete[] L;
delete[] Li;
delete NLi;
delete MLi;
delete ldaLi;
delete[] Ld;
delete[] work;
delete[] s11;
delete[] s12;
delete[] s21;
delete[] s22;
delete[] neka;
delete[] neka1;
delete[] ipivLi;
delete infoLi;
}
///////////////////////////
prog.cpp
//////////////////////////
#include
#include
#include
#include "int1.h"
#include "hstep.h"
extern "C" {
#include"mkl_cblas.h"
#include"mkl_lapack.h"
}
using namespace std;
int main() {
clock_t t1 = clock();
Alfa::hstep("", 0.1, 0.08, 0, 20, 1.8, 2.4, 3);
clock_t t2 = clock();
cout << (double)(t2-t1)/CLOCKS_PER_SEC;
}
Thanks to all, i hope that i didnt make any stupid errors.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The error is not caused by MKL, but by a misuse on your part of a floating point expression. When you call function int1 with a=0.1, c=0.08, n=14 and m=11, the test if (n*c!=a*m) can fail to perform as you intended it to. The test succeeds only if all bits of n*c and a*m coincide. However, because of the inability to represent numbers exactly, 15*0.08 need not equal 12*0.1. To see this, add the following statement prior to the test and look at the output produced:
For helpful discussions on this and related topics, see Goldberg's article.
There is a second lesson to be drawn from this example: it would have been impossible to make a diagnosis based on your first post in this thread. With the full code available, on the other hand, one could have given the Intel compiler the -Wall option to receive the helpful warning:
C = alpha op(A) op(B) + beta C
and, therefore, C has to be initialized before calling the routine, unless beta is zero and the elements of C are well-defined.
[cpp] if(n==15 && m==12)printf("n=%d, m=%d, n*c==a*m? %sn",n,m,n*c==a*m ? "yes":"no");To avoid this error, replace the test by something similar to
[/cpp]
[bash]if(fabs(n*c/(a*m)-1) > 1e-8)in whatever programming language you use (C++, Fortran, Matlab, etc.).
[/bash]
For helpful discussions on this and related topics, see Goldberg's article.
There is a second lesson to be drawn from this example: it would have been impossible to make a diagnosis based on your first post in this thread. With the full code available, on the other hand, one could have given the Intel compiler the -Wall option to receive the helpful warning:
[bash]int1.cpp(12): remark #1572: floating-point equality and inequality comparisons are unreliableAfter you have corrected for this point, you will need to fix another potential source of error. The GEMM routines perform the operation
if (n*c!=a*m) {
[/bash]
C = alpha op(A) op(B) + beta C
and, therefore, C has to be initialized before calling the routine, unless beta is zero and the elements of C are well-defined.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks man, you have helped me alot. Sorry for the stupid error, i am kind of a begginer. Live and learn, right. Thanks again.
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page