Convolution using vslzConvNewTask, vslzConvExec, etc .. crashes

Paul_Margosian · ‎01-19-2011

Trying to do a convolution, following directions in published reference material. Code compiles but crashes on vslzConvNewTask with

"Unhandled exception at 0x000000013ffcacff in GrappaTesting.exe: 0xC0000005: Access violation writing location 0x0000000000000004"

Code setting things up is below. Any ideas on what might be the problem?

Paul Margosian

int aux1 = sizeof(int);
VSLConvTaskPtr* task = (VSLConvTaskPtr*)aux1;
int xshape[2];// = {2,3};
int yshape[2];// = {256,512};
int zshape[2]; //= {256,512};
xshape[0] = kRows; //2
xshape[1] = kCols; //3
yshape[0] = numRows; //256
yshape[1] = numCols; //512
zshape[0] = numRows+kRows-1;
zshape[1] = numCols+kCols-1;

int xstride[2];
int ystride[2];
int zstride[2];
xstride[0] = 1;
xstride[1] = kCols;
ystride[0] = 1;
ystride[1] = numCols;
zstride[0] = 1;
zstride[1] = numCols;

int status;
const int dims = 2;
status = vslzConvNewTask(task, VSL_CONV_MODE_FFT, dims,xshape, yshape, zshape);

Vladimir_Petrov__Int · ‎01-19-2011

Paul,

Please change the task declaration to:
VSLConvTaskPtr task;
and the last line to:
status = vslzConvNewTask(&task, VSL_CONV_MODE_FFT, dims,xshape, yshape, zshape);

Best regards,
-Vladimir

Paul_Margosian · ‎01-20-2011

Vladimir,

Thanks. This fixes work. It builds, but crashes on the next key instruction:

status = vslzConvExec(&task, x, xstride, y, ystride, z, zstride);

Details below. Any insight on this one?

Paul Margosian

Another crash for unknown reasons. Being a new user is a struggle. Message:

Unhandled exception at 0x094626a0 in GrappaTesting.exe: 0xC0000005: Access violation at location 0x00000000094626a0.

Chain of code is what you see above, updated by your fixes:

VSLConvTaskPtr task = (VSLConvTaskPtr)aux1; ...... (your fix .. OK)
....
.... (load some parameters as above)
....
status = vslzConvNewTask(&task, VSL_CONV_MODE_FFT, dims,xshape, yshape, zshape); // (your fix .. OK)

// define x, y, z
MKL_Complex16 * x = new MKL_Complex16 [xshape[0]*xshape[1]];
MKL_Complex16 * y = new MKL_Complex16 [yshape[0]*yshape[1]];
MKL_Complex16 * z = new MKL_Complex16 [zshape[0]*zshape[1]];

.....
..... (load complex numbers into x, y)
.....
status = vslzConvExec(&task, x, xstride, y, ystride, z, zstride); /// crashes here

Vladimir_Petrov__Int · ‎01-20-2011

Paul,

Like you have already learnt you need to initialize the task first (hence you must pass an address of the task variable), later on you just use it (i.e. simply pass the task variable (without the ampersand)) like this:

status = vslzConvExec(task, x, xstride, y, ystride, z, zstride); /// does not crash here

Please also feel free to enjoy the examples in \mkl\examples\vslc\source folder.

Best regards,
-Vladimir

Paul_Margosian · ‎01-20-2011

Vladimir,

Thanks for the effort. Still crashes.

Trying to keep this simple, this is how the code looks now

int aux1 = sizeof(int);
VSLConvTaskPtr task = (VSLConvTaskPtr)aux1; // your fix
....
.... (define xstride, xshape, dims, etc) ...
....
status = vslzConvNewTask(&task, VSL_CONV_MODE_FFT, dims,xshape, yshape, zshape); // your fix
...
... (define x, y as before, load complex numbers) ..
...
status = vslzConvExec(task, x, xstride, y, ystride, z, zstride); // with fix, still crashes here

Have tried the example ... unfortunately it is quite far from this one .. 1D, single precision, etc .. got into the first troubles by following their pointer formats.

Is there some obscure problem with contents of xshape, xstride, etc, or mabe using "new" to define the x,y,z vectors?

Using VS 2008 for C++, Windows 7 Pro 64 bit, Dell T5500, in case any of these have an effect.

Still Stuck.

Paul Margosian

Vladimir_Petrov__Int · ‎01-21-2011

Paul,

Now please try one of the following:
- use NULLs (since you seem to expect the default data layout) instead of xstride, ystride, zstride in vslzConvExec() like this:
status = vslzConvExec(task, x, NULL, y, NULL, z, NULL);

- or set the strides properly like this:
xstride[0] = 1;
xstride[1] = kRows;
ystride[0] = 1;
ystride[1] = numRows;
zstride[0] = 1;
zstride[1] = numRows;

since C/C++ assumes row-major data layout (as opposed to FORTRAN).

Best regards,
-Vladimir

Paul_Margosian · ‎01-21-2011

Vladimir,

Thanks very much. Both of your proposed solutions work (build and run OK), BUT ...
answer is a matrix full of "NAN". Still, it's progress of sorts.

Will try experimenting with various orders of loading, etc .. this thing seems very sensitive and documentation is sketchy.

Have already:
* Done the convolution explicitly in C++ .. works fine, but is very slow.
* Tried own FFT approach using FFTW .. faster but have leftover phase problems that need debugging

vstzConvExec is quite fast, but produces bad answers so far.

If you have any further hints, I'd appreciate them.

Thanks again,

Paul Margosian

Vladimir_Petrov__Int · ‎01-21-2011

Paul,

I apologize for mixing the row/col-majornesses myself. At last here is a piece of code that actually works

[cpp]#include "mkl_vsl.h"

#include 

// define LETS_USE_STRIDES if you want to use non-default strides
//#define LETS_USE_STRIDES

int main()
{
  int kRows = 2;
  int kCols = 3;
  int numRows = 16;
  int numCols = 8;
  int resRows = numRows+kRows-1;
  int resCols = numCols+kCols-1;

  VSLConvTaskPtr task;
  MKL_INT xshape[2];
  MKL_INT yshape[2];
  MKL_INT zshape[2];
  xshape[0] = kCols;
  xshape[1] = kRows;
  yshape[0] = numCols;
  yshape[1] = numRows;
  zshape[0] = resCols;
  zshape[1] = resRows;

#if defined(LETS_USE_STRIDES)
  MKL_INT xstride[2];
  MKL_INT ystride[2];
  MKL_INT zstride[2];

  xstride[0] = 1;
  xstride[1] = kCols;
  ystride[0] = 1;
  ystride[1] = numCols;
  zstride[0] = 1;
  zstride[1] = resCols;
#endif

  int status;
  const MKL_INT dims = 2;
  status = vslzConvNewTask(&task, VSL_CONV_MODE_FFT, dims,xshape, yshape, zshape);

  // define x, y, z
  MKL_Complex16 * x = new MKL_Complex16 [xshape[0]*xshape[1]];
  MKL_Complex16 * y = new MKL_Complex16 [yshape[0]*yshape[1]];
  MKL_Complex16 * z = new MKL_Complex16 [zshape[0]*zshape[1]];

  for (int row=0; row

Paul_Margosian · ‎01-21-2011

Vladimir,

I much appreciate your sample code.

By now I had figured out that the row-col thing was flipped .. went from a matrix full of "NAN" to an image that was very much shaded .. been struggling with that. Been trying all sorts of code permutations with no luck.

Will study your code and let you know if one of the steps proves crucial.

Thanks again.

Paul Margosian

Vladimir_Petrov__Int · ‎01-21-2011

Paul,

Besides the row-col flip-flop (introduced by me) the most crucial thing is the mistaken stride of the output (feel the difference between your code and line #38 in mine :))

Best regrads,
-Vladimir

Paul_Margosian · ‎01-21-2011

Vladimir,

Update: only difference between formats in your code and mine (now) was usage of MKL_INT instead of int for some key constants. Results were the same.

My remaining mystery is that I have done the covolutions (really many of them) explicitly (kind of "by hand") and gotten a good answer. What I get using vslzConv... is kind of in the ballpark but with heavy shading. Got the same thing when trying to "do it myself" using fftw twice, multiply, etc. This is probably a mess of my own making .. possibily connected with the scrolling of data caused by most "in-place" FFT methods.

Whatever it turns out to be, I think you've done everything you can to help.

The application is quite an interesting one from MRI called "parallel imaging". For the sake of speed, one can leave out some of the standard measurements. Using multiple receivers, one can construct interpolation coefficients to use for calculating the missing data points. One can turn the small arrays of interpolation coefficients into true convolution kernels. So here I am, trying to do the appropriate convolutions as quickly as possible... and not doing so well.

Anyway, unless you know something very special about location or orientation of data in the input matrices or some other semi-magic trick, I probably have a debugging task to do and that's about it.

Thanks again.

Paul Margosian

Paul_Margosian · ‎01-21-2011

Vladimir,

Oops, just noticed the larger size for the output z matrix. Tried that and got very different output. The game now will be to guess just which 256x512 set of points to extract from z for the final output. Just pulling out the central ones didn't do the right thing, but there are many combinations yet to try.

You've put your finger on something important. Will let you know how it plays out.

What I'm getting now, if done up in color, might be considered a peculiar kind of art. Since it's really supposed to be a human brain, that idea might not be very popular.

Thanks again.

Paul Margosian

Paul_Margosian · ‎01-21-2011

Vladimir,

Double checked this after your note. Got all excited for a moment, but the big changes I got were just an indexing mistake. Making Z big enough for true convolution to work all the way to the edge is a good thing, but did not change the appearance of my output images.

I've got some other problem and will just have to struggle with it.

Thanks for your help.

Paul Margosian

Paul_Margosian · ‎01-22-2011

Vladimir,

Found my own mistakes, so all is well. Building some images by a linear combination of these convolved pieces .. and I was doing my sums incorrectly.

Thanks for your help to get this function operating. Just wish it were a little faster.

Paul Margosian

Vladimir_Petrov__Int · ‎01-22-2011

Paul,

I'm glad your app is finally running as expected.
Is it the MKL perf or my responsiveness you are referring to when you say "faster" :)?

Best regards,
-Vladimir

Paul_Margosian · ‎01-24-2011

Vladimir,

You're funny. Without your help would still be trying to psychoanalyze the documentation.

Just put in a new thread seeking ideas for making vslzConvExec faster when dealing with a lot of convolutions.

Paul Margosian

apolo74 · ‎07-25-2011

Quoting Vladimir Petrov (Intel)

Paul,

Please change the task declaration to:
VSLConvTaskPtr task;
and the last line to:
status = vslzConvNewTask(&task, VSL_CONV_MODE_FFT, dims,xshape, yshape, zshape);

Best regards,
-Vladimir

Why is it not possible to use 'VSLConvTaskPtr* task'? In my application I'm forced to use a pointer since that variable will be used by several functions of my class. In the documentation says that:

vslsConvNewTaskX1D(VSLConvTaskPtr* task, ...
vslConvSetStart( VSLConvTaskPtrtask, ...
vslsConvExecX1D(VSLConvTaskPtrtask, ...
vslConvDeleteTask( VSLConvTaskPtr*task );

The only thing that works is:

VSLConvTaskPtr task;
vslsConvNewTask1D( &task, ...
vslConvSetStart( task, ...
vslsConvExec1D( task, ...
vslConvDeleteTask( &task );

How can I translate this into a pointer-based convolution?

Boris

PS: [ Solved ] I see now that 'VSLConvTaskPtr' is already a '*' so I can use it across my functions. Sorry for the waste of space and time.

Convolution using vslzConvNewTask, vslzConvExec, etc .. crashes .. Now Solved!