Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Intel Community
- Software
- Software Development SDKs and Libraries
- Intel® oneAPI Math Kernel Library
- scalapack psgemm fails

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

Arrigoni__Viviana

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

06-20-2019
02:11 AM

123 Views

scalapack psgemm fails

Hi.

I want to use psgemm to multiply a matrix A by its transpose on its left-hand-side, namely: C = At * A

I am trying to run a simple program with 9 processors placed in a 3x3 grid. Each processor generates a random square matrix, Ablock, of size block_dim x block_dim (choosing here block_dim = 10), that is a submatrix of A. Hence the global matrix A is a 30x30 matrix.

myrow and mycol are the row and column grid indexes of processors.

idesca and idescc are the descriptors of matrices A and C.

C is initialized as a block_dim x block_dim array.

I call psgemm as follows, but I get a segmentation fault:

psgemm_('T', 'N', block_dim, block_dim, block_dim, &one, Ablock, myrow * block_dim, mycol * block_dim, idesca, Ablock, myrow * block_dim, mycol * block_dim, idescal, &zero, C, myrow * block_dim, mycol * block_dim, idescc);

what am I doing wrong?

Link Copied

4 Replies

Gennady_F_Intel

Moderator

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

06-23-2019
10:15 PM

123 Views

Arrigoni__Viviana

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

06-24-2019
01:20 AM

123 Views

Here is what I do before calling psgemm:

int n_procs, id; int nprow, npcol, myrow, mycol, np, info; int idesca[9], idescc[9]; int block_dim, n; Cblacs_pinfo(&id, &n_procs); np = (ceil)(sqrtf((double)n_procs)); n = 27; block_dim = n / np; Cblacs_get( -1, 0, &icon ); Cblacs_gridinit( &icon,"r", np, np ); Cblacs_gridinfo( icon, &nprow, &npcol, &myrow, &mycol); descinit_(idesca, &n, &n , &block_dim, &block_dim , &izero, &izero, &icon, &block_dim, &info); descinit_(idescc, &n, &n, &block_dim, &block_dim, &izero, &izero, &icon, &block_dim, &info); float *Ablock = (float*)calloc(block_dim * block_dim, sizeof(float)); float *C = (float*)calloc(block_dim * block_dim, sizeof(float)); for (int k = 0; k < n_procs; ++k){ generate_rand_mtx(Ablock, block_dim, block_dim); }

Where generate_rand_mtx is the following function:

void generate_rand_mtx(float *A, int m , int n){ int seed = time(NULL); srand(seed); for (int i = 0; i < m; ++i){ for (int j = 0; j < n; ++j) A[i * n + j] = (float)rand()/(float)(RAND_MAX); } return; }

I compile the code in this way:

mpiicc -std=c99 -DMKL_LP64 -I${MKLROOT}/include -o atasc AtA_scalapack.c -L${MKLROOT}/lib/intel64 -lmkl_scalapack_lp64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lmkl_blacs_intelmpi_lp64 -lpthread -lm -ldl

and I run it with 9 processors:

mpirun -np 9 ./atasc

Gennady_F_Intel

Moderator

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

06-24-2019
01:58 AM

123 Views

**-DMKL_LP64** - what is that? Do you want to link with ILP64 API, then please set -DMKL_ILP64 and link with ILP64 libs

Arrigoni__Viviana

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

06-24-2019
02:10 AM

123 Views

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

For more complete information about compiler optimizations, see our Optimization Notice.