- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi all I rewrote a program for exploiting SIMD.
However I seem to be encountering a problem of segmentation fault. How do I find whats wrong?
Attached are the files program and test program with SIMD that fail's.
while test.F90 is just loop.F90 with minor modifications for some reason it fails. and I dont know why.
I use the intel 15 set of compilers
Any help?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
For using SIMD your arrays must be aligned to vector widths (64 bytes on MIC).
Insert alignment tests (prior to the loop). Find out what is in error and fix it.
If I were to make a guess, TLT is not aligned thus making your components inside not aligned. (Though it may be any of the other arrays like LMASK nad/or KAPPA_THIC.)
Note, after aligning TLD, be careful about changing the dimensions of the components. As currently expressed, the arrays are multiples of vector width. If it is necessary to change the extents, then you may require some pad cells.
The questions for you to figure out is: a) Will the compiler optimization make use of LMASK in masked stores? And, b) Is your hit ratio on LMASK==.true. sufficiently high enough to warrant use of masked stores?
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I guess its my mistake. I didnt mean error was with addition.
What I meant is, I "modernized the code" to be better used in MIC.
The original Code (loop.F90) would compile and run(i.e ./a.out) wouldn't give any errors.
But after making a few changes I got the error SIGSEV(which is nothing but test.F90).
even if I remove the !dir$ SIMD I will get the same errors i.e SIGSEV(in test.F90). The errors are related to calculation of WORK1 and WORK3.
What can be cause of it in WORK1 and 3 calculation?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Add diagnostic printouts of the addresses of all the arrays you expect/require to be vectorized. Use the LOC function on the 1st element.
LMASK(1,1)
TLT%K_LEVEL(1,1,1)
KMT(1,1,1)
TLT%ZTW(1,1,1)
WORK1(1,1,1)
KAPPA_THIC(1,1,1,1,1)
SLX(1,1,1,1,1,1)
WORK3(1,1,1)
SLY(1,1,1,1,1,1)
If any of those addresses are not vector aligned (64 bytes on MIC), then you will have SIGSEV issues. Use the attributes on your variable declarations to assure alignment.
You can insert the tests or printouts inside a conditional section. Essentially you are inserting what a C/C++/C# programmer call an assert.
After you fix the alignment issue, you may find the compiler may optimize this better:
do j=1,8 do i=1,8 LMASK(i,j) = TLT%K_LEVEL(i,j,bid) == k .and. & TLT%K_LEVEL(i,j,bid) < KMT(i,j,bid) .and. & TLT%ZTW(i,j,bid) == 1 enddo enddo do j=1,8 do i=1,8 if ( LMASK(i,j) ) WORK1(i,j,kk) = KAPPA_THIC(i,j,kbt,k,bid) & * SLX(i,j,kk,kbt,k,bid) * dz(k) if ( LMASK(i,j) ) WORK3(i,j,kk) = KAPPA_THIC(i,j,kbt,k,bid) & * SLY(i,j,kk,kbt,k,bid) * dz(k) enddo enddo
While the compiler optimization may have figured out the single nested loop structure, it should definitely be able to figure out the two nested loop structure with respect to using the vector mask move instructions.
Note, incorporating vectorization in the above is efficient only if you have a high LMASK(i,j)==.true.
Jim Dempsey
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page