Hi, Intell technician,
I revised our application Fortran codes in terms of the instructions given in the following paper:
However, our Fortran application did not improve in performance. On the contrary, our Fortran application get worse performances.
For example, if I add !dir$ vector aligned before a do loop:
!dir$ vector aligned
do itt=1, jgm
hmat(itt,mdnorp)=cs(itt) * hmat(itt,mdnorp)-sn(itt) * hmat(itt,mdnorp)
Could you tell me what probably caused such a poor performance? Do you have some standard Fortran OpenMP programmes for public to confirm any works given in the above paper?
I look forward to hearing from you.
For the example you show, vector aligned could improve performance only if you can guarantee alignment for all used values of mdnorp for example with a leading dimension of a multiple of 16 for 32 bit data. If the loop is long enough to measure performance accurately, any gain from the directive may be negligible.