- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello!
I've tried to remove reduction(+: for reducing OpenMP stack memory usage. After some playing, I got the following code:
!$omp parallel default(none) &
!$omp shared(mol, itbl, qvec, gExp, hardness, nat, djdL, djdr, djdtr) &
!$omp private(iat, jat, ish, jsh, ii, jj, iid, jid, r1, g1, gij, vec, dG, dS, djdL_omp, djdr_omp, djdtr_omp)
!$ allocate(djdL_omp(size(djdL, dim=1), size(djdL, dim=2), size(djdL, dim=3)), source = 0.0_wp)
!$ allocate(djdr_omp(size(djdtr, dim=1), size(djdr, dim=2), size(djdr, dim=3)), source = 0.0_wp)
!$ allocate(djdtr_omp(size(djdtr, dim=1), size(djdtr, dim=2)), source = 0.0_wp)
!$ associate (djdL => djdL_omp, djdr => djdr_omp, djdtr => djdtr_omp)
!$omp do collapse(2) schedule(dynamic,32)
do iat = 1, nat
do jat = 1, nat
if (jat >= iat) cycle
ii = itbl(1, iat)
iid = mol%id(iat)
jj = itbl(1, jat)
jid = mol%id(jat)
vec(:) = mol%xyz(:, jat) - mol%xyz(:, iat)
r1 = sqrt(sum(vec**2))
do ish = 1, itbl(2, iat)
do jsh = 1, itbl(2, jat)
gij = gamAverage(hardness(ish, iid), hardness(jsh, jid))
g1 = 1.0_wp / (r1**gExp + gij**(-gExp))
dG(:) = -vec*r1**(gExp-2.0_wp) * g1 * g1**(1.0_wp/gExp)
dS(:, 1) = 0.5_wp * dG(1) * vec
dS(:, 2) = 0.5_wp * dG(2) * vec
dS(:, 3) = 0.5_wp * dG(3) * vec
djdr(:, iat, jj+jsh) = djdr(:, iat, jj+jsh) - dG*qvec(ii+ish)
djdr(:, jat, ii+ish) = djdr(:, jat, ii+ish) + dG*qvec(jj+jsh)
djdtr(:, jj+jsh) = djdtr(:, jj+jsh) + dG*qvec(ii+ish)
djdtr(:, ii+ish) = djdtr(:, ii+ish) - dG*qvec(jj+jsh)
djdL(:, :, jj+jsh) = djdL(:, :, jj+jsh) + dS*qvec(ii+ish)
djdL(:, :, ii+ish) = djdL(:, :, ii+ish) + dS*qvec(jj+jsh)
end do
end do
end do
end do
!$omp end do nowait
!$ end associate
!$omp critical (djdr_crt)
!$ djdr(:,:,:) = djdr + djdr_omp
!$omp end critical (djdr_crt)
!$omp critical (djdL_crt)
!$ djdL(:,:,:) = djdL + djdL_omp
!$omp end critical (djdL_crt)
!$omp critical (djdtr_crt)
!$ djdtr(:,:) = djdtr + djdtr_omp
!$omp end critical (djdtr_crt)
!$omp end parallel
See https://github.com/grimme-lab/xtb/pull/1350
However, while this code works fine for gfortran + OpenMP, ifx + OpenMP sucks.
It can happen because:
- associate in !$omp parallel
- masking already existing variable in associate
- something else...
Tested ifx version is 2025.2.0.
Link Copied
0 Replies
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page