Software Archive
Read-only legacy content
17061 Discussions

Help with vectorization

conor_p_
Beginner
328 Views

Hello everyone, 

I have a subroutine that definitely has the computational load that could benefit from a MIC. The subroutine is being used in a molecular simulation. The idea behind the subroutine is to determine which particles are within a certain distance, and for each particle create an array that says which particles are within that distance. However because I am only updating the array inside an if statement, the compiler sees a vector dependence and won't vectorize. I am wondering if anyone here sees a way to vectorize this subroutine.

 

program neighbor
  use ifport

  implicit none
  double precision, allocatable :: x(:),y(:),z(:)
  double precision :: dx,dy,dz,dr2
  double precision :: dens,vol,box,rcut2
  double precision :: x1,y1,z1,x2,y2,z2
  integer :: np
  integer :: i,j
  integer,allocatable :: numNeigh(:),vlist(:,:)

  np = 600
  allocate(numNeigh(np))
  allocate(vlist(1000,np))
  allocate(x(np),y(np),z(np))

  dens  = 1.0d0
  vol   = real(np)/dens
  box   = vol**(1.0d0/3.0)
  rcut2 = 2.50d0*2.50d0
  call seed(10)

  numNeigh(:) = 0
  vlist(:,:)  = 0

  !--- create arbitrary initial coordinates. Not important
  do i = 1,np
     x(i) = box*rand(); y(i) = box*rand(); z(i) = box*rand()
  enddo

  do i = 1,np
     x1 = x(i); y1 = y(i); z1 = z(i)
     !---would like to vectorize this inner loop
     do j=1,np

        if(i.eq.j)then
           cycle
        endif
        
        !---calcuate distance between particles i and j
        x2 = x(j); y2 = y(j); z2 = z(j)
        dx = x2-x1; dy = y2-y1; dz = z2-z1
        dr2 = dx**2 + dy**2 + dz**2

        if(dr2.lt.rcut2)then
           !---if distance is less than cutoff distance, update number of neighbors and store particle j in vlist
           numNeigh(i) = numNeigh(i) + 1
           vlist(numNeigh(i),i) = j
        endif

     enddo
  enddo

  stop
end program neighbor

 

0 Kudos
1 Reply
James_C_Intel2
Employee
328 Views

A fairly standard approach to something like this is to treat it in two phases. 

  1. Build a boolean vector showing which elements are near,
  2. Compress that into your neighbour list as a separate operation.

Something like this.

program neighbor
  use ifport

  implicit none
  double precision, allocatable :: x(:),y(:),z(:)
  double precision :: dx,dy,dz,dr2
  double precision :: dens,vol,box,rcut2
  double precision :: x1,y1,z1,x2,y2,z2
  integer :: np
  integer :: i,j
  integer,allocatable :: numNeigh(:),vlist(:,:)
  logical,allocatable :: nearEnough(:)

  np = 600

  allocate(numNeigh(np), nearEnough(np))                        !vlist can't need more than np items for each element.
  allocate(vlist(np,np))
  allocate(x(np),y(np),z(np))

  dens  = 1.0d0
  vol   = real(np)/dens
  box   = vol**(1.0d0/3.0)
  rcut2 = 2.50d0*2.50d0
  call seed(10)

  numNeigh(:) = 0
  vlist(:,:)  = 0

  !--- create arbitrary initial coordinates. Not important
  do i = 1,np
     x(i) = box*rand(); y(i) = box*rand(); z(i) = box*rand()
  enddo

! For each particle
  do i = 1,np
     x1 = x(i); y1 = y(i); z1 = z(i)
     !---would like to vectorize this inner loop
     do j=1,np                                  
        if (i .eq. j) then
           nearEnough(j) = .false.
           cycle
        endif
        !---calculate distance between particles i and j
        x2 = x(j); y2 = y(j); z2 = z(j)
        dx = x2-x1; dy = y2-y1; dz = z2-z1
        dr2 = dx**2 + dy**2 + dz**2

        nearEnough(j) = dr2 .lt. rcut2
     enddo

     ! Now reduce 
     do j=1,np
        if (nearEnough(j)) then
           numNeigh(i) = numNeigh(i) + 1
           vlist(numNeigh(i),i) = j
        endif
     enddo
  enddo

  stop
end program neighbor

The first loop now vectorizes. There are a number of other issues here, though:-

  1. Since distance(a,b) .eq. distance(b,a) you should be able to halve the amount of work (and remove the conditional from the interesting loop) by running it from i+1 to np, though you'll then need some more code to build the full neighbour list.
  2. If you parallelize the outer loop and do the halving trick the work will be imbalanced, so play with either schedule(static,1), or schedule(dynamic)

HTH

0 Kudos
Reply