Interesting Fortran Algorithm

JohnNichols · ‎05-19-2021

I have been looking at some spike detection code, I use some that I developed, but I was wondering if there were better ideas. So I have been trying a few. This code - from yesterday - intrigued me until I looked at it and decided the basic algorithm is not really good,

PeakDetect = 0
    filteredY = 0.0
    filteredY(1:lag+1) = y(1:lag+1)
    avgFilter = 0.0
    avgFilter(lag+1) = mean(y(1:2*lag+1))
    stdFilter = 0.0
    stdFilter(lag+1) = std(y(1:2*lag+1))
    write(*,100)1,lag + 1,2*lag+1, avgFilter(lag+1), stdFilter(lag+1)
100 Format(3(i6,"  "),2(F10.5," "))    
    if (stdFilter(lag+1)/avgFilter(lag+1)>0.1) then ! If the coefficient of variation exceeds 10%, the signal is too uneven at the start, possibly because of a peak.
        write(unit=*,fmt=1001)
1001    format(1X,'Warning: Peak detection might have failed, as there may be a peak at the edge of the frequency range.',/)
    end if

as it assumes that if we determine the average over a length and the standard deviation - then if the coefficient of variation is more than 10% you have a spike at the beginning.

Whereas a set of numbers with mean close to zero will trip this every time.

Interesting problem.

JohnNichols · ‎05-19-2021

The concept for the code is ok, somewhat, it is the implementation that is suspect. The coder does a look ahead and determines an average and stdev about a set of numbers centered on the value of interest, but in the code, the coder sets the forward values to zero.

Interesting.

JohnNichols · ‎05-19-2021

function PeakDetect(y,lag,threshold, influence)
    implicit none
    ! Declaring part
    real, dimension(:), intent(in) :: y
    real, dimension(size(y)) :: filteredY, avgFilter, stdFilter, PD, PeakDetect
    integer :: lag, ii
    real :: threshold, influence
    !real mean
    !real std

    ! Executing part
    PD = 0.0
    filteredY = Y
    filteredY(1:lag+1) = y(1:lag+1)
    avgFilter = 0.0
    avgFilter(lag+1) = mean(y(1:2*lag+1))
    stdFilter = 0.0
    stdFilter(lag+1) = std(y(1:2*lag+1))
    write(*,100)1,y(1)*1000.0,1,2*lag+1, avgFilter(lag+1), stdFilter(lag+1),0, 1
100 Format((i6,"  "),F10.5,"  ", 2(i6,"  "),4(F10.5," "))    
    if (stdFilter(lag+1)/avgFilter(lag+1)>0.1) then ! If the coefficient of variation exceeds 10%, the signal is too uneven at the start, possibly because of a peak.
        write(unit=*,fmt=1001)
1001    format(1X,'Warning: Peak detection might have failed, as there may be a peak at the edge of the frequency range.',/)
    end if
    do ii = lag+2, (size(y)-1)
        if (abs(y(ii) - avgFilter(ii-1)) > threshold * stdFilter(ii-1)) then
            ! Find only the largest outstanding value which is only the one greater than its predecessor and its successor
        write(*,*)ii,(y(ii)-avgFilter(ii-1)),y(ii)-y(ii-1),y(ii)-y(ii+1)
            if (y(ii) > avgFilter(ii-1) .AND. y(ii) > y(ii-1) .AND. y(ii) > y(ii+1)) then
                PD(ii) = 1
                write(*,*)"here"
            end if
            filteredY(ii) = influence * y(ii) + (1 - influence) * filteredY(ii-1)
        else
            filteredY(ii) = y(ii)
        end if
        ! Modified with respect to the original code. Mean and standard deviation are calculted symmetrically around the current point
        avgFilter(ii) = mean(filteredY(ii-lag:ii+lag))
        stdFilter(ii) = std(filteredY(ii-lag:ii+lag))
        write(*,100)ii,y(ii)*1000.0,ii-lag,ii+lag, avgFilter(ii)*1000.0, stdFilter(ii)*1000.0,PD(ii),(y(ii)-avgFilter(ii))/stdFilter(ii)
    end do
    PeakDetect = PD
    end function PeakDetect

So the function is called PeakDetect, in the original code, the coder set values of PeakDetect(i) = 1.0, but when you do a debug, the PeakDetect array does not turn up in the autos, and the values array returned are all zero, but if I create an array PD, and do PeakDetect = PD just before the return it works normally and I can see PD in the autos.

Is this normal behaviour. At least now I am getting peaks, just to many.

JohnNichols · ‎05-19-2021

No Gaussian Data

The interesting challenge with modern bulk data is you start to pick up the things that are non-Gaussian, this picture shows a typical sample. The pattern repeats at 512 -- binary system, and you can see the non- Gaussian at about 100 into the 512 segment. Often the problem with non-Gaussian is the distribution is not symmetric and has a much broader range than the Gaussian stuff. But the Gaussian stuff gives you a nice smooth middle set.

The trick is determining the Gaussian from non-Gaussian. And now fixing the rather common filter above for more difficult data.

The challenge is not having to much of the big data, you can use extreme techniques the problem is people want the benefits of big data without the waiting time. ie how long does it take paint to dry, no do not stick your finger in the wet paint to test it.

Interesting fun.

Arjen_Markus · ‎05-19-2021

This is a topic for which a lot of algorithms have been thought of. I have myself implemented a few, based on the available literature, though not (yet) in Fortran.

JohnNichols · ‎05-20-2021

This is a problem in that each of the 512 points in a segment is drawn from an individual probability distribution that is being driven by a distinct and separate linearly independent equation and so the results are not at all Gaussian and so using means and std is a real challenge, not over the long term, if you have 5 million data sets it is not hard, but people want an answer in 2 to 4 minutes, each 512 is 8 seconds, so we have from 14 to 28 data sets.

The interesting problem is to confirm that the current method is the best.

Fortran is missing many things, unfortunately, but that is its beauty it is stable.

JohnNichols · ‎05-20-2021

This algorithm has some potential, but I am tired of having to plot the results in Excel, so I am fixing a BMP generator program in Fortran to draw simple plots of the results.

JohnNichols · ‎05-20-2021

      subroutine circle(rgb, iA, jA, iB,jB,m1, m2, m3)
      implicit none
      integer iA, iB, m1,m2,m3, jA, jB, i, j, iC, jC,m4,m5,m6,iD,jD
      character*1 rgb(3,ihpixf,jvpixf) !      RGB pixel data array
      real iAA, iBB, jAA, jBB, iCC, jCC,iA1, jA1, delI, delJ, lenG, lenH

      iAA = real(iA)
      iBB = real(iB)
      jAA = real(jA)
      jBB = real(jB)
      lenH = (abs(jBB - jAA))/2.0

      iCC = (iAA+iBB)/2.0
      jCC = (jAA+jBB)/2.0
      iC = int(iCC)
      jC = int(jCC)
      iD = iB - iA
      jD = jB - jA

      if(id .ne. jD) then
          stop "Not a circle"
      endif

      do 110 j = jA, jB
          do 120 i = iA, iB
              if(j .eq. jA .and. i .eq. iC) then
                  write(*,*)i,j
                  rgb(1,i,j) = char(m1)
                  rgb(2,i,j) = char(m2)
                  rgb(3,i,j) = char(m3)


              else if(j .gt. jA .and. j .lt. jC .and. i .eq. iC) then
                  write(*,*)i,j
                  rgb(1,i,j) = char(m1)
                  rgb(2,i,j) = char(m2)
                  rgb(3,i,j) = char(m3)

              elseif(j .eq. jC) then
                  write(*,*)"here I am ", i,j,iC,jC
                  rgb(1,i,j) = char(m1)
                  rgb(2,i,j) = char(m2)
                  rgb(3,i,j) = char(m3)

              else if(j .gt. jC .and. j .lt. jB .and. i .eq. iC) then
                  write(*,*)i,j
                  rgb(1,i,j) = char(m1)
                  rgb(2,i,j) = char(m2)
                  rgb(3,i,j) = char(m3)

              elseif(j .eq. jb.and. i .eq. iC) then
                  write(*,*)i,j
                  rgb(1,i,j) = char(m1)
                  rgb(2,i,j) = char(m2)
                  rgb(3,i,j) = char(m3)
              else
                  write(*,*)i,j
                  iA1 = real(i)
                  jA1 = real(j)
                  delI = abs(iA1 - iC)

                  delJ = abs(jA1 - jC)
                  lenG = sqrt(delI*delI + delJ*delJ)
                  if(lenG .gt. lenH) then
                      
                      if(lenH  .gt. 0.9 * lenG) then
                          rgb(1,i,j) = char(m1)
                          rgb(2,i,j) = char(155)
                          rgb(3,i,j) = char(155)
                          write(*,*)"Drawn A"
                          else
                          write(*,*)"Not drawn A"
                      endif

                  else
                      write(*,*)"Drawn"

                      rgb(1,i,j) = char(m1)
                      rgb(2,i,j) = char(m2)
                      rgb(3,i,j) = char(m3)
                  endif




              endif
120   end do
110   end do


      return
      end subroutine

The interesting part is drawing a pixelated circle, do you think it would be better to create a flag that is set in the if statements and then have a small routine for the rgb setting or how I have got it? Not sure which would be faster.

JohnNichols · ‎05-20-2021

pixel circle --