Andrew,

JohnNichols · ‎03-31-2016

do k = 1 , 3
        do j = 1 , 3
            d(1) = c(1,1)*qm(1,j,k) + c(1,2)*qm(2,j,k) + c(1,3)*qm(3,j,k)
            d(2) = c(2,1)*qm(1,j,k) + c(2,2)*qm(2,j,k) + c(2,3)*qm(3,j,k)
            d(3) = c(3,1)*qm(1,j,k) + c(3,2)*qm(2,j,k) + c(3,3)*qm(3,j,k)
            do i =1,j
                kth(i,j)=kth(i,j)+qm(1,i,k)*d(1)+qm(2,i,k)*d(2)+qm(3,i,k)*d(3)
                kth(j,i)=kth(i,j)
            end do
        end do
    end do

This has got to be a matrix operation - doe it not?

andrew_4619 · ‎04-01-2016

do k = 1 , 3
        do j = 1 , 3
            d(1) = c(1,1)*qm(1,j,k) + c(1,2)*qm(2,j,k) + c(1,3)*qm(3,j,k)
            d(2) = c(2,1)*qm(1,j,k) + c(2,2)*qm(2,j,k) + c(2,3)*qm(3,j,k)
            d(3) = c(3,1)*qm(1,j,k) + c(3,2)*qm(2,j,k) + c(3,3)*qm(3,j,k)
        end do
    end do
   
!The above is below BUT!    
! the "kth" array gets overwriten on each K loop...
! I guess there is some other code missing?    
do k = 1 , 3    
    d3X3 = matmul( C, qm(:,:,k) ) ! d3X3(3,3)
enddo

JohnNichols · ‎04-01-2016

Andrew:

Original code is on my other computer at home I will put it up tonight.

I am just getting to the printout the answers stage on this subroutine so I will try your thoughts. I am slowly tracking through it to understand the math --

Tahnks

John

jimdempseyatthecove · ‎04-01-2016

Andrew,

Your rework is missing the do i= loop.

Given the original code, and assuming this is a hot spot, I would experiment with:

cTranspose = transpose(c)
do k = 1 , 3
  do j = 1 , 3
    d(1) = dot_product(cTranspose(1:3,1),qm(1:3,j,k))
    d(2) = dot_product(cTranspose(1:3,2),qm(1:3,j,k))
    d(3) = dot_product(cTranspose(1:3,3),qm(1:3,j,k))
    select case(j)
    case(1)
      kth(1,j)=kth(1,j)+qm(1,1,k)*d(1)+qm(2,1,k)*d(2)+qm(3,1,k)*d(3)
      kth(j,1)=kth(1,j)
    case(2)
      kth(1:2,j)=kth(1:2,j)+qm(1,1:2,k)*d(1)+qm(2,1:2,k)*d(2)+qm(3,1:2,k)*d(3)
      kth(j,1:2)=kth(1:2,j)
    case(3)
      kth(1:3,j)=kth(1:3,j)+qm(1,1:3,k)*d(1)+qm(2,1:3,k)*d(2)+qm(3,1:3,k)*d(3)
      kth(j,1:3)=kth(1:3,j)
    end select
  end do
end do

Jim Dempsey

John_Campbell · ‎04-01-2016

I would try array syntax or some basic vector routines, such as the following and see if they have any effect. The chances are that ifort would already do as well as has been suggested, but sometimes we still can try. I find daxpy, when vectorised should work well for matrix operations, but not sure of small vector performance. Does AVX 256 work well with a real(8) d(3) ?

! initial code
    do k = 1 , 3
        do j = 1 , 3
            d(1) = c(1,1)*qm(1,j,k) + c(1,2)*qm(2,j,k) + c(1,3)*qm(3,j,k)
            d(2) = c(2,1)*qm(1,j,k) + c(2,2)*qm(2,j,k) + c(2,3)*qm(3,j,k)
            d(3) = c(3,1)*qm(1,j,k) + c(3,2)*qm(2,j,k) + c(3,3)*qm(3,j,k)
            do i =1,j
                kth(i,j)=kth(i,j)+qm(1,i,k)*d(1)+qm(2,i,k)*d(2)+qm(3,i,k)*d(3)
                kth(j,i)=kth(i,j)
            end do
        end do
    end do

! array syntax, assuming d(3) and c(3,3)
    do k = 1 , 3
        do j = 1 , 3
            d(:) = c(:,1)*qm(1,j,k) + c(:,2)*qm(2,j,k) + c(:,3)*qm(3,j,k)
            do i = 1,j
                kth(i,j) = kth(i,j) + dot_product ( qm(:,i,k), d )
                kth(j,i) = kth(i,j)
            end do
        end do
    end do

! or vector routines
    do k = 1 , 3
        do j = 1 , 3
            d = 0
            do i = 1,3
               call daxpy ( d, c(1,i), qm(i,j,k), 3 )
            end do
            do i = 1,j
                kth(i,j) = kth(i,j) + dot_product ( qm(:,i,k), d )
                kth(j,i) = kth(i,j)
            end do
        end do
    end do

! or use matmul
    do k = 1 , 3
        do j = 1 , 3
            d = matmul ( c, qm(:,j,k) )
            do i = 1,j
                kth(i,j) = kth(i,j) + dot_product ( qm(:,i,k), d )
                kth(j,i) = kth(i,j)
            end do
        end do
    end do

JohnNichols · ‎04-04-2016

Thank you for the replies.

The code is from an 18 degree of freedom triangular FEM element developed by Carlos Felippa.

He originally developed the code in Fortran 77, which he ported to Mathematica in the '90s so his students would not have to learn Fortran programming -- give me a break.

His TA/RA in the early 2000's ported it to Mathematica for the same reason - one has access to a PDF file with the Mathematic Code but not the Fortran code.

JohnNichols · ‎04-04-2016

Sorry hit the wrong button.

I enclose a copy of the Golmon version of the program.

I got the model coded over the weekend and started to run some tests on it. I have two interesting problems, the first is to make sure global answers are somewhat correct and the second is to make the code a bit more succinct. I started playing with matmul and dot product, but a lot of this stuff could be made more abstract with some subroutines instead of repeating the same commands many times. I also have to check that the changes do not change the answers. Which is not to hard as you can mirror the results before you make the change.

It is so many years since I sat in a Linear Algebra lecture and now I am back in the middle of it -- in those days in Pure Math we used to ask what was the purpose - now I See

John

jimdempseyatthecove · ‎04-04-2016

John,

Interesting article. I think your best approach is to perform the conversion to Fortran in three steps.

1) Without regard to implications on vectorization or parallelization make the conversion such that the results agree. Then
2) rework the Fortran code to be amenable to vectoization This generally means converting from AOS to SOA, assure results equivalent. Then
3) If necessary, rework the code to be amenable to parallelization. This generally means partitioning the data, handling special cases for interior and boundary conditions, and handling force propagations across boundaries. This is usually not as easy as it looks.

Performing the work in phases will easing debugging should (when) errors occur. Consistency of results after step 1 will prove correctness in conversion. Step 2 may produce equivalent results though not necessarily exactly the same results. It may take some analysis to assert the equivalent results are correct. Step 3 will permit you to model larger numbers of elements. Though this too may result in equivalent but inexact results.

Jim Dempsey

JohnNichols · ‎04-04-2016

Jim:

I agree - step 1 is complete, and Professor Felippa - code author just sent me the original Fortran - so I can move to Step 2.

No-one every believes FEM code.

John

JohnNichols · ‎04-04-2016

INTEGER, PARAMETER :: dp = selected_real_kind(15, 307)
    INTEGER, PARAMETER :: mn_3 = 3,mn_9 = 9, mn_2 = 2, mn_10 = 10
    Real (kind = dp) em,nu,h, c, db(mn_3,mn_3)

    c = em*h**3/(12.d0*(1.-nu**2.0))

Slowly working my way through my code and Felippa's original not many differences:

Equation for c I got wrong - was out by 144 - so I had the 12 in the wrong spot.

Question : Is dp the same as calling double precision?

He has a neat way of comparing two arrays - he adds them up and compares the difference - not exact but a good start.

JohnNichols · ‎04-04-2016

db(1,1) 3.00000015646220 REAL(8)

Next question - on comparing the values for Db(1,1) I get this value from Felippa's code and exactly the same value from my code. Why are the errors at the end of 15646220 not random? I would expect some form of round off error by now - given the calculation range, but the numbers are exact to 13 places --

John

JohnNichols · ‎04-04-2016

      call   SM3SHMEMBB (xlp, ylp, dm, 1.5d0,1.0d0, le, sm, nd, status)
      write(*,*)sm
      call   SM3SHMEMBH (xlp, ylp, dm, betab, le, sm, nd, status)
      call   SM3SHBENDB (xlp, ylp, db,1.0d0,clr,cqr, lb, sm, nd, status)
      call   SM3SHBENDH (xlp, ylp, db,1.0d+0, lb, sm, nd, status)
C

Thinking ahead - there is a lot of time spent in these routines - each one generates an independent section of the the stiffness matrix sm. Is this is the sort of thing we send parallel?

John

jimdempseyatthecove · ‎04-05-2016

John,

The first level of optimization you should consider is vectorization. This may require a complete rework of your array layouts. And as a consequence may significantly change how you parallelize the code.

Jim Dempsey

jimdempseyatthecove · ‎04-05-2016

Note, the article you posted concerned itself with one node of your surface. The real art in optimizing this code vectorization as well as parallelization, is in performing the (correct) calculations of the entire surface. The vectorization should be incorporated when you consider the entire surface.

Jim Dempsey

JohnNichols · ‎04-05-2016

Jim:

I have a lot to learn. Wonder if there is enough time left.

I am loath to put up Professor Felippa's original code without his permission, but once I have mine correct I will put up a correct version, which is derived from a published version.

I have finished two of the 4 routines - the first one was exact to 15 significant figures - but the second started to disagree at the 8 and 9th place.

Looking at the code, Felippa determined the length of the sides using sqrt whereas I left them as length squared and used the L squared in the next calculation and he resquared the result -- do you think this will make a difference at the 8th figure?

John

jimdempseyatthecove · ‎04-05-2016

Try adding:

/Qprec-sqrt /Qprec-div

or try

/fp:precise

Unfortunately (my opinion), the default is set to sacrifice precision over speed. My position is for the default to provide the precision, then have the programmer (incrementally) request the additional optimizations.

Jim Dempsey

JohnNichols · ‎04-05-2016

Jim:

It took about 6 hours to debug my code. Not many mistakes, just annoyances, interestingly - 1 was mine I left out a block of code and 2 equations incorrect in the Mathematica Copy of the original Fortran X23 instead of X32 things - so the number was negative instead of positive, One was hard to track down as it only affected one number of the 324 and then only in a small way

EXAM DIFF is useful for looking at two printouts of matrices and telling the differences.

Weirdly at the finish the two files were identical to 15 places on all numbers - the Fortran God is strange or maybe Steve's weekly sacrifice worked this week - that is a joke.

Fun

Prof. Felippa included an Eigen solver - with another element program - this comes from a 1975 book and appears to have been written in Algol - I think from a note in the files. I used it to look at the eigenvalues for this element.

Now the good part making it work with multiple elements, Pardiso and FEAST.

The version I created from the published Mathematica Code is attached -- Prof. Felippa actually made less use of variables and more use of arrays in his code - the Mathematica was altered code in interesting ways.

Thought you might like a look - before I start butchering it further.

Quite nifty actually the code and the idea -- humans are surprisingly inventive.

John

JohnNichols · ‎04-05-2016

No - the Mathematica code was correct - it was my copying that was wrong.

It took a while to decipher the Transpose(I)AI multiplication on the 3rd page, but now I have it - nifty way to create a thicker element with different properties. Quite smart.

John

jimdempseyatthecove · ‎04-05-2016

What are the inputs to the program?

John_Campbell · ‎04-05-2016

John,

"He originally developed the code in Fortran 77, which he ported to Mathematica in the '90s "

He originally developed the element in 1960's so it predates F77. There are certainly earlier versions available than a Mathematica version.

I am fairly sure that the QTSHEL thin shell routine in SAP4 from UCB is based on Felippa's shell element and so is an early publication and use of his shell element. It is a quadrilateral element using 4 triangles with a central node that is reduced out during the formulation. It basically combines a 3-node membrane triangle element and a 3-node bending triangle element to produce a 5 dof per node performance. Mid side rotation freedoms could also be included, but were not very practical.

I have used this element for many years. While it does not have the quadratic displacement field of the 8-node shells, this element is a very robust element and manages many of the areas where higher order elements can fail. It provides a predictable accuracy for reasonable mesh sizes.

The code is interesting in the use of many local variables, which avoids the matrix notation solution and minimises operations count. It is quite complex and a code I have never attempted to clean-up. However for most finite element calculations, the generation of element stiffness, stress and load matrices is a small proportion of (< 5%) of the total run time so a more concise matrix notation is preferred and much easier to audit.

Try and find the error in this almost original routine, obtained in 1978. ( I think I used a screen editor to improve the layout, which was more difficult with a card deck!)

C
C     CALLED BY:  QTSHEL
C
C     THIS SUBROUTINE FORMS THE PLATE BENDING STIFFNESS AND/OR THE
C     CONSISTENT LOAD VECTOR OF A LINEAR CURVATURE COMPATIBLE TRIANGLE
C     (LCCT) WITH 6, 5, 4 OR 3 NODAL POINTS
C
C
C * * * * * * * * * * * * * * * INPUTS  * * * * * * * * * * * * * * * *
C
C      M       NUMBER OF MIDPOINT DEGREES OF FREEDOM (M =3,2,1,0).
C              NOTE.. MIDPOINTS 4-5-6 (IF INCLUDED) ARE LOCATED ON
C              SIDES 2-3, 3-1 AND 1-2, RESPECTIVELY.
C
C     KKK      OPERATION FLAG
C                KKK LE 0 = FORM STIFFNESS MATRIX AND LOAD VECTOR.
C                KKK GT 0 = FORM LOAD VECTOR ONLY.
C
C  A(I),B(I)   I=1...3   PROJECTIONS OF SIDES 2-3, 3-1 AND 1-2 ONTO
C              X AND -Y, RESPECTIVELY.
C
C   C(I,J)     I=1...3, J=1...3   PLANE STRESS MATERIAL MATRIX.
C
C    H(I)      I=1...3   CORNER THICKNESSES (LINEAR VARIATION ASSUMED).
C
C    PT(I)     I=1...3   CORNER VALUES OF LATERAL DISTRIBUTED LOAD
C              (LINEAR VARIATION ASSUMED).
C
C   BMT(I,J)   I=1...3, J=1...3   INITIAL BENDING MOMENT COMPONENTS
C              MOM-XX (J=1), MOM-YY (J=2) AND MOM-XY (J=3) AT THE
C              CORNERS I=1...3  (LINEAR VARIATION ASSUMED).
C
C
C * * * * * * * * * * * * * * * OUTPUTS * * * * * * * * * * * * * * * *
C
C   ST(I,J)    I=1..NDF, J=1..NDF  WITH NDF (NUMBER OF DOF) = 9+M, IS
C              THE ELEMENT STIFFNESS MATRIX ASSOCIATED WITH THE NODAL
C              DISPLACEMENT ORDERING
C                W(1),RX(1),RY(1),W(2), .... RY(3),RM(1), ... RM(M)
C              WHERE RM(1), ... RM(M), IF M GT 0, ARE MIDPOINT
C              DEVIATIONS FROM NORMAL SLOPE LINEARITY
C
C    FT(I)     I=1..NDF   CONSISTENT NODAL FORCE VECTOR ASSOCIATED
C              WITH THE NODAL DISPLACEMENT ORDERING DESCRIBED ABOVE.
C
C
      SUBROUTINE SLCCT (M,KKK)
      IMPLICIT REAL*8(A-H,O-Z)
      COMMON /TRIARG/ A(3),B(3), HMT(3), H(3),  C(3,3), SMT(3,3),
     1 BMT(3,3), FT(12),    PX(3),PY(3),PT(3),RM(3), ST(12,12)
      DIMENSION  P(21,12), G(21), Q(3,6), QB(3,6), T(3), U(3), HT(3),
     1 TX(3), TY(3), IPERM(3), XM(3,3), XM0(3)
      EQUIVALENCE (CM11,C(1)),(CM12,C(2)),(CM13,C(3)),(CM22,C(5)),
     1 (CM23,C(6)),(CM33,C(9))
      LOGICAL NOS, FLAT
      DATA IPERM/2,3,1/
      H0 = (H(1)+H(2)+H(3))/3.
      IF (H0.LE.0.)  GO TO 1000
      NDF = 9 + M
      NOS = KKK.GT.0
      FLAT = (H(1).EQ.H(2)).AND.(H(2).EQ.H(3))
      AREA = A(3)*B(2)-A(2)*B(3)
      FAC = H0**3*AREA/864.
      PTF = AREA/6480.
      T(3) = 1.
      DO 150  I = 1,3
      J = IPERM(I)
      K = IPERM(J)
      X = A(I)**2+B(I)**2
      U(I) = -(A(I)*A(J)+B(I)*B(J))/X
      X =DSQRT(X)
      Y = 2.*AREA/X
      HT(I) =  2.*Y
      TX(I) =  Y*A(I)/X
      TY(I) = -Y*B(I)/X
      A1 = A(I)/AREA
      A2 = A(J)/AREA
      B1 = B(I)/AREA
      B2 = B(J)/AREA
      Q(1,I)   = B1*B1
      Q(2,I)   = A1*A1
      Q(3,I)   = 2.*A1*B1
      Q(1,I+3) = 2.*B1*B2
      Q(2,I+3) = 2.*A1*A2
      Q(3,I+3) = 2.*(A1*B2+A2*B1)
      DO 120  N = 1,3
  120 XM(N,I) = BMT(N,I)*AREA/72.
      IF (FLAT)  GO TO 150
      DO 140  N = 1,3
      L = IPERM(N)
      T(1) = H(N)/H0
      T(2) = H(L)/H0
      IF (T(1).GT.0.)  XM(N,I) = XM(N,I)/T(1)**3
      C1 = T(I)
      C2 = T(J)
      C3 = T(K)
      C4 = C2+C3
      C11 = C1*C1
      C23 = C2*C3
      C5 = C4*(3.*C1+C4) + 6.*C11 - 2.*C23
      C6 = C5 + 3.*C4*C4 - 4.*(C11+C23)
      QB(N,I  ) = (C1*(10.*C11-3.*C23)+C4*C5)/17.5 - 2.0
  140 QB(N,I+3) = (C1*(C11-2.*C23)+C4*C6)/35.0 - 1.0
  150 CONTINUE
      DO 200  I = 1,3
      J = IPERM(I)
      K = IPERM(J)
      II = 3*I
      JJ = 3*J
      KK = 3*K
      A1 = A(I)
      A2 = A(J)
      A3 = A(K)
      B1 = B(I)
      B2 = B(J)
      B3 = B(K)
      U1 = U(I)
      U2 = U(J)
      U3 = U(K)
      W1 = 1.-U1
      W2 = 1.-U2
      W3 = 1.-U3
      B1D = B1 + B1
      B2D = B2 + B2
      B3D = B3 + B3
      A1D = A1 + A1
      A2D = A2 + A2
      A3D = A3 + A3
      C21 = B1-B3*U3         + TX(K)
      C22 = -B1D+B2*W2+B3*U3 + TX(J)-TX(K)
      C31 = A1-A3*U3         + TY(K)
      C22 = -B1D+B2*W2+B3*U3 + TX(J)-TX(K)
      C31 = A1-A3*U3         + TY(K)
      C32 = -A1D+A2*W2+A3*U3 + TY(J)-TY(K)
      C51 = B3*W3-B2         + TX(K)
      C52 = B2D-B3*W3-B1*U1  + TX(I)-TX(K)
      C61 = A3*W3-A2         + TY(K)
      C62 = A2D-A3*W3-A1*U1  + TY(I)-TY(K)
      C81 = B3-B2D-B2*U2     + TX(J)
      C82 = B1D-B3+B1*W1     + TX(I)
      C91 = A3-A2D-A2*U2     + TY(J)
      C92 = A1D-A3+A1*W1     + TY(I)
      P1 = PT(I)*PTF
      P2 = PT(J)*PTF
      P3 = PT(K)*PTF
      U37 = 7.*U3
      W27 = 7.*W2
      W24 = 4.*W2
      U34 = 4.*U3
      C1 = 54.+W27
      C2 = 54.+U37
      C3 = 15.+W24
      C4 = 39.+U37
      C5 = 39.+W27
      C6 = 15.+U34
      TXS = TX(J)+TX(K)
      TYS = TY(J)+TY(K)
      FT(II-2) = 6.*((90.+U37+W27)*P1+(36.+U37+W24)*P2+(36.+U34+W27)*P3)
      FT(II-1) = (C1*B2-C2*B3+7.*TXS)*P1 + (C3*B2-C4*B3+4.*TXS+
     1 3.*TX(K))*P2 + (C5*B2-C6*B3+4.*TXS+3.*TX(J))*P3
      FT(II)   = (C1*A2-C2*A3+7.*TYS)*P1 + (C3*A2-C4*A3+4.*TYS+
     1 3.*TY(K))*P2 + (C5*A2-C6*A3+4.*TYS+3.*TY(J))*P3
      FT(K+9)  = (7.*(P1+P2)+4.*P3)*HT(K)
      XM0(I) = (XM(1,I)+XM(2,I)+XM(3,I))/3.
      DO 200  N = 1,3
      L = 6*(I-1) + N
      Q11 = Q(N,I)
      Q22 = Q(N,J)
      Q33 = Q(N,K)
      Q12 = Q(N,I+3)
      Q23 = Q(N,J+3)
      Q31 = Q(N,K+3)
      Q2333 = Q23-Q33
      Q3133 = Q31-Q33
      P(L   ,II-2) = 6.*(-Q11+W2*Q33+U3*Q2333)
      P(L   ,II-1) = C21*Q23+C22*Q33-B3D*Q12+B2D*Q31
      P(L   ,II  ) = C31*Q23+C32*Q33-A3D*Q12+A2D*Q31
      P(L   ,JJ-2) = 6.*(Q22+W3*Q2333)
      P(L   ,JJ-1) = C51*Q2333+B3D*Q22
      P(L   ,JJ  ) = C61*Q2333+A3D*Q22
      P(L   ,KK-2) = 6.*(1.+U2)*Q33
      P(L   ,KK-1) = C81*Q33
      P(L   ,KK  ) = C91*Q33
      P(L   ,I+9 ) = 0.
      P(L   ,J+9 ) = HT(J)*Q33
      P(L   ,K+9 ) = HT(K)*Q2333
      P(L+3 ,II-2) = 6.*(Q11+U3*Q3133)
      P(L+3 ,II-1) = C21*Q3133-B3D*Q11
      P(L+3 ,II  ) = C31*Q3133-A3D*Q11
      P(L+3 ,JJ-2) = 6.*(-Q22+U1*Q33+W3*Q3133)
      P(L+3 ,JJ-1) = C51*Q31+C52*Q33+B3D*Q12-B1D*Q23
      P(L+3 ,JJ  ) = C61*Q31+C62*Q33+A3D*Q12-A1D*Q23
      P(L+3 ,KK-2) = 6.*(1.+W1)*Q33
      P(L+3 ,KK-1) = C82*Q33
      P(L+3 ,KK  ) = C92*Q33
      P(L+3 ,I+9 ) = HT(I)*Q33
      P(L+3 ,J+9 ) = 0.
      P(L+3 ,K+9 ) = HT(K)*Q3133
      P(N+18,II-2) = 2.*(Q11+U3*Q12+W2*Q31)
      P(N+18,KK-1) = ((B1D-B2D)*Q33+C82*Q23+C81*Q31)/3.
      P(N+18,KK  ) = ((A1D-A2D)*Q33+C92*Q23+C91*Q31)/3.
  200 P(N+18,K+9 ) = HT(K)*Q12/3.
  300 DO 400  J = 1,NDF
      DO 340  L = 1,3
      II = L
      KK = L + 18
      P3 = P(KK,J)
      G(KK) = 0.
      DO 340  N = 1,3
      I = IPERM(N)
      JJ = II + 3
      P1 = P(II,J)
      P2 = P(JJ,J)
      SUM = P1 + P2 + P3
      G1 = SUM + P1
      G2 = SUM + P2
      G3 = SUM + P3
      IF (FLAT)  GO TO 320
      G1 = G1 + QB(N,1)*P1 + QB(N,6)*P2 + QB(N,5)*P3
      G2 = G2 + QB(N,6)*P1 + QB(N,2)*P2 + QB(N,4)*P3
      G3 = G3 + QB(N,5)*P1 + QB(N,4)*P2 + QB(N,3)*P3
  320 G(II) = G1
      G(JJ) = G2
      G(KK) = G3 + G(KK)
      II = II + 6
  340 FT(J) = FT(J) - XM(N,L)*G1 - XM(I,L)*G2 - XM0(L)*G3
      IF (NOS)  GO TO 400
      DO 360 N = 1,19,3
      G1 = G(N)
      G2 = G(N+1)
      G3 = G(N+2)
      G(N)   = CM11*G1 + CM12*G2 + CM13*G3
      G(N+1) = CM12*G1 + CM22*G2 + CM23*G3
  360 G(N+2) = CM13*G1 + CM23*G2 + CM33*G3
      DO 390  I = 1,J
      X = 0.
      DO 380  N = 1,21
  380 X = X + G(N)*P(N,I)
      X = X*FAC
      ST(I,J) = X
  390 ST(J,I) = X
  400 CONTINUE
 1000 RETURN
      END

Very little code developed at UCB in 70's adopted Fortran 77 syntax. The following is the title block for the version of SAP4 I have. Even today, Feap shows this history and still uses the pre F90 memory management approaches. I remain very impressed by what these codes have achieved and still use an adaptation of the skyline solver from these codes and Wilson's shifted subspace eigen-solver that produces reliable results for the larger problems attempted today.

C  **  **  **  **  **  **  **  **  **  **  **  **  **  **  **  **  **  *
C
C                                 SAP4
C                     A STRUCTURAL ANALYSIS PROGRAM
C           FOR STATIC AND DYNAMIC RESPONSE OF LINEAR SYSTEMS
C
C                K.J. BATHE , E.L. WILSON , F.E. PETERSON
C                  UNIVERSITY OF CALIFORNIA , BERKELEY
C
C          IBM CONVERSION BY UNIVERSITY OF SOUTHERN CALIFORNIA
C                            AUGUST, 1973
C                         REVISED JULY, 1974
C
C                     PRIME CONVERSION FROM IBM
C                                 BY
C                LARRY J. FEESER AND WILLIAM E. BOND
C                  DEPARTMENT OF CIVIL ENGINEERING
C                  RENSSELAER POLYTECHNIC INSTITUTE
C                       TROY, NEW YORK  12181
C                              JULY, 1978
C
C  **  **  **  **  **  **  **  **  **  **  **  **  **  **  **  **  **  *

JohnNichols · ‎04-06-2016

jimdempseyatthecove wrote:

What are the inputs to the program?

I have started to put in an input file, but it has nothing in it at the moment, you could comment it out, or just make sure the file exists.

Because I am comparing files between different programs I use the default open file name of FORT.?? without a name - so I know what I am comparing quickly.

In terms of the variables in the program,

typea is for another element that I first got running from Felippa.

x(3) is the x locations of the corners, Y and Z similar

1,2,3 node order is counter clockwise from 1.

Nu is Poisson's ratio - Felippa set it to 0.3333 for this sample -- a bit high but ok

H(1:3) is the corner heights - in this case I am working in metric

em is Young's Modulus -- 4000 assumes a concrete and units in MPa is my guess unless it is a weird PSI unit

T is the transform matrix to allow different plates in a layer member. Not used at the moment.

lb and le are order indices for the final array esm assembly.

betab - means you have a pure material with a nu close to 0.5 - possible but less than ideal for us

f and a few others I have just adopted Felippa's until I understand the program better, but it appears to pull in different stiffness elements.

John

Matrix Operation