Community
cancel
Showing results for
Did you mean:
Highlighted
New Contributor II
73 Views

## Matrix Operation

```do k = 1 , 3
do j = 1 , 3
d(1) = c(1,1)*qm(1,j,k) + c(1,2)*qm(2,j,k) + c(1,3)*qm(3,j,k)
d(2) = c(2,1)*qm(1,j,k) + c(2,2)*qm(2,j,k) + c(2,3)*qm(3,j,k)
d(3) = c(3,1)*qm(1,j,k) + c(3,2)*qm(2,j,k) + c(3,3)*qm(3,j,k)
do i =1,j
kth(i,j)=kth(i,j)+qm(1,i,k)*d(1)+qm(2,i,k)*d(2)+qm(3,i,k)*d(3)
kth(j,i)=kth(i,j)
end do
end do
end do```

This has got to be a matrix operation - doe it not?

Tags (1)
52 Replies
Highlighted
Valued Contributor III
57 Views
```do k = 1 , 3
do j = 1 , 3
d(1) = c(1,1)*qm(1,j,k) + c(1,2)*qm(2,j,k) + c(1,3)*qm(3,j,k)
d(2) = c(2,1)*qm(1,j,k) + c(2,2)*qm(2,j,k) + c(2,3)*qm(3,j,k)
d(3) = c(3,1)*qm(1,j,k) + c(3,2)*qm(2,j,k) + c(3,3)*qm(3,j,k)
end do
end do

!The above is below BUT!
! the "kth" array gets overwriten on each K loop...
! I guess there is some other code missing?
do k = 1 , 3
d3X3 = matmul( C, qm(:,:,k) ) ! d3X3(3,3)
enddo ```

Highlighted
New Contributor II
57 Views

Andrew:

Original code is on my other computer at home I will put it up tonight.

I am just getting to the printout the answers stage on this subroutine so I will try your thoughts.  I am slowly tracking through it to understand the math --

Tahnks

John

Highlighted
Black Belt
57 Views

Andrew,

Your rework is missing the do i= loop.

Given the original code, and assuming this is a hot spot, I would experiment with:

```cTranspose = transpose(c)
do k = 1 , 3
do j = 1 , 3
d(1) = dot_product(cTranspose(1:3,1),qm(1:3,j,k))
d(2) = dot_product(cTranspose(1:3,2),qm(1:3,j,k))
d(3) = dot_product(cTranspose(1:3,3),qm(1:3,j,k))
select case(j)
case(1)
kth(1,j)=kth(1,j)+qm(1,1,k)*d(1)+qm(2,1,k)*d(2)+qm(3,1,k)*d(3)
kth(j,1)=kth(1,j)
case(2)
kth(1:2,j)=kth(1:2,j)+qm(1,1:2,k)*d(1)+qm(2,1:2,k)*d(2)+qm(3,1:2,k)*d(3)
kth(j,1:2)=kth(1:2,j)
case(3)
kth(1:3,j)=kth(1:3,j)+qm(1,1:3,k)*d(1)+qm(2,1:3,k)*d(2)+qm(3,1:3,k)*d(3)
kth(j,1:3)=kth(1:3,j)
end select
end do
end do
```

Jim Dempsey

Highlighted
New Contributor II
57 Views

I would try array syntax or some basic vector routines, such as the following and see if they have any effect. The chances are that ifort would already do as well as has been suggested, but sometimes we still can try. I find daxpy, when vectorised should work well for matrix operations, but not sure of small vector performance. Does AVX 256 work well with a real(8) d(3) ?

```! initial code
do k = 1 , 3
do j = 1 , 3
d(1) = c(1,1)*qm(1,j,k) + c(1,2)*qm(2,j,k) + c(1,3)*qm(3,j,k)
d(2) = c(2,1)*qm(1,j,k) + c(2,2)*qm(2,j,k) + c(2,3)*qm(3,j,k)
d(3) = c(3,1)*qm(1,j,k) + c(3,2)*qm(2,j,k) + c(3,3)*qm(3,j,k)
do i =1,j
kth(i,j)=kth(i,j)+qm(1,i,k)*d(1)+qm(2,i,k)*d(2)+qm(3,i,k)*d(3)
kth(j,i)=kth(i,j)
end do
end do
end do

! array syntax, assuming d(3) and c(3,3)
do k = 1 , 3
do j = 1 , 3
d(:) = c(:,1)*qm(1,j,k) + c(:,2)*qm(2,j,k) + c(:,3)*qm(3,j,k)
do i = 1,j
kth(i,j) = kth(i,j) + dot_product ( qm(:,i,k), d )
kth(j,i) = kth(i,j)
end do
end do
end do

! or vector routines
do k = 1 , 3
do j = 1 , 3
d = 0
do i = 1,3
call daxpy ( d, c(1,i), qm(i,j,k), 3 )
end do
do i = 1,j
kth(i,j) = kth(i,j) + dot_product ( qm(:,i,k), d )
kth(j,i) = kth(i,j)
end do
end do
end do

! or use matmul
do k = 1 , 3
do j = 1 , 3
d = matmul ( c, qm(:,j,k) )
do i = 1,j
kth(i,j) = kth(i,j) + dot_product ( qm(:,i,k), d )
kth(j,i) = kth(i,j)
end do
end do
end do

```

Highlighted
New Contributor II
57 Views

Thank you for the replies.

The code is from an 18 degree of freedom triangular FEM element developed by Carlos Felippa.

He originally developed the code in Fortran 77, which he ported to Mathematica in the '90s so his students would not have to learn Fortran programming -- give me a break.

His TA/RA in the early 2000's ported it to Mathematica for the same reason - one has access to a PDF file with the Mathematic Code but not the Fortran code.

Highlighted
New Contributor II
57 Views

Sorry hit the wrong button.

I enclose a copy of the Golmon version of the program.

I got the model coded over the weekend and started to run some tests on it.  I have two interesting problems, the first is to make sure global answers are somewhat correct and the second is to make the code a bit more succinct. I started playing with matmul and dot product, but a lot of this stuff could be made more abstract with some subroutines instead of repeating the same commands many times.  I also have to check that the changes do not change the answers.  Which is not to hard as you can mirror the results before you make the change.

It is so many years since I sat in a Linear Algebra lecture and now I am back in the middle of it -- in those days in Pure Math we used to ask what was the purpose - now I See

John

Highlighted
Black Belt
57 Views

John,

Interesting article. I think your best approach is to perform the conversion to Fortran in three steps.

1) Without regard to implications on vectorization or parallelization make the conversion such that the results agree. Then
2) rework the Fortran code to be amenable to vectoization  This generally means converting from AOS to SOA, assure results equivalent. Then
3) If necessary, rework the code to be amenable to parallelization. This generally means partitioning the data, handling special cases for interior and boundary conditions, and handling force propagations across boundaries. This is usually not as easy as it looks.

Performing the work in phases will easing debugging should (when) errors occur. Consistency of results after step 1 will prove correctness in conversion. Step 2 may produce equivalent results though not necessarily exactly the same results. It may take some analysis to assert the equivalent results are correct. Step 3 will permit you to model larger numbers of elements. Though this too may result in equivalent but inexact results.

Jim Dempsey

Highlighted
New Contributor II
57 Views

Jim:

I agree - step 1 is complete, and Professor Felippa - code author just sent me the original Fortran - so I can move to Step 2.

No-one every believes FEM code.

John

Highlighted
New Contributor II
57 Views
```INTEGER, PARAMETER :: dp = selected_real_kind(15, 307)
INTEGER, PARAMETER :: mn_3 = 3,mn_9 = 9, mn_2 = 2, mn_10 = 10
Real (kind = dp) em,nu,h, c, db(mn_3,mn_3)

c = em*h**3/(12.d0*(1.-nu**2.0))```

Slowly working my way through my code and Felippa's original not many differences:

Equation for c I got wrong - was out by 144 - so I had the 12 in the wrong spot.

Question : Is dp the same as calling double precision?

He has a neat way of comparing two arrays - he adds them up and compares the difference - not exact but a good start.

Highlighted
New Contributor II
57 Views

db(1,1) 3.00000015646220  REAL(8)

Next question - on comparing the values for Db(1,1) I get this value from Felippa's code and exactly the same value from my code.  Why are the errors at the end of 15646220 not random?  I would expect some form of round off error by now - given the calculation range, but the numbers are exact to 13 places --

John

Highlighted
New Contributor II
57 Views
```      call   SM3SHMEMBB (xlp, ylp, dm, 1.5d0,1.0d0, le, sm, nd, status)
write(*,*)sm
call   SM3SHMEMBH (xlp, ylp, dm, betab, le, sm, nd, status)
call   SM3SHBENDB (xlp, ylp, db,1.0d0,clr,cqr, lb, sm, nd, status)
call   SM3SHBENDH (xlp, ylp, db,1.0d+0, lb, sm, nd, status)
C```

Thinking ahead - there is a lot of time spent in these routines - each one generates an independent section of the the stiffness matrix sm. Is this is the sort of thing we send parallel?

John

Highlighted
Black Belt
57 Views

John,

The first level of optimization you should consider is vectorization. This may require a complete rework of your array layouts. And as a consequence may significantly change how you parallelize the code.

Jim Dempsey

Highlighted
Black Belt
57 Views

Note, the article you posted concerned itself with one node of your surface. The real art in optimizing this code vectorization as well as parallelization, is in performing the (correct) calculations of the entire surface. The vectorization should be incorporated when you consider the entire surface.

Jim Dempsey

Highlighted
New Contributor II
57 Views

Jim:

I have a lot to learn.  Wonder if there is enough time left.

I am loath to put up Professor Felippa's original code without his permission, but once I have mine correct I will put up a correct version, which is derived from a published version.

I have finished two of the 4 routines - the first one was exact to 15 significant figures - but the second started to disagree at the 8 and 9th place.

Looking at the code, Felippa determined the length of the sides using sqrt whereas I left them as length squared and used the L squared in the next calculation and he resquared the result -- do you think this will make a difference at the 8th figure?

John

Highlighted
Black Belt
57 Views

/Qprec-sqrt /Qprec-div

or try

/fp:precise

Unfortunately (my opinion), the default is set to sacrifice precision over speed. My position is for the default to provide the precision, then have the programmer (incrementally) request the additional optimizations.

Jim Dempsey

Highlighted
New Contributor II
57 Views

Jim:

It took about 6 hours to debug my code.  Not many mistakes, just annoyances, interestingly - 1 was mine I left out a block of code and 2 equations incorrect in the Mathematica Copy of the original Fortran X23 instead of X32 things - so the number was negative instead of positive,  One was hard to track down as it only affected one number of the 324 and then only in a small way

EXAM DIFF is useful for looking at two printouts of matrices and telling the differences.

Weirdly at the finish the two files were identical to 15 places on all numbers - the Fortran God is strange or maybe Steve's weekly sacrifice worked this week - that is a joke.

Fun

Prof. Felippa included an Eigen solver - with another element program - this comes from a 1975 book and appears to have been written in Algol - I think from a note in the files.  I used it to look at the eigenvalues for this element.

Now the good part making it work with multiple elements, Pardiso and FEAST.

The version I created from the published Mathematica Code is attached -- Prof. Felippa actually made less use of variables and more use of arrays in his code - the Mathematica was altered code in interesting ways.

Thought you might like a look - before I start butchering it further.

Quite nifty actually the code and the idea -- humans are surprisingly inventive.

John

Highlighted
New Contributor II
57 Views

No - the Mathematica code was correct - it was my copying that was wrong.

It took a while to decipher the Transpose(I)AI multiplication on the 3rd page, but now I have it - nifty way to create a thicker element with different properties.  Quite smart.

John

Highlighted
Black Belt
57 Views

What are the inputs to the program?

Highlighted
New Contributor II
57 Views

John,

"He originally developed the code in Fortran 77, which he ported to Mathematica in the '90s "

He originally developed the element in 1960's so it predates F77. There are certainly earlier versions available than a Mathematica version.

I am fairly sure that the QTSHEL thin shell routine in SAP4 from UCB is based on Felippa's shell element and so is an early publication and use of his shell element. It is a quadrilateral element using 4 triangles with a central node that is reduced out during the formulation. It basically combines a 3-node membrane triangle element and a 3-node bending triangle element to produce a 5 dof per node performance. Mid side rotation freedoms could also be included, but were not very practical.

I have used this element for many years. While it does not have the quadratic displacement field of the 8-node shells, this element is a very robust element and manages many of the areas where higher order elements can fail. It provides a predictable accuracy for reasonable mesh sizes.

The code is interesting in the use of many local variables, which avoids the matrix notation solution and minimises operations count. It is quite complex and a code I have never attempted to clean-up. However for most finite element calculations, the generation of element stiffness, stress and load matrices is a small proportion of  (< 5%) of the total run time so a more concise matrix notation is preferred and much easier to audit.

Try and find the error in this almost original routine, obtained in 1978. ( I think I used a screen editor to improve the layout, which was more difficult with a card deck!)

```C
C     CALLED BY:  QTSHEL
C
C     THIS SUBROUTINE FORMS THE PLATE BENDING STIFFNESS AND/OR THE
C     CONSISTENT LOAD VECTOR OF A LINEAR CURVATURE COMPATIBLE TRIANGLE
C     (LCCT) WITH 6, 5, 4 OR 3 NODAL POINTS
C
C
C * * * * * * * * * * * * * * * INPUTS  * * * * * * * * * * * * * * * *
C
C      M       NUMBER OF MIDPOINT DEGREES OF FREEDOM (M =3,2,1,0).
C              NOTE.. MIDPOINTS 4-5-6 (IF INCLUDED) ARE LOCATED ON
C              SIDES 2-3, 3-1 AND 1-2, RESPECTIVELY.
C
C     KKK      OPERATION FLAG
C                KKK LE 0 = FORM STIFFNESS MATRIX AND LOAD VECTOR.
C                KKK GT 0 = FORM LOAD VECTOR ONLY.
C
C  A(I),B(I)   I=1...3   PROJECTIONS OF SIDES 2-3, 3-1 AND 1-2 ONTO
C              X AND -Y, RESPECTIVELY.
C
C   C(I,J)     I=1...3, J=1...3   PLANE STRESS MATERIAL MATRIX.
C
C    H(I)      I=1...3   CORNER THICKNESSES (LINEAR VARIATION ASSUMED).
C
C    PT(I)     I=1...3   CORNER VALUES OF LATERAL DISTRIBUTED LOAD
C              (LINEAR VARIATION ASSUMED).
C
C   BMT(I,J)   I=1...3, J=1...3   INITIAL BENDING MOMENT COMPONENTS
C              MOM-XX (J=1), MOM-YY (J=2) AND MOM-XY (J=3) AT THE
C              CORNERS I=1...3  (LINEAR VARIATION ASSUMED).
C
C
C * * * * * * * * * * * * * * * OUTPUTS * * * * * * * * * * * * * * * *
C
C   ST(I,J)    I=1..NDF, J=1..NDF  WITH NDF (NUMBER OF DOF) = 9+M, IS
C              THE ELEMENT STIFFNESS MATRIX ASSOCIATED WITH THE NODAL
C              DISPLACEMENT ORDERING
C                W(1),RX(1),RY(1),W(2), .... RY(3),RM(1), ... RM(M)
C              WHERE RM(1), ... RM(M), IF M GT 0, ARE MIDPOINT
C              DEVIATIONS FROM NORMAL SLOPE LINEARITY
C
C    FT(I)     I=1..NDF   CONSISTENT NODAL FORCE VECTOR ASSOCIATED
C              WITH THE NODAL DISPLACEMENT ORDERING DESCRIBED ABOVE.
C
C
SUBROUTINE SLCCT (M,KKK)
IMPLICIT REAL*8(A-H,O-Z)
COMMON /TRIARG/ A(3),B(3), HMT(3), H(3),  C(3,3), SMT(3,3),
1 BMT(3,3), FT(12),    PX(3),PY(3),PT(3),RM(3), ST(12,12)
DIMENSION  P(21,12), G(21), Q(3,6), QB(3,6), T(3), U(3), HT(3),
1 TX(3), TY(3), IPERM(3), XM(3,3), XM0(3)
EQUIVALENCE (CM11,C(1)),(CM12,C(2)),(CM13,C(3)),(CM22,C(5)),
1 (CM23,C(6)),(CM33,C(9))
LOGICAL NOS, FLAT
DATA IPERM/2,3,1/
H0 = (H(1)+H(2)+H(3))/3.
IF (H0.LE.0.)  GO TO 1000
NDF = 9 + M
NOS = KKK.GT.0
FLAT = (H(1).EQ.H(2)).AND.(H(2).EQ.H(3))
AREA = A(3)*B(2)-A(2)*B(3)
FAC = H0**3*AREA/864.
PTF = AREA/6480.
T(3) = 1.
DO 150  I = 1,3
J = IPERM(I)
K = IPERM(J)
X = A(I)**2+B(I)**2
U(I) = -(A(I)*A(J)+B(I)*B(J))/X
X =DSQRT(X)
Y = 2.*AREA/X
HT(I) =  2.*Y
TX(I) =  Y*A(I)/X
TY(I) = -Y*B(I)/X
A1 = A(I)/AREA
A2 = A(J)/AREA
B1 = B(I)/AREA
B2 = B(J)/AREA
Q(1,I)   = B1*B1
Q(2,I)   = A1*A1
Q(3,I)   = 2.*A1*B1
Q(1,I+3) = 2.*B1*B2
Q(2,I+3) = 2.*A1*A2
Q(3,I+3) = 2.*(A1*B2+A2*B1)
DO 120  N = 1,3
120 XM(N,I) = BMT(N,I)*AREA/72.
IF (FLAT)  GO TO 150
DO 140  N = 1,3
L = IPERM(N)
T(1) = H(N)/H0
T(2) = H(L)/H0
IF (T(1).GT.0.)  XM(N,I) = XM(N,I)/T(1)**3
C1 = T(I)
C2 = T(J)
C3 = T(K)
C4 = C2+C3
C11 = C1*C1
C23 = C2*C3
C5 = C4*(3.*C1+C4) + 6.*C11 - 2.*C23
C6 = C5 + 3.*C4*C4 - 4.*(C11+C23)
QB(N,I  ) = (C1*(10.*C11-3.*C23)+C4*C5)/17.5 - 2.0
140 QB(N,I+3) = (C1*(C11-2.*C23)+C4*C6)/35.0 - 1.0
150 CONTINUE
DO 200  I = 1,3
J = IPERM(I)
K = IPERM(J)
II = 3*I
JJ = 3*J
KK = 3*K
A1 = A(I)
A2 = A(J)
A3 = A(K)
B1 = B(I)
B2 = B(J)
B3 = B(K)
U1 = U(I)
U2 = U(J)
U3 = U(K)
W1 = 1.-U1
W2 = 1.-U2
W3 = 1.-U3
B1D = B1 + B1
B2D = B2 + B2
B3D = B3 + B3
A1D = A1 + A1
A2D = A2 + A2
A3D = A3 + A3
C21 = B1-B3*U3         + TX(K)
C22 = -B1D+B2*W2+B3*U3 + TX(J)-TX(K)
C31 = A1-A3*U3         + TY(K)
C22 = -B1D+B2*W2+B3*U3 + TX(J)-TX(K)
C31 = A1-A3*U3         + TY(K)
C32 = -A1D+A2*W2+A3*U3 + TY(J)-TY(K)
C51 = B3*W3-B2         + TX(K)
C52 = B2D-B3*W3-B1*U1  + TX(I)-TX(K)
C61 = A3*W3-A2         + TY(K)
C62 = A2D-A3*W3-A1*U1  + TY(I)-TY(K)
C81 = B3-B2D-B2*U2     + TX(J)
C82 = B1D-B3+B1*W1     + TX(I)
C91 = A3-A2D-A2*U2     + TY(J)
C92 = A1D-A3+A1*W1     + TY(I)
P1 = PT(I)*PTF
P2 = PT(J)*PTF
P3 = PT(K)*PTF
U37 = 7.*U3
W27 = 7.*W2
W24 = 4.*W2
U34 = 4.*U3
C1 = 54.+W27
C2 = 54.+U37
C3 = 15.+W24
C4 = 39.+U37
C5 = 39.+W27
C6 = 15.+U34
TXS = TX(J)+TX(K)
TYS = TY(J)+TY(K)
FT(II-2) = 6.*((90.+U37+W27)*P1+(36.+U37+W24)*P2+(36.+U34+W27)*P3)
FT(II-1) = (C1*B2-C2*B3+7.*TXS)*P1 + (C3*B2-C4*B3+4.*TXS+
1 3.*TX(K))*P2 + (C5*B2-C6*B3+4.*TXS+3.*TX(J))*P3
FT(II)   = (C1*A2-C2*A3+7.*TYS)*P1 + (C3*A2-C4*A3+4.*TYS+
1 3.*TY(K))*P2 + (C5*A2-C6*A3+4.*TYS+3.*TY(J))*P3
FT(K+9)  = (7.*(P1+P2)+4.*P3)*HT(K)
XM0(I) = (XM(1,I)+XM(2,I)+XM(3,I))/3.
DO 200  N = 1,3
L = 6*(I-1) + N
Q11 = Q(N,I)
Q22 = Q(N,J)
Q33 = Q(N,K)
Q12 = Q(N,I+3)
Q23 = Q(N,J+3)
Q31 = Q(N,K+3)
Q2333 = Q23-Q33
Q3133 = Q31-Q33
P(L   ,II-2) = 6.*(-Q11+W2*Q33+U3*Q2333)
P(L   ,II-1) = C21*Q23+C22*Q33-B3D*Q12+B2D*Q31
P(L   ,II  ) = C31*Q23+C32*Q33-A3D*Q12+A2D*Q31
P(L   ,JJ-2) = 6.*(Q22+W3*Q2333)
P(L   ,JJ-1) = C51*Q2333+B3D*Q22
P(L   ,JJ  ) = C61*Q2333+A3D*Q22
P(L   ,KK-2) = 6.*(1.+U2)*Q33
P(L   ,KK-1) = C81*Q33
P(L   ,KK  ) = C91*Q33
P(L   ,I+9 ) = 0.
P(L   ,J+9 ) = HT(J)*Q33
P(L   ,K+9 ) = HT(K)*Q2333
P(L+3 ,II-2) = 6.*(Q11+U3*Q3133)
P(L+3 ,II-1) = C21*Q3133-B3D*Q11
P(L+3 ,II  ) = C31*Q3133-A3D*Q11
P(L+3 ,JJ-2) = 6.*(-Q22+U1*Q33+W3*Q3133)
P(L+3 ,JJ-1) = C51*Q31+C52*Q33+B3D*Q12-B1D*Q23
P(L+3 ,JJ  ) = C61*Q31+C62*Q33+A3D*Q12-A1D*Q23
P(L+3 ,KK-2) = 6.*(1.+W1)*Q33
P(L+3 ,KK-1) = C82*Q33
P(L+3 ,KK  ) = C92*Q33
P(L+3 ,I+9 ) = HT(I)*Q33
P(L+3 ,J+9 ) = 0.
P(L+3 ,K+9 ) = HT(K)*Q3133
P(N+18,II-2) = 2.*(Q11+U3*Q12+W2*Q31)
P(N+18,KK-1) = ((B1D-B2D)*Q33+C82*Q23+C81*Q31)/3.
P(N+18,KK  ) = ((A1D-A2D)*Q33+C92*Q23+C91*Q31)/3.
200 P(N+18,K+9 ) = HT(K)*Q12/3.
300 DO 400  J = 1,NDF
DO 340  L = 1,3
II = L
KK = L + 18
P3 = P(KK,J)
G(KK) = 0.
DO 340  N = 1,3
I = IPERM(N)
JJ = II + 3
P1 = P(II,J)
P2 = P(JJ,J)
SUM = P1 + P2 + P3
G1 = SUM + P1
G2 = SUM + P2
G3 = SUM + P3
IF (FLAT)  GO TO 320
G1 = G1 + QB(N,1)*P1 + QB(N,6)*P2 + QB(N,5)*P3
G2 = G2 + QB(N,6)*P1 + QB(N,2)*P2 + QB(N,4)*P3
G3 = G3 + QB(N,5)*P1 + QB(N,4)*P2 + QB(N,3)*P3
320 G(II) = G1
G(JJ) = G2
G(KK) = G3 + G(KK)
II = II + 6
340 FT(J) = FT(J) - XM(N,L)*G1 - XM(I,L)*G2 - XM0(L)*G3
IF (NOS)  GO TO 400
DO 360 N = 1,19,3
G1 = G(N)
G2 = G(N+1)
G3 = G(N+2)
G(N)   = CM11*G1 + CM12*G2 + CM13*G3
G(N+1) = CM12*G1 + CM22*G2 + CM23*G3
360 G(N+2) = CM13*G1 + CM23*G2 + CM33*G3
DO 390  I = 1,J
X = 0.
DO 380  N = 1,21
380 X = X + G(N)*P(N,I)
X = X*FAC
ST(I,J) = X
390 ST(J,I) = X
400 CONTINUE
1000 RETURN
END
```

Very little code developed at UCB in 70's adopted Fortran 77 syntax. The following is the title block for the version of SAP4 I have. Even today, Feap shows this history and still uses the pre F90 memory management approaches. I remain very impressed by what these codes have achieved and still use an adaptation of the skyline solver from these codes and Wilson's shifted subspace eigen-solver that produces reliable results for the larger problems attempted today.

```C  **  **  **  **  **  **  **  **  **  **  **  **  **  **  **  **  **  *
C
C                                 SAP4
C                     A STRUCTURAL ANALYSIS PROGRAM
C           FOR STATIC AND DYNAMIC RESPONSE OF LINEAR SYSTEMS
C
C                K.J. BATHE , E.L. WILSON , F.E. PETERSON
C                  UNIVERSITY OF CALIFORNIA , BERKELEY
C
C          IBM CONVERSION BY UNIVERSITY OF SOUTHERN CALIFORNIA
C                            AUGUST, 1973
C                         REVISED JULY, 1974
C
C                     PRIME CONVERSION FROM IBM
C                                 BY
C                LARRY J. FEESER AND WILLIAM E. BOND
C                  DEPARTMENT OF CIVIL ENGINEERING
C                  RENSSELAER POLYTECHNIC INSTITUTE
C                       TROY, NEW YORK  12181
C                              JULY, 1978
C
C  **  **  **  **  **  **  **  **  **  **  **  **  **  **  **  **  **  *```