Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

Polynomial line fitting

Valued Contributor III

has anyone tried FITPACK with Windows?

11 Replies
Honored Contributor I

What is the problem exactly? I did find the source code via Netlib (Google was no help here, as it has very different idea about what "fit" means).

0 Kudos
Honored Contributor III

There are two packages called "Fitpack" at Netlib: one by Alan Cline and the other by Paul Dierckx. The word "fitting" may imply that the resulting curve/expression should interpolate the provided points exactly, or that the curve should "fit" the data in a least-square or some similar sense, i.e., that the data points should be close to the curve. An interpolating curve may have unwelcome wiggles, and a smooth fit tries to avoid these wiggles. Sometimes the curve is subjected to "tension", as in a taut string.

0 Kudos
Honored Contributor III
I have used some of the Dierckx routines but I did some mild refactoring to make it 2018 standards compliment. If I recall I was fitting spline curves.
0 Kudos
Honored Contributor I

I found only one FITPACK on netlib, thanks for the second link. The first few experiments with Cline's package look fine, though some results were puzzling. Anyway, I will continue experimenting, as I am interested in the general problem these packages solve. (My experiments involve creating an object-oriented interface for them and then see whether the splines thus produced give understandable answers)

0 Kudos
Valued Contributor III

The accelerometers, tilt meters and GPS data that one collects, even if only at 8 second intervals or 2000 times per second, starts to create huge data series. 

The first problem with these series is the thermal impact, this is not something one can ignore. 

So a sample from a tilt meter:



Clearly there are two influences, a loss line and then thermal, the loss line is measured over decades and the temperature over a day, so temp dominates for a little while, but then the loss line kicks in. 


We can do a linear regression in many ways, Fortran, EXCEL and C# as three samples. EXCEL gives the best graphs but it takes a lot of time, useful as you work out techniques.  C# gives better graphs than Fortran for the same work, but Fortran is faster.  C# does not handle non-linear easily.  If you are doing a lot use Fortran or C#, personal choice. 

If we do the linear regression in EXCEL we get 



 CoefficientsStandard Errort StatP-value
X Variable 1-1.73446E-078.35188E-10-207.67289920


The P value is not really zero, but such high t-STATS test the ability of real numbers to complete the analysis, anything over 2 in t-stat is the regression is solid and 200 is there is no argument. 

The -1.7e-7 is not important till you realize there are 10000 steps per day,  3.65 million in a year and 365 million in a century, a bridge should last a century and we do not want the piers to tilt that far. 


a residual plot and the trend is mostly gone and we get 

The secondary analysis is to look at the residuals and here one can use FFT, but the answer is trivial at one day cycle and the better method is to use polynomial or such regression,  but here I use the linear to take out the temperature dependence. 




 CoefficientsStandard Errort StatP-value
X Variable 1-0.0016236345.61804E-06-289.0040


and the residual plot is 



Now I am stuck, the only real solution is Fortran - maybe C++ if you are a masochist and we are now looking at lots of records, where the difference is the measured acceleration - thermal, traffic or construction.  If we tease apart the data we will find the polynomial elements that relate to traffic and construction and the base one is the one left over - minor thermal.  In this case because we did a few months without traffic we know the lowest curve is no loads other than natural, bit of wind, river force etc.. 

So we use Fortran to subset the data into nice groups and then apply polynomial type regression, ie I found  FITPACK and now I have to see if it will work. 

One tests out the methods with EXCEL and then you code them. 

The interesting issue is the use of the Central Limit Theorem on very large data sets. 













0 Kudos
Honored Contributor III

Can you make two new charts, TiltX and X Variable 1 Residual, but this time, plot the 1st 20000 points

*** and replace the "large" dot and diamond with a 1 pixel wide line.

This might help to visualize the activity.


The 3rd and 4th charts are somewhat interesting. Can you explain them a little bit.


The data presented is similar except the slopes are switched (3rd chart has a somewhat negative slope and the 4th chart has a somewhat positive slope).


Both charts appear to have four major harmonics causing four major aggregations (point densities about lines).

While charts 1 and 2 could have a single curve fitted (somewhat that of abs(sin(X)))

Charts 3 and 4 might be best served with plotting multiple lines. As to how to do this, it may require some creativity.

 A guess at what to do would (iteratively)

a) filter out what appears to be stragglers from potential lines (density clusters)

b) Check for potential to curve fit multiple lines

c) if fail, tighten filter, go to a)


First pass might filter out these:



Somewhat like a "Game of Life" where a point can die, but points cannot propagate.



0 Kudos
Honored Contributor I

Such datasets are always intriguing :), but I cannot give any advice on this particular one. What I did want to mention is that I continued working on a more modern interface to Cline's FITPACK and did some more experimenting, in particular with the smoothing options. The package allows you to calculate a smoothing spline "object" where one of the parameters is the weight, roughly the to be assigned to the data points. In this particular package that weight is roughly the standard deviation (according to the comments). Using a "large" value makes the curve much smoother and less inclined to the following the data points.

Just some observations.

0 Kudos
Valued Contributor III


Jim:  This is the first 20000 points as you asked. I have used the thinnest of lines.  Each point represents "average" data for an 8.192 second interval which is based on 16384 FFT at 2000 Hz. 

The vertical axis is tilt in degrees.  

The object is an old structure made of steel with a concrete substructure. The tilt meter sits on top of the substructure 80 feet in the air. 

The tilt meter reads at 200 Hz.  

There are three things that can disturb the structure, thermal - traffic and construction work adjacent.   If you record continuously then you have the advantage that some periods are just thermal and some traffic and thermal etc... 

Whenever I do this, I am asked either what is the impact of traffic or construction.  The simple answer is to code it properly and then just get all the answers and put them in a MySQL database in the cloud.  

With this structure at the first meeting, a comment was made that thermal did not impact the response.  



The correlation is obvious to even an old human brain.  


If I plot the temperature against the tilt I get - and this shows several lines through the points 


Question is do they exist or are they random - turn on lines


So the result shows the tilt is related to temp + some other factors.  Now we look for the other factors. 






This is traffic - thermal and construction, now it is just set theory. 

Arjen:  I live inside these data sets with clients asking - what does it mean.  

0 Kudos
Honored Contributor III

Looking at the temperature verses tilt looks a hysteresis chart. I do not think they are random.

Rather, one is when the temperature is ascending, and the other is when the temperature is descending.

I am not sure about the third line in between the top and bottom. That line seems to be somewhat stable (i.e. not noisy).

The vertical blip at ~22.5 may be a wind gust or something heavy moving into/out of the structure (or something rotating on the structure).

Also the "gap" between 20 and 21.5 is in the top line interesting, something happened at that time.

Does that correlate with some event related to the structure?



0 Kudos
Valued Contributor III


Thanks for the message.   I want to send you a private message, but I am snookered as to how to do it, the new dashboard does not appear to have a way to create messages. 

In essence the trucks driving across the bridge change the mass of the steel bridge and impact on the tilt.  

Any idea how to create a private message would be appreciated?


0 Kudos
Honored Contributor III

We'd connected by email before.

I emailed you a message a moment ago.



0 Kudos