Basic nth order curve fit

Shawn_Gibson · ‎08-25-2014

My goal: Using a small data set (3 - 40 points), calculate the coefficients of an n-th order polynomial, using either the MKL library, or IPP library that fits a data set using c++. Then from that equation, calculate its derivative and solve for any roots, using constraints. I suspect this is something easy, yet I have been searching and searching for just a basic code example and haven't really found what I'm looking for.

Here is a very basic example of such a problem that I solved using Excel trend lines. Input data points x1, y1 (13, 14153.6), x2, y2 (14, 14155.79), x3, y3, (15, 13736.97). First derive a 2nd order equation that roughly fits this data. Excel spits back the trendline equation y = -210.5x^2 + 5685.7x - 24186. So I would be looking for a code example (MKL, or IPP) that gives me the coefficients -210.5 and 5685.7 for this basic example. From there I have a model of my data and I'm looking to interpolate a peak from these 3 points (which will probably be somewhere between 13 - 14). This is a simple enough equation that I can calculate the derivative, set it equal to 0 and solve for x, which turns out to be 13.5052. So I might be looking for a function that takes for its input the 3 data points (array), and what order of polynomial to fit for and then spits back the coefficients.

I can extend my example to 24 points (x = 1 - 24), y = {1700.347656, 2154.535156, 2760.722656, 3649.910156, 4886.097656, 6507.285156, 8244.472656, 10070.66016, 11762.84766, 13218.03516, 13872.22266, 14135.41016, 14153.59766, 14155.78516, 13736.97266, 12577.16016, 10917.34766, 9318.535156, 7739.722656, 6247.910156, 4968.097656, 3924.640625, 3086.183594, 2482.726563}. Again using Excel, I can get a 6th order curve fit (or 2nd, or whatever I would like). y = -0.0031x6 + 0.2293x5 - 5.6862x4 + 46.001x3 + 27.263x2 - 58.167x + 1766. This is a bit trickier to calculate the roots of the derivative. Using the Excel solver, i hand calculated an equation for the derivative, then used a constraint that my root had to be bigger than 10 (as I expect my peak to be around 13 somewhere). The solution was 12.664.

Hopefully this gives a flavor of what I would like to do. Right now I'm lost in the sea of functions/examples and I'm just trying to get a basic example working. I'm trying with the MKL data fit functions, but maybe those are more complex than I need.

mecej4 · ‎08-25-2014

There are two parts to the answer. One is regarding how to do what you asked. The other is to raise doubts about whether it should be done at all, and raise awareness of the unreliability of the answers.

You should use MKL along with a procedural programming language such as Fortran, C (and C++, C#) if you are going to repeat evaluating the peaks of your data thousands of times with different input data. For casual calculation, however, you can be more productive by using a front end such as Matlab (and its clones). The Matlab code to perform your 6-th degree fit and find its peak, for example, is this:

n=6;    % polynomial degree
x=1:24; % data
y=[1700.347656, 2154.535156, 2760.722656, 3649.910156, 4886.097656, ...
   6507.285156, 8244.472656, 10070.66016, 11762.84766, 13218.03516, ...
   13872.22266, 14135.41016, 14153.59766, 14155.78516, 13736.97266, ...
   12577.16016, 10917.34766, 9318.535156, 7739.722656, 6247.910156, ...
   4968.097656, 3924.640625, 3086.183594, 2482.726563];
c=polyfit(x,y,n);     % polynomial coefficients
cd=.*c(1:n);  % coeffs. of derivative polynomial
rcd=roots(cd)         % roots of derivative polynomial

Running this in Matlab gives the result

rcd =

 24.796955220817100 + 1.851754740317976i
 24.796955220817100 - 1.851754740317976i
 12.744533617404986                     
 -0.806296543053428                     
  0.495302762706506

of which the third is the one you expected, but there are two other real roots, and you need to ask why they are not equally acceptable.

Suppose, however, that I change the degree from 6 to 5. The resulting peak locations are drastically different, although the third one is close:

rcd =

  1.0e+002 *

   3.921807689372086
   0.236854696358038
   0.127785399241093
   0.016938110116897

Thus, you see that the "answer" is quite sensitive to the polynomial degree selected. There are now four peaks instead of the three seen earlier.

Likewise, investigation will show you that the result is also sensitive to errors in the input data for x and/or y. Do you know what the probable errors in those are? You can then obtain "confidence intervals" for the peaks. A course in "data analysis" or "regression analysis" will provide you with the details.

Shawn_Gibson · ‎08-25-2014

mecej4,

Thank you for your response. I understand the need to know what one is doing before jumping into a lake full of alligators ;). What you have presented above is a very excellent example of how I can solve this problem using Matlab. Now I have two excellent examples of how to solve the problem. One using Excel, and your excellent example of doing the same thing with Matlab. Incidentally with your follow on example of a 5th degree polynomial, based upon the nature of my problem and knowing the inputs, I would have chosen 12.778.

What I still don't have however is a clear cut code example of how to achieve the same results using either the MKL or IPP libraries, as that is what I want to implement using c++ code. I have presented a very basic example mainly to learn the code mechanics to get the coefficients and roots.

mecej4 · ‎08-25-2014

You can use Lapack-e routine LAPACKE_dgels() to solve the overdetermined linear equations for the polynomial coefficients. The input matrix to dgels() will have 1 in the first column, the vector x in the second column, x² in the third, ..., and xⁿ in the last column. MKL comes with an example file for calling dgels, in .../mkl/examples/lapacke/source/lapacke_dgels_row.c.

Following the fitting, you need to call a routine that can find a root of a nonlinear equation. I am not aware of a routine in MKL or IPP to do this, but you should be able to put one together yourself or find a suitable routine on the Web.