Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
6956 Discussions

How to use Nonlinear Least Squares Problem without Constraints

jeremy_74
Beginner
1,867 Views
Hi together,

I'm quite new with using the MKL and also not a big mathematic freak. At the moment I try to solve a Function from the type:

ai + bi * x1 = ci * x2 / (di + x3)

x1, x2, x3 are unknown, I have initial estimations for x2 and x3.

Now the questions:
1.) Can I even use the MKL for that, espacially the strnlsp* functiopns?
2.) in the init function how can I match these function to the transfer parameters and what the... is the length of an function m ?

I found some examples, but those are not really good documented - so I really need your help.

Addition: You can not offend me my critizies my knowledge at mathematic.... ;-)

Thanks for your best effort.
0 Kudos
29 Replies
Alexander_K_Intel2
1,450 Views
Hi,
Basically speaking the problem is not clear for me. Is ai,bi,ci,di,x1,x2 and x3 scalar variables? If yes, why you can't calculate x1 asx1=(ci*x2/(di+x3)-ai)/bi?
With best regards,
Alexander Kalinkin
0 Kudos
Alexander_K_Intel2
1,450 Views
And one more question: how many equations you have in your system?
With best regards,
Alexander Kalinkin
0 Kudos
Gennady_F_Intel
Moderator
1,450 Views
one more question: "I found some examples, but those are not really good documented - so I really need your help."
What examples are You talking about and how we should improve the documentation?
0 Kudos
mecej4
Honored Contributor III
1,450 Views
The following comments are based on what I think is the purpose of the calculation:

Given data vectors a, b, c, d of size n >= 3, scalar coefficients x_1, x_2 and x_3 are to be found such that the n equations

a_i + b_i x_1 = c_i x_2 / (d_i + x_3), i = 1 .. n,

are satisfied in the least squares sense.

Being curious myself about the capabilities of MKL in this task, I decided to write up a short test program, adapted from the example in the MKL documentation ("Example. dtrnlsp Usage in Fortran"), with fake data made up for a, b, c, d, and reasonable termination criteria for the iterative solution. Note that the documentation contains a typo, as flagged in the source below. (Edit 2/8/2010: the typo has been corrected).

The result is

strnlsp j74 ... PASS 3.3930975E-03 3.3930975E-03

[fortran]program nlfitmkl
implicit none

INCLUDE 'mkl_rci.fi'

external j74
real, dimension(3) :: x = (/ 3.7, 1.9, 0.17 /)
integer :: i,j,st_cr,res,n=3,m=4,iter,iter1=10,iter2=5,rci_req
real :: eps(6) = (/0.01, 0.01, 0.01, 0.01, 0.01, 0.01 /),rs=90.0
real :: r1,r2,jac_eps=1e-3,fvec(4), fjac(4,3)
integer*8 hndl
logical :: success = .FALSE.

res = strnlsp_init(hndl,n,m,x,eps,iter1,iter2,rs)
if (res /= TR_SUCCESS) then
print *, ' Error in strnlsp_init',res
call mkl_free_buffers()
stop 1
endif

RCI_req = 0
do i=1,m
fvec(i) = 0
do j=1,n
fjac(i,j) = 0
end do
end do

do while(.not. success)
res = strnlsp_solve(hndl,fvec,fjac,rci_req)
if (res /= TR_SUCCESS) then
print *, ' Error in strnlsp_solve', res
call mkl_free_buffers()
stop 2
endif
select case (rci_req)
case (-1,-2,-3,-4,-5,-6)
success = .true.
case (1)
call j74(m,n,x,fvec)
case (2)
res = sjacobi(j74,n,m,fjac,x,jac_eps)
if(res /= TR_SUCCESS) then
print *, ' Error in sjacobi'
call mkl_free_buffers()
stop 3
endif
end select
end do
res=strnlsp_get(hndl,iter,st_cr,r1,r2) ! <---- MKL doc erroneously shows r1_r2
if (res /= TR_SUCCESS) then
print *,' Error in strnlsp_get'
call mkl_free_buffers
stop 4
endif
res = strnlsp_delete(hndl)
if (res /= TR_SUCCESS) then
print *, ' Error in strnlsp_delete'
call mkl_free_buffers()
stop 5
endif
call mkl_free_buffers()
if(r2 < 4e-3) then
print *,' strnlsp j74 ... PASS',r1,r2
else
print *,' strnlsp j74 ... FAIL',r1,r2
endif
end program nlfitmkl

subroutine j74(m,n,x,f)
implicit none
real, dimension(4) :: &
a= (/ -20.394, -21.862, -16.822, -21.395 /), &
b = (/ 5.137, 5.365, 3.418, 5.383 /), &
c = (/ -1.752, -2.554, -2.282, -1.88 /), &
d = (/ 2.228, 2.240, 0.868, 2.25 /)
integer, intent(in) :: m,n
integer :: i
real, dimension(n), intent(in out) :: x
real, dimension(m), intent(in out) :: f
do i=1,m
f(i)=a(i)+x(1)*b(i)-x(2)*c(i)/(d(i)+x(3))
end do
return
end subroutine j74
[/fortran]
0 Kudos
Gennady_F_Intel
Moderator
1,450 Views
  1. res=strnlsp_get(hndl,iter,st_cr,r1,r2)!<----MKLdocerroneouslyshowsr1_r2
What version are talking about?
0 Kudos
mecej4
Honored Contributor III
1,450 Views
Gennady,

I am not sure which version this link pertains to. I had searched the Intel site, located the documentation, and bookmarked the HTML version of the reference manual, assuming that it would be updated whenever a new release of MKL was issued.

http://software.intel.com/sites/products/documentation/hpc/compilerpro/en-us/cpp/win/mkl/refman/appendices/mkl_appC_osr_dtrnlsp.html#appC-exC-43
0 Kudos
Gennady_F_Intel
Moderator
1,450 Views
aahh, thanks a lot!I didn't check these examples -:). We will remove the typo asap.
0 Kudos
Gennady_F_Intel
Moderator
1,450 Views
fyi - the typo was removed..
0 Kudos
jeremy_74
Beginner
1,450 Views
Thanks a lot for all this help!

Many things are now clear and understood. I translated your example to my enviroment and to C and I can observe the single steps working.

But there occures an new problem. After the second Iteration (the Jacobi Matrix is been calculated the first time in 5 step, with success) - the strnlsp_solve crash with an segmentation fault or better with the output " unknown machine code instruction".

Do you have any idear what is going wrong here?

0 Kudos
mecej4
Honored Contributor III
1,450 Views
My guess is that the MKL routines are not being called with the proper argument lists. Note that in this particular instance a function pointer is among the arguments.

If you post the C source, it may be possible to locate the bug.
0 Kudos
jeremy_74
Beginner
1,450 Views
[cpp]    //Optimierung
    //Here I copy my local parameters to global variables for using them in the global function
    CopyCamParaToGlobalVar();  
    _TRNSP_HANDLE_t handle;
    int n = 3; // Count of unknowns
    int m = m_imagePointX.size()*2;  // Count of Values
    float *x;
    x = (float*) malloc (sizeof (float)*n);
    x[0] = m_k1; // Initial values
    x[1] = m_f;
    x[2] = m_Tz;
    Ipp32f eps[6];
    eps[1] = 0.005;
    eps[2] = 0.005; 
    eps[3] = 0.005; 
    eps[4] = 0.005; 
    eps[5] = 0.005; 
    eps[6] = 0.005; 
    int iter1 = 1000;
    int iter2 = 100;
    Ipp32f rs = 90;

    //Init
    int res = strnlsp_init(&handle, &n, &m, x, eps, &iter1, &iter2, &rs);
    if (res != TR_SUCCESS)
    {
        std::cerr<<"Fehler bei Least Square Init ="<<<:ENDL> = 0.0; 
    for (int i = 0; i = 0.0; 
    while(success == false)
    {
        std::cerr<<"Neue Runde = "<<<:ENDL> 

and here the Output on the screen:

[bash]Erfolg bei Least Square Init =1501 n3 m140
Neue Runde = 0
Erfolg bei Least Square Solve Iteration=0 RCI_Request=1
CalculateFunctionTsai - Start
CalculateFunctionTsai - End mit x0=0 x1=0.394556 x2=-0.0816883
Neue Runde = 1
Erfolg bei Least Square Solve Iteration=1 RCI_Request=2
CalculateFunctionTsai - Start
CalculateFunctionTsai - End mit x0=27 x1=0.394556 x2=-0.0816883
CalculateFunctionTsai - Start
CalculateFunctionTsai - End mit x0=-27 x1=0.394556 x2=-0.0816883
CalculateFunctionTsai - Start
CalculateFunctionTsai - End mit x0=0 x1=27.3946 x2=-0.0816883
CalculateFunctionTsai - Start
CalculateFunctionTsai - End mit x0=0 x1=-26.6054 x2=-0.0816883
CalculateFunctionTsai - Start
CalculateFunctionTsai - End mit x0=0 x1=0.394556 x2=26.9183
CalculateFunctionTsai - Start
CalculateFunctionTsai - End mit x0=0 x1=0.394556 x2=-27.0817
Erfolg bei Jacobi Iteration=1 n=3 m=140
Neue Runde = 2
Ungltiger Maschinenbefehl

[/bash]
The last means SIGILL
0 Kudos
jeremy_74
Beginner
1,450 Views
I checked this example also and have some questions about:

1.) in Line 36 (C-Code Example) MKL)INT - I think this should be MKL_INT
2.) I'm confused about the access to the elemts of x:
[cpp]    for (i = 0; i < n/4; i++)

    {

        x [4*i]     = 3.0;

        x [4*i + 1] = -1.0;

        x [4*i + 2] = 0.0;

        x [4*i + 3] = 1.0;

    }[/cpp]

This will always steps through the loop once, right? Is there a deeper sense-which I do not understand?

And at the end the same again:
[cpp]void extended_powell (MKL_INT *m, MKL_INT *n, double *x, double *f)

{

    MKL_INT i;

 

    for (i = 0; i < (*n)/4; i++)

    {

       f [4*i] = x [4*i] + 10.0*x [4*i + 1];

       f [4*i + 1] = 2.2360679774998*(x [4*i + 2] - x [4*i + 3]);

       f [4*i + 2] = (x [4*i + 1] - 2.0*x [4*i + 2])*(x [4*i + 1] - 2.0*x [4*i + 2]);

       f [4*i + 3] = 3.1622776601684*(x [4*i] - x [4*i + 3])*(x [4*i] - x [4*i + 3]);

    }

    return;

}[/cpp]
0 Kudos
Alexander_K_Intel2
1,450 Views
Gentlemen,
First of all thank you for so deep interest in Trust Region functionality! Nevertheless Jeremy could you provide us your example fully (subroutine CalculateFunctionTsai and other) to reproduce your problem.
With best regards,
Alexander Kalinkin
0 Kudos
Alexander_K_Intel2
1,450 Views
Aboutextendet_powell function: you can set m=n=8 for example than your example will step through loop twice.
With best regards,
Alexander Kalinkin
0 Kudos
jeremy_74
Beginner
1,450 Views
Sorry that I can not share the whole code, but for a better understanding I can provide all the inputs and the code parts which are interessting for this error:

1.) The CalculateFunctionTsai Subroutine:
[cpp]void CalculateFunctionTsai(int *m, int *n, double *x, double *FValues)
{
	std::cerr<<"CalculateFunctionTsai - Start m="<< *m<<:ENDL> - g_cx) / g_sx;      // [mm]
		yd = g_dy * (g_imagePointY - g_cy);              // [mm]
		r_squared = pow(xd , 2) + pow(yd, 2);
		ay = yd;
		ax = xd;
		by = yd * r_squared;
		bx = xd * r_squared;
		cy = (g_R4 * g_refPointX + g_R5 * g_refPointY + g_R6 * g_refPointZ + g_Ty );
		cx = (g_R1 * g_refPointX + g_R2 * g_refPointY + g_R3 * g_refPointZ + g_Tx );
		d = (g_R7 * g_refPointX + g_R8 * g_refPointY + g_R9 * g_refPointZ);
		*FValues = ax + bx * x[0] - cx * x[1] / (d + x[2]);
		++FValues;
		*FValues = ay + by * x[0] - cy * x[1] / (d + x[2]);
		++FValues;
	}
	std::cerr<<"CalculateFunctionTsai - End mit x0="< 
2.) The Start Parameters:

[cpp]IS::Error CameraCalib::ReadCalib()
{

	//Kalibrierwerte ToDo: Wird bergeben
	int countSpan = 7;
	int countRow = 10;
	int countTotal = 70;
	Ipp32f xRefOffset = 100.0;
	Ipp32f yRefOffset = 100.0;
	Ipp32f xRefStepSize = 40.0;
	Ipp32f yRefStepSize = 80.0;

	Ipp32f xImg[70] ={ 42.0, 127.0, 214.0, 305.0, 399.0, 490.5, 576.0,
							   36.0, 121.5, 210.0, 304.5, 399.0, 492.0, 580.5,
							   31.5, 115.5, 207.0, 301.5, 397.5, 492.0, 580.5,
							   27.0, 114.0, 204.0, 300.0, 396.0, 492.0, 580.5,
							   22.5, 111.0, 202.5, 298.5, 394.5, 490.5, 580.5,
							   21.0, 108.0, 201.0, 295.5, 393.0, 389.0, 577.5,
							   18.0, 106.5, 199.5, 294.0, 391.5, 487.5, 576.0,
							   16.5, 105.0, 198.0, 292.5, 388.5, 484.5, 573.0,
							   16.5, 103.5, 196.5, 292.5, 387.0, 481.5, 570.0,
							   18.0, 103.5, 196.5, 291.0, 384.0, 477.0, 565.5};

	Ipp32f yImg[70] ={ 34.0,  33.0,  29.0,  31.0,  35.0,  40.5,  49.5,
							   75.0,  73.5,  73.5,  73.5,  78.0,  82.5,  90.0,
							  115.5, 117.0, 117.0, 117.5, 121.5, 126.0, 135.0,
							  162.0, 162.0, 163.5, 166.5, 165.5, 174.0, 178.5,
							  207.0, 208.5, 211.5, 214.5, 219.0, 222.0, 226.5,
							  253.5, 255.0, 259.5, 264.0, 267.0, 271.5, 273.0,
							  300.0, 303.0, 307.5, 312.0, 319.0, 318.0, 319.5,
							  346.5, 351.0, 357.0, 360.0, 364.5, 366.0, 367.5,
							  391.5, 399.0, 402.0, 408.0, 412.5, 412.5, 411.5,
							  436.5, 445.5, 451.5, 456.0, 459.0, 459.0, 456.0};

	m_refPointX.resize(countTotal);
	m_refPointY.resize(countTotal);
	m_refPointZ.resize(countTotal);
	m_imagePointX.resize(countTotal);
	m_imagePointY.resize(countTotal);

	int t=0;

	for(int y=0; y = xImg;
			m_imagePointY = yImg;

			m_refPointX = xRefOffset + x * xRefStepSize;
			m_refPointY = yRefOffset + y * yRefStepSize;
			m_refPointZ = 10;
			std::cerr<<"X: "<<<" Y: "<<<" ; ";
			++t;
		}
		std::cerr<<:ENDL> 

3.) The already calculated parameters:
m_Ty =-781.346
m_Tx =-0.772715
m_sx =1.5971
m_R1=-0.874132 m_R2=0.025665 m_R3=0.48501
m_R4=-0.407516 m_R5=-0.582054 m_R6=-0.703665
m_R7=0.264242 m_R8=-0.812745 m_R9=0.519251

I guess this is all.

There is are some informations I want to share with you:
  • I found out, that when I overwrite the results of fjac again width 0.0 after calling sjacobi(CalculateFunctionTsai,&n,&m,fjac,x,eps); (line 67)-> the error don't occures! It seems like it have to do width the content of fjac. -> I'm really confused
  • It doesn't make any difference if I use the double functions instead of the Real Functions.
Last but not least all variables called g_* are only copies of the corresponding m_* variable


0 Kudos
jeremy_74
Beginner
1,450 Views
There is no difference, when I limit m to n.
0 Kudos
Alexander_K_Intel2
1,450 Views
Hi,
Its really difficult to reproduce your testcase from parts of program. Could you combine really small example that I could compile and execute on my side? You dont need to share whole code but without testcase it is really hard to find out rootcase of your problem.
With best regards,
Alexander Kalinkin
0 Kudos
jeremy_74
Beginner
1,450 Views
Hi,

I transfer all to a small C-Programm. This should help to reproduce the error:

[cpp]/*
 * test.cpp
 *
 *  Created on: 10.02.2011
 *      Author: jeremy
 */

#include 
#include 
#include 
#include 


	//! Innere Kalibrimain.o:(.bss+0x88): first defined here
	float m_f, m_cx, m_cy, m_k1, m_k2;

	//! Rotationsmatrix der Kamera
	float m_R1, m_R2, m_R3, m_R4, m_R5, m_R6, m_R7, m_R8, m_R9;

	//!Translationsmatrix
	float m_Tx, m_Ty, m_Tz;

	//! Sensorwerte
	float m_dxs, m_dy, m_Ncx, m_Nfx, m_sx;

	//! Vector Referenzpunkte
	std::vector m_refPointX, m_refPointY, m_refPointZ;

	//! Vector Bildpunkte
	std::vector m_imagePointX, m_imagePointY;

void CalculateFunctionTsai(int *m, int *n, double *x, double *FValues)
{
	std::cerr<<"CalculateFunctionTsai - Start m="<< *m<<:ENDL> - m_cx) / m_sx;      // [mm]
		yd = m_dy * (m_imagePointY - m_cy);              // [mm]
		r_squared = pow(xd , 2) + pow(yd, 2);
		ay = yd;
		ax = xd;
		by = yd * r_squared;
		bx = xd * r_squared;
		cy = (m_R4 * m_refPointX + m_R5 * m_refPointY + m_R6 * m_refPointZ + m_Ty );
		cx = (m_R1 * m_refPointX + m_R2 * m_refPointY + m_R3 * m_refPointZ + m_Tx );
		d = (m_R7 * m_refPointX + m_R8 * m_refPointY + m_R9 * m_refPointZ);
		*FValues = ax + bx * x[0] - cx * x[1] / (d + x[2]);
		++FValues;
		*FValues = ay + by * x[0] - cy * x[1] / (d + x[2]);
		++FValues;
	}
	std::cerr<<"CalculateFunctionTsai - End mit x0="< = xImg;
			m_imagePointY = yImg;

			m_refPointX = xRefOffset + x * xRefStepSize;
			m_refPointY = yRefOffset + y * yRefStepSize;
			m_refPointZ = 10;
			std::cerr<<"X: "<<<" Y: "<<<" ; ";
			++t;
		}
		std::cerr<<:ENDL> = 0.0;
	for (int i = 0; i< (m * n); ++i)
		fjac = 0.0;

	while(success == false)
	{
		std::cerr<<"Neue Runde = "<<<:ENDL> best regards Jeremy
0 Kudos
Gennady_F_Intel
Moderator
1,450 Views
Jeremy, may be I missed something the part of discussions,
but you mentioned above that some crash happened: "the strnlsp_solve crash with an segmentation fault or better with the output " unknown machine code instruction"."
I quickly checked your example - the test finished without run-time problems..
0 Kudos
jeremy_74
Beginner
1,303 Views
what a surprise - now I'm really confused the same code crash with the german output on the screen: "Ungltiger Maschinenbefehl" what is equal with SIGILL. Could it be that it have something to do with SIMD? I'm using a Pentium M 1500MHz, which can only provide MMX, SSE and SSE2 and SIGILL can have some reason in this...- just an idear..
0 Kudos
Reply