documentation on Optimization Solver Routines

llevrel · ‎07-09-2009

Hi,

First, thanks for the great MKL software.

I'm having a hard time trying to understand sections 14 and F of the Reference Manual...
Here are the many inconsistencies or imprecisions that puzzle me:
- in dtrnlsp_init and _solve: "F(x) is the value of the functional", but the functional is a positive scalar so (i) it has no norm to be computed when comparing to eps(2); (ii) it can't be added to A(x)s which is a vector (comparison with eps(5))
- in dtrnlsp_init and _solve: what does the condition on eps(3) mean? max(abs(A_ij) for all i,j)- in dtrnlsp_init: what does rs serve for exactly? It is "used in determining (...)", but how?
- in dtrnlsp_solve: first equation of section Description: F(x) seems to be the vector of m f(x) values. Shouldn't that be (y-f(x)) ?
- in dtrnlsp_solve: last equation of section Description is problematic: J is a matrix, so is J^T.J; but J^T.F(x).s is... is what? What is the product of a matrix times a vector times another vector?
- in dtrnlsp_solve: is fjac the Jacobian of fvec=y-f(x) or of f(x)? I bet it's fvec, but "the function" is ambiguous.
- in the code example in C: rs=0.0 ???
- in appendix F, third equation, there should be no 1/2; also, r(x) and R(x) are introduced but never used afterwards
- appendix F, "Trust-Region Algorithm" section: second equation again mixes F with a vector; k is not defined (guessed it's the iteration number)
- appendix F, "Trust-Region Algorithm" section, third equation: now f is the functional, F the vector??? (plus this 1/2 again)
- appendix F, last equation: what are functions rho and m?

I'm all the more surprised that the rest of the manual is rather precise. Maybe this chapter is a late addition that couldn't be fully reviewed? Anyway I'm confident that you will soon clarify things.

Thanks in advance,
LL

ArturGuzik · ‎07-09-2009

Quoting - llevrel

- in dtrnlsp_init and _solve: "F(x) is the value of the functional", but the functional is a positive scalar so (i) it has no norm to be computed when comparing to eps(2); (ii) it can't be added to A(x)s which is a vector (comparison with eps(5))
- in dtrnlsp_init and _solve: what does the condition on eps(3) mean? max(abs(A_ij) for all i,j)- in dtrnlsp_init: what does rs serve for exactly? It is "used in determining (...)", but how?
- in dtrnlsp_solve: first equation of section Description: F(x) seems to be the vector of m f(x) values. Shouldn't that be (y-f(x)) ?
- in dtrnlsp_solve: last equation of section Description is problematic: J is a matrix, so is J^T.J; but J^T.F(x).s is... is what? What is the product of a matrix times a vector times another vector?
- in dtrnlsp_solve: is fjac the Jacobian of fvec=y-f(x) or of f(x)? I bet it's fvec, but "the function" is ambiguous.
- in the code example in C: rs=0.0 ???
- in appendix F, third equation, there should be no 1/2; also, r(x) and R(x) are introduced but never used afterwards
- appendix F, "Trust-Region Algorithm" section: second equation again mixes F with a vector; k is not defined (guessed it's the iteration number)
- appendix F, "Trust-Region Algorithm" section, third equation: now f is the functional, F the vector??? (plus this 1/2 again)
- appendix F, last equation: what are functions rho and m?

I'm all the more surprised that the rest of the manual is rather precise. Maybe this chapter is a late addition that couldn't be fully reviewed? Anyway I'm confident that you will soon clarify things.

Thanks in advance,
LL

Hi,

you nicely listed all the problems with the docs. I can help with just few of them:

(1) in dtrnlsp_solve: is fjac the Jacobian of fvec=y-f(x) or of f(x)? I bet it's fvec, but "the function" is ambiguous.

In calculations always fvec (a residual here) is used, rest is just to make people confused.

(2) Appendix F.

All of that is copied (without update) from the book Trust-region methods By Andrew R. Conn, Nicholas I. M. Gould, Philippe L. Toint. See yourself.
So rho is just a ratio of decrease in function and model (approximation of f in neighbourhood of x) used to accept/reject the trial step. Model is mentioned somewhere on top of the page (in the docs) but no details given.

All these small things with the docs make Users confused, annoyed, and discouraged. Good and clear docs are essential. Take the linking problem as an example. Once somebody finally (!!!) came up with the linking assistance tool, linking problems almost dissapear from Forum.

A.

llevrel · ‎07-10-2009

Quoting - ArturGuzik

rest is just to make people confused.

:-D

(2) Appendix F.

All of that is copied (without update) from the book Trust-region methods By Andrew R. Conn, Nicholas I. M. Gould, Philippe L. Toint. See yourself.

That's what I feared. Sigh.

All these small things with the docs make User's confused, annoyed and discourage them. Good and clear docs are essential. Take the linking problem as an example. Once somebody finally (!!!) came up with the linking assistance tool, linking problems almost dissapear from Forum.

About docs, oh yes, fully agreed! Regarding linking, I strove to understand how to make it work.

Thanks for your help!

Nikita_S_Intel · ‎07-13-2009

Hi Lucas,

Thank you for your comments. You found some misprints in our documentation. Well fix it as soon as possible.

- in dtrnlsp_init and _solve: "F(x) is the value of the functional", but the functional is a positive scalar so
(i) it has no norm to be computed when comparing to eps(2);
(ii) it can't be added to A(x)s which is a vector (comparison with eps(5))
The statement of non-linear least square problem is min||F(x)|| = min||y f(x)|| , where F(x) : Rⁿ --> R^m is a twice differentiable function in Rⁿ. Solving of the nonlinear least squares problem is searching for the best approximation to vector y with model function fi(x), which has nonlinear dependence on variables . The best approximation means that the sum of squares of residuals yi - fi(x) is the lowest possible.
Well fix the introduction of "Nonlinear Least Squares Problem without Constraints" section, as soon as possible.

- in dtrnlsp_init and _solve: what does the condition on eps(3) mean? max(abs(A_ij) for all i,j)The Jacobi matrix should be non-singular matrix. Here we checked this.

- in dtrnlsp_init: what does rs serve for exactly? It is "used in determining (...)", but how?
rs is positive input variable used in determining the initial step bound. In most cases the factor should lie within the interval (0.1, 100.0). The generally recommended value is 100. If youre interested in details, please you find information in (first reference exists in user manual):
(1) Andrew R. Conn, Nicholas I.M. Gould, and Philippe L. Toint: Trust-region Methods. SIAM Society for Industrial & Applied Mathematics, New Jersey, MPS-SIAM series on optimization edition, 2000.
(2) J.J. More. The LevenbergMarquardt algorithm: implementation and theory. G.A. Watson, ed, Lectures Notes in Mathematics 630: Numerical Analysis (SpringerVerlag, Berlin, 1978) pp 105116.
(3) J.J. More, D.C. Sorensen. Computing a trust region step. SIAM journal on scientific and statistical Computing, 4(3):553572, 1983.

- in dtrnlsp_solve: first equation of section Description: F(x) seems to be the vector of m f(x) values. Shouldn't that be (y-f(x)) ?
You are right, but here is just a classic statement of nonlinear least square problem.

- in dtrnlsp_solve: last equation of section Description is problematic: J is a matrix, so is J^T.J; but J^T.F(x).s is... is what? What is the product of a matrix times a vector times another vector?
There is a misprint, third equation should be: min || J^T(x)J(x)s + J^T(x)F(x)||. You can find full description of algorithms in (1), (2) and (3).

- in dtrnlsp_solve: is fjac the Jacobian of fvec=y-f(x) or of f(x)? I bet it's fvec, but "the function" is ambiguous.
Yes, you are right, its fvec

- in the code example in C: rs=0.0 ???
Its misprint, thank you!

- appendix F, "Trust-Region Algorithm" section: second equation again mixes F with a vector; k is not defined (guessed it's the iteration number)
Yes, you are right; of course its iteration number

- appendix F, "Trust-Region Algorithm" section, third equation: now f is the functional, F the vector??? (plus this 1/2 again)
We'll rewrite this section.

- appendix F, last equation: what are functions rho and m?
Artur answered on this question. Artur, thank you!

BTW, the nonlinear optimizations are heavy algorithms. We included only main points in users manual, pls find full description of algorithm in (1), (2) and (3)

Thank you!
--Nikita

llevrel · ‎07-13-2009

Hi Nikita,

Thanks for you reply. I still have a couple of questions.

Quoting - Nikita Shustrov (Intel)

The Jacobi matrix should be non-singular matrix. Here we checked this.

So, you mean the test is on the determinant of J? |det(J)| < eps(3) ?

BTW, the nonlinear optimizations are heavy algorithms. We included only main points in users manual, pls find full description of algorithm in (1), (2) and (3)

Yes, I understand that. I guess it will be rather OK when the typos and ambiguities are corrected.

Another question/remark: I've started looking at the references, and it seems the TR algorithm uses some parameters to accept/reject trials and to evolve the TR radius. The authors give some hints about reasonable values, but you should make clear what values the routines use actually. (Or you could even let the user tweak them!)

Nikita_S_Intel · ‎07-14-2009

Hi Lucas,

So, you mean the test is on the determinant of J? |det(J)| < eps(3) ?
No we don't calculate the determinant of Jacobi matrix because this operation is very "expensive". We check following: the norm of each column of Jacobi matrixshould be larger the epsilon. In general, it's enough for our algorithm.

Another question/remark: I've started looking at the references, and it seems the TR algorithm uses some parameters to accept/reject trials and to evolve the TR radius. The authors give some hints about reasonable values, but you should make clear what values the routines use actually. (Or you could even let the user tweak them!)
I understand your question, but unfortenalty these parameters and handling of parameters are internal :)

--Nikita