I'm not sure what the best way to communicate requests for new functionality in MKL is, so I'll post here, hoping that the post will get to the right person at Intel. I know that with limited resources, it may not be possible to create this functionality. But I'm also sure that I'm not the only one who would take advantage of it. So here are some things that I really wish MKL could do.
1. A digamma and polygamma function. These functions come up when working with the log densities of probability distributions that include beta and gamma functions. The digamma function is the first derivative of the log gamma function, and knowing that first derivative is helpful when computing the mode of certain distributions. These functions are available in GSL, but not MKL.
2. A symmetric rank-1 and rank-k update for the Cholesky decomposition of a matrix. These functions would be equivalent to the _syr and _syrk BLAS functions, but would take Cholesky decompositions of the A matrix as the input, and return the Cholesky of A as the output. Why is this useful? The fastest way to compute the density of, or draw from, a multivariate normal distribution is by using the Cholesky of the precision matrix to convert to/from a standard MVN. When this matrix changes repeatedly, I need to do the update, and then take the Cholesky decomposition over and over. This seems inefficient to me. I know that there are some algorithms out there that can do this, but it would be quite nice if MKL included that functionality.
3. Trust region algorithms that are more amenable to general nonlinear optimization, as opposed to solving a nonlinear least squares algorithm. All that would be necessary here would be some efficient code to solve the trust region subproblem, given a current estimate of a gradient, Hessian and trust radius. An example would be the Steihaug conjugate gradient algorithm.
Please let me know if requests like this should go elsewhere. Or, please let me know if there are existing solutions to these problems that I don't know about. But it seems to me that these three things would be of general interest, and could take advantage of optimizations of the Intel architecture in a way that most users would not be able to implement on their own.
My understanding is that the trust region solver is meant to solve nonlinear least squares problems. While one could think of general nonlinear optimization as a problem that minimizes the distance between the gradient and a zero vector, much of the nonlinear optimization literature is based on minimizing a scalar-valued function. Usually, a user needs to supply a function that evaluates a gradient as well (either analytically or numerically). Supplying the Hessian is often optimal, but not required. The way (I think) the MKL solver works is that the user must supply an analytical function for the gradient AND the Jacobian (which would be the Hessian in this case). But when Hessians are very expensive to compute, quasi-Newton methods (using BFGS or SR1 updates) are commonly used. The MKL trust region solver does not seem to support this paradigm.
In my ideal dream world, MKL would include a nonlinear optimizer that works similar to the ones in R or MATLAB: the user supplies a pointer to a function that evaluates the function and the gradient, and the MKL routine does the rest (including quasi-Newton Hessian approximations) A more flexible approach might be how MKL does NLS now: supplying separate functions for initialization, iteration, convergence checking, etc.
Why is there a need for this functionality to be in MKL? Many available nonlinear optimizers use line search methods (as in R and GSL). However, trust region methods tend to be more robust when Hessians are nearly singular, and in my problems, line search methods always seem to crash. The problem is that solving the trust region subproblem can be hard to implement efficiently. Also, most existing quasi-Newton methods use BFGS methods, which always give a positive-definite Hessian, even when the true Hessian isn't. SR1 updating (which is why I wanted the rank-1 update to the Cholesky of a matrix) does not have this restriction. Unfortunately, I have not been able to find an existing implementation of a trust region optimizer with SR1 updates. Therefore, I coded my own: the Steihaug conjugate-gradient trust region method with preconditioning in Ch. 7 of the Nocedal and Wright (2006) Nonlinear Optimization book.
So what routines might MKL want to supply? for starters, solving the separate trust region subproblem efficiently (using dogleg, Steihaug, Lanczos, or other methods), and computing BFGS, SR1, and L-BFGS updates for Hessians, inverse Hessians, and Choleskys of both. It would also me nice to have a straightforward routine for vectorized or parallel numerical differentiation (user supplies a pointer to a scalar function pointer, and MKL estimates a gradient vector at a particular point).
I could go on and on about this, but I hope this adds some more clarity.