Could someone please clarify if the DAAL Coordinate Descent algorithm implementation works with a not necessary additively separable non-smooth function?
The example below refers to L1 norm as non smooth part of the objective function.
I would like to use Coordinate Descent algorithm for more generic not necessary separable non-smooth functions. Any thoughts?
- General Support
Current implementation of Intel DAAL Coordinate Descent optimization solver requires next available results from objective function:
- component of gradient vector computed for smooth part of objective function
- component of hessian matrix diagonal, computed for smooth part of objective function
- and proximal projection operator result, to handle not smooth part
Coordinate descent (and other solvers) requires composite form of objective function as described common part of iterative solvers:
Also I'm not sure that proximal projection operator itself is applicable for not separable representation of objective function.
Sorry for long response.
Thank you Kirill,
- Can you also take a look at the following topic
it looks to me as it could be a serious problem with the MSE. I really need your advice on weather this can be fixed or if there is work around on
using SAGA with regularized MSE
- As far as the current topic - non-smooth part of the objective function M(theta), generally, does not have to be separable by coordinates (like L1 norm or coordinate wise box constraints). The SAGA algorithm works perfectly fine with not separable non smooth functions. However, looks like DAAL coordinate descent can not handle this "non-separable" scenario because it is not implemented as block coordinate descent algorithm. Probably, it is worth to reflect that in documentation.