Coordinate Descent algorithm with non separable non-smooth term

Gusev__Dmitry · ‎06-11-2020

Hello,

Could someone please clarify if the DAAL Coordinate Descent algorithm implementation works with a not necessary additively separable non-smooth function?

The example below refers to L1 norm as non smooth part of the objective function.

https://software.intel.com/content/www/us/en/develop/documentation/daal-programming-guide/top/algorithms/analysis/optimization-solvers/iterative-solver/coordinate-descent-algorithm.html

I would like to use Coordinate Descent algorithm for more generic not necessary separable non-smooth functions. Any thoughts?

Thanks.

Regards,

Dmitry

AthiraM_Intel · ‎06-12-2020

Hi,

We have forwarded this case to SME, they will get back to you soon.

Thanks

Kirill_S_Intel · ‎06-16-2020

Hello Dmitry,

Current implementation of Intel DAAL Coordinate Descent optimization solver requires next available results from objective function:

component of gradient vector computed for smooth part of objective function
component of hessian matrix diagonal, computed for smooth part of objective function
and proximal projection operator result, to handle not smooth part

Coordinate descent (and other solvers) requires composite form of objective function as described common part of iterative solvers:

https://software.intel.com/content/www/us/en/develop/documentation/daal-programming-guide/top/algorithms/analysis/optimization-solvers/iterative-solver.html

Also I'm not sure that proximal projection operator itself is applicable for not separable representation of objective function.

Sorry for long response.

Best regards,

Kirill

Gusev__Dmitry · ‎06-17-2020

Thank you Kirill,

Can you also take a look at the following topic

https://software.intel.com/en-us/forums/intel-data-analytics-acceleration-library/topic/856818 ;

it looks to me as it could be a serious problem with the MSE. I really need your advice on weather this can be fixed or if there is work around on

using SAGA with regularized MSE

As far as the current topic - non-smooth part of the objective function M(theta), generally, does not have to be separable by coordinates (like L1 norm or coordinate wise box constraints). The SAGA algorithm works perfectly fine with not separable non smooth functions. However, looks like DAAL coordinate descent can not handle this "non-separable" scenario because it is not implemented as block coordinate descent algorithm. Probably, it is worth to reflect that in documentation.

Best regards,

Dmitry

Shailen_Sobhee · ‎06-19-2020

Hi Dmitry,

We will get back to you on the other thread you linked.

Kind regards,

Shailen