Can anyone please help and verify the Lipschitz constant formula at the bottom of the page https://software.intel.com/en-us/daal-programming-guide-logistic-loss ?
Currently, in the Developer Guide document the upper bound for logistic loss Lipschitz constant is specified as max||xi||2 + λ2/n
However, it could be shown that logistic loss Hessian spectral norm upper bound (rough) is rather max||xi||22 + λ2
Your clarification is much appreciated.
Thanks for reporting this statement! Seems instead of max||xi||2 we should use max||xi||22 , it was misprint.
λ2 scaling was done to align sklearn regularization penalty value. (https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/linear_model/_sag.py#L78)
Note: Lipschitz constant is used only for SAGA algorithm, for calculation optimal step size.
Thanks for getting back to me.
Couple of thoughts about Lipschitz constant:
I would like to know your opinion, since I am implementing a custom objective function for the SAGA algorithm and looking for high performance computing with large data sets.
Currently, algorithm supports parameter 'learningRateSequence' (https://software.intel.com/en-us/daal-programming-guide-computation-13), if we see some performance improvements with another step size, DAAL library provides options to use this advantages. For reason of conformance with sklearn library we can`t change this automatic step selection, however 'learningRateSequence' parameter is provided.
About source of scaling λ term by n I think it make sense.