Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Beginner
133 Views

Constrained optimization with DAAL

Jump to solution

Hello all,

Can someone help me to answer  my question: if  DAAL  is good for convex constrained optimization? 

As stated in the article ( https://software.intel.com/en-us/daal-programming-guide-objective-function), the proximal operator there could  be used for non-smooth part of objective function, and the example (https://software.intel.com/en-us/daal-programming-guide-logistic-loss) shows this for L1 regularization. On the other hand,  if non-smooth  part M(theta)  is just an indicator (characteristic)  function of some convex set (constraints) , the proximal operator is exactly projection operator.

Is it possible to pass this projection operator to objective function object to handle convex constraints in that way?

Thanks! Your help is much appreciated,

Dmitry.

0 Kudos

Accepted Solutions
Highlighted
Employee
217 Views

Hello Dmitry,

Jump to solution

Hello Dmitry,

Please see some comments regarding  #25: you should modify your custom_proj.h(from #23) with only one line of code sumOfFunctionsParameter = ∥ in initialize() method.

If you delete field "Parameter par;" in Batch class you have to set _par and sumOfFunctionsParameter pointers to new allocated memory for  logistic_prox_function::Parameter.

sumOfFunctionsParameter = new logistic_prox_function::Parameter(numberOfTerms); _par = sumOfFunctionsParameter;

#26 

1) So inheritance from logistic loss::Batch was advised as simplest way to re-use kernel of logistic loss function. But you can follow existed example(optimization_solvers/custom_obj_func.h) to create your own objective function and implement compute with alignment to logistic loss function without inheritance (it would be harder to use the same internal functions like xgemv).

2) Modification of saga optimization solver flow is not recommended. DAAL optimization solver use existed sum_of_function api. Lipschitz constant is computed by result to compute already. And change of argument and reversion it back is performed on solver side and you have to take into account this aspect for implementation your proximal projection (all components of argument were divided by 'stepLength' and then proximal projection computation is called for this modified argument).

 

Best regards,

Kirill

 

View solution in original post

0 Kudos
31 Replies
Highlighted
133 Views

While DAAL provides some set

Jump to solution

While DAAL provides some set of objective functions out-of-the-box, it also supports user-provided objective functions as well. So, if you need to use optimization solver with custom objective function, you should provide custom objective function and pass it into appropriate optimization solver.

If your objective function is similar to any one, which DAAL provides, you can use inheritance to simplify the implementation. 

Please also note that not every optimization solver supports objective functions with non-smooth part.

Could you please provide some details about your specific projection operator, including kind of constraints you use?

0 Kudos
Highlighted
Beginner
133 Views

Hi Mikhail,

Jump to solution

Hi Mikhail,

Thanks for the reply. 

My optimization problem is as follows:

f(x) -> min 

with convex constraints C =  { x |  A*x >= 0} , A is an matrix,  x € Rn  

The target function  f is one from DAAL out-of-the-box functions (logistic regression) 

The tricky part is the constraints. My idea was to introduce non smooth part as an indicator function of set C, specifically

IC(x) = { 0 if x€C,  otherwise +infinity}, in that way the original problem is deduced to the unconstrained optimization problem 

f(x) +  IC(x) -> min 

The proximal operator   proxη (https://software.intel.com/en-us/daal-programming-guide-objective-function)

of the IC(x) is just euclidean projection on the convex set C.  I can implement that projection as prox : Rn -> R function.

The question is: is it possible to pass or override the proximal operator (projection in my case) within optimization solver?

If this approach does not work what would be the other way the constrained problem f(x) -> min,  x€C could be solved with DAAL?

Thanks and regards,

Dmitry

0 Kudos
Highlighted
Employee
133 Views

Hi, Dmitry.

Jump to solution

Hi, Dmitry.

With current DAAL API it`s impossible to pass or override the proximal projection within DAAL solver (or within any DAAL implemented objective functions).

As it was pointed above by Mikhail you should create your own custom objective function with proximal projection which you need (example for creation of custom objective function you can find in src/examples/cpp/source/optimization_solvers/custom_obj_func.h).

Also you can simplify this task by inheritance from daal::algorithms::optimization_solver::logistic_loss::BatchContainer and daal::algorithms::optimization_solver::logistic_loss::Batch classes. You need to override method Batch::initialize() with creation your inherited BatchContainer. And implement BatchContainer::compute similar to LogLossKernel<algorithmFPType, method, cpu>::doCompute method (see src/algorithms/kernel/objective_function/logistic_loss/logistic_loss_dense_default_batch_impl.i).

Your own implementation for 'compute' method can be the same for value/hessian/gradient/lipschitzConstant but proximal projection computation can be implemented exactly as you need (DAAL proximal projection you can find src/algorithms/kernel/objective_function/logistic_loss/logistic_loss_dense_default_batch_impl.i:145).

0 Kudos
Highlighted
Beginner
133 Views

Hi Kirril,

Jump to solution

Hi Kirril,

Thanks for the information.

Could you please clarify how to change implementation of the compute() function in the custom_obj_func.h to add projection? I tried to do it exactly as for the gradient computation - it did not work.

Thanks, I really appreciate your help.

Dmitry.

 

0 Kudos
Highlighted
Employee
133 Views

Please have a look into src

Jump to solution

Please have a look into src/algorithms/kernel/objective_function/logistic_loss/logistic_loss_dense_default_batch_impl.i:145.

You should retrieve 'parameter->resultsToCompute' flag as for gradient and value flag and get the allocated result Numeric Table with proximalProjectionIdx exactly like on custom_obj_func.h:340.

Also please be informed proximal operator doesn`t take the 'step' η parameter. Optimization solver modifies (by division) input argument and then revert it (src/algorithms/kernel/optimization_solver/saga/saga_dense_default_impl.i:295).

Best regards,

Kirill

0 Kudos
Highlighted
Beginner
133 Views

Thank you Kirill,

Jump to solution

Thank you Kirill,

Following the suggested logic , I have added the following code at the end of the compute() function in the  custom_obj_func.h file:

const bool projFlag = ((parameter->resultsToCompute &
        daal::algorithms::optimization_solver::objective_function::proximalProjection) != 0) ? true : false;

if (projFlag)
    {
        daal::data_management::NumericTable* projTable =
            result->get(daal::algorithms::optimization_solver::objective_function::proximalProjectionIdx).get();

        daal::data_management::BlockDescriptor<algorithmFPType> projBlock;
        projTable->getBlockOfRows(0, p, writeOnly, projBlock);
        algorithmFPType* proj = projBlock.getBlockPtr();

        //Projection onto param set {theta| theta[0] <= 0}

        for (size_t j = 0; j < p; j++)
        {
            proj = theta;
        }
        
        if (theta[0] > 0)
        {
            proj[0] = 0;
        }
        projTable->releaseBlockOfRows(projBlock);
    }

It is just simple projection onto half-space {theta| theta[0] <= 0}.

 In the file sgd_custom_obj_func_dense_batch.cpp I added the line 

customObjectiveFunction->parameter.resultsToCompute

= optimization_solver::objective_function::gradient |optimization_solver::objective_function::proximalProjection;

I ran the updated program sgd_custom_obj_func_dense_batch and still got the same result despite of the constraints. 

Specifically,

Minimum:
0.970
0.728
0.828
0.944
1.016

Do you have any other suggestions? Does it mean that sum_of_functions type of objective function just ignores projections?

Thank you very much for your time and support,

Best regards,

Dmitry

0 Kudos
Highlighted
Employee
133 Views

The same result was obtained

Jump to solution

The same result was obtained as SGD solver doesn`t support proximal projection and only smooth part of function is taken into account.

Currently only SAGA solver supports proximal projection.

 

Best regards,

Kirill

0 Kudos
Highlighted
Beginner
133 Views

Thank you Kirill,

Jump to solution

Thank you Kirill,

With the SAGA  solver I can see that constraints are taken into account, however I can not verify numerical results because even without my changes DAAL examples for sgd optimization for logistic regression returns not expected results. I created new topic for that:

https://software.intel.com/en-us/forums/intel-data-analytics-acceleration-library/topic/816872

If you could take a look at that , it would be very helpful and highly appreciated.

Regards,

Dmitry.

 

0 Kudos
Highlighted
Employee
133 Views

Hello Dmitry,

Jump to solution

Hello Dmitry,

Explanation and recommendation was provided for new topic.

 

Best regards,

Kirill

0 Kudos
Highlighted
Beginner
133 Views

Hi Kirill,

Jump to solution

Hi Kirill,

Thank you very much for the explanation.

The saga algorithm works much faster than fixed step size sgd  and, as you mentioned, should support the proximal projection.

However, I still have a problem with the  inheritance from daal::algorithms::optimization_solver::logistic_loss::BatchContainer  and Batch classes.

My implementation of derived classes is attached (custom_proj.h)

The problem is, that sagaAlgorithm.compute()  call in saga_custom_proj.cpp file does not invoke my dummy implementation of  the compute()  function (custom_proj.h)  below:

template<typename algorithmFPType, logistic_loss::Method method, CpuType cpu>
    daal::services::Status BatchContainer<algorithmFPType, method, cpu>::compute()
    {
        return daal::services::Status();
    }

Looks like it calls the base class compute method,  instead, since the algorithm returns correct values. That prevents me from customization of the compute method.

I  am not sure also about my implementation of the initialize() function below (cpu type template param?):

 void initialize()
 {
            Analysis<batch>::_ac = new BatchContainer<algorithmFPType, method, CpuType::sse2>(&_env);
            _in = &input;
            _par = sumOfFunctionsParameter;
 }

Can you please advice what is wrong with the attached implementation?

Thank you very much. I really appreciate your time and support.

Best regards,

Dmitry.

0 Kudos
Highlighted
Employee
133 Views

Hi Dmitry,

Jump to solution

Hi Dmitry,

As DAAL optimization solvers perform 'clone' of objective functions inside the kernels, looks like logistic_loss::Batch::cloneImpl() is called instead of logistic_prox_function::Batch::cloneImpl() as redefinition was not provided. Your 'compute' method is called only for gradient computation as I see. Please provide definition for 'cloneImpl' at first.

 

Best regards,

Kirill

0 Kudos
Highlighted
Beginner
133 Views

Thank you, Kirill !

Jump to solution

Thank you, Kirill !

You are right, after implementation of the logistic_prox_function::Batch::cloneImpl() , the compute()  function gets called.

My implementation of the compute function is attached.  After fixing the majority of compilation errors, I still have eight of them related to some basic functions in the Math namespace, which is not accessible to me ('Math': is not a member of 'daal::internal').  Those functions are slightly different than the similar ones in mkl namespace. (vexp vs vExp, etc).

I have the list of the errors attached.

Can you please advice,  what should I do to fix those compilation errors?

 

Thanks,

Dmitry.

 

0 Kudos
Highlighted
Beginner
133 Views

Hi Kirill,

Jump to solution

Hi Kirill,

I was able to fix  the compilation errors, and started testing the implemented logistic_prox_function with saga algorithm. 

Since I just mimic doCompute method from logistic_loss_dense_default_batch_impl.i I would expect the same results as for out of box logistic_loss function. However the results I am getting are different and therefore are wrong.

The problem is this 

result->get(objective_function::valueIdx).get() and all other similar calls (except result->get(objective_function::lipschitzConstantIdx).get()) return null,  despite of the statement in the main function: 

customObjectiveFunction->parameter().resultsToCompute = objective_function::gradient | objective_function::proximalProjection
        | objective_function::value | objective_function::lipschitzConstant;

I attached my  implementation of  logistic_prox_function in custom_proj.h.

The saga_custom_proj.cpp contains main function.

Can you please advice,  what is wrong with my implementation of the compute() function?

 

Thanks a lot!

Dmitry.

0 Kudos
Highlighted
Beginner
133 Views

Hi Kirill,

Jump to solution

Hi Kirill,

Slowly making progress, I was able to resolve the issue with population of the value, gradient, nonSmoothTermValue, proximalProjection, lipschitzConstant - see attached.

The problem is that I follow your instructions on implementation and still not getting valid test results.

Specifically, batchIndices variable is not being populated.

Can you or any of your colleagues please get back to me with any recommendations on how to customize objective function for saga algorithm using DAAL library.

I have attached my implementation of the custom function along with the test driver calling saga algorithm for the custom objective function.

I highly appreciate your feedback.

Thanks a lot!

Regards,

Dmitry.

0 Kudos
Highlighted
Employee
133 Views

Hello, Dmitry

Jump to solution

Hello, Dmitry

Code was simplified and reworked. And now result obtained via custom logistic_prox_function::Batch is exactly the same as with default logistic loss function.

 

Best regards,

Kirill

 

0 Kudos
Highlighted
Employee
133 Views

Also I`d like to remind you

Jump to solution

Also I`d like to remind you that batch indexes are not such useful for SAGA solver. Only one training sample is used on each iterations (batchSize = 1).

Best regards,

Kirill

0 Kudos
Highlighted
Beginner
133 Views

Hi Kirill !

Jump to solution

Hi Kirill !

Thank you - I really appreciate your help and support.

I took the new implementation you provided.

I got two problems with this implementation which I need to address:

           1. I am not sure how to define the defaultKernel variable in the fragment below, I can not compile it otherwise.

if(proximalProjection)
        {
            /*
             *do here proximal projection with taking into account that saga solver will devide argument to handle step size in proximal projection
             *(see implementation in logistic loss kernel as example!)
             *
            */
        }
        else
        {
            return defaultKernel->compute(input->get(logistic_loss::data).get(), input->get(logistic_loss::dependentVariables).get(),
                                          input->get(logistic_loss::argument).get(), value, hessian, gradient, nonSmoothTermValue, proximalProjection, lipschitzConstant, parameter);
        }

I tried to use

_DAAL_CALL_KERNEL(env, internal::LogLossKernel, __DAAL_KERNEL_ARGUMENTS(algorithmFPType, method),
        compute, input->get(logistic_loss::data).get(), input->get(logistic_loss::dependentVariables).get(), input->get(logistic_loss::argument).get(),
        value, hessian, gradient, nonSmoothTermValue, proximalProjection, lipschitzConstant, parameter);

instead of the return statement, and received linking error.

        2. What should I return in the fragment below after projection implementation

if(proximalProjection)
        {
            /*
             *do here proximal projection with taking into account that saga solver will devide argument to handle step size in proximal projection
             *(see implementation in logistic loss kernel as example!)
             *
            */

return ???
        }

Can you please advise?

Thanks!

Regards,

Dmitry.

 

 

0 Kudos
Highlighted
Beginner
133 Views

Hi Kirill !

Jump to solution

Hi Kirill !

I found a possible solution to the problem I mentioned in the previous post, specifically:

#include "logistic_loss_dense_default_batch_impl.i"
 

if (proximalProjection)
        {
            /*
             *do here proximal projection with taking into account that saga solver will devide argument to handle step size in proximal projection
             *(see implementation in logistic loss kernel as example!)
             *
            */

            algorithmFPType* b;
            HomogenNumericTable<algorithmFPType>* hmgBeta = dynamic_cast<HomogenNumericTable<algorithmFPType>*>(input->get(logistic_loss::argument).get());
            b = hmgBeta->getArray();

            algorithmFPType* prox;
            HomogenNumericTable<algorithmFPType>* hmgProx = dynamic_cast<HomogenNumericTable<algorithmFPType>*>(proximalProjection);
            prox = hmgProx->getArray();
            int nBeta = proximalProjection->getNumberOfRows();

            for (int i = 0; i < nBeta; i++)
            {
                prox = b;
            }
            if (b[0] <= b[1])
            {
                prox[0] = (b[0] + b[1])/2.0;     //Simple projection onto half-plane  y >= x
                prox[1] = (b[0] + b[1])/2.0;
            }

            return daal::services::Status();

        }
        else
        {
             daal::services::Environment::env& env = *_env;
            __DAAL_CALL_KERNEL(env, logistic_loss::internal::LogLossKernel, __DAAL_KERNEL_ARGUMENTS(algorithmFPType, method),
                compute, input->get(logistic_loss::data).get(), input->get(logistic_loss::dependentVariables).get(), input->get(logistic_loss::argument).get(),
                value, hessian, gradient, nonSmoothTermValue, proximalProjection, lipschitzConstant, parameter);

        }
    }

The code generates the correct results at least for some simple cases.

Does this code sound right to you? Any suggestions?

Thanks,

Dmitry.

 

0 Kudos
Highlighted
Employee
133 Views

Hello, Dmitry

Jump to solution

Hello, Dmitry

In provided example the branch "if(proximalProjection)" was added as example and to get the same results as default function should be commented of course. Default kernel can be created without including "logistic_loss_dense_default_batch_impl.i" (see attached code).

Best regards,

Kirill

0 Kudos