Intel® oneAPI Data Analytics Library
Learn from community members on how to build compute-intensive applications that run efficiently on Intel® architecture.

Not expected results for logistic regression coefficients in the training stage

Gusev__Dmitry
Beginner
3,589 Views

Hello,

I have a couple of questions regarding logistic regression DAAL algorithm's results in the training stage.

1. For the data set below the DAAL  algorithm  generates coefficients -1.389    0.235, while the expected results are  : -4.3578       0.6622 .

Could someone please clarify, why the beta coefficients for logistic regression in the training stage are not even close to the expected results? 

Here is the data set:

1,0
2,0
3,0
4,0
5,1
6,0
7,1
8,0
9,1
10,1

Here is the corresponding DAAL code:

        training::Batch<> trainAlgorithm(2);

       trainAlgorithm.input.set(classifier::training::data, data);

        trainAlgorithm.input.set(classifier::training::labels, dependentVariables);

        trainAlgorithm.parameter().penaltyL1 = 0.0f;
        trainAlgorithm.parameter().penaltyL2 = 0.0f
        trainAlgorithm.parameter().interceptFlag = true;
        trainAlgorithm.parameter().nClasses = 2;
        trainAlgorithm.compute();
        training::ResultPtr trainingResult = trainAlgorithm.getResult();
        logistic_regression::ModelPtr modelptr = trainingResult->get(classifier::training::model);

        NumericTablePtr beta = modelptr->getBeta();

2. The second question is: How to specify the optimizationSolver parameter to switch from default SGD  to LBFGS solver?

Thank you, your help is highly appreciated.

Regards,

Dmitry

 

0 Kudos
1 Solution
Kirill_S_Intel
Employee
3,589 Views

Hello Dmitry,

Thanks for reporting your question, I'm glad to help you.

Not accurate solution was obtained due to low accuracyThreshold which is 1e-4 by default. For your case SGD momentum algorithm reaches maximum number of iterations that's why we obtain not exact minimum point. 

So if we set the same accuracyTreshhold we should increase the number of iterations:

    auto solver                             = optimization_solver::sgd::Batch<float, optimization_solver::sgd::momentum>::create();
    const size_t nIterations         = 40000;
    const float learningRate      = 1e-3;
    const float accuracyThreshold = 1e-4; // to get closer solution we could even reduce it
    solver->parameter.learningRateSequence  = HomogenNumericTable<float>::create(1, 1, NumericTable::doAllocate, learningRate);
    solver->parameter.accuracyThreshold     = accuracyThreshold;
    solver->parameter.nIterations           = nIterations;
    solver->parameter.batchSize             = nRows; // 10

    trainAlgorithm.parameter().optimizationSolver = solver; // set optimization solver

    ...

    algorithm.compute();
    printNumericTable(solver->getResult()->get(optimization_solver::iterative_solver::nIterations), "Number of iterations performed:");

    printNumericTable(modelptr->getBeta(), "Logistic Regression coefficients:");
 

We will get next results:

Number of iterations performed:
30024.000

Logistic Regression coefficients:
-4.327    0.658

So to get more exact minimum point we can use 'double' FPtype for input table, for created solver, and for created algorithm.

 

Here example how to use L-BFGS solver:

    auto solver                                    = optimization_solver::lbfgs::Batch<double>::create();
    const size_t nIterations                = 20;
    const double accuracyThreshold = 1e-9;
    solver->parameter.accuracyThreshold     = accuracyThreshold;
    solver->parameter.nIterations           = nIterations;
    solver->parameter.batchSize             = nRow; // 10;

    solver->parameter.L = 1;
    solver->parameter.correctionPairBatchSize = nRow; // 10;

    /* Create an algorithm object to train the logistic regression model */
    training::Batch<double> algorithm(nClasses);

    /* Pass a training data set and dependent values to the algorithm */
    algorithm.input.set(classifier::training::data, trainData);
    algorithm.input.set(classifier::training::labels, trainDependentVariable);

    /* set optimization solver*/
    algorithm.parameter().optimizationSolver = solver;

So we obtain next results:

Number of iterations performed:
15

Logistic Regression coefficients:
-4.35787  0.66223

As we set batchSize and correctionPairBatchSize equal to nRow and L parameter to 1 we obtain not stochastic optimization with automatic step selection and obtain less number of iterations.

 

Best regards,

Kirill

View solution in original post

0 Kudos
8 Replies
Kirill_S_Intel
Employee
3,590 Views

Hello Dmitry,

Thanks for reporting your question, I'm glad to help you.

Not accurate solution was obtained due to low accuracyThreshold which is 1e-4 by default. For your case SGD momentum algorithm reaches maximum number of iterations that's why we obtain not exact minimum point. 

So if we set the same accuracyTreshhold we should increase the number of iterations:

    auto solver                             = optimization_solver::sgd::Batch<float, optimization_solver::sgd::momentum>::create();
    const size_t nIterations         = 40000;
    const float learningRate      = 1e-3;
    const float accuracyThreshold = 1e-4; // to get closer solution we could even reduce it
    solver->parameter.learningRateSequence  = HomogenNumericTable<float>::create(1, 1, NumericTable::doAllocate, learningRate);
    solver->parameter.accuracyThreshold     = accuracyThreshold;
    solver->parameter.nIterations           = nIterations;
    solver->parameter.batchSize             = nRows; // 10

    trainAlgorithm.parameter().optimizationSolver = solver; // set optimization solver

    ...

    algorithm.compute();
    printNumericTable(solver->getResult()->get(optimization_solver::iterative_solver::nIterations), "Number of iterations performed:");

    printNumericTable(modelptr->getBeta(), "Logistic Regression coefficients:");
 

We will get next results:

Number of iterations performed:
30024.000

Logistic Regression coefficients:
-4.327    0.658

So to get more exact minimum point we can use 'double' FPtype for input table, for created solver, and for created algorithm.

 

Here example how to use L-BFGS solver:

    auto solver                                    = optimization_solver::lbfgs::Batch<double>::create();
    const size_t nIterations                = 20;
    const double accuracyThreshold = 1e-9;
    solver->parameter.accuracyThreshold     = accuracyThreshold;
    solver->parameter.nIterations           = nIterations;
    solver->parameter.batchSize             = nRow; // 10;

    solver->parameter.L = 1;
    solver->parameter.correctionPairBatchSize = nRow; // 10;

    /* Create an algorithm object to train the logistic regression model */
    training::Batch<double> algorithm(nClasses);

    /* Pass a training data set and dependent values to the algorithm */
    algorithm.input.set(classifier::training::data, trainData);
    algorithm.input.set(classifier::training::labels, trainDependentVariable);

    /* set optimization solver*/
    algorithm.parameter().optimizationSolver = solver;

So we obtain next results:

Number of iterations performed:
15

Logistic Regression coefficients:
-4.35787  0.66223

As we set batchSize and correctionPairBatchSize equal to nRow and L parameter to 1 we obtain not stochastic optimization with automatic step selection and obtain less number of iterations.

 

Best regards,

Kirill

0 Kudos
Gusev__Dmitry
Beginner
3,597 Views

Hi Kirill,

Thanks for the clarification. I have some more cases which I would like to clarify:

1) I am getting wrong result below with the toy data set from the previous post and  with trainAlgorithm.parameter().penaltyL2 = 0.000125;

Logistic Regression coefficients column vector:
-73838.070
10535.673 

lbfgs optimal value: 29.492

However, when trainAlgorithm.parameter().penaltyL2 = 0.0005; the result is correct.

My code is below:

auto solver        = optimization_solver::lbfgs::Batch<float>::create();
       
        solver->parameter.accuracyThreshold     = accuracyThreshold;
        solver->parameter.nIterations           = nIterations;
        solver->parameter.batchSize             = 10;
        solver->parameter.L = 1;
        solver->parameter.correctionPairBatchSize = 100;
        //const float learningRate      = 1e-3;
        
        //solver->parameter.stepLengthSequence  = HomogenNumericTable<float>::create(1, 1, NumericTable::doAllocate, learningRate);

        training::Batch<> trainAlgorithm(2);
       

        trainAlgorithm.input.set(classifier::training::data, data);
        trainAlgorithm.input.set(classifier::training::labels, dependentVariables);
        trainAlgorithm.parameter().optimizationSolver = solver; 

        trainAlgorithm.parameter().penaltyL1 = 0.0f;
        trainAlgorithm.parameter().penaltyL2 = 0.000125;
        trainAlgorithm.parameter().interceptFlag = true;
      
        trainAlgorithm.compute();

2) I am also getting wrong result with more realistic data set (the 1372x5 data csv file is attached) for both zero and positive L2 penalties. Specifically, for zero L2 penalty the result is below. The penalty does not help to stabilize the coefficients:

Logistic Regression coefficients column vector:
2269896543625417850880.000
-993017962874825342976.000
-1080473856619942248448.000
1924941782054602801152.000
-2224122660700265906176.000

lbfgs optimal value: 32.745  //wrong

while correct Optimal Value = 0.01823630 

Please advise, I appreciate your help,

Thanks,

Dmitry.

 

0 Kudos
Kirill_S_Intel
Employee
3,597 Views

Hi Dmitry,

Seems I found the reason why coefficients  are not stabilized.

As it was noticed in previous code lines: it's important to set batchSize and  correctionPairBatchSize to number_of_rows(1372) in training data set (and L=1) to get deterministic l-bfgs.

So with correct parameters I have managed to get right results for your data:

Number of iterations performed:
35.00000

Logistic Regression coefficients:
7.32236   -7.86140  -4.19118  -5.28824  -0.60444

Log Loss value:
0.01818

 

Please just align parameters:

        nRows = trainData->getNumberOfRows();

        solver->parameter.batchSize             = nRows;
        solver->parameter.L = 1;
        solver->parameter.correctionPairBatchSize = nRows;
 

Best regards,

Kirill

0 Kudos
Gusev__Dmitry
Beginner
3,597 Views

Hi Kirill,

Thanks for the information.  I have a couple of more questions:

1) Does your recommendation to use deterministic lbfgs mean that lbfgs solver works only in full batch mode (not  in stochastic or mini batch manner)?

2) Does it mean that the default  batchSize = 10, correctionPairBatchSize = 100 will most likely generate wrong result? ( see https://software.intel.com/en-us/daal-programming-guide-computation-8)

3) What optimization solver would you recommend to solve the logistic loss optimization problem with the data set attached? The deterministic lbfgs generates unreasonable result  (optimal value = 1892504043520.00000000),  saga solver does better (optimal value = 0.58294517),  but the correct optimal value should be 0.4783. ( penaltyL2 = 0, interceptFlag = false)

Thank you for your help,

Regards

Dmitry.

0 Kudos
Kirill_S_Intel
Employee
3,597 Views

Hi Dmitry,

1)  The recommendation was given due to deterministic lbfgs performs much less iterations (it's applicable for sgd, adagrad solvers also). If you want to use stochastic or mini batch mode please set more iterations and more accurate step length.

2) It depends on data set, and provided step length, accuracy threshold, number of iterations. Default values also can produce good results.

3) As I see if we use saga solver and set more accurate threshold and bigger number of iterations we can obtain closer solution

Logistic Regression coefficients:
0.00000   0.00105   0.00264   0.00099   0.00409   0.01316   0.00001   -0.00668  -0.00507  0.02397   0.00058   -0.00006

Log Loss value:
0.48798

 

And with lbfgs solver we can obtain even exact minimum  point(with provided small step length):

    const double learningRate      = 1e-7;
    const double accuracyThreshold = 1e-10;
    const size_t nRows = trainData->getNumberOfRows();

    auto solver                             = optimization_solver::lbfgs::Batch<double>::create();//optimization_solver::sgd::Batch<double,              optimization_solver::sgd::momentum>::create();
    solver->parameter.stepLengthSequence  = HomogenNumericTable<double>::create(1, 1, NumericTable::doAllocate, learningRate);
    solver->parameter.accuracyThreshold     = accuracyThreshold;
    solver->parameter.nIterations           = nIterations;
    solver->parameter.batchSize             = nRows;
    solver->parameter.L = 2; // to disable automatic step selection
    solver->parameter.correctionPairBatchSize = nRows;

Number of iterations performed:
59269.00000

Logistic Regression coefficients:
0.00000   -0.21268  0.00183   -0.00042  0.15487   0.00198   0.00000   -0.17484  -0.07494  0.03599   0.00022   -0.61153

Log Loss value:
0.47835

Best regards,

Kirill

0 Kudos
Adweidh_Intel
Moderator
3,597 Views

Dear Dmitry,

Could you please confirm whether your issue is fixed or not ?

0 Kudos
Gusev__Dmitry
Beginner
3,597 Views

I have received a very professional explanation on why the  lbfgs  algorithm generated wrong results for the data set.   I expected, however, that the lbfgs  execution time would be better and the algorithm implementation would be more robust for the stochastic mode.

If there is any idea on how to make lbfgs (or any other DAAL solver) computation faster for the data set from the previous post (it is not normalized/scaled data) please share this information with me.

 Your help is really appreciated.

Thanks,

Dmitry 

0 Kudos
James_S
Employee
3,596 Views

Thanks Dmitry for the detailed explanation. 

Hi Dmitry, we have implementation aligned with article in DAAL, but we could check and investigate possibility of improvements. Thank you!

0 Kudos
Reply