- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I have a couple of questions regarding logistic regression DAAL algorithm's results in the training stage.
1. For the data set below the DAAL algorithm generates coefficients -1.389 0.235, while the expected results are : -4.3578 0.6622 .
Could someone please clarify, why the beta coefficients for logistic regression in the training stage are not even close to the expected results?
Here is the data set:
1,0
2,0
3,0
4,0
5,1
6,0
7,1
8,0
9,1
10,1
Here is the corresponding DAAL code:
training::Batch<> trainAlgorithm(2);
trainAlgorithm.input.set(classifier::training::data, data);
trainAlgorithm.input.set(classifier::training::labels, dependentVariables);
trainAlgorithm.parameter().penaltyL1 = 0.0f;
trainAlgorithm.parameter().penaltyL2 = 0.0f
trainAlgorithm.parameter().interceptFlag = true;
trainAlgorithm.parameter().nClasses = 2;
trainAlgorithm.compute();
training::ResultPtr trainingResult = trainAlgorithm.getResult();
logistic_regression::ModelPtr modelptr = trainingResult->get(classifier::training::model);
NumericTablePtr beta = modelptr->getBeta();
2. The second question is: How to specify the optimizationSolver parameter to switch from default SGD to LBFGS solver?
Thank you, your help is highly appreciated.
Regards,
Dmitry
- Tags:
- General Support
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Dmitry,
Thanks for reporting your question, I'm glad to help you.
Not accurate solution was obtained due to low accuracyThreshold which is 1e-4 by default. For your case SGD momentum algorithm reaches maximum number of iterations that's why we obtain not exact minimum point.
So if we set the same accuracyTreshhold we should increase the number of iterations:
auto solver = optimization_solver::sgd::Batch<float, optimization_solver::sgd::momentum>::create();
const size_t nIterations = 40000;
const float learningRate = 1e-3;
const float accuracyThreshold = 1e-4; // to get closer solution we could even reduce it
solver->parameter.learningRateSequence = HomogenNumericTable<float>::create(1, 1, NumericTable::doAllocate, learningRate);
solver->parameter.accuracyThreshold = accuracyThreshold;
solver->parameter.nIterations = nIterations;
solver->parameter.batchSize = nRows; // 10
trainAlgorithm.parameter().optimizationSolver = solver; // set optimization solver
...
algorithm.compute();
printNumericTable(solver->getResult()->get(optimization_solver::iterative_solver::nIterations), "Number of iterations performed:");
printNumericTable(modelptr->getBeta(), "Logistic Regression coefficients:");
We will get next results:
Number of iterations performed:
30024.000
Logistic Regression coefficients:
-4.327 0.658
So to get more exact minimum point we can use 'double' FPtype for input table, for created solver, and for created algorithm.
Here example how to use L-BFGS solver:
auto solver = optimization_solver::lbfgs::Batch<double>::create();
const size_t nIterations = 20;
const double accuracyThreshold = 1e-9;
solver->parameter.accuracyThreshold = accuracyThreshold;
solver->parameter.nIterations = nIterations;
solver->parameter.batchSize = nRow; // 10;
solver->parameter.L = 1;
solver->parameter.correctionPairBatchSize = nRow; // 10;
/* Create an algorithm object to train the logistic regression model */
training::Batch<double> algorithm(nClasses);
/* Pass a training data set and dependent values to the algorithm */
algorithm.input.set(classifier::training::data, trainData);
algorithm.input.set(classifier::training::labels, trainDependentVariable);
/* set optimization solver*/
algorithm.parameter().optimizationSolver = solver;
So we obtain next results:
Number of iterations performed:
15
Logistic Regression coefficients:
-4.35787 0.66223
As we set batchSize and correctionPairBatchSize equal to nRow and L parameter to 1 we obtain not stochastic optimization with automatic step selection and obtain less number of iterations.
Best regards,
Kirill
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Dmitry,
Thanks for reporting your question, I'm glad to help you.
Not accurate solution was obtained due to low accuracyThreshold which is 1e-4 by default. For your case SGD momentum algorithm reaches maximum number of iterations that's why we obtain not exact minimum point.
So if we set the same accuracyTreshhold we should increase the number of iterations:
auto solver = optimization_solver::sgd::Batch<float, optimization_solver::sgd::momentum>::create();
const size_t nIterations = 40000;
const float learningRate = 1e-3;
const float accuracyThreshold = 1e-4; // to get closer solution we could even reduce it
solver->parameter.learningRateSequence = HomogenNumericTable<float>::create(1, 1, NumericTable::doAllocate, learningRate);
solver->parameter.accuracyThreshold = accuracyThreshold;
solver->parameter.nIterations = nIterations;
solver->parameter.batchSize = nRows; // 10
trainAlgorithm.parameter().optimizationSolver = solver; // set optimization solver
...
algorithm.compute();
printNumericTable(solver->getResult()->get(optimization_solver::iterative_solver::nIterations), "Number of iterations performed:");
printNumericTable(modelptr->getBeta(), "Logistic Regression coefficients:");
We will get next results:
Number of iterations performed:
30024.000
Logistic Regression coefficients:
-4.327 0.658
So to get more exact minimum point we can use 'double' FPtype for input table, for created solver, and for created algorithm.
Here example how to use L-BFGS solver:
auto solver = optimization_solver::lbfgs::Batch<double>::create();
const size_t nIterations = 20;
const double accuracyThreshold = 1e-9;
solver->parameter.accuracyThreshold = accuracyThreshold;
solver->parameter.nIterations = nIterations;
solver->parameter.batchSize = nRow; // 10;
solver->parameter.L = 1;
solver->parameter.correctionPairBatchSize = nRow; // 10;
/* Create an algorithm object to train the logistic regression model */
training::Batch<double> algorithm(nClasses);
/* Pass a training data set and dependent values to the algorithm */
algorithm.input.set(classifier::training::data, trainData);
algorithm.input.set(classifier::training::labels, trainDependentVariable);
/* set optimization solver*/
algorithm.parameter().optimizationSolver = solver;
So we obtain next results:
Number of iterations performed:
15
Logistic Regression coefficients:
-4.35787 0.66223
As we set batchSize and correctionPairBatchSize equal to nRow and L parameter to 1 we obtain not stochastic optimization with automatic step selection and obtain less number of iterations.
Best regards,
Kirill
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Kirill,
Thanks for the clarification. I have some more cases which I would like to clarify:
1) I am getting wrong result below with the toy data set from the previous post and with trainAlgorithm.parameter().penaltyL2 = 0.000125;
Logistic Regression coefficients column vector:
-73838.070
10535.673
lbfgs optimal value: 29.492
However, when trainAlgorithm.parameter().penaltyL2 = 0.0005; the result is correct.
My code is below:
auto solver = optimization_solver::lbfgs::Batch<float>::create();
solver->parameter.accuracyThreshold = accuracyThreshold;
solver->parameter.nIterations = nIterations;
solver->parameter.batchSize = 10;
solver->parameter.L = 1;
solver->parameter.correctionPairBatchSize = 100;
//const float learningRate = 1e-3;
//solver->parameter.stepLengthSequence = HomogenNumericTable<float>::create(1, 1, NumericTable::doAllocate, learningRate);
training::Batch<> trainAlgorithm(2);
trainAlgorithm.input.set(classifier::training::data, data);
trainAlgorithm.input.set(classifier::training::labels, dependentVariables);
trainAlgorithm.parameter().optimizationSolver = solver;
trainAlgorithm.parameter().penaltyL1 = 0.0f;
trainAlgorithm.parameter().penaltyL2 = 0.000125;
trainAlgorithm.parameter().interceptFlag = true;
trainAlgorithm.compute();
2) I am also getting wrong result with more realistic data set (the 1372x5 data csv file is attached) for both zero and positive L2 penalties. Specifically, for zero L2 penalty the result is below. The penalty does not help to stabilize the coefficients:
Logistic Regression coefficients column vector:
2269896543625417850880.000
-993017962874825342976.000
-1080473856619942248448.000
1924941782054602801152.000
-2224122660700265906176.000
lbfgs optimal value: 32.745 //wrong
while correct Optimal Value = 0.01823630
Please advise, I appreciate your help,
Thanks,
Dmitry.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Dmitry,
Seems I found the reason why coefficients are not stabilized.
As it was noticed in previous code lines: it's important to set batchSize and correctionPairBatchSize to number_of_rows(1372) in training data set (and L=1) to get deterministic l-bfgs.
So with correct parameters I have managed to get right results for your data:
Number of iterations performed:
35.00000
Logistic Regression coefficients:
7.32236 -7.86140 -4.19118 -5.28824 -0.60444
Log Loss value:
0.01818
Please just align parameters:
nRows = trainData->getNumberOfRows();
solver->parameter.batchSize = nRows;
solver->parameter.L = 1;
solver->parameter.correctionPairBatchSize = nRows;
Best regards,
Kirill
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Kirill,
Thanks for the information. I have a couple of more questions:
1) Does your recommendation to use deterministic lbfgs mean that lbfgs solver works only in full batch mode (not in stochastic or mini batch manner)?
2) Does it mean that the default batchSize = 10, correctionPairBatchSize = 100 will most likely generate wrong result? ( see https://software.intel.com/en-us/daal-programming-guide-computation-8)
3) What optimization solver would you recommend to solve the logistic loss optimization problem with the data set attached? The deterministic lbfgs generates unreasonable result (optimal value = 1892504043520.00000000), saga solver does better (optimal value = 0.58294517), but the correct optimal value should be 0.4783. ( penaltyL2 = 0, interceptFlag = false)
Thank you for your help,
Regards
Dmitry.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Dmitry,
1) The recommendation was given due to deterministic lbfgs performs much less iterations (it's applicable for sgd, adagrad solvers also). If you want to use stochastic or mini batch mode please set more iterations and more accurate step length.
2) It depends on data set, and provided step length, accuracy threshold, number of iterations. Default values also can produce good results.
3) As I see if we use saga solver and set more accurate threshold and bigger number of iterations we can obtain closer solution
Logistic Regression coefficients:
0.00000 0.00105 0.00264 0.00099 0.00409 0.01316 0.00001 -0.00668 -0.00507 0.02397 0.00058 -0.00006
Log Loss value:
0.48798
And with lbfgs solver we can obtain even exact minimum point(with provided small step length):
const double learningRate = 1e-7;
const double accuracyThreshold = 1e-10;
const size_t nRows = trainData->getNumberOfRows();
auto solver = optimization_solver::lbfgs::Batch<double>::create();//optimization_solver::sgd::Batch<double, optimization_solver::sgd::momentum>::create();
solver->parameter.stepLengthSequence = HomogenNumericTable<double>::create(1, 1, NumericTable::doAllocate, learningRate);
solver->parameter.accuracyThreshold = accuracyThreshold;
solver->parameter.nIterations = nIterations;
solver->parameter.batchSize = nRows;
solver->parameter.L = 2; // to disable automatic step selection
solver->parameter.correctionPairBatchSize = nRows;
Number of iterations performed:
59269.00000
Logistic Regression coefficients:
0.00000 -0.21268 0.00183 -0.00042 0.15487 0.00198 0.00000 -0.17484 -0.07494 0.03599 0.00022 -0.61153
Log Loss value:
0.47835
Best regards,
Kirill
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear Dmitry,
Could you please confirm whether your issue is fixed or not ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have received a very professional explanation on why the lbfgs algorithm generated wrong results for the data set. I expected, however, that the lbfgs execution time would be better and the algorithm implementation would be more robust for the stochastic mode.
If there is any idea on how to make lbfgs (or any other DAAL solver) computation faster for the data set from the previous post (it is not normalized/scaled data) please share this information with me.
Your help is really appreciated.
Thanks,
Dmitry
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks Dmitry for the detailed explanation.
Hi Dmitry, we have implementation aligned with article in DAAL, but we could check and investigate possibility of improvements. Thank you!
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page