Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Gusev__Dmitry
Beginner
152 Views

Not expected results of sgd optimization for logistic regression

Hello,

Can someone help me to understand why I am getting numerical results with DAAL which I would not expect:

I am running sgd_log_loss_dense_batch project.  I changed datasetFileName to point to very simple logistic regression input:

1,0
2,0
3,0
4,0
5,1
6,0
7,1
8,0
9,1
10,1

Expected coefficients are

 (Intercept)           V1  
    -4.3578       0.6622

However output of sgd_log_loss_dense_batch is not even close to it and depends significantly on initial approximation. For instance if initialPoint[nFeatures + 1] = {0, 0}; it hits max iteration  limit (1000 iteration) and the output is

Minimum:
-1.111
0.127

Number of iterations performed:
1000.000

With initialPoint[nFeatures + 1] = {1, 1}; It stops after first iteration and the output is

Minimum:
0.990
0.971

Number of iterations performed:
1.000

Your help is much appreciated,

Regards,

Dmitry.

 

0 Kudos
1 Reply
Kirill_S_Intel
Employee
152 Views

Hello Dmitry,

As you pointed sgd (default dense method) reached maximum iteration (nIterations = 1000). If you increase this value (nIterations=10000) you get much closer solution.

Also do not forget to decrease 'accuracyThreshold' to get even better solution (does not depend on initial point).

Try with this initial values:

const size_t nIterations = 10000000;
const size_t nFeatures = 1;
const float  learningRate = 0.001f;
const double accuracyThreshold = 0.00002;

float initialPoint[nFeatures + 1] = {0,0}; // {1,1} no difference

result:

Minimum:
-4.383
0.652

Number of iterations performed:
10000000.000
 

Such behavior is due to not automatic step size selection (you should pick up it correctly), and sgd defaultDense method consider only one sample on each iteration (batchSize = 1). To get smaller number of iterations use 'minibatch' or 'momentum' methods of sgd solver (examples: sgd_moment_dense_batch.cpp, sgd_mini_dense_batch.cpp).

To summarize: use smaller value for 'accuracyThreshold'; pick up step size correctly for sgd solver. 

 

Best regards,

Kirill

 

Reply