Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Intel Community
- Software
- Software Development SDKs and Libraries
- Intel® oneAPI Data Analytics Library
- Not expected results of sgd optimization for logistic regression

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

Gusev__Dmitry

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

08-31-2019
04:20 PM

152 Views

Not expected results of sgd optimization for logistic regression

Hello,

Can someone help me to understand why I am getting numerical results with DAAL which I would not expect:

I am running sgd_log_loss_dense_batch project. I changed datasetFileName to point to very simple logistic regression input:

1,0

2,0

3,0

4,0

5,1

6,0

7,1

8,0

9,1

10,1

Expected coefficients are

(Intercept) V1

-4.3578 0.6622

However output of sgd_log_loss_dense_batch is not even close to it and depends significantly on initial approximation. For instance if initialPoint[nFeatures + 1] = {0, 0}; it hits max iteration limit (1000 iteration) and the output is

Minimum:

-1.111

0.127

Number of iterations performed:

1000.000

With initialPoint[nFeatures + 1] = {1, 1}; It stops after first iteration and the output is

Minimum:

0.990

0.971

Number of iterations performed:

1.000

Your help is much appreciated,

Regards,

Dmitry.

Link Copied

1 Reply

Kirill_S_Intel

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

09-01-2019
10:54 PM

152 Views

Hello Dmitry,

As you pointed sgd (default dense method) reached maximum iteration (nIterations = 1000). If you increase this value (nIterations=10000) you get much closer solution.

Also do not forget to decrease 'accuracyThreshold' to get even better solution (*does not depend on initial point*).

Try with this initial values:

const size_t nIterations = 10000000;

const size_t nFeatures = 1;

const float learningRate = 0.001f;

const double accuracyThreshold = 0.00002;

float initialPoint[nFeatures + 1] = {0,0}; // {1,1} *no difference*

result:

Minimum:

-4.383

0.652

Number of iterations performed:

10000000.000

Such behavior is due to not automatic step size selection (you should pick up it correctly), and sgd defaultDense method consider only one sample on each iteration (batchSize = 1). To get smaller number of iterations use 'minibatch' or 'momentum' methods of sgd solver (examples: sgd_moment_dense_batch.cpp, sgd_mini_dense_batch.cpp).

To summarize: use smaller value for 'accuracyThreshold'; pick up step size correctly for sgd solver.

Best regards,

Kirill

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

For more complete information about compiler optimizations, see our Optimization Notice.