I am trying to use Intel DAAL C++ for linear regression. It always return the regression coefficient including the intercept coefficient. I have tried to set the algorithm parameter interceptFlag=false, but it didn't work at all. I have even set the default parameter in the linear_regression_model.h to be false, but still I get the intercept coefficient in the coefficients list.
How can I exclude intercept coefficient?
Many thanks in advance.
linear_regression::Model::getBeta() method returns the numeric table of regression coefficients in the following format:
β00, β01, ..., β0p,
βk0, βk1, ..., βkp.
Where p is the number of features in the data set; k is the number of responses (usually k = 1).
If the algorithm parameter interceptFlag = false is provided, the returned intercept coefficients β00,... , βk0 are equal to zero. But the size of the table of coefficients stays the same.
Could you please check the values of beta coefficients with interceptFlag = false and interceptFlag = true? Normally you should get two different sets of the coefficients.
Thanks Victoriya! Actually my problem is when I set interceptFlag=False, I get the same results as interceptFlag=True. I have set the flag to be false in the header file, but it didn't help. None of the intercept coefficients are zero. I have compared the result with another regression algorithm. The predictions are the same, but none of the coefficients are matching.
To reproduce this behavior on our side, could you please provide the additional details:
It would be also great if you share the code that reproduces this behavior.
Unfortunately I cannot share my data. It is rather small data set, with 54 examples and 15 features in training set and 16 examples in test set. I use the 2017 version for intel 64 architecture. I used linear regression model with normal equation. I've slightly modified the linear regression header file to set intercept flag false by default.
To reproduce the issue on our side I created and ran the test case using the artificial dataset. Both are attached. The test case trains the linear regression model by means of Normal Equations method.
Using this test case I was not able to reproduce the behavior you have described with Intel DAAL 2017 Update 2 on Intel(R) Xeon(R) CPU E5-2680 running Linux* OS (I built the test using dynamic parallel 64 bit version of the library).
Can you please build the example in the same way, run it, and let us know about results you get on your side? Additional information about CPU you use would be helpful.
Meanwhile, on my side I detected another erroneous behavior of the example: the library throws an exception “Failed to solve the system of normal equations” when interceptFlag is set to false.
We will analyze and fix it in one of the next releases of the library.
At the same time, QR method for training of linear regression model runs fine including the case when interceptFlag = false. This method can be used as a workaround on your side.
Here's the details of my CPU architecture: Intel(R) Xeon(R) CPU E5-2660 v3 @ 2.60GHz.
I have tested again by adding interceptFlag=False, again. As you also mentioned, it doesn't work for the linear regression with normal equation. That's why I tried to fix it in the header file by setting to be false by default.