Intel® Distribution for Python*
Engage in discussions with community peers related to Python* applications and core computational packages.
447 Discussions

Converted XGboost Regressor does not match predictions on the same test input

mik700
Beginner
3,209 Views

 

Hi I am trying to use daal4py to make the inference of the XGBoost regression faster. I have successfully converted the XGboost model and predicted with the new converted model. However, when I compared this prediction with the prediction from the original xgboost model, I found out they are different and most likely I must be doing sth wrong. 

 

Here is my code context: 

 

# Model is an sklearn pipeline (dataprep -> XGboost regression)
model = pickle.loads(my_model)


regressor = model2['regression']
daal_model = d4p.get_gbt_model_from_xgboost(regressor.get_booster())


# prepped_data is of <class 'scipy.sparse._csr.csr_matrix'> type of (5,245) shape
prepped_data = model['dataprep'].transform(X_test)


xgb_prediction=model['regression'].predict(prepped_data)
daal_prediction = d4p.gbt_regression_prediction(fptype='float').compute(prepped_data,daal_model)

print(daal_prediction.prediction.reshape(-1))
print(xgb_prediction)

 

 And these are the printed outputs:

 

>>> print(daal_prediction.prediction.reshape(-1))
[36.1748 36.1748 36.1748 36.1748 36.1748]
>>> print(xgb_prediction)
[ 0.00406926 -0.00053832  0.02176214 -0.00546156 -0.20432618]

 

 I tried a couple of different inputs, for example changing scipy sparse matrix to numpy matrix or XGboost's DMatrix erc. However, no matter what was the type of the input data it did not infuence the prediction.

Do You have any clues what might be wrong? Thanks in advance for any info/suggestions  

0 Kudos
3 Replies
JyothisV_Intel
Employee
3,149 Views

Hi,

 

Good day to you.

 

Thanks for posting in Intel Communities.

 

We tried replicating the reported issue from our side using the below code similar to the code provided by you and was unable to observe this.

 

Code:

# Importing Libraries
import numpy as np
import xgboost as xgb
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
import daal4py as d4p

# Load the California Housing dataset
california_housing = fetch_california_housing()
X, y = california_housing.data, california_housing.target

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# Create an XGBoost Regressor model
model = xgb.XGBRegressor()

# Train the model on the training data
model.fit(X_train, y_train)

# Make predictions on the testing data using XGBoost Regressor Model
xgb_predictions = model.predict(X_test)

# Mean Squred Error of XGBoost Model
print("Mean squared error regression loss of XGBoost Model:", mean_squared_error(y_test, xgb_predictions))

# Convert XGBoost model to daal4py GBT model
daal4py_gbt_model = d4p.get_gbt_model_from_xgboost(model.get_booster())

# Make predictions using daal4py's gbt_regression_prediction
d4p_predictions = d4p.gbt_regression_prediction(fptype='float').compute(X_test, daal4py_gbt_model)

# Reshaping the predictions
d4p_xgb_predictions = d4p_predictions.prediction.reshape(-1)

# Mean Squred Error of daal4py XGBoost Model
print("Mean squared error regression loss of daal4py XGBoost Model:", mean_squared_error(y_test, d4p_xgb_predictions))

# Comparison of XGBoost and daal4py XGBoost models
print("Are both of the predictions same? : ", np.array_equal(xgb_predictions, d4p_xgb_predictions))

 

Output:

XGBoost Predictions: [0.7013145 2.8957915 0.8142091 ... 2.5384514 1.9320047 2.043333 ]
Mean squared error regression loss of XGBoost Model: 0.2192855720832607

daal4py XGBoost Predictions: [0.7013145 2.8957915 0.8142091 ... 2.5384514 1.9320047 2.043333 ]
Mean squared error regression loss of daal4py XGBoost Model: 0.2192855720832607

Are both of the predictions same? : True

 

Although we cannot provide code level debugging/correction, kindly get back to us with a complete sample reproducer code/model and dataset if you are still facing any issues so that we can assist you better.

 

Regards,

Jyothis V James

 

0 Kudos
JyothisV_Intel
Employee
3,096 Views

Hi,


Good day to you.


We have not received any response from you. Is your issue resolved?


Thanks and Regards,

Jyothis V James


0 Kudos
JyothisV_Intel
Employee
3,035 Views

Hi,


Good day to you.


We have not received any update from you. Intel will no longer monitor this thread. Kindly post a new question if you need any assistance with Intel products and services.


Thanks and Regards,

Jyothis V James


0 Kudos
Reply