Solved: Error while using intel arc GPU for neural networks

Neelesh · ‎09-05-2023

i am getting this error

File "D:\gputest\ann.py", line 50, in <module>

y_pred=model.forward(X_train)

^^^^^^^^^^^^^^^^^^^^^^

File "D:\gputest\ann.py", line 26, in forward

x=F.relu(self.f_connected1(x))

^^^^^^^^^^^^^^^^^^^^

File "C:\Users\neelesh\.conda\envs\venv\Lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl

return forward_call(*args, **kwargs)

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "C:\Users\neelesh\.conda\envs\venv\Lib\site-packages\torch\nn\modules\linear.py", line 114, in forward

return F.linear(input, self.weight, self.bias)

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and xpu:0! (when checking argument for argument self in method wrapper_XPU_out_addmm_out)**

**when i am trying to run an example from https://github.com/krishnaik06/Pytorch-Tutorial

the code i am using is as follows** `

import pandas as pd import torch
import torch.nn as nn 
import torch.nn.functional as F 
import intel_extension_for_pytorch as ipex 
from sklearn.model_selection import train_test_split
df=pd.read_csv('diabetes.csv')
X=df.drop('Outcome',axis=1).values### independent features
y=df['Outcome'].values###dependent features
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.2,random_state=0)

X_train=torch.FloatTensor(X_train)
X_test=torch.FloatTensor(X_test)
y_train=torch.LongTensor(y_train)
y_test=torch.LongTensor(y_test)

class ANN_Model(nn.Module):
    def __init__(self,input_features=8,hidden1=20,hidden2=20,out_features=2):
        super().__init__()
        self.f_connected1=nn.Linear(input_features,hidden1)
        self.f_connected2=nn.Linear(hidden1,hidden2)
        self.out=nn.Linear(hidden2,out_features)
    def forward(self,x):
        x=F.relu(self.f_connected1(x))
        x=F.relu(self.f_connected2(x))
        x=self.out(x)
        return x

####instantiate my ANN_model
torch.manual_seed(20)
model=ANN_Model()
##transferring model and data to GPU
X_train.to('xpu')
y_train.to('xpu')
model=model.to('xpu')

###Backward Propogation-- Define the loss_function,define the optimizer
loss_function=nn.CrossEntropyLoss()
optimizer=torch.optim.Adam(model.parameters(),lr=0.01)


model,optimizer=ipex.optimize(model,optimizer=optimizer)

epochs=500
final_losses=[]
for i in range(epochs):
    i=i+1
    y_pred=model.forward(X_train)
    loss=loss_function(y_pred,y_train)
    final_losses.append(loss)
    if i%10==1:
        print("Epoch number: {} and the loss : {}".format(i,loss.item()))
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

I am using an intel arc a770 GPU i have verified that the installation is correct as per the intel installation guide

I am getting an error that there are at least two devices even though i have moved my model and input data to ('xpu')

AthiraM_Intel · ‎09-08-2023

Hi,

Thank you for posting in Intel Communities.

Kindly follow the below order to run the code on an Intel XPU (GPU) :

...
# Instantiate ANN_model
torch.manual_seed(20)
model=ANN_Model()

# Backward Propogation-- Define the criterion(loss_function), optimizer
loss_function=nn.CrossEntropyLoss()
optimizer=torch.optim.Adam(model.parameters(),lr=0.01)
model.train()

# IPEX Optimization
# IPEX optimization needs to be done after defining model and optimizer
model,optimizer=ipex.optimize(model,optimizer=optimizer)

# Move the model and criterion(loss function) to XPU
model=model.to("xpu")
loss_function=loss_function.to("xpu")

epochs=500
final_losses=[]
for i in range(epochs):
   i=i+1

   # Move the data to XPU for training across the epochs
   X_train=X_train.to("xpu")
   y_train=y_train.to("xpu")

   y_pred=model.forward(X_train)
   loss=loss_function(y_pred,y_train)
   final_losses.append(loss)
   if i%10==1:
       print("Epoch number: {} and the loss : {}".format(i,loss.item()))
   optimizer.zero_grad()
   loss.backward()
   optimizer.step()

Refer the official examples in the link: Examples — intel_extension_for_pytorch 2.0.110+xpu documentation

If this resolves your issue, make sure to accept this as a solution. This would help others with similar issue.

Thanks

View solution in original post

AthiraM_Intel · ‎09-08-2023

Hi,

Thank you for posting in Intel Communities.

Kindly follow the below order to run the code on an Intel XPU (GPU) :

...
# Instantiate ANN_model
torch.manual_seed(20)
model=ANN_Model()

# Backward Propogation-- Define the criterion(loss_function), optimizer
loss_function=nn.CrossEntropyLoss()
optimizer=torch.optim.Adam(model.parameters(),lr=0.01)
model.train()

# IPEX Optimization
# IPEX optimization needs to be done after defining model and optimizer
model,optimizer=ipex.optimize(model,optimizer=optimizer)

# Move the model and criterion(loss function) to XPU
model=model.to("xpu")
loss_function=loss_function.to("xpu")

epochs=500
final_losses=[]
for i in range(epochs):
   i=i+1

   # Move the data to XPU for training across the epochs
   X_train=X_train.to("xpu")
   y_train=y_train.to("xpu")

   y_pred=model.forward(X_train)
   loss=loss_function(y_pred,y_train)
   final_losses.append(loss)
   if i%10==1:
       print("Epoch number: {} and the loss : {}".format(i,loss.item()))
   optimizer.zero_grad()
   loss.backward()
   optimizer.step()

Refer the official examples in the link: Examples — intel_extension_for_pytorch 2.0.110+xpu documentation

If this resolves your issue, make sure to accept this as a solution. This would help others with similar issue.

Thanks

Neelesh · ‎09-08-2023

Thanks for the solution that works.

i have a follow up questions though:

the program takes a significant amount of time to start when running on GPU about 20 seconds or something like that for the first epoch to load and another 20s for the second epoch to load., after that the program runs smoothly. Is there a way to reduce this time?

AthiraM_Intel · ‎09-15-2023

Hi,

Glad to know that your issue is resolved. Thank you for accepting our solution.

Answer to your follow-up question- A warmup time is expected. There is a set of optimizations and best practices which can accelerate training and inference of deep learning models in PyTorch.

You can refer the below links to know more about performance tuning :

https://pytorch.org/tutorials/recipes/recipes/tuning_guide.html#performance-tuning-guide

https://intel.github.io/intel-extension-for-pytorch/cpu/latest/tutorials/performance_tuning/tuning_guide.html

Since your initial query is resolved and we have addressed your follow-up query as well, please post a new question for any further assistance as this thread will be no longer monitored by Intel.

Thanks