Intel® oneAPI Data Analytics Library
Learn from community members on how to build compute-intensive applications that run efficiently on Intel® architecture.
226 Discussions

Intel Server Unavailable after executing this code

Adarsh2
Beginner
2,187 Views

I am on intel dev cloud and using Intel OneAPI. This is my code till now:

# first block of jupyter notebook
import modin.pandas as pd

# second block of jupyter notebook
df = pd.read_csv('dataset/dataset.csv')
df.head()

 output -

# output of second block

UserWarning: Ray execution environment not yet initialized. Initializing...
To remove this warning, run the following python code before doing dataframe operations:

    import ray
    ray.init()

2023-09-01 12:00:16,471 INFO worker.py:1636 -- Started a local Ray instance.

 

The first block is running properly but, when I am reading my dataset, it is giving me this warning and server unavailable error. 

 

If I use `import pandas as pd`, the code is running fine, but `modin.pandas` is not working. My dataset is ~ 2 GB csv file. Why is this happening???

 
0 Kudos
10 Replies
AthiraM_Intel
Moderator
2,141 Views

Hi,


Thanks for posting in Intel Communities.


We are able to reproduce the issue in Intel DevCloud for oneAPI. We are checking on this internally.



Thanks


0 Kudos
yehudaorel
Employee
2,120 Views

Hi Adarsh2,

 

When using Modin on dev cloud the following lines must be called:

import ray

ray.shutdown()

ray.init(_memory=16000 * 1024 * 1024, object_store_memory=500 * 1024 * 1024,_driver_object_store_memory=500 * 1024 * 1024)

Can you please give this modification a try and see if the issue is resolved.

I.e (updated code):

# first block of jupyter notebook
import ray
ray.shutdown()
ray.init(_memory=16000 * 1024 * 1024, object_store_memory=500 * 1024 * 1024,_driver_object_store_memory=500 * 1024 * 1024)

# second block of jupyter notebook
import modin.pandas as pd

# third block of jupyter notebook
df = pd.read_csv('dataset/dataset.csv')
df.head()

 

Intel OneAPI samples has a helpful getting started sample with Modin - 

https://github.com/oneapi-src/oneAPI-samples/tree/master/AI-and-Analytics/Getting-Started-Samples/IntelModin_GettingStarted

 

Thanks

 

0 Kudos
Adarsh2
Beginner
2,105 Views

Thanks for your reply!

But this code is running forever.

 

Screenshot 2023-09-07 212224.png

0 Kudos
yehudaorel
Employee
2,031 Views

Could you run the following to check the memory size limit on your devcloud account:

ulimit -m


To insure there are no issues in the environment itself, I would also recommend to follow the environment setup steps listed here: https://github.com/oneapi-src/oneAPI-samples/tree/master/AI-and-Analytics/Getting-Started-Samples/IntelModin_GettingStarted#configure-environment




0 Kudos
yehudaorel
Employee
1,877 Views

Hi Adarsh2, were you able to give the oneAPI sample a try? In case of no response in the next couple days, the ticket will be closed due to inactivity.


0 Kudos
Igor_Z_Intel
Employee
1,861 Views

I tried excerpt from oneAPI samples

import modin.pandas as pd
import numpy as np

array=np.random.randint(low=100, high=100000,size=(2**18,2**8))
np.savetxt("foo.csv", array,delimiter=",")

df=pd.read_csv('foo.csv')
print(df.head)

 

It works on local machine but not in devcloud

0 Kudos
yehudaorel
Employee
1,837 Views

I believe the issue you are seeing is stemming from a bug in the Ray library. On the dev cloud is is needed to add the following lines as I mentioned in earlier comment:

import ray
ray.shutdown()
ray.init(_memory=16000 * 1024 * 1024, object_store_memory=500 * 1024 * 1024,_driver_object_store_memory=500 * 1024 * 1024)

Following the configuration and setup instructions here: https://github.com/oneapi-src/oneAPI-samples/tree/master/AI-and-Analytics/Getting-Started-Samples/IntelModin_GettingStarted#configure-environment , I was able to run with no issues.

modin_devcloud_run.png

Please make sure your environment is set-up properly and add the api calls to ray library.

 

0 Kudos
Igor_Z_Intel
Employee
1,805 Views

Igor_Z_Intel_0-1696520660906.png

at least for ray 2.7 it doesn't seem to work for me

0 Kudos
yehudaorel
Employee
1,792 Views

Can you try to downgrade ray package to 2.6.1:


pip uninstall ray

pip install ray==2.6.1


Please give this a try and re-export the ipykernel to run the notebook.


0 Kudos
Igor_Z_Intel
Employee
1,770 Views

Installing 2.6.1 and using 
 

import ray
ray.shutdown()
ray.init(_memory=16000 * 1024 * 1024, object_store_memory=500 * 1024 * 1024,_driver_object_store_memory=500 * 1024 * 1024)

 I was able to read dataset.csv, @Adarsh2 can you please try this way?

0 Kudos
Reply