<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re:Errors when attempting mutli-processing with Itex XPUs. in AI Tools from Intel</title>
    <link>https://community.intel.com/t5/AI-Tools-from-Intel/Errors-when-attempting-mutli-processing-with-Itex-XPUs/m-p/1630326#M797</link>
    <description>&lt;P&gt;&lt;SPAN style="background-color: var(--slds-g-color-neutral-base-100, var(--lwc-colorBackgroundInput,rgb(255, 255, 255))); color: var(--slds-g-color-neutral-base-30, var(--lwc-colorTextWeak,rgb(68, 68, 68))); font-size: var(--lwc-fontSize3,0.8125rem); font-family: var(--lwc-fontFamily,-apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol');"&gt;Hi, as discussed on Slack, let me update here as well.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="background-color: var(--slds-g-color-neutral-base-100, var(--lwc-colorBackgroundInput,rgb(255, 255, 255))); color: var(--slds-g-color-neutral-base-30, var(--lwc-colorTextWeak,rgb(68, 68, 68))); font-size: var(--lwc-fontSize3,0.8125rem); font-family: var(--lwc-fontFamily,-apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol');"&gt;The following error is due to missing metadata in the dask df transformation using lambda function.ValueError('Metadata inference failed in&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;lambda&lt;/P&gt;&lt;P&gt;.\n\nYou have supplied a custom function and Dask is&amp;nbsp;&lt;EM&gt;unable to \ndetermine the type of output that that function returns. \n\n&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;To resolve this please provide a meta= keyword.\nThe docstring of the Dask function you ran should have more information.\n\nOriginal error is belo&lt;/EM&gt;&lt;A href="about:blank" rel="noopener noreferrer" target="_blank"&gt;&lt;EM&gt;w:\n------------------------\n&lt;/EM&gt;&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;TypeError("'NAType' object is not iterable")&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;If we add metadata like below... It doesn't throw the error...&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;&lt;SPAN style="font-size: 13px;"&gt; &amp;nbsp;&amp;nbsp; meta = ('SubBoard', 'object')&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN style="font-size: 13px;"&gt; &amp;nbsp;&amp;nbsp; df["SubBoard"] = df["SubBoard"].map_partitions(&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN style="font-size: 13px;"&gt; &amp;nbsp;&amp;nbsp; lambda part: part.apply(&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN style="font-size: 13px;"&gt; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; lambda x: [re.sub(r"[\n\t\r\v]", "", value) if pd.notna(value) else value for value in x]&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN style="font-size: 13px;"&gt; &amp;nbsp;&amp;nbsp; ),meta=meta&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN style="font-size: 13px;"&gt; &amp;nbsp;&amp;nbsp; )&lt;/SPAN&gt;&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;BR /&gt;</description>
    <pubDate>Wed, 11 Sep 2024 07:45:28 GMT</pubDate>
    <dc:creator>Aditya18</dc:creator>
    <dc:date>2024-09-11T07:45:28Z</dc:date>
    <item>
      <title>Errors when attempting mutli-processing with Itex XPUs.</title>
      <link>https://community.intel.com/t5/AI-Tools-from-Intel/Errors-when-attempting-mutli-processing-with-Itex-XPUs/m-p/1619594#M752</link>
      <description>&lt;P&gt;Hi,&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am the CIO at Aleph Innovations, and we work off of Intel's Developer cloud to train and test our novel machine learning models.&amp;nbsp;&lt;SPAN&gt;We work on 16 Intel Datacenter GPU Max 1550s, which have a total FP32 perf&lt;/SPAN&gt;&lt;SPAN&gt;ormance of ~1000 TFLOPS. We recently have been trying to get our itex xpus to run in parallel and distribute the load from one trail in our Optuna model training function and get quick results. We have had the following errors and would like to solve this problem as it has been going on for a while and bottlenecking our whole product development pipeline. From what I can tell, we initialize the gpus and get to memory growth for each of the. Then nothing happens after this point. I have plenty of telemetry in my code that should be showing if it were running. Attached below are some screenshots of where the code stops and an error that comes up in our initial sdp platform for connecting to the intel cloud. I am also going to include some code snippets. Please help if possible. If you think you can assist but require further information, please let me know and I will be happy to provide whatever necessary.&lt;BR /&gt;&lt;BR /&gt;Code Snippets:&lt;BR /&gt;&lt;BR /&gt;## Dependencies, initializing xpus, and defining our xpu strategy&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;import os&lt;BR /&gt;import logging&lt;/P&gt;&lt;P&gt;os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"&lt;BR /&gt;os.environ["ITEX_TILE_AS_DEVICE"] = "0"&lt;BR /&gt;os.environ["ITEX_OMP_THREADPOOL"] = "0"&lt;BR /&gt;os.environ["ZE_FLAT_DEVICE_HIERARCHY"] = "FLAT"&lt;BR /&gt;os.environ["TF_ENABLE_ONEDNN_OPTS"] = "1"&lt;BR /&gt;os.environ["ITEX_FP32_MATH_MODE"] = "BF32"&lt;BR /&gt;# os.environ["ITEX_VERBOSE"] = "2" #more intel telemetry&lt;/P&gt;&lt;P&gt;tf_logger = logging.getLogger("tensorflow")&lt;BR /&gt;tf_logger.setLevel(logging.ERROR)&lt;/P&gt;&lt;P&gt;import keras&lt;BR /&gt;import tensorflow as tf&lt;BR /&gt;import pandas as pd&lt;BR /&gt;import re&lt;BR /&gt;import numpy as np&lt;BR /&gt;from tensorflow.keras.models import Sequential&lt;BR /&gt;from tensorflow.keras.layers import (&lt;BR /&gt;BatchNormalization,&lt;BR /&gt;Conv2D,&lt;BR /&gt;Flatten,&lt;BR /&gt;Dense,&lt;BR /&gt;Activation,&lt;BR /&gt;)&lt;BR /&gt;from sklearn.model_selection import train_test_split&lt;BR /&gt;from tensorflow.keras.optimizers import Adam&lt;BR /&gt;import sklearn&lt;BR /&gt;import sys&lt;BR /&gt;from sklearn.preprocessing import MultiLabelBinarizer&lt;BR /&gt;import optuna&lt;BR /&gt;from tensorflow.keras.callbacks import EarlyStopping&lt;BR /&gt;from tensorflow.keras import layers&lt;BR /&gt;import intel_extension_for_tensorflow as itex&lt;BR /&gt;from optuna.pruners import HyperbandPruner&lt;BR /&gt;from optuna.samplers import TPESampler&lt;BR /&gt;from optuna.integration import DaskStorage&lt;BR /&gt;from dask.distributed import Client, LocalCluster&lt;BR /&gt;import dask.array as da&lt;BR /&gt;import dask.dataframe as dd&lt;BR /&gt;import tqdm&lt;BR /&gt;import pandas as pd&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;tf.get_logger().setLevel("ERROR")&lt;BR /&gt;tf.config.threading.set_intra_op_parallelism_threads(96)&lt;BR /&gt;tf.config.threading.set_inter_op_parallelism_threads(8)&lt;/P&gt;&lt;P&gt;# Set up logging&lt;BR /&gt;logging.basicConfig(&lt;BR /&gt;level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s"&lt;BR /&gt;)&lt;BR /&gt;logger = logging.getLogger(__name__)&lt;/P&gt;&lt;P&gt;# Configure TensorFlow to use Intel GPUs&lt;BR /&gt;physical_devices = tf.config.list_physical_devices("XPU")&lt;/P&gt;&lt;P&gt;num_XPUs = len(physical_devices)&lt;BR /&gt;print(f"Avilable xpu devices : {num_XPUs}")&lt;/P&gt;&lt;P&gt;# attempt to reduce memory pressure (unrahul)&lt;BR /&gt;for xpu in physical_devices:&lt;BR /&gt;try:&lt;BR /&gt;tf.config.experimental.set_memory_growth(xpu, True)&lt;BR /&gt;print(f"Memory growth set for XPU: {xpu}")&lt;BR /&gt;except RuntimeError as e:&lt;BR /&gt;print(e)&lt;/P&gt;&lt;P&gt;if num_XPUs &amp;gt; 1:&lt;BR /&gt;strategy = tf.distribute.MirroredStrategy(&lt;BR /&gt;devices=[f"/XPU:{i}" for i in range(num_XPUs)]&lt;BR /&gt;) # rahul: nitfix&lt;BR /&gt;logger.info(f"Using MirroredStrategy with {num_XPUs} XPUs") # rahul nit fix&lt;BR /&gt;else:&lt;BR /&gt;strategy = tf.distribute.OneDeviceStrategy(device="/XPU:0")&lt;BR /&gt;logger.info("Using OneDeviceStrategy with XPU:0")&lt;BR /&gt;&lt;BR /&gt;## using our strategy for computation speed-up&lt;BR /&gt;&lt;BR /&gt;with strategy.scope():&lt;BR /&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;## whatever needs a speed-up&lt;/P&gt;</description>
      <pubDate>Wed, 31 Jul 2024 03:46:41 GMT</pubDate>
      <guid>https://community.intel.com/t5/AI-Tools-from-Intel/Errors-when-attempting-mutli-processing-with-Itex-XPUs/m-p/1619594#M752</guid>
      <dc:creator>CMCAlephInnv</dc:creator>
      <dc:date>2024-07-31T03:46:41Z</dc:date>
    </item>
    <item>
      <title>Re:Errors when attempting mutli-processing with Itex XPUs.</title>
      <link>https://community.intel.com/t5/AI-Tools-from-Intel/Errors-when-attempting-mutli-processing-with-Itex-XPUs/m-p/1619814#M754</link>
      <description>&lt;P&gt;Hi Christopher, we would like to inform you that we are routing your query to the dedicated team for further assistance&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Wed, 31 Jul 2024 17:33:22 GMT</pubDate>
      <guid>https://community.intel.com/t5/AI-Tools-from-Intel/Errors-when-attempting-mutli-processing-with-Itex-XPUs/m-p/1619814#M754</guid>
      <dc:creator>Vipin_S_Intel</dc:creator>
      <dc:date>2024-07-31T17:33:22Z</dc:date>
    </item>
    <item>
      <title>Re:Errors when attempting mutli-processing with Itex XPUs.</title>
      <link>https://community.intel.com/t5/AI-Tools-from-Intel/Errors-when-attempting-mutli-processing-with-Itex-XPUs/m-p/1623213#M769</link>
      <description>&lt;P&gt;We has identified that issue should be caused by pandas and it is not from ITEX or TensorFlow.&lt;/P&gt;&lt;P&gt;Intel Python experts will help to look into this pandas OOM issue on PVC.&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Wed, 14 Aug 2024 17:03:48 GMT</pubDate>
      <guid>https://community.intel.com/t5/AI-Tools-from-Intel/Errors-when-attempting-mutli-processing-with-Itex-XPUs/m-p/1623213#M769</guid>
      <dc:creator>Louie_T_Intel</dc:creator>
      <dc:date>2024-08-14T17:03:48Z</dc:date>
    </item>
    <item>
      <title>Re:Errors when attempting mutli-processing with Itex XPUs.</title>
      <link>https://community.intel.com/t5/AI-Tools-from-Intel/Errors-when-attempting-mutli-processing-with-Itex-XPUs/m-p/1623214#M770</link>
      <description>&lt;P&gt;We has identified that issue should be caused by pandas and it is not from ITEX or TensorFlow.&lt;/P&gt;&lt;P&gt;Intel Python experts will help to look into this pandas OOM issue on PVC.&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Wed, 14 Aug 2024 17:05:52 GMT</pubDate>
      <guid>https://community.intel.com/t5/AI-Tools-from-Intel/Errors-when-attempting-mutli-processing-with-Itex-XPUs/m-p/1623214#M770</guid>
      <dc:creator>Louie_T_Intel</dc:creator>
      <dc:date>2024-08-14T17:05:52Z</dc:date>
    </item>
    <item>
      <title>Re:Errors when attempting mutli-processing with Itex XPUs.</title>
      <link>https://community.intel.com/t5/AI-Tools-from-Intel/Errors-when-attempting-mutli-processing-with-Itex-XPUs/m-p/1630326#M797</link>
      <description>&lt;P&gt;&lt;SPAN style="background-color: var(--slds-g-color-neutral-base-100, var(--lwc-colorBackgroundInput,rgb(255, 255, 255))); color: var(--slds-g-color-neutral-base-30, var(--lwc-colorTextWeak,rgb(68, 68, 68))); font-size: var(--lwc-fontSize3,0.8125rem); font-family: var(--lwc-fontFamily,-apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol');"&gt;Hi, as discussed on Slack, let me update here as well.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="background-color: var(--slds-g-color-neutral-base-100, var(--lwc-colorBackgroundInput,rgb(255, 255, 255))); color: var(--slds-g-color-neutral-base-30, var(--lwc-colorTextWeak,rgb(68, 68, 68))); font-size: var(--lwc-fontSize3,0.8125rem); font-family: var(--lwc-fontFamily,-apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol');"&gt;The following error is due to missing metadata in the dask df transformation using lambda function.ValueError('Metadata inference failed in&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;lambda&lt;/P&gt;&lt;P&gt;.\n\nYou have supplied a custom function and Dask is&amp;nbsp;&lt;EM&gt;unable to \ndetermine the type of output that that function returns. \n\n&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;To resolve this please provide a meta= keyword.\nThe docstring of the Dask function you ran should have more information.\n\nOriginal error is belo&lt;/EM&gt;&lt;A href="about:blank" rel="noopener noreferrer" target="_blank"&gt;&lt;EM&gt;w:\n------------------------\n&lt;/EM&gt;&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;TypeError("'NAType' object is not iterable")&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;If we add metadata like below... It doesn't throw the error...&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;&lt;SPAN style="font-size: 13px;"&gt; &amp;nbsp;&amp;nbsp; meta = ('SubBoard', 'object')&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN style="font-size: 13px;"&gt; &amp;nbsp;&amp;nbsp; df["SubBoard"] = df["SubBoard"].map_partitions(&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN style="font-size: 13px;"&gt; &amp;nbsp;&amp;nbsp; lambda part: part.apply(&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN style="font-size: 13px;"&gt; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; lambda x: [re.sub(r"[\n\t\r\v]", "", value) if pd.notna(value) else value for value in x]&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN style="font-size: 13px;"&gt; &amp;nbsp;&amp;nbsp; ),meta=meta&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN style="font-size: 13px;"&gt; &amp;nbsp;&amp;nbsp; )&lt;/SPAN&gt;&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Wed, 11 Sep 2024 07:45:28 GMT</pubDate>
      <guid>https://community.intel.com/t5/AI-Tools-from-Intel/Errors-when-attempting-mutli-processing-with-Itex-XPUs/m-p/1630326#M797</guid>
      <dc:creator>Aditya18</dc:creator>
      <dc:date>2024-09-11T07:45:28Z</dc:date>
    </item>
    <item>
      <title>Re:Errors when attempting mutli-processing with Itex XPUs.</title>
      <link>https://community.intel.com/t5/AI-Tools-from-Intel/Errors-when-attempting-mutli-processing-with-Itex-XPUs/m-p/1630327#M798</link>
      <description>&lt;P&gt;Based on the previous comments, I can infer that before converting the numpy array to .npz, you are trying to do preprocessing using DafaFrame.&lt;/P&gt;&lt;P&gt;We don't have support for Pandas/Dask (for data preprocessing) on Intel GPU.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Though it can run very well on CPUs with Modin (Ray or Dask backend).&amp;nbsp;&lt;A href="https://modin.readthedocs.io/en/stable/getting_started/using_modin/using_modin_cluster.html" rel="noopener noreferrer" target="_blank"&gt;https://modin.readthedocs.io/en/stable/getting_started/using_modin/using_modin_cluster.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://arunjose696.github.io/modin_perf_examples/gh_page_4.html" rel="noopener noreferrer" target="_blank"&gt;https://arunjose696.github.io/modin_perf_examples/gh_page_4.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;This content is a preview of a link.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;modin.readthedocs.io&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://modin.readthedocs.io/en/stable/getting_started/using_modin/using_modin_cluster.html" rel="noopener noreferrer" target="_blank"&gt;modin.readthedocs.io&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://modin.readthedocs.io/en/stable/getting_started/using_modin/using_modin_cluster.html" rel="noopener noreferrer" target="_blank" style="font-size: var(--lwc-fontSizeSmall,0.75rem);"&gt;https://modin.readthedocs.io/en/stable/getting_started/using_modin/using_modin_cluster.html&lt;/A&gt;&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Wed, 11 Sep 2024 07:46:30 GMT</pubDate>
      <guid>https://community.intel.com/t5/AI-Tools-from-Intel/Errors-when-attempting-mutli-processing-with-Itex-XPUs/m-p/1630327#M798</guid>
      <dc:creator>Aditya18</dc:creator>
      <dc:date>2024-09-11T07:46:30Z</dc:date>
    </item>
    <item>
      <title>Re:Errors when attempting mutli-processing with Itex XPUs.</title>
      <link>https://community.intel.com/t5/AI-Tools-from-Intel/Errors-when-attempting-mutli-processing-with-Itex-XPUs/m-p/1630328#M799</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;This particular Dask issue has been resolved after using the meta with Dask dataframe operation.&lt;/P&gt;&lt;P&gt;Pandas don't support Intel GPU. However, it can be optimized on a&amp;nbsp;Multi-node CPU cluster using Modin.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Closing the ticket. Feel free to raise a separate ticket for further issues.&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Wed, 11 Sep 2024 07:47:11 GMT</pubDate>
      <guid>https://community.intel.com/t5/AI-Tools-from-Intel/Errors-when-attempting-mutli-processing-with-Itex-XPUs/m-p/1630328#M799</guid>
      <dc:creator>Aditya18</dc:creator>
      <dc:date>2024-09-11T07:47:11Z</dc:date>
    </item>
  </channel>
</rss>

