- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am running the below code for using the stabilityai/stable-diffusion-xl-base-1.0 from HuggingFace, optimizing it with Optimum Intel for Openvino. When running the inference, I am doing htop utility. I am seeing that is is using only half of the logical cores present in the Xeon server. I am using the m7i.8xlarge EC2 instance in AWS, which has 16 physical cores, 32 logical cores on a one socket Sapphire Rapids Xeon Server. I am using Ubuntu24.04
While launching the jupyter notebook, I have used the command below to launch jupyter notebook using all 32 logical cores.
taskset -c 0-31 jupyter-lab --ip 0.0.0.0 --no-browser --allow-root
This is the code I am running in the jupyter notebook:
!pip install --upgrade-strategy eager optimum["openvino"]
!pip install diffusers
!pip install huggingface_hub
from huggingface_hub import notebook_login
notebook_login()
from optimum.intel import OVStableDiffusionXLPipeline
model_id = "stabilityai/stable-diffusion-xl-base-1.0"
pipeline = OVStableDiffusionXLPipeline.from_pretrained(model_id, export=True)
# Don't forget to save the exported model
pipeline.save_pretrained("openvino-sd-xl-base-1.0")
# Here is the inference code for using the Openvino converted model and running an inference for converting text to image using the converted model.
import os
from optimum.intel import OVStableDiffusionXLPipeline
from openvino.runtime import Core
#Use Environment Variables: Set the environment variables directly in your script to ensure they are applied:
os.environ["OMP_NUM_THREADS"] = "32"
os.environ["MKL_NUM_THREADS"] = "32"
os.environ["OPENBLAS_NUM_THREADS"] = "32"
os.environ["NUMEXPR_NUM_THREADS"] = "32"
# Initialize OpenVINO's Core object
core = Core()
# Set the number of threads to the total number of logical processors (vCPUs)
core.set_property("CPU", {
"INFERENCE_NUM_THREADS": "32",
"NUM_STREAMS": "1",
"CPU_BIND_THREAD": "YES" # Bind threads to specific cores
})
# Load the OpenVINO IR format model using the custom Core object
pipeline = OVStableDiffusionXLPipeline.from_pretrained("openvino-sd-xl-base-1.0", ov_core=core)
# Run inference for text-to-image generation
prompt = "boat in an ocean"
image = pipeline(prompt, num_inference_steps=50).images[0]
# Display the generated image
from IPython.display import display
display(image)
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Rajiv Mandal,
Thanks for reaching out.
Can you enable the CPU cores by Multi-Threading Optimization? Use the properties below to limit the availability of CPU resources for model inference. The OpenVINO Runtime will perform multi-threading scheduling based on the limited available CPU if the platform or operating system supports this behavior.
- ov::inference_num_threads
- ov::hint::scheduling_core_type
- ov::hint::enable_hyper_threading
Please take a look at Multi Threading Optimization for more information.
Regards,
Aznie
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Rajiv Mandal,
This thread will no longer be monitored since we have provided information. If you need any additional information from Intel, please submit a new question.
Regards,
Aznie

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page