Intel® Distribution of OpenVINO™ Toolkit
Community assistance about the Intel® Distribution of OpenVINO™ toolkit, OpenCV, and all aspects of computer vision-related on Intel® platforms.
6405 Discussions

Generative or LLM models inferencing for higher batches

Shravanthi
Beginner
297 Views

Hi,

 

How to collect inferences for stable diffusion and llama2 models for higher batch sizes also how can we run these models on Intel GPU ?

0 Kudos
2 Replies
Peh_Intel
Moderator
267 Views

Hi Shravanthi,


OpenVINO™ offers two main paths for Generative AI use cases:

  • Using OpenVINO as a backend for Hugging Face frameworks (transformers, diffusers) through the Optimum Intel extension.
  • Using OpenVINO native APIs (Python and C++) with custom pipeline code.


For more information, you can refer to the Optimize and Deploy Generative AI Models.


Besides, there is also few Jupyter notebook tutorials for OpenVINO™ on running Generative AI models.



Regards,

Peh


0 Kudos
Peh_Intel
Moderator
209 Views

Hi Shravanthi,


This thread will no longer be monitored since we have provided suggestion and answer. If you need any additional information from Intel, please submit a new question. 



Regards,

Peh


0 Kudos
Reply