Intel® Distribution of OpenVINO™ Toolkit
Community assistance about the Intel® Distribution of OpenVINO™ toolkit, OpenCV, and all aspects of computer vision-related on Intel® platforms.
6452 Discussions

NEW RELEASE: OpenVINO toolkit  2023.2 now available!

Luis_at_Intel
Moderator
1,080 Views

We are proud to announce our newest release of OpenVINO 2023.2. This latest version focuses on improving LLM performance for generative AI workloads.

 

Here is what’s new in this release:

 

Improved User Experience — We’ve made model conversion easier and increased the availability of OpenVINO on package managers and through the Hugging Face ecosystem.

  • Expanded model support for direct PyTorch model conversion – automatically convert additional models directly from PyTorch or execute via torch.compile with OpenVINO as the backend.
  • Easier optimization and conversion of Hugging Face models – compress LLM models to int8 with Hugging Face Optimum command line interface and export models to OpenVINO format.
  • OpenVINO is now available on Conan package manager which allows more seamless package management for large scale projects for C and C++ developers.

Gen AI and LLM enhancements — We’ve expanded model support, and accelerated inference with the new addition of int4 optimized model weights compression.

  • New and noteworthy models supported - We’ve enabled models used for chatbots, instruction following, code generation, and many more including prominent models like Llava, chatGLM, Bark (text to audio) and Latent Consistency Models or LCM (optimized version of Stable Diffusion)
  • Accelerate inferencing for LLM models on Intel® Core™ CPU and iGPU with the use of int8 and int4 model weights compression.
  • Expanded model support for dynamic shapes for improved performance on GPU.
  • Preview support for int4 model format is now included. Int4 optimized model weights are now available to try on Intel® Core™ CPU and iGPU, to accelerate models like Llama 2 and chatGLM2.
  • The following int4 model compression formats are supported for inferencing in runtime:
    • Generative Pre-training Transformer Quantization (GPTQ); with GPTQ-compressed models, you can access them through the Hugging Face repositories.
    • Native int4 compression through NNCF.

More portability and performance — Develop once, deploy anywhere. OpenVINO enables developers to to run AI at the edge, in the cloud or locally.

  • In 2023.1 we announced full support for Arm architecture, now we have improved performance by enabling FP16 model formats for LLMs and integrating additional acceleration libraries to improve latency.

 

Download the 2023.2 Release 
Download Latest Release Now

 

Get all the details 
See 2023.2 release notes

 

Helpful Links

NOTE: Links open in a new window.

0 Kudos
0 Replies
Reply