NEW RELEASE: OpenVINO toolkit  2023.3 LTS is now available!

Luis_at_Intel · ‎01-25-2024

We are proud to announce our newest release of OpenVINO toolkit. 2023.3 LTS is now available. This latest version focuses on more improvements in LLM performance to enable your generative AI workloads with OpenVINO.

What’s new in this release:

More Gen AI coverage and framework integrations to minimize code changes.

Introducing OpenVINO Gen AI repository on GitHub that demonstrates native C and C++ pipeline samples for LLMs. We’ve started supporting string tensors as inputs and tokenizers natively to reduce overhead and ease production.
New and noteworthy models validated; Mistral, Zephyr, Qwen, chatGLM3, and Baichuan2.
New Jupyter Notebooks for Latent Consistency Models (LCM) and Distil-Whisper. Updated LLM Chatbot notebook to include Neural Chat, TinyLlama, ChatGLM, Qwen, Notus and Youri models.
Torch.compile is now fully integrated with OpenVINO which now includes a hardware 'options' parameter allowing for seamless inferencing hardware selection by leveraging OpenVINO plugin architecture.

Broader LLM model support and more model compression techniques.

As part of the Neural Network Compression Framework (NNCF), INT4 weight compression model formats are now fully supported on Intel® Xeon® CPUs in addition to Intel® Core^TM and iGPU, adding more performance, lower memory usage, and accuracy opportunities when using LLMs.
Improved performance of transformer based LLM on CPU using stateful model technique to increase memory efficiency where internal states are shared among multiple iterations of inference.
Tokenizer and Torchvision transform support is now available in the OpenVINO runtime (via new API) requiring less preprocessing code and enhancing performance by automatically handling this model setup.

More portability and performance to run AI at the edge, in the cloud, or locally.

Full support for 5th Generation Intel® Xeon® (codename Emerald Rapids) delivering on the AI everywhere promise.
Further optimized performance on Intel® Core™ Ultra (codename Meteor Lake) CPU with latency hint, by leveraging both P-core and E-cores.
Improved performance on ARM platforms with throughput hint, by increasing efficiency in usage of the CPU cores and memory bandwidth.
Preview JavaScript API to enable node JS development to access JavaScript binding via source code.

Improved model serving of LLMs through OpenVINO Model Server. This not only enables LLM serving over KServe v2 gRPC and REST APIs for more flexibility but also improves throughput by running processing like tokenization on the server side.

Download the 2023.3 LTS Release
Download Latest Release Now

Get all the details
See 2023.3 LTS release notes

Helpful Links

NOTE: Links open in a new window.

NEW RELEASE: OpenVINO toolkit 2023.3 LTS is now available!

NEW RELEASE: OpenVINO toolkit  2023.3 LTS is now available!