- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Empowering Developers with the full potential of generative AI
Here is What’s New in This Release:
Improved User Experience — We’ve improved frameworks integrations providing more choice for developers and enhanced Hugging Face transformer model experience and performance.
- NEW: PyTorch Model Support – Developers can now use their API of choice - PyTorch or OpenVINO for added performance benefits. Now, users can automatically import and convert their PyTorch models using the convert_model() API for use with native OpenVINO toolkit APIs.
- PREVIEW torch.compile – Developers can now use OpenVINO as a backend through PyTorch torch.compile, empowering developers to utilize OpenVINO toolkit through PyTorch APIs. This feature has also been integrated into the Automatic1111 Stable Diffusion Web UI, helping developers achieve accelerated performance for Stable Diffusion 1.5 and 2.1 on Intel CPUs and GPUs in both Linux and Windows OS platforms.
- Improvements with Optimum Intel. HuggingFace and Intel have continued to enhance top generative AI models by optimizing execution, making your models run faster and more efficiently on both CPU and GPU. OpenVINO serves as a runtime for inferencing execution. We’ve also enabled new PyTorch auto import and conversion capabilities along with added support for weights compression to further performance gains.
Generative AI and LLM support — Paired with more model compression techniques, developers have more options when exploring LLMs, including the most prominent models.
- OpenVINO has enhanced its performance and accessibility for Generative AI. We have made significant strides in enhancing runtime performance and optimizing memory usage, particularly for LLMs.
- We’ve enabled models used for chatbots, instruction following, code generation, and many more including prominent models like BLOOM, Dolly, LLAMA 2, GPT-J, GPTNeoX, ChatGLM, Automatic1111, Open-LLAMA and Lo-Ra.
- LLMs are now improved on GPU. We’ve expanded model coverage for dynamic shapes support, further helping the performance of generative AI workloads on both integrated and discrete GPUs. Furthermore, we improved memory reuse and weight memory consumption for dynamic shapes.
- Neural Network Compression Framework (NNCF) now includes an 8-bit weight compression method making it easier to compress and optimize LLM models. We’ve also added SmoothQuant for more accurate and efficient post-training quantization for Large Language Models.
More portability and performance — Develop once, deploy anywhere. OpenVINO enables developers to to run AI at the edge, in the cloud or locally.
- OpenVINO now has integration with MediaPipe. Now developers have direct access to this framework for building multipurpose AI pipelines. Easily integrate with OpenVINO Runtime and OpenVINO Model Server to enhance performance for faster AI model execution. You also benefit from seamless model management and version control, custom logic integration with additional calculators and graphs for tailored AI solutions. Lastly scale faster with the ability to delegate deployment to remote hosts via gRPC/REST interfaces for distributed processing.
- NEW: Full support for Intel® CoreTM Ultra (codename Meteor Lake). This new generation of Intel CPUs is tailored to excel in AI workloads with a built-in inference accelerator, Intel AI Boost (Neural Processing Unit or NPU plug-in) but not limited to just one AI accelerator, you also have the option of running AI workloads on both iGPU, and CPU.
Download the 2023.1 Release
Download Latest Release Now
Get all the details
See 2023.1 release notes
Helpful Links
NOTE: Links open in a new window.
Link Copied
0 Replies

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page