Enhance Automatic Speech Recognition with Intel AI Solutions: Developer Spotlight

Ramya_Ravi · ‎09-24-2024

Automatic Speech Recognition (ASR) uses AI technology to convert spoken language to readable text. This technology has grown exponentially over the last decade and ASR systems are commonly used in voice assistants like Siri, Alexa and transcription services.

Usha Rengaraju in her blog proposed an interesting project on how to build accurate conversational AI systems in banks to enhance their voice and text-based customer interactions. Her blog mainly demonstrates how to utilize the pretrained ASR models like Whisper-tiny model using the MINDS14 dataset

The blog explains the various steps involved in the project:

Install and import all the necessary libraries
Download the MINDS14 dataset using Hugging Face CLI
Perform Exploratory data analysis and preprocessing
Model training
Model inferencing

Read more about the project on Medium and GitHub.

Intel’s AI Tools and Technologies Used

The project is developed using Intel® Extension for PyTorch* and Intel® Neural Compressor.

Intel Extension for PyTorch: The Intel extension expands PyTorch with up-to-date features and optimizations for an extra performance boost on Intel hardware. Check out how to install Intel Extension for PyTorch. The extension can be loaded as a Python module or linked as a C++ library. Python users can enable it dynamically by importing intel_extension_for_pytorch.

The CPU tutorial gives detailed information about Intel Extension for PyTorch for Intel CPUs. Source code is available at the main branch.
The GPU tutorial gives detailed information about Intel Extension for PyTorch for Intel GPUs. Source code is available at the xpu-main branch.

Intel Neural Compressor: This is an open-source Python library that runs on CPUs or GPUs, which:

Performs model quantization to reduce the model size and increase the speed of deep learning inference for deployment.
Automates popular methods such as quantization, compression, pruning, and knowledge distillation across multiple deep-learning frameworks.

What’s Next?

We encourage you to check out and incorporate Intel’s other AI/ML Framework optimizations and tools into your AI workflow and learn about the unified, open, standards-based oneAPI programming model that forms the foundation of Intel’s AI Software Portfolio to help you prepare, build, deploy, and scale your AI solutions.

About the Author:
Usha Rengaraju is an AI consultant and the World’s first women triple Kaggle Grandmaster. She specializes in deep learning and generative AI. She is ranked as top ten Data Scientists in India for the year 2020 by Analytics India Magazine and ranked as top ten women data scientists by Analytics Insight magazine for 2021.