Fine-Tuning LLMs with LoRA on Intel GPUs – for Predictions, E-commerce, and Personal Assistants

Eugenie_Wirz · ‎06-09-2023

How startups from the Intel® Liftoff program leveraged Intel® Data Center GPU Max Series and 4th Gen Intel® Xeon® Scalable processors to unleash the potential of LLM-powered applications

Low-Rank Adaptation of Large Language Models (LoRA) is a parameter-efficient fine-tuning approach developed by Microsoft Research*, which has gained recent attention with the upswing in interest in large language models (LLMs). It has become a standard way to scale LLM fine-tuning and customization.

Until recently, this work has been executed on Nvidia* GPUs with CUDA. In May 2023, however, three Intel® Liftoff program startups achieved a significant milestone by fine-tuning LLMs with LoRA for the first time on Intel® Data Center GPU Max Series processors in the Intel® Developer Cloud during a five-day AI hackathon.

The virtual hackathon provided six AI startups access to 4th Gen Intel® Xeon® processors and Intel Max Series GPUs on the Intel Developer Cloud. They were tasked with exploring the potential of next-generation LLM-based applications. The startups fine-tuned and deployed models ranging from 3 to 7 billion parameters on the Intel Data Center GPU Max 1100 running on 4th Gen Intel Xeon processors using Intel AI and oneAPI tools.

Out of the six teams participating, three startups were selected for the final showcase to demonstrate their innovative applications built around customized LLMs. The models (Dolly 7b and OpenLLama 3b) for all three of these applications were fine-tuned on a single Intel Max Series GPU using the Intel® oneAPI software stack, including the Intel® oneAPI Base Toolkit, Intel® oneAPI Deep Neural Network (oneDNN), Intel® oneAPI DPC++/C++ Compiler with SYCL* runtime, Intel® Extension for PyTorch and the Intel® Extension for Transformers.

The innovative prowess of the three startups manifested in the applications they built around customized LLMs. Moonshot AI* is breaking new ground in leveraging LLMs to make predictions. The co-founder of MoonshotAI, Daniel Han-Chen, revealed their experience, saying: "We had a hectic week, it was extremely useful, and we learned a lot… the Intel GPUs were fabulous… The Intel optimizations allow the models to train much faster."

SiteMana* is harnessing LLMs to automate e-commerce marketing. As the founder Peter Ma puts it: "With the stellar performance of Intel's GPU, Dolly 2.0, and OpenLlama at our disposal during the hackathon, we at SiteMana were able to build an LLM model inspired by state of the art chatbots. The model was fine-tuned to write personalized emails, and the deployment and testing phase was surprisingly seamless. Empowered by the tremendous performance of Intel's GPU, we at SiteMana, during the hackathon, leveraged Dolly 2.0 and OpenLlama to create a customized LLM. This model, specialized in creating personalized emails, was deployed and tested seamlessly, exceeding our expectations."

Additionally, Selecton Technologies* is carving a niche by developing an AI personal assistant for gamers utilizing LLMs. Yevgen Lopatin, Selecton's co-founder, articulated their experience during the LLM Hackathon sprint, saying: “The sprint provided us with valuable GPU resources to validate our solution. During a one-day training session, we fine-tuned the Dolly LLM model using the LORA training script on an Intel Data Center MAX 1100 GPU with 48 GB VRAM. With the availability of the LLM platform, we explored uncharted territories in computational capacity, achieving exceptional results.”

Not only did these three startups fine-tune models but also built applications around them that can be commercialized. This hackathon demonstrated that startups and larger enterprises can access everything needed to build business applications with virtual hardware on Intel Developer Cloud. The Intel® Developer Cloud, currently in beta, provides early access to new Intel hardware and future technologies to prequalified customers and Intel Liftoff program members.

Intel Liftoff is a free virtual program designed to help early-stage tech startups scale innovation faster. The Intel Liftoff program is centered on three key pillars that leverage Intel’s reach, scale, and resources.

Stellar technical expertise and support to accelerate time to market
Access to cutting-edge technology to innovate ahead of the curve
Impactful go-to-market support to stand out from the competition.

So, if you’re an AI startup and want access to Intel Developer Cloud and other resources, join the Intel Liftoff program. Come build with us!

Intel Liftoff is on a mission to partner with the world's most promising AI startups, encouraging them to create and innovate using our Intel platforms. We are not only part of the present AI landscape but are actively shaping its future.  Check this space next week for a technical blog about how we achieved this and how you can do it too!   

Read more about the Intel Liftoff for Startups here: https://developer.intel.com/liftoff
Here is our blog article about the second edition of the Intel® Liftoff LLM Hackathon.
Contact the program team: Ralph de Wargny, Rahul Unnikrishnan Nair, and Ryan Metz

Learn more about Intel AI and oneAPI tools.

*Other names and brands may be claimed as the property of others.  SYCL is a trademark of the Khronos Group Inc.