AI Tools from Intel
Find answers to your toolkit installation, configuration, and get-started questions.
Announcements
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.

developing on top of the intel extension for transformers

AlignmentLabAI
Beginner
1,754 Views

is there a well known or recommended way to scale out on the sapphire rapids that its optimized for, once weve finished building out the most efficient stack we can on it?   im trying to get my organization comfortable using intel cpu frameworks in general in preparation for gaudi's to represent more of the available ecosystem going forward so im trying to optimize our experiment spend and look at how scaling a cpu focused stack looks

0 Kudos
1 Reply
kta_intel
Moderator
1,731 Views

Hey, thanks for the question. Glad to hear your org is interested in Intel CPU and Gaudi

 

Can you clarify what you mean by scaling in this context?

 

Overall, it'll probably depend on specific use case, but if you're interested in optimizing Intel Extension for Transformers to continue to be performant on sapphire rapids as you expand your user base and/or if you're interested in optimizing it so that it will be more performant as you increase your server size, general recommendation is to start by leveraging low precision data type (ie INT4/INT8) and data parallelization (i.e. DPP):  https://github.com/intel/intel-extension-for-transformers

0 Kudos
Reply