Analyzers
Talk to fellow users of Intel Analyzer tools (Intel VTune™ Profiler, Intel Advisor)
5250 讨论

When running the LLM model detected ARC770 dGPU many stalled stage

WeiSeng
员工
1,533 次查看

Hi Team,

 

Recently we using the vtune to analysis the LLM model inference on ARC770 DGPU.

 

From the Vtune trace detected many stalled stages instead of activate stage.

 

I attached the picture that capture from vtune.

 

Any suggestion?

 

0 项奖励
5 回复数
NormanS_Intel
主持人
1,421 次查看

Hello WeiSeng,


Thank you for posting in the community!


To further investigate this issue, could you please confirm if you are using Intel® VTune™ Profiler? If not, could you provide the exact name of the software you are using?


Best regards,

Norman S.

Intel Customer Support Engineer


0 项奖励
WeiSeng
员工
1,393 次查看

Hello,

 

Yes, is using tyhe Intel VTUNE Profiler to capture the ARC770 GPU metrics.

 

Thanks!

0 项奖励
JedG_Intel
主持人
1,363 次查看

Hello WeiSeng,

 

Thank you for sharing this information.

 

To ensure you receive the most specialized assistance, we have a dedicated forum that addresses these specific concerns. Therefore, I will be moving this discussion to our Developer Software Forum. This will allow our knowledgeable community and experts to provide you with timely and accurate solutions.

 

Have a good one!

 

 

Best regards,

Jed G.

Intel Customer Support Technician


0 项奖励
yuzhang3_intel
主持人
1,325 次查看

In general, Stalled issues are related to memory footprint. Model optimization, like optimizing model compilation time, graph fusing, etc., can also reduce memory usage and inference time. You can also use SLM to improve memory access latency for some kernels. Using oneDNN is also helpful for LLM optimization. There are some documents you can refer to:

https://docs.openvino.ai/2024/learn-openvino/llm_inference_guide.html

https://www.intel.com/content/www/us/en/docs/oneapi/optimization-guide-gpu/2024-2/kernels.html

 

 

 

 

0 项奖励
clevels
员工
1,115 次查看

@WeiSeng Please see @yuzhang3_intel recommendations above.

0 项奖励
回复