해결됨: About TPC-C kernel implementation (aso) instruction

ryanpark · ‎01-13-2025

https://docs.habana.ai/en/latest/TPC/TPC_User_Guide/TPC_Coherency.html

As TPC_User_Guid stated, to ensure the coherency between each kernel, it seems that we use ASO instruction if each TPC kernel accesses the same global memory.

For instance, if we need to read and update the values in matrix C (in the same coordinates), we must block them to protect coherency, or each TPC core might read past values that haven't been updated yet.

Are there some examples of the ASO instructions?

There are no references that I could find. Currently, I have only one way to access the same global memory, which needs to consist of coherency.

James_Edwards · ‎01-15-2025

R&D pointed me at this sample: Habana_Custom_Kernel/kernels/include/pscan_combined.h at dev/zhzhang/pscan_gaudi2 · HabanaAI/Habana_Custom_Kernel

Is this sufficient?

원본 게시물의 솔루션 보기

James_Edwards · ‎01-15-2025

Currently there are no examples of using the ASO instruction outside of the description given in the base documentation or in the TPC tutorial, given here: Use TPC Kernels on Intel® Gaudi® Technology. I will see if I can contact R&D and get a code snippet pertaining to this.

James_Edwards · ‎01-15-2025

R&D pointed me at this sample: Habana_Custom_Kernel/kernels/include/pscan_combined.h at dev/zhzhang/pscan_gaudi2 · HabanaAI/Habana_Custom_Kernel

Is this sufficient?

ryanpark · ‎01-19-2025

Thanks, it could be a solution for me. Thanks for the reply. I really appreciate that it solved my problem.

ryanpark · ‎01-19-2025

Thanks, it could be a solution for me.

Thanks for the reply and I really appreciate that it solved my problem