Intel® High Level Design
Support for Intel® High Level Synthesis Compiler, DSP Builder, OneAPI for Intel® FPGAs, Intel® FPGA SDK for OpenCL™
663 Discussions

Memory error when trying to simulate matrix multiplication using HLS math lib

Gopikrishnan
Novice
724 Views
component void test(ihc::stream_in<int> &matrixData,ihc::stream_in<int> &matrixData2,ihc::stream_out<int> &matrixout,bool ld,
        int dim_in,int dim_out){

     ////////////input and output matrix definition/////////  
     hls_memory hls_singlepump  hls_max_replicates(1hls_bankwidth(sizeof(int))
     int feat_matrix[N_FEAT][N_COLS_1];

     hls_memory hls_singlepump  hls_max_replicates(1hls_bankwidth(sizeof(int))
     static int weight_matrix[N_COLS_1][N_COLS_2];
        
     int out_feat[N_FEAT][N_COLS_2];       
     ///////////////////////////////////////////
     
     //////////populate the feature and weight matrix//////////
     for (int i=0; i <N_FEAT ; i++){
       for(int j=0 ; j<N_COLS_1 ; j++){
          feat_matrix[i][j] = matrixData.read();
          
       }
      }
     if (ld){
     for (int i=0; i <N_COLS_1 ; i++){
       for(int j=0 ; j<N_COLS_2 ; j++){
          bool success = false;
          weight_matrix[i][j] = matrixData2.tryRead(success);
         
        }
       }
     }
    ///////////////////////////////////////////////////////

   ////////MAT MUL AND WRITE BACK TO STREAM/////////////

     matrix_multiply<int,N_FEAT,N_COLS_1,N_COLS_2,N_DSP,N_DSP>(feat_matrix,weight_matrix,out_feat);
     for (int i=0; i <N_FEAT ; i++){
       for(int j=0 ; j<dim_out ; j++){
         matrixout.write(out_feat[i][j]); 
       }
     }
   ////////////////////////////////////////////////////
   ////////////////////////////////////////////////////
}
 
 
When I'm trying to simulate this matrix multiplication for large dimensions it's giving me mem error
Fatal vsim4 memory allocation failure.
 
I figured out it's the loop unrolling that it's causing the issue. When I try to simulate the code using simple for loops with no unrolling it works fine. Is there any work around this or is there a better way of coding this
0 Kudos
2 Replies
BoonBengT_Intel
Moderator
689 Views

Hi @Gopikrishnan,

 

Thank you for posting in Intel community forum, hope all is well and apologies for the delayed in response.
Would recommend to try on the loop unroll pragma, which should reduces the latencey.


More information can be found here on the code snippet and implementation.
Please do let us know if that helps.

Note: there are also details steps on the loops best practices here which are recommended, hopefully it will give some insights.

 

Best Wishes
BB

0 Kudos
BoonBengT_Intel
Moderator
645 Views

Hi @Gopikrishnan,

 

Greetings, as we do not receive any further clarification on what is provided. Hence thread will now be transitioned to community support and we will no longer monitor this thread. For new queries, please feel free to open a new thread and we will be right with you. Pleasure having you here.

 

Best Wishes
BB

0 Kudos
Reply