Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)
17049 Discussions

Piplining, what is the best way?

Altera_Forum
Honored Contributor II
1,661 Views

Hi, 

 

I have an algorithm that takes input data, make several sequental steps, and deliver results. Since my input data is comming very fast, I need to pipeline my computations. I have test example created similar to my problem and need kind help. My test example is attached. It contains 2 different always blocks controlled with Clk1 and Clk2. I am sure that Clk2 is ok and do what I want. My question is the following, is the "always" block with Clk1 is correct, or can be modified to achieve better timing properties? 

 

Thank you! 

 

Sincerely, 

 

Ilghiz 

 

module My_Second_Project (A1, A2, A3, Clk1, Clk2, X1, X2, X3, Y1, Y2, Y3); input A1, A2, A3; input Clk1, Clk2; output X1, X2, X3; reg X1, X2, X3; output Y1, Y2, Y3; reg Y1, Y2, Y3; reg B1, B2, B3, C1, C2, C3; reg P1, P2, P3, Q1, Q2, Q3, R1, R2, R3, S1, S2, S3; always @(posedge Clk1) // can I ensure that I read C1,C2,C3 first for computations of X1,X2,X3, // and the same for other variables? When I change all "=" to "<=" // I have no difference in RTL!!! begin X1=C1*C2; X2=C1*C3; X3=C2*C3; C1=B1*B2; C2=B1*B3; C3=B2*B3; B1=A1*A2; B2=A1*A3; B3=A2*A3; end always @(posedge Clk2) // is Ok, have 3 pipe-line blocks begin Y1=S1*S2; Y2=S1*S3; Y3=S2*S3; R1=Q1*Q2; R2=Q1*Q3; R3=Q2*Q3; P1=A1*A2; P2=A1*A3; P3=A2*A3; end always @(negedge Clk2) begin Q1=P1; Q2=P2; Q3=P3; S1=R1; S2=R2; S3=R3; end endmodule
0 Kudos
3 Replies
Altera_Forum
Honored Contributor II
916 Views

Your design has two clocks and Ax is being used in both. This means that Ax is crossing clock domains at least once and you're not taking care of that. 

You should use synchronizer chains to synchronize Ax to the proper clock domains. 

You may also need synchronizer chains for the outputs. 

 

http://www.altera.com/literature/wp/wp-01082-quartus-ii-metastability.pdf 

 

Otherwise, you're only doing a 16x16 bit multiplication every clock cycle. That's as fast as it can get. 

 

Regarding the "<=" vs "=" issue, it's simple. 

With "<=" you always read the value that was present before the event (clock edge, in this case). 

With "=", you read the last value you assigned to the signal in the processing block.  

 

If you don't read a signal after doing the assignment, then the behavior is the same. 

 

Thus, the following three code snippets have the same behavior. 

 

always @ (posedge clk) begin 

a = b; 

b = c; 

end 

 

always @ (posedge clk) begin 

a <= b; 

b <= c; 

end 

 

always @ (posedge clk) begin 

b <= c; 

a <= b; 

end 

 

On the other hand, the next code snippet DOES NOT have the same behavior as the other three. 

 

always @ (posedge clk) begin 

b = c; 

a = b; 

end
0 Kudos
Altera_Forum
Honored Contributor II
916 Views

Dear Rbugalho, 

 

thank you for your kind answer! It seems that my newbie question was not very precisely posted, what I need is 

 

 

--- Quote Start ---  

 

You may also need synchronizer chains for the outputs. 

 

--- Quote End ---  

 

 

I mean, I have several blocks, let's say like 

B1=A1*A2; B2=A1*A3; B3=A2*A3;  

 

and I want to put them in pipeline. What is the correct way to do it? Is it correct to do it like following: 

 

always @(posedge Clk1) begin X1=C1*C2; X2=C1*C3; X3=C2*C3; C1=B1*B2; C2=B1*B3; C3=B2*B3; B1=A1*A2; B2=A1*A3; B3=A2*A3; end  

 

Sorry, for my newbie questions. I really want to understand the situation because I need to implement very complicated pipe-line algorithm. 

 

Sincerely, 

 

Ilghiz
0 Kudos
Altera_Forum
Honored Contributor II
916 Views

Sorry, but with all the side issues I think you missed my reply to what you wanted to know: 

As far as I can see, your code is already fully pipelined and it's correct. I don't think there's anything you need/can do to improve it.
0 Kudos
Reply