Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)
16613 Discussions

What I did Wrong in my Verilog Project? (Quartus cannot compile it)

Altera_Forum
Honored Contributor II
1,563 Views

Hi, 

 

I am trying to implement data conversion algorithm. It has 8 inputs and its clock running at 400MHz and 1 bit output and its clock running at about 1MHz. 

 

During the project I need to find when my input starts, collect 192*1024 bits of continuous input into "RingData" with the time stamp "TimeStep". I used only ones RindData read and writes; the same is true for timestep. 

 

when i start it at quartus (at xp professional 32bit) i am waiting about 5 minutes and still at the stage of 9% of synthes. the last messages that i am recieving from quartus are: 

 

Info: Found 2 instances of uninferred RAM logic 

Info: RAM logic "RingData" is uninferred due to asynchronous read logic 

Info: RAM logic "TimeStep" is uninferred due to inappropriate RAM size 

Warning: Cannot convert all sets of registers into RAM megafunctions when creating nodes; therefore the resulting number of registers remaining in design can cause longer compilation time or result in insufficient memory to complete Analysis and Synthesis 

[/I] 

 

It seems that something I did wrong but I cannot figured it out. Please, adwise me what to do, my complete Verilog file is attached. 

 

Sincerely, 

 

Ilghiz 

 

PS: I simplified the source removing unnecessary mathematics. 

 

module My_First_Project (InData, InReady, OutClock, OutData); parameter MaxK=8; parameter MaxN=MaxK*1024; input InData; input InReady, OutClock; output OutData; reg OutData; reg signed LocData1, aLD0, aLD1, aLD2, aLD3, aLD4; reg signed LD0, LD1, LD2, LD3, LD4; reg DeMuxCounter; reg Temp; reg H00, H01, H02, H03, H04, H05, H06, H07, H08, H09; reg H10, H11, H12, H13, H14, H15, H16, H17, H18, H19; reg H20, H21, H22, H23, H24, H25, H26, H27, H28, H29; reg H30, H31, H32, H33, H34, H35, H36, H37, H38, H39; reg G01, G02, G03, G04, G05, G06, G07, G08, G09; reg G10, G11, G12, G13, G14, G15, G16, G17, G18, G19; reg G20, G21, G22, G23, G24, G25, G26, G27, G28, G29; reg G30, G31, G32, G33, G34, G35, G36, G37, G38, G39; reg NadoT; reg CurTime; reg NextStep; reg RingData ; reg TimeStep ; reg BeginPos; reg EndPos; reg CurShiftData; reg CurPos; reg EndPosSw; initial begin CurTime=0; NadoT=0; DeMuxCounter=0; BeginPos=0; EndPos=0; CurPos=79; EndPosSw=1; end always @(posedge InReady) begin LocData1={InData, Temp, InData, Temp, InData, Temp, InData, Temp, InData, Temp, InData, Temp, InData, Temp, InData, Temp}; end always @(negedge InReady) Temp<=InData; always @(LocData1) begin if(DeMuxCounter) begin DeMuxCounter<=DeMuxCounter+1; {aLD0, aLD1, aLD2, aLD3, aLD4}<={aLD1, aLD2, aLD3, aLD4, LocData1}; NextStep<=0; end else begin DeMuxCounter<=DeMuxCounter+1; {aLD0, aLD1, aLD2, aLD3, aLD4} <= {aLD1, aLD2, aLD3, aLD4, LocData1}; {LD0, LD1, LD2, LD3, LD4} <= {aLD1, aLD2, aLD3, aLD4, LocData1}; NextStep<=1; end end always @(posedge NextStep) begin begin G39<=H38; G38<=H37; G37<=H36; G36<=H35; G35<=H34; G34<=H33; G33<=H32; G32<=H31; G31<=H30; G30<=H29; G29<=H28; G28<=H27; G27<=H26; G26<=H25; G25<=H24; G24<=H23; G23<=H22; G22<=H21; G21<=H20; G20<=H19; G19<=H18; G18<=H17; G17<=H16; G16<=H15; G15<=H14; G14<=H13; G13<=H12; G12<=H11; G11<=H10; G10<=H09; G09<=H08; G08<=H07; G07<=H06; G06<=H05; G05<=H04; G04<=H03; G03<=H02; G02<=H01; G01<=H00; if(NadoT) begin RingData=H39; NadoT=NadoT-1; if((BeginPos&1023)==0) TimeStep]=CurTime; BeginPos=BeginPos+1; end end begin CurTime<=CurTime+1; H39<=G39; H38<=G38; H37<=G37; H36<=G36; H35<=G35; H34<=G34; H33<=G33; H32<=G32; H31<=G31; H30<=G30; H29<=G29; H28<=G28; H27<=G27; H26<=G26; H25<=G25; H24<=G24; H23<=G23; H22<=G22; H21<=G21; H20<=G20; H19<=G19; H18<=G18; H17<=G17; H16<=G16; H15<=G15; H14<=G14; H13<=G13; H12<=G12; H11<=G11; H10<=G10; H09<=G09; H08<=G08; H07<=G07; H06<=G06; H05<=G05; H04<=G04; H03<=G03; H02<=G02; H01<=G01; H00<={LD1, LD2, LD3, LD4}; if(LD1>=2000 && LD2>=2000 && LD3>=2000 && LD4>=2000) begin NadoT<=(NadoT&1023)+3072; end end end always @(posedge OutClock) begin {CurShiftData, OutData}=CurShiftData; CurPos=CurPos-1; if(CurPos==0) begin CurPos=79; CurShiftData=EndPos; if((EndPos&1023)==0 && EndPosSw) begin CurShiftData=TimeStep]; EndPosSw=0; end else begin CurShiftData=RingData; EndPosSw=1; EndPos=EndPos+1; end end end endmodule
0 Kudos
6 Replies
Altera_Forum
Honored Contributor II
554 Views

I found a construct in your code, that's absolutely not synthesizable. You can't built a counter without a clock (a posedge or negedge condition). 

always @(LocData1) begin if(DeMuxCounter) begin DeMuxCounter<=DeMuxCounter+1; 

You should check, what you want to achieve here and find a clear synchronous construct for it. I also noticed ripple clocks in the design that may prevent timing closure. 

 

You mentioned a input clock speed of 400 MHz. Do you mean that InReady is 400 MHz or 200 MHz? 

 

I don't see at once an asynchronous read of RingData. I wonder, if it has to do with usage of blocking assignments. Altera RAM interference examples are exclusively using non-blocking assignments, according to it's synchronous function. Or you have removed the problem when simplyfying the code. 

 

In any case, without forcing RAM inference for the large buffer structure, the design can't compile I fear.
0 Kudos
Altera_Forum
Honored Contributor II
554 Views

Dear FvM, 

 

thank you for your kind suggestions. I tried to rewrite everything according to your suggestions and got my project compiled, however I am still not sure that everything is ok. 

 

I got very impressive Fmax counts for several clocks of my design: 438MHz for InReady and 153MHz for the internal clock which is 4 times demux of InReady, so, I need at least 100MHz. Indeed I have two designs, one is running with InReady clocked at 200MHz and one other slightly differ with more wider reg [27:0] InData running at 400MHz clock. 

 

What I cannot understand right now in my Quartus compilation, is the following: I allocate reg [63:0] RingData [0:8191], so, it is 512KBits, however, in "Flow Summary" I used zero bits. 

 

Would enybody comment me where my RingData and TimeStep arrays was allocated? In RTL it is marked as sync_ram, but if the internal Cyclone 3 memory of 608K were used, why the Flow Summary has zero bits of usage. 

 

I am attaching complete code of this project, RTL, Fmax and Flow summaries. 

 

Sincerely, 

 

Ilghiz 

 

module My_First_Project (InData, InReady, OutClock, OutData); parameter MaxK=8; parameter BLKSIZE=1024; parameter MaxN=MaxK*BLKSIZE; input InData; input InReady, OutClock; output OutData; reg OutData; reg signed LD0, LD1, LD2, LD3, LD4; reg signed PE, NE; reg DeMuxCounter; reg H00, H01, H02, H03, H04, H05, H06, H07, H08, H09; reg H10, H11, H12, H13, H14, H15, H16, H17, H18, H19; reg H20, H21, H22, H23, H24, H25, H26, H27, H28, H29; reg H30, H31, H32, H33, H34, H35, H36, H37, H38, H39; reg SumS1, SumS2; reg LevelS1, LevelS2; reg signed x1, x2, x3, x4; reg y1, y2, y3, y4, yy1, yy2, yyy, LS1, LS2; reg z1, z2, z3, z4, zz1, zz2, zzz; reg NadoT; reg CurTime; reg RingData ; reg TimeStep ; reg BeginPos; reg EndPos; reg CurShiftData; reg CurPos; reg EndPosSw; initial begin LD4<=0; CurTime=0; SumS1=1; SumS2=1; LevelS1=1; LevelS2=1; NadoT=0; DeMuxCounter=0; BeginPos=0; EndPos=0; CurPos=79; EndPosSw=1; end always @(posedge InReady) begin PE <= {PE, InData}; end always @(negedge InReady) begin NE <= {NE, InData}; DeMuxCounter<=DeMuxCounter+1'b1; end always @(posedge DeMuxCounter) begin if(NadoT) begin RingData=H39; NadoT=NadoT-1'b1; if(BeginPos==0) TimeStep]=CurTime; BeginPos=BeginPos+1'b1; end begin LD0<=LD4; LD1<={PE, NE, PE, NE, PE, NE, PE, NE, PE, NE, PE, NE, PE, NE, PE, NE}; LD2<={PE, NE, PE, NE, PE, NE, PE, NE, PE, NE, PE, NE, PE, NE, PE, NE}; LD3<={PE, NE, PE, NE, PE, NE, PE, NE, PE, NE, PE, NE, PE, NE, PE, NE}; LD4<={PE, NE, PE, NE, PE, NE, PE, NE, PE, NE, PE, NE, PE, NE, PE, NE}; end begin {H00, H01, H02, H03, H04, H05, H06, H07, H08, H09, H10, H11, H12, H13, H14, H15, H16, H17, H18, H19, H20, H21, H22, H23, H24, H25, H26, H27, H28, H29, H30, H31, H32, H33, H34, H35, H36, H37, H38, H39} <= {LD1, LD2, LD3, LD4, H00, H01, H02, H03, H04, H05, H06, H07, H08, H09, H10, H11, H12, H13, H14, H15, H16, H17, H18, H19, H20, H21, H22, H23, H24, H25, H26, H27, H28, H29, H30, H31, H32, H33, H34, H35, H36, H37, H38}; x1<=LD0-LD1; x2<=LD1-LD2; x3<=LD2-LD3; x4<=LD3-LD4; y1<=LD1*LD1; y2<=LD2*LD2; y3<=LD3*LD3; y4<=LD4*LD4; SumS1<=SumS1-(SumS1>>3); SumS2<=SumS2-(SumS2>>3); LS1<=LevelS1-(LevelS1>>3); LS2<=LevelS2-(LevelS2>>3); CurTime<=CurTime+1; end begin z1<=x1*x1; z2<=x2*x2; z3<=x3*x3; z4<=x4*x4; yy1<=y1+y2; yy2<=y3+y4; end begin zz1<=z1+z2; zz2<=z3+z4; yyy<=yy1+yy2; end begin zzz<=zz1+zz2; SumS1<=SumS1+(yyy>>5); end begin SumS2<=SumS2+(zzz>>5); end begin if(SumS1>=LevelS1 && SumS2>=LevelS2) begin LevelS1<=LS1+(SumS1>>5); LevelS2<=LS2+(SumS2>>5); NadoT<=NadoT+3072; end end end always @(OutClock) begin begin {CurShiftData, OutData}<=CurShiftData; end if(CurPos) CurPos<=CurPos-1'b1; else begin CurPos<=79; CurShiftData<=EndPos; if(EndPos==0 && EndPosSw) begin CurShiftData=TimeStep]; EndPosSw=0; end else begin CurShiftData=RingData; EndPosSw=1; EndPos=EndPos+1'b1; end end end endmodule
0 Kudos
Altera_Forum
Honored Contributor II
554 Views

You have several issues in the below code: 

- it's combinational (no edge sensitive condition) 

- CurShiftData is the RAM output register and a shift register at the same time. That doesn't work. 

Because I don't know what you actually wan to achieve here, I can't suggest a solution. 

 

always @(OutClock) begin begin {CurShiftData, OutData}<=CurShiftData; end if(CurPos) CurPos<=CurPos-1'b1; else begin CurPos<=79; CurShiftData<=EndPos; if(EndPos==0 && EndPosSw) begin CurShiftData=TimeStep]; EndPosSw=0; end else begin CurShiftData=RingData; EndPosSw=1; EndPos=EndPos+1'b1; end end end
0 Kudos
Altera_Forum
Honored Contributor II
554 Views

Dear FvM, 

 

thank you for your reply. Actually, I really cannot guess what to do, I am new in FPGA... I need to implement the following algorithm: 

 

on posetive or on negative edges I should send one bit from 80 bit array CurShiftdata; 

 

if there is no data available on CurShiftData, I need to generate CurShiftData according to the following rules: 

[79:75] are zeros, 

[76:64] bits corresponds to EndPos, (I am using [79:64]<=EndPos am I right?) 

the rest [63:0] 64 bit data are collected 1/1024 times from TimeStep array, and on other cases from RingData. 

 

Actually, I have no filling what your comment about RAM output register means. Yes, I understand that it is shift register, and probably it is by some means also RAM, but when it is RAM or not, I cannot define myself, so I cannot figure out how to fix this problem. Please, help me! 

 

Sincerely, 

 

Ilghiz 

0 Kudos
Altera_Forum
Honored Contributor II
554 Views

As a general remark, your code is rather complex for a "First_Project". I hope, you're able to solve the involved problems without too much frustration. Most people start learning HDL programming with more basic design problems.  

 

--- Quote Start ---  

on positive or on negative edges I should send one bit 

--- Quote End ---  

 

I understand now, why you wrote always @(outclock), but unfortunately, it's not synthesizable. You need a kind of DDIO (dual-data-rate) output register. I present a principle solution in the code snippet below (registering two bits and use a multiplexer to select the right output data bit for both clock phases), for high OutClock speeds, explicite instantiation of a DDIO primitive may be required. 

 

The other point is to keep the requirements for RAM inference. I'm showing below a construct that is accepted by the Quartus compiler, but I'm not sure if it's acceptable to register the RAM output 1 clock cycle in advance. If it doesn't work this way, you have to use a different construct, that reserves one clock cycle delay for the RAM read action. 

always @(posedge OutClock) begin begin {CurShiftData, OutData_n,OutData_p}<=CurShiftData; end if(CurPos) begin CurPos=CurPos-1'b1; end else begin CurPos<=79; CurShiftData<=EndPos; if(EndPos==0 && EndPosSw) begin EndPosSw<=0; CurShiftData<=CurShiftData_s1; end else begin EndPosSw<=1; EndPos<=EndPos+1'b1; CurShiftData<=CurShiftData_s2; end end CurShiftData_s1<=TimeStep]; CurShiftData_s2<=RingData; end assign OutData = (OutClock)?OutData_p:OutData_n;
0 Kudos
Altera_Forum
Honored Contributor II
554 Views

Dear FvM, 

 

thank you very much for your kind suggestion. Due to your help and helps in other forums I was able to rewrite this example such a way that it compile and looks reasonable in RTL.  

 

In regards to "My_Frist_Project" it is really my first project, I never used Verilog/VHDL or other synthes languages before, however, it is really simple algorithm regarding to my goal - QR like algorithm. Hope that my experience in numerical mathematics helps me to implement it fast enough :) 

 

Sincerely, 

 

Ilghiz
0 Kudos
Reply