- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I am trying to implement data conversion algorithm. It has 8 inputs and its clock running at 400MHz and 1 bit output and its clock running at about 1MHz. During the project I need to find when my input starts, collect 192*1024 bits of continuous input into "RingData" with the time stamp "TimeStep". I used only ones RindData read and writes; the same is true for timestep.
when i start it at quartus (at xp professional 32bit) i am waiting about 5 minutes and still at the stage of 9% of synthes. the last messages that i am recieving from quartus are:
Info: Found 2 instances of uninferred RAM logic Info: RAM logic "RingData" is uninferred due to asynchronous read logic Info: RAM logic "TimeStep" is uninferred due to inappropriate RAM size Warning: Cannot convert all sets of registers into RAM megafunctions when creating nodes; therefore the resulting number of registers remaining in design can cause longer compilation time or result in insufficient memory to complete Analysis and Synthesis [/I] It seems that something I did wrong but I cannot figured it out. Please, adwise me what to do, my complete Verilog file is attached. Sincerely, Ilghiz PS: I simplified the source removing unnecessary mathematics.
module My_First_Project (InData, InReady, OutClock, OutData);
parameter MaxK=8;
parameter MaxN=MaxK*1024;
input InData;
input InReady, OutClock;
output OutData;
reg OutData;
reg signed LocData1, aLD0, aLD1, aLD2, aLD3, aLD4;
reg signed LD0, LD1, LD2, LD3, LD4;
reg DeMuxCounter;
reg Temp;
reg H00, H01, H02, H03, H04, H05, H06, H07, H08, H09;
reg H10, H11, H12, H13, H14, H15, H16, H17, H18, H19;
reg H20, H21, H22, H23, H24, H25, H26, H27, H28, H29;
reg H30, H31, H32, H33, H34, H35, H36, H37, H38, H39;
reg G01, G02, G03, G04, G05, G06, G07, G08, G09;
reg G10, G11, G12, G13, G14, G15, G16, G17, G18, G19;
reg G20, G21, G22, G23, G24, G25, G26, G27, G28, G29;
reg G30, G31, G32, G33, G34, G35, G36, G37, G38, G39;
reg NadoT;
reg CurTime;
reg NextStep;
reg RingData ;
reg TimeStep ;
reg BeginPos;
reg EndPos;
reg CurShiftData;
reg CurPos;
reg EndPosSw;
initial
begin
CurTime=0;
NadoT=0;
DeMuxCounter=0;
BeginPos=0;
EndPos=0;
CurPos=79;
EndPosSw=1;
end
always @(posedge InReady)
begin
LocData1={InData, Temp, InData, Temp, InData, Temp, InData, Temp,
InData, Temp, InData, Temp, InData, Temp, InData, Temp};
end
always @(negedge InReady) Temp<=InData;
always @(LocData1)
begin
if(DeMuxCounter)
begin
DeMuxCounter<=DeMuxCounter+1;
{aLD0, aLD1, aLD2, aLD3, aLD4}<={aLD1, aLD2, aLD3, aLD4, LocData1};
NextStep<=0;
end
else
begin
DeMuxCounter<=DeMuxCounter+1;
{aLD0, aLD1, aLD2, aLD3, aLD4} <= {aLD1, aLD2, aLD3, aLD4, LocData1};
{LD0, LD1, LD2, LD3, LD4} <= {aLD1, aLD2, aLD3, aLD4, LocData1};
NextStep<=1;
end
end
always @(posedge NextStep)
begin
begin
G39<=H38; G38<=H37; G37<=H36; G36<=H35; G35<=H34; G34<=H33; G33<=H32; G32<=H31; G31<=H30; G30<=H29;
G29<=H28; G28<=H27; G27<=H26; G26<=H25; G25<=H24; G24<=H23; G23<=H22; G22<=H21; G21<=H20; G20<=H19;
G19<=H18; G18<=H17; G17<=H16; G16<=H15; G15<=H14; G14<=H13; G13<=H12; G12<=H11; G11<=H10; G10<=H09;
G09<=H08; G08<=H07; G07<=H06; G06<=H05; G05<=H04; G04<=H03; G03<=H02; G02<=H01; G01<=H00;
if(NadoT)
begin
RingData=H39;
NadoT=NadoT-1;
if((BeginPos&1023)==0) TimeStep]=CurTime;
BeginPos=BeginPos+1;
end
end
begin
CurTime<=CurTime+1;
H39<=G39; H38<=G38; H37<=G37; H36<=G36; H35<=G35; H34<=G34; H33<=G33; H32<=G32; H31<=G31; H30<=G30;
H29<=G29; H28<=G28; H27<=G27; H26<=G26; H25<=G25; H24<=G24; H23<=G23; H22<=G22; H21<=G21; H20<=G20;
H19<=G19; H18<=G18; H17<=G17; H16<=G16; H15<=G15; H14<=G14; H13<=G13; H12<=G12; H11<=G11; H10<=G10;
H09<=G09; H08<=G08; H07<=G07; H06<=G06; H05<=G05; H04<=G04; H03<=G03; H02<=G02; H01<=G01;
H00<={LD1, LD2, LD3, LD4};
if(LD1>=2000 && LD2>=2000 && LD3>=2000 && LD4>=2000)
begin
NadoT<=(NadoT&1023)+3072;
end
end
end
always @(posedge OutClock)
begin
{CurShiftData, OutData}=CurShiftData;
CurPos=CurPos-1;
if(CurPos==0)
begin
CurPos=79;
CurShiftData=EndPos;
if((EndPos&1023)==0 && EndPosSw)
begin
CurShiftData=TimeStep];
EndPosSw=0;
end
else
begin
CurShiftData=RingData;
EndPosSw=1;
EndPos=EndPos+1;
end
end
end
endmodule
Link Copied
6 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I found a construct in your code, that's absolutely not synthesizable. You can't built a counter without a clock (a posedge or negedge condition).
always @(LocData1)
begin
if(DeMuxCounter)
begin
DeMuxCounter<=DeMuxCounter+1;
You should check, what you want to achieve here and find a clear synchronous construct for it. I also noticed ripple clocks in the design that may prevent timing closure. You mentioned a input clock speed of 400 MHz. Do you mean that InReady is 400 MHz or 200 MHz? I don't see at once an asynchronous read of RingData. I wonder, if it has to do with usage of blocking assignments. Altera RAM interference examples are exclusively using non-blocking assignments, according to it's synchronous function. Or you have removed the problem when simplyfying the code. In any case, without forcing RAM inference for the large buffer structure, the design can't compile I fear.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear FvM,
thank you for your kind suggestions. I tried to rewrite everything according to your suggestions and got my project compiled, however I am still not sure that everything is ok. I got very impressive Fmax counts for several clocks of my design: 438MHz for InReady and 153MHz for the internal clock which is 4 times demux of InReady, so, I need at least 100MHz. Indeed I have two designs, one is running with InReady clocked at 200MHz and one other slightly differ with more wider reg [27:0] InData running at 400MHz clock. What I cannot understand right now in my Quartus compilation, is the following: I allocate reg [63:0] RingData [0:8191], so, it is 512KBits, however, in "Flow Summary" I used zero bits. Would enybody comment me where my RingData and TimeStep arrays was allocated? In RTL it is marked as sync_ram, but if the internal Cyclone 3 memory of 608K were used, why the Flow Summary has zero bits of usage. I am attaching complete code of this project, RTL, Fmax and Flow summaries. Sincerely, Ilghiz
module My_First_Project (InData, InReady, OutClock, OutData);
parameter MaxK=8;
parameter BLKSIZE=1024;
parameter MaxN=MaxK*BLKSIZE;
input InData;
input InReady, OutClock;
output OutData;
reg OutData;
reg signed LD0, LD1, LD2, LD3, LD4;
reg signed PE, NE;
reg DeMuxCounter;
reg H00, H01, H02, H03, H04, H05, H06, H07, H08, H09;
reg H10, H11, H12, H13, H14, H15, H16, H17, H18, H19;
reg H20, H21, H22, H23, H24, H25, H26, H27, H28, H29;
reg H30, H31, H32, H33, H34, H35, H36, H37, H38, H39;
reg SumS1, SumS2;
reg LevelS1, LevelS2;
reg signed x1, x2, x3, x4;
reg y1, y2, y3, y4, yy1, yy2, yyy, LS1, LS2;
reg z1, z2, z3, z4, zz1, zz2, zzz;
reg NadoT;
reg CurTime;
reg RingData ;
reg TimeStep ;
reg BeginPos;
reg EndPos;
reg CurShiftData;
reg CurPos;
reg EndPosSw;
initial
begin
LD4<=0;
CurTime=0;
SumS1=1;
SumS2=1;
LevelS1=1;
LevelS2=1;
NadoT=0;
DeMuxCounter=0;
BeginPos=0;
EndPos=0;
CurPos=79;
EndPosSw=1;
end
always @(posedge InReady)
begin
PE <= {PE, InData};
end
always @(negedge InReady)
begin
NE <= {NE, InData};
DeMuxCounter<=DeMuxCounter+1'b1;
end
always @(posedge DeMuxCounter)
begin
if(NadoT)
begin
RingData=H39;
NadoT=NadoT-1'b1;
if(BeginPos==0) TimeStep]=CurTime;
BeginPos=BeginPos+1'b1;
end
begin
LD0<=LD4;
LD1<={PE, NE, PE, NE, PE, NE, PE, NE,
PE, NE, PE, NE, PE, NE, PE, NE};
LD2<={PE, NE, PE, NE, PE, NE, PE, NE,
PE, NE, PE, NE, PE, NE, PE, NE};
LD3<={PE, NE, PE, NE, PE, NE, PE, NE,
PE, NE, PE, NE, PE, NE, PE, NE};
LD4<={PE, NE, PE, NE, PE, NE, PE, NE,
PE, NE, PE, NE, PE, NE, PE, NE};
end
begin
{H00, H01, H02, H03, H04, H05, H06, H07, H08, H09,
H10, H11, H12, H13, H14, H15, H16, H17, H18, H19,
H20, H21, H22, H23, H24, H25, H26, H27, H28, H29,
H30, H31, H32, H33, H34, H35, H36, H37, H38, H39} <=
{LD1, LD2, LD3, LD4,
H00, H01, H02, H03, H04, H05, H06, H07, H08, H09,
H10, H11, H12, H13, H14, H15, H16, H17, H18, H19,
H20, H21, H22, H23, H24, H25, H26, H27, H28, H29,
H30, H31, H32, H33, H34, H35, H36, H37, H38};
x1<=LD0-LD1;
x2<=LD1-LD2;
x3<=LD2-LD3;
x4<=LD3-LD4;
y1<=LD1*LD1;
y2<=LD2*LD2;
y3<=LD3*LD3;
y4<=LD4*LD4;
SumS1<=SumS1-(SumS1>>3);
SumS2<=SumS2-(SumS2>>3);
LS1<=LevelS1-(LevelS1>>3);
LS2<=LevelS2-(LevelS2>>3);
CurTime<=CurTime+1;
end
begin
z1<=x1*x1;
z2<=x2*x2;
z3<=x3*x3;
z4<=x4*x4;
yy1<=y1+y2;
yy2<=y3+y4;
end
begin
zz1<=z1+z2;
zz2<=z3+z4;
yyy<=yy1+yy2;
end
begin
zzz<=zz1+zz2;
SumS1<=SumS1+(yyy>>5);
end
begin
SumS2<=SumS2+(zzz>>5);
end
begin
if(SumS1>=LevelS1 && SumS2>=LevelS2)
begin
LevelS1<=LS1+(SumS1>>5);
LevelS2<=LS2+(SumS2>>5);
NadoT<=NadoT+3072;
end
end
end
always @(OutClock)
begin
begin
{CurShiftData, OutData}<=CurShiftData;
end
if(CurPos) CurPos<=CurPos-1'b1;
else
begin
CurPos<=79;
CurShiftData<=EndPos;
if(EndPos==0 && EndPosSw)
begin
CurShiftData=TimeStep];
EndPosSw=0;
end
else
begin
CurShiftData=RingData;
EndPosSw=1;
EndPos=EndPos+1'b1;
end
end
end
endmodule
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You have several issues in the below code:
- it's combinational (no edge sensitive condition) - CurShiftData is the RAM output register and a shift register at the same time. That doesn't work. Because I don't know what you actually wan to achieve here, I can't suggest a solution.always @(OutClock)
begin
begin
{CurShiftData, OutData}<=CurShiftData;
end
if(CurPos) CurPos<=CurPos-1'b1;
else
begin
CurPos<=79;
CurShiftData<=EndPos;
if(EndPos==0 && EndPosSw)
begin
CurShiftData=TimeStep];
EndPosSw=0;
end
else
begin
CurShiftData=RingData;
EndPosSw=1;
EndPos=EndPos+1'b1;
end
end
end
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear FvM,
thank you for your reply. Actually, I really cannot guess what to do, I am new in FPGA... I need to implement the following algorithm: on posetive or on negative edges I should send one bit from 80 bit array CurShiftdata; if there is no data available on CurShiftData, I need to generate CurShiftData according to the following rules: [79:75] are zeros, [76:64] bits corresponds to EndPos, (I am using [79:64]<=EndPos am I right?) the rest [63:0] 64 bit data are collected 1/1024 times from TimeStep array, and on other cases from RingData. Actually, I have no filling what your comment about RAM output register means. Yes, I understand that it is shift register, and probably it is by some means also RAM, but when it is RAM or not, I cannot define myself, so I cannot figure out how to fix this problem. Please, help me! Sincerely, Ilghiz- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As a general remark, your code is rather complex for a "First_Project". I hope, you're able to solve the involved problems without too much frustration. Most people start learning HDL programming with more basic design problems.
--- Quote Start --- on positive or on negative edges I should send one bit --- Quote End --- I understand now, why you wrote always @(outclock), but unfortunately, it's not synthesizable. You need a kind of DDIO (dual-data-rate) output register. I present a principle solution in the code snippet below (registering two bits and use a multiplexer to select the right output data bit for both clock phases), for high OutClock speeds, explicite instantiation of a DDIO primitive may be required. The other point is to keep the requirements for RAM inference. I'm showing below a construct that is accepted by the Quartus compiler, but I'm not sure if it's acceptable to register the RAM output 1 clock cycle in advance. If it doesn't work this way, you have to use a different construct, that reserves one clock cycle delay for the RAM read action.always @(posedge OutClock)
begin
begin
{CurShiftData, OutData_n,OutData_p}<=CurShiftData;
end
if(CurPos)
begin
CurPos=CurPos-1'b1;
end
else
begin
CurPos<=79;
CurShiftData<=EndPos;
if(EndPos==0 && EndPosSw)
begin
EndPosSw<=0;
CurShiftData<=CurShiftData_s1;
end
else
begin
EndPosSw<=1;
EndPos<=EndPos+1'b1;
CurShiftData<=CurShiftData_s2;
end
end
CurShiftData_s1<=TimeStep];
CurShiftData_s2<=RingData;
end
assign OutData = (OutClock)?OutData_p:OutData_n;
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear FvM,
thank you very much for your kind suggestion. Due to your help and helps in other forums I was able to rewrite this example such a way that it compile and looks reasonable in RTL. In regards to "My_Frist_Project" it is really my first project, I never used Verilog/VHDL or other synthes languages before, however, it is really simple algorithm regarding to my goal - QR like algorithm. Hope that my experience in numerical mathematics helps me to implement it fast enough :) Sincerely, Ilghiz
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page