Programmable Devices
CPLDs, FPGAs, SoC FPGAs, Configuration, and Transceivers
21591 Discussions

signal power estimation

Altera_Forum
Honored Contributor II
2,764 Views

Hello guys.. 

 

I'm trying to implement a block that evaluates the signal rms value along n samples. 

 

So, using the std definition one must do (x_k is a complex number so I have to take in account |x_k|): 

 

http://upload.wikimedia.org/math/0/1/4/014a453df92ed77eca97b228f0624d9f.png  

 

The simplier way I have in mind is to use an accumulator. By the way it will indroduce some overflow issues. There is a good way in doing this ?? 

 

Thank you !
0 Kudos
17 Replies
Altera_Forum
Honored Contributor II
1,447 Views

 

--- Quote Start ---  

Hello guys.. 

 

I'm trying to implement a block that evaluates the signal rms value along n samples. 

 

So, using the std definition one must do (x_k is a complex number so I have to take in account |x_k|): 

 

http://upload.wikimedia.org/math/0/1/4/014a453df92ed77eca97b228f0624d9f.png  

 

The simplier way I have in mind is to use an accumulator. By the way it will indroduce some overflow issues. There is a good way in doing this ?? 

 

Thank you ! 

--- Quote End ---  

 

 

since your input is complex then square Re(Re*Re) + square Im(Im*Im) then accumulate this result over say 2^20 samples. 

The accumulator will need 20 bits extra(over that of adder result) to avoid overflow. For 1/n Discard 20 LSBs when you read final result before clearing it to restart. 

 

 

for square root, avoid it if you don't need it else use LUT or ip
0 Kudos
Altera_Forum
Honored Contributor II
1,447 Views

 

--- Quote Start ---  

since your input is complex then square Re(Re*Re) + square Im(Im*Im) then accumulate this result over say 2^20 samples. 

The accumulator will need 20 bits extra(over that of adder result) to avoid overflow. For 1/n Discard 20 LSBs when you read final result before clearing it to restart. 

 

 

for square root, avoid it if you don't need it else use LUT or ip 

--- Quote End ---  

 

 

Dear Kaz, 

 

Please correct me if I'm wrong.. I think that in the worst case, at each addition I should need an extra bit. In that case if I add 2^20 samples I will need 2^20 extra bits... Or not ?? 

Probably I've only to try and find the correct value of bit needed to take 2^20 samples without overflow..
0 Kudos
Altera_Forum
Honored Contributor II
1,447 Views

 

--- Quote Start ---  

Dear Kaz, 

 

Please correct me if I'm wrong.. I think that in the worst case, at each addition I should need an extra bit. In that case if I add 2^20 samples I will need 2^20 extra bits... Or not ?? 

Probably I've only to try and find the correct value of bit needed to take 2^20 samples without overflow.. 

--- Quote End ---  

 

 

No that sort of bits is enough to count all cosmic stars or even my cash. 

for each pair of adds you need one bit so for 2^20 samples addition you need 20 bits more.(power of 2 maths)
0 Kudos
Altera_Forum
Honored Contributor II
1,447 Views

imagine your input is just 1 (one bit), you add up 2^20 samples of 1, what you get? 

1*2^20 = 2^20 which needs 20 bits(+1 bit)
0 Kudos
Altera_Forum
Honored Contributor II
1,447 Views

 

--- Quote Start ---  

imagine your input is just 1 (one bit), you add up 2^20 samples of 1, what you get? 

1*2^20 = 2^20 which needs 20 bits(+1 bit) 

--- Quote End ---  

 

Dear kaz 

 

It's ok. But what about working with 16 bits words ?  

 

If you add up 2^20 samples of 4 you will get 4*2^20 that needs more than 21 bits.. or not ? 

 

My problem is not the overflow of the counter but the one of the sum.
0 Kudos
Altera_Forum
Honored Contributor II
1,447 Views

 

--- Quote Start ---  

Dear kaz 

 

It's ok. But what about working with 16 bits words ?  

 

If you add up 2^20 samples of 4 you will get 4*2^20 that needs more than 21 bits.. or not ? 

--- Quote End ---  

 

 

4*2^20 requires 3 bits + 20 bits = 23 bits. 

 

in your case you have to multiply say 8 bits * 8bits => 16 bits +16bits => 17 bits + 20bits => 37 bits 

 

You can imagine that by cascading pairs of additions: 

sample1(17bits) + sample2(17 bits) needs 18 bits (res1) 

sample3(17bits) + sample4(17 bits) needs 18 bits (res2) 

 

res1(18bits) + res2(18 bits) needs 19 bits 

... 

 

thus you imagine 20 cascaded stages of addition needed for 2^20 samples
0 Kudos
Altera_Forum
Honored Contributor II
1,447 Views

Dear Kaz 

 

I've got the point. That seems a nice solution ! 

 

But if i want to implement the adder in a cascaded stages style I have to do it by hand.. I was thinking about something recursive: 

https://www.alteraforum.com/forum/attachment.php?attachmentid=8569
0 Kudos
Altera_Forum
Honored Contributor II
1,447 Views

 

--- Quote Start ---  

Dear Kaz 

 

I've got the point. That seems a nice solution ! 

 

But if i want to implement the adder in a cascaded stages style I have to do it by hand.. I was thinking about something recursive: 

--- Quote End ---  

 

 

using parallel adders wastes massive number of adders (2^10 for just first stage). One accumulator will do equivalent job but needs reset to start and it is slow i.e. result will be available after 2^20 samples gone through but for rms it should do.
0 Kudos
Altera_Forum
Honored Contributor II
1,447 Views

Thank you kaz. 

 

How can I implement such accumulator ?Is there a precompiled standard block or has to be written by hand ?
0 Kudos
Altera_Forum
Honored Contributor II
1,447 Views

 

--- Quote Start ---  

Thank you kaz. 

 

How can I implement such accumulator ?Is there a precompiled standard block or has to be written by hand ? 

--- Quote End ---  

 

 

just a feedback register on clocke edge 

 

sum <= sum + din; --din is I*I + Q*Q 

 

you need to read final result, apply reset etc. 

 

at the end you carefully get the mean of squares: 

sum_trunc <= sum(n downto 20); 

 

Then you decide for the square rooting issue
0 Kudos
Altera_Forum
Honored Contributor II
1,447 Views

Thank you kaz.. I'm on it.. 

 

 

I've written the following code: 

 

LIBRARY ieee; USE ieee.std_logic_1164.all; USE IEEE.std_logic_unsigned.all; ENTITY rms_estimator IS GENERIC (ACCBIT : INTEGER:=10); PORT( squared_abs : IN STD_LOGIC_VECTOR(31 DOWNTO 0); clk , clr : IN STD_LOGIC; squared_rms: OUT STD_LOGIC_VECTOR(31 DOWNTO 0); ready : out STD_LOGIC; END ENTITY rms_estimator; ARCHITECTURE logic OF rms_estimator IS signal sum : STD_LOGIC_VECTOR(ACCBIT+31 downto 0):=(OTHERS =>'0'); signal i : STD_LOGIC_VECTOR(ACCBIT downto 0):=(OTHERS =>'0'); BEGIN PROCESS(clk,clr) BEGIN IF(clr = '1') THEN squared_rms<=(OTHERS =>'0'); sum<=(OTHERS =>'0'); ready<='0'; i<=(OTHERS =>'0'); ELSIF rising_edge(clk) THEN if (i=2**ACCBIT) then squared_rms<=std_logic_vector(sum(ACCBIT+31 downto ACCBIT)); ready<='1'; else sum<= sum + unsigned(pad & squared_abs); i<=i+1; ready<='0'; squared_rms<=(others=>'0'); end if; END IF; END PROCESS; END ARCHITECTURE logic; 

 

 

by the way the squared_rms is always at 'X' after the ready bit is set to '1'.. Any suggestions ?? I'm really stucked... 

 

Thank you for your time!
0 Kudos
Altera_Forum
Honored Contributor II
1,447 Views

some suggestions(not compiled) 

 

ARCHITECTURE logic OF rms_estimator IS signal sum : unsigned(ACCBIT+31 downto 0):=(OTHERS =>'0'); signal i : unsigned(ACCBIT downto 0):=(OTHERS =>'0'); BEGIN PROCESS(clk,clr) BEGIN IF(clr = '1') THEN squared_rms <= (OTHERS =>'0'); sum <= (OTHERS =>'0'); ready <= '0'; i < =(OTHERS =>'0'); ELSIF rising_edge(clk) THEN if (i=2**ACCBIT - 1) then sum <= (others => '0'); squared_rms<=std_logic_vector(sum(ACCBIT+31 downto ACCBIT)); ready<='1'; i <= (others => '0'); else sum<= sum + unsigned(pad & squared_abs); -- what is pad? '0'? i<=i+1; ready<='0'; --squared_rms<=(others=>'0'); -- you may keep it latched end if; END IF; END PROCESS; END ARCHITECTURE logic;
0 Kudos
Altera_Forum
Honored Contributor II
1,447 Views

Thank you kaz.. 

 

the 'pad' was not intended to be there it was written down in a previus attempt. 

 

I have not updated the 'i' signal after it equals 2**ACCBIT because for testing pourpose I want to stop the sum when the ready bit is set. 

 

By the way I always have XXXX... as output when the ready bit is set... mumble mumble.. 

 

It seems that the problem is in the recursion.. 

 

If i set: 

sum<=32767+sum ==> the rms is 32767 as expected
0 Kudos
Altera_Forum
Honored Contributor II
1,447 Views

try correct the width of squared_rms (should be less 10 bits). 

 

edit: ignore this note as it is 32 bits so is abs 

change sum type to unsigned
0 Kudos
Altera_Forum
Honored Contributor II
1,447 Views

Dear kaz.. 

 

What I see is that at reset time: 

sum=0; 

 

Then at the first sum: 

sum<=sum+unsigned(squared_abs) --squared_abs=2147483647 loaded from a .dat file 

 

The result is XXXXXX and it mantains its value until the ready bit is set and de-set. After the de-setting it starts summing correctly from 0. 

 

 

Thank you !
0 Kudos
Altera_Forum
Honored Contributor II
1,447 Views

I've found the bug.. In the testbench i was not restting the input. Now it seems to work correctly. Thank you for the help kaz. You're a real guru !

0 Kudos
Altera_Forum
Honored Contributor II
1,447 Views

Hello guys.. 

 

Now I'm working with RMS power estimation again.. I have two complex signals: sig1 and sig2. 

 

I measure the RMS of sig1 and sig2 with a vhdl block and then via software I calculate: [(float) RMS(sig1)] / [(float) RMS(sig2)] with sig1>sig2. 

 

I want that RMS(sig1) ~= RMS(sig2). So I tried to multiply sig2 with a 16-int-constant related to [(float) RMS(sig1)] / [(float) RMS(sig2)] and then take the 16-MSB after that multiplication. By the way I think I'm still missing something because I see funny results.. 

 

Any suggestions ? 

 

Thank you! have a nice day !
0 Kudos
Reply