FPGA Intellectual Property
PCI Express*, Networking and Connectivity, Memory Interfaces, DSP IP, and Video IP
6343 Discussions

FIR II Coefficient Read from Cyclone V HPS Freezing.

CCase1
Beginner
1,597 Views

Hello,

 

I am having an issue reading the FIR II Coeffecients using an avalon memory mapped slaved interface from the lighweight bridge of a Cyclone V HPS.

 

Reading the location mapped in QSYS from either the kernel or userspace results in a linux system freeze.

 

I have noticed the following behavior on the lightweight AXI lightweight bus.

 

Any help appreciated.

 

I have tried mapping the FIR II Avalon slave at multiple locations and reading from both kernel space (ioread32()) and userspace (pointer derference after mmap()). Both results freeze.

 

Has anyone been successful doing this or know of a project which has instantiated this successfully.

 

Attached is a picture of most of the lightweight AXI signals right after I access the coeffecients at 0x3000 on the lightweight bridge.

 

I am using Quartus 18.0 Standard.

 

Regards,

Chris.

0 Kudos
9 Replies
JC_FPGA
Novice
1,482 Views

There's a long standing bug in the FIR filter read-back code. The read signal is messed up. I've attached a python script that will fix the code. You have to re-run the script each time the FIR is regenerated by Quartus. You can place the following in a batch file or shell script to make life easier.

python3 fir_filter_read_coeff_fix.py YOUR_PROJECT_DIR/interpolator/interpolator_0002.vhd python3 fir_filter_read_coeff_fix.py YOUR_PROJECT_DIR/interpolator_sim/interpolator.vhd

Replace YOUR_PROJECT_DIR with your required path and replace "interpolator" with whatever you named your filter. The main thing is that you have to fix both the implementation and the simulation versions of the filter.

 

Python script:

################################################################################ # # Python3 script to fix bug in Intel/Altera FIR generator code # # Created: 2018-08-16 # Author: Jim Cox ################################################################################   # This script is meant to be used as a post-processing step after creating a # FIR filter using the Quartus tools. The read signal for the coefficients # is not connected properly to the lower level module.   import re import sys     if (len(sys.argv) != 2): sys.exit("\nUsage: fir_filter_read_coeff_fix.py my_file.vhd\n")   ### Lines to delete ### ### signal coeff_in_read_sig : std_logic; ### ### coeff_in_read_sig <= not(coeff_in_we(0)); ###   ### Line to modify ### ### busIn_read => coeff_in_read_sig,   ################################################################# # Input file my_file = open(sys.argv[1], "rt+")     my_file_list = my_file.readlines()   # Move back to the begining of the file my_file.seek(0)   for line_of_code in my_file_list: if (re.search('^\W+signal coeff_in_read_sig : std_logic;', line_of_code)): continue if (re.search('^\W+coeff_in_read_sig <= not\(coeff_in_we\(0\)\);', line_of_code)): continue if (re.search('^\W+busIn_read => coeff_in_read_sig,', line_of_code)): print(" busIn_read => coeff_in_read,", file=my_file) print("File has been updated\n") continue print(line_of_code, end='', file=my_file) my_file.truncate() my_file.close()

 

0 Kudos
CCase1
Beginner
1,482 Views

I will try this right away!

Very! much appreciated.

0 Kudos
CCase1
Beginner
1,482 Views

Well that helped a whole bunch.

 I am now able to read certain coefficients.

 However certain offsets will freeze the linux, the bus looks to resolve okay, see pictures.

 I have made two edits to the FIR II core.

 I redefined the reset_n to be a reset inside the HP_FIR_hw.tcl file.

 

 Changed to :

 

add_interface_port coeff_reset coeff_in_areset reset Input 1

 

 

 

According to the FIR II guide, the reset is a reset and not a reset_n.

 

I also ran the provided PYTHON script.

 

I can read the following coefficients for a symmetric 15 tap filter.

0,4,8,12. They show up correctly as instantiated.

 

Access others freezes the linux system, even though the access appears very similar in SignalTap.

 

The FIR II instantiates like so:

 

entity system_fir_compiler_ii_0 is port ( clk : in STD_LOGIC; reset_n : in STD_LOGIC; coeff_in_clk : in STD_LOGIC; coeff_in_areset : in STD_LOGIC; coeff_in_address : in STD_LOGIC_VECTOR(4-1 downto 0); coeff_in_data : in STD_LOGIC_VECTOR(32-1 downto 0); coeff_in_we : in STD_LOGIC_VECTOR(0 downto 0); coeff_in_read : in STD_LOGIC; coeff_out_data : out STD_LOGIC_VECTOR(32-1 downto 0); coeff_out_valid : out STD_LOGIC_VECTOR(0 downto 0); ast_sink_data : in STD_LOGIC_VECTOR((0 + 1*24) * 1 + 0 - 1 downto 0); ast_sink_valid : in STD_LOGIC; ast_sink_sop : in STD_LOGIC; ast_sink_eop : in STD_LOGIC; ast_sink_error : in STD_LOGIC_VECTOR(1 downto 0); ast_source_data : out STD_LOGIC_VECTOR(24 * 1*1 - 1 downto 0); ast_source_valid : out STD_LOGIC; ast_source_sop : out STD_LOGIC; ast_source_eop : out STD_LOGIC; ast_source_channel : out STD_LOGIC_VECTOR(log2_ceil_one(4) - 1 downto 0); ast_source_error : out STD_LOGIC_VECTOR(1 downto 0) ); end system_fir_compiler_ii_0;

I have included a failing read at offset 5 and a successful read at offset 0 in signal tap.

The HPS LW bus is set to 32bit width. Access to other peripherals on the lightweight bus work just fine.

 

Originally I was using a 16bit coefficient bus. I bumped my coefficients to 17bit in lenght to get a 32 bit bus. I thought that might be a problem, especially since the HPS bus is 32 bit. However certain addresses still freeze linux.

 

Super odd.

 

 Attached is a not successful read at offset 5.

 

0 Kudos
CCase1
Beginner
1,482 Views

posted a file.

Attached is a successful read at offset 4. (Linux does not lock up)

0 Kudos
CCase1
Beginner
1,482 Views

Attached is the core as seen on the Qsys bus.

0 Kudos
JC_FPGA
Novice
1,482 Views

The filter itself uses an active low reset and the coefficient reload interface uses an asynchronous active high reset. Don't ask me why. In practice, the coeff_in_areset doesn't appear to do anything (at least in my configuration). Writing a new coefficient appears to take effect immediately.

 

Using a 16-bit or less coefficient width will configure the FIR for a 16-bit coefficient data port. Using a 17-bit or larger coefficient width will configure the FIR for a 32-bit coefficient data port.

 

The two pictures from SignalTap appear to show identical operation for the two different coefficient addresses. Can you point out the difference?

 

Make sure your HPS addressing is being handled properly. You need to read coefficient addresses 0, 1, 2, 3, 4, 5, etc. Don't jump by 4 when using a 32-bit HPS bus. Be aware that you only need to write and read-back half of the coefficients for a symmetrical FIR filter. If you have an odd number of coefficients, write and read one additional address. For example, if you have 41 coefficients, you'll need to write to the first 21 addresses in the FIR filter to reprogram them all.

 

What clock do you have connected to the coeff_in_clk port?

 

 

0 Kudos
CCase1
Beginner
1,482 Views

Thanks for the help. I greatly appreciate it.

 

Thanks of the heads up on the symmetric nature of the coeffs. I realized this.

 

Currently I'm at a 32 bit coefficient bus, having bumped my coefficients up to 17 bits wide.

 

 

The coeff_clock is tied to h2f_user0_clock from the HSP, set at 100mHz. All my other peripherals on the LW bus use this same clock and reset.

The coeff_reset is tied to the h2f_reset of the HPS.

 

I have found that I can write to all the symmetric coefficients just fine in a loop in linux.

 

I have used mmap and am pointer de-referencing on a uint32_t*;

 

I can also read all the coefficients if I do other bus transactions before I go to read the FIR II coefficient bus. I can also only do one. Doing reads in a loop do not work. So odd.

 

The address appears to be translating fine onto the offset. Addresses inside the FIR II coeffcient block show as 0,1,2,3,4. The address on the AXI bus and in linux count by +4 bytes from the FIR II base. I hope this makes sense. Incrementing on a *uint32_t maps to the offset I would expect and I see that.

 

 

0 Kudos
JC_FPGA
Novice
1,482 Views

It appears that the read has a few clocks of latency. Maybe that's why reading in a loop doesn't work. Does a burst read operate properly?

 

Is everything working now other than the read-back?

0 Kudos
CCase1
Beginner
1,482 Views

Yeah. Everything appears to be working apart from being able to read only 1 FIR II coef after an mmap().

 

The reads in a user-space loop are spread very far out in time (relatively), so I don't think? its a burst issue.

 

Half of me thinks its a linux c programming issues, but identically programmed loop reads on other LW peripherals behave just fine.

 

I can loop write all the coefficients just fine and then read them back successfully one by one.

 

I am attaching a full poke application. The problem FIR II read loop is at the bottom. That won't ever run. The single offset read in the middle will run, but seemingly only if I investigate another LW peripheral before that.

 

I'll continue to play with this in my extra time, but for now it will deliver. The most important feature is coeffecient write which works great now. I bet if I study the LW Axi before and after a freeze I might come up with something.

 

I MASSIVELY appreciate the help on the problem JCox. I diffed the FIR II IP from 18.0 with the latest 19.1 release. No differences! This is HIGHLY alarming.

// The start address and length of the Lightweight bridge #define HPS_TO_FPGA_LW_BASE 0xFF200000 #define HPS_TO_FPGA_LW_SPAN 0x0020000   #define HPS_TO_FPGA_HW_BASE 0xC0000000 #define HPS_TO_FPGA_HW_SPAN 0x04000000     #define GEN_INTERRUPT 0x20C0 #define DUMMYREG 0x2000 #define FIR_OFFSET 0x0000     volatile uint32_t *fir2base = 0; volatile uint32_t *genint = 0; volatile uint32_t *dummyreg = 0; volatile uint32_t *gen_interupt = 0; void * lw_bridge_map = 0;   int main(int argc, char ** argv) {     int devmem_fd = 0; int result = 0; int offset = 0; int write_enable = 0; int read_loop_enable = 0; int dummy_value = 0;     if (argc < 4) { printf("Please supply offset and writeover option\n"); } else { offset = atoi(argv[1]); printf("Offset is: %d \n",offset); write_enable = atoi(argv[2]); printf("Write enable is: %d \n",write_enable); read_loop_enable = atoi(argv[3]); printf("Read loop enable is: %d \n",read_loop_enable); }     // Open up the /dev/mem device (aka, RAM) devmem_fd = open("/dev/mem", O_RDWR | O_SYNC); if(devmem_fd < 0) { perror("devmem open"); exit(EXIT_FAILURE); }   // mmap() the entire address space of the Lightweight bridge so we can access our custom module lw_bridge_map = (uint32_t*)mmap(NULL, HPS_TO_FPGA_LW_SPAN, PROT_READ | PROT_WRITE, MAP_SHARED, devmem_fd, HPS_TO_FPGA_LW_BASE); if(lw_bridge_map == MAP_FAILED) { perror("devmem mmap"); close(devmem_fd); exit(EXIT_FAILURE); }   fir2base = (uint32_t*)((void*)lw_bridge_map + FIR_OFFSET); dummyreg = (uint32_t*)((void*)lw_bridge_map + DUMMYREG); genint = (uint32_t*)((void*)lw_bridge_map + GEN_INTERRUPT);       printf("About to write DUMMYREG\n"); for(int i = 0; i<16; i++) { printf("ADDRESS IS %x ",dummyreg + i); printf("DUMMYREG: %x \n", *(dummyreg + i)); // printf("ADDRESS IS %x ",dummyreg + i); // printf("DUMMYREG: %x \n", *(dummyreg + i)); }       //You can only right to half the space, the coeffecients are symmetric. if (write_enable) { printf("About to read FIRBASE\n"); for(int i = 0; i<16; i++) { printf("ADDRESS IS %x \n",fir2base + i); printf("Writing %x \n",i); *(fir2base + i) = i; usleep(100); } }         printf("About to read FIRBASE\n"); printf("ADDRESS IS %x \n",fir2base + offset); int value = *(fir2base + offset); printf("Readback is %x::%d\n",value,value);     uint32_t *cur_addr = 0; if (read_loop_enable) { printf("About to read FIRBASE\n"); for(int i = 0; i<16; i++) { printf("ADDRESS IS %x \n",fir2base + i);   // cur_addr = dummyreg + i; // printf("DUMMYREG: %x \n", *(cur_addr)); // value = *(cur_addr); // printf("Value is: %x \n", value); // printf("ADDRESS IS %x \n",fir2base + i); cur_addr = fir2base + i; value = *(cur_addr); // //printf("Readback is %x::%d\n",*(fir2base + offset),*(fir2base + offset)); printf("Fir_Reg value: %x \n", (value)); // dummy_value = *(dummyreg + 15); // usleep(100); } }

 

0 Kudos
Reply