Cyclone V SOC EPCQ XIP Example Design Cyclone V SOC EPCQ XIP Example Design

cancel
Showing results for 
Search instead for 
Did you mean: 
363 Discussions

Cyclone V SOC EPCQ XIP Example Design Cyclone V SOC EPCQ XIP Example Design

Cyclone V SOC EPCQ XIP Example Design





Introduction

This page presents a demo that shows a bare-metal application running from EPCQ Flash, on a Cyclone V Soc Development Kit, without using SDRAM. Some more details about the demo: EPCQ Flash (1MB window) is used for program and read only data HPS OCRAM (64KB) is used for regular data storage

  • OCRAM is cleared at the beginning of the bare-metal application
  • Caches are demonstrated, including cache pre-loading and cache locking

Interrupts are demonstrated

Prerequisites

The following are required for this demo: For running the demo: Cyclone V Development Kit, rev E preferable SoC EDS 16.1 (for Quartus Flash Programmer) Serial terminal running on PC (TeraTerm for example) For compiling the Preloader: SoC EDS 16.1 For compiling and debugging the bare-metal application: SoC EDS 16.1, including ARM DS-5

Deliverables

The following are included with this demo:

  • - hardware project, precompiled preloader and Sample application archive

The sample application archive soc_dev_kit_boot_epcq.tar.gz contains the following:

soc_dev_kit_xip_epcq

ip/

soc_system.qsys - updated qsys file to use the correct pinmuxing for UART on our devkit

output_files/

soc_system.sof - updated fpga configuration file, resulted after recompiling with the updated qsys

hps_isw_handoff/

soc_system_epcq_hps_0 - updated handoff file resulted after recompiling with the updated qsys

software

Altera-SoCFPGA-HardwareLib-EPCQ-XIP-CV-GNU - bare-metal application, sources, precompiled binaries and debug launchers

spl_bsp_epcq - updated preloader, sources and precompiled binaries

preloader_epcq.patch - preloader patch

create_preloader.sh - script to automate creating the preloader

output_file.cof - convert programming file configuration, uses relative paths

output_file.jic - epcq flash image

program_epcq.cdf - quartus programmer configuration, uses relative paths

prepare_jic.sh - script to prepare jic file. Uses the cof file to find the components

flash_jic.sh - script to flash the jic to EPCQ

Running the Demo

In order to run the demo, the following steps need to be performed:

1. Start an Embedded Command Shell

2. Configure the Board to boot from FPGA by setting the BSEL jumpers accordingly:

BSEL0BSEL1BSEL2Description
LeftRightRightFPGA

3. Set the MSEL jumpers accordingly:

MSEL0MSEL1MSEL2MSEL3MSEL4MSEL5
UpDownUpUpDownDown

4. USB blaster connected to PC

5. Connect the board to the PC using the USB serial connection, and start a serial terminal on the PC, using 115,200-8-N-1.

6. Run the script to flash the provided EPCQ image to the EPCQ, power cycle the board and see the Preloader and bare-metal application running on the console:

cd soc_dev_kit_xip_epcq

./flash_jic.sh


* takes ~4 minutes to flash EPCQ

Info: *******************************************************************

Info: Running Quartus Prime Programmer

Info: Version 16.0.0 Build 211 04/27/2016 SJ Standard Edition

Info: Copyright (C) 1991-2016 Altera Corporation. All rights reserved.

Info: Your use of Altera Corporation's design tools, logic functions

Info: and other software and tools, and its AMPP partner logic

Info: functions, and any output files from any of the foregoing

Info: (including device programming or simulation files), and any

Info: associated documentation or information are expressly subject

Info: to the terms and conditions of the Altera Program License

Info: Subscription Agreement, the Altera Quartus Prime License Agreement,

Info: the Altera MegaCore Function License Agreement, or other

Info: applicable license agreement, including, without limitation,

Info: that your use is for the sole purpose of programming logic

Info: devices manufactured by Altera and sold by Altera or its

Info: authorized distributors. Please refer to the applicable

Info: agreement for further details.

Info: Processing started: Mon Apr 3 12:46:55 2017

Info: Command: quartus_pgm program_epcq.cdf

Info (213045): Using programming cable "USB-BlasterII [1-2.2]"

Info (213011): Using programming file ./output_file.jic with checksum 0x2E6FC8AF for device 5CSXFC6D6@2

Info (209060): Started Programmer operation at Mon Apr 3 12:47:01 2017

Inconsistency detected by ld.so: dl-close.c: 762: _dl_close: Assertion `map->l_init_called' failed!

Info (209016): Configuring device index 2

Info (209017): Device 2 contains JTAG ID code 0x02D020DD

Info (209007): Configuration succeeded -- 1 device(s) configured

Info (209018): Device 2 silicon ID is 0x19

Info (209044): Erasing ASP configuration device(s)

Info (209023): Programming device(s)

Info (209021): Performing CRC verification on device(s)

Info (209011): Successfully performed operation(s)

Info (209061): Ended Programmer operation at Mon Apr 3 12:51:18 2017

Info: Quartus Prime Programmer was successful. 0 errors, 0 warnings

Info: Peak virtual memory: 475 megabytes

Info: Processing ended: Mon Apr 3 12:51:18 2017

Info: Elapsed time: 00:04:23

Info: Total CPU time (on all processors): 00:00:26


* power cycle the board, you will see this on the console:

U-Boot SPL 2013.01.01 (Apr 02 2017 - 09:22:37)

BOARD : Altera SOCFPGA Cyclone V Board

CLOCK: EOSC1 clock 25000 KHz

CLOCK: EOSC2 clock 25000 KHz

CLOCK: F2S_SDR_REF clock 0 KHz

CLOCK: F2S_PER_REF clock 0 KHz

CLOCK: MPU clock 800 MHz

CLOCK: DDR clock 400 MHz

CLOCK: UART clock 100000 KHz

CLOCK: MMC clock 488 KHz

CLOCK: QSPI clock 400000 KHz

RESET: WARM

Reading header from EPCQ flash

Hello EPCQ World!

Running 256KB test_function_1() without caches enabled : 19424723 ticks.

Caches enabled

Preload 256KB test_function_1() in L2 cache : 19427699 ticks.

Running 256KB test_function_1() from preloaded cache : 18111 ticks.

Running 512KB test_function_2() : 38785599 ticks.

Running 512KB test_function_2() : 24785337 ticks.

Running 256KB test_function_1() from preloaded cache : 18105 ticks.

Global Timer Interrupt: 1 of 1000

Global Timer Interrupt: 2 of 1000

Global Timer Interrupt: 3 of 1000

Global Timer Interrupt: 4 of 1000

...

Re-compiling the demo

This section describes how to re-compile the demo if any changes are necessary.

Generating and Compiling the Preloader

To re-create the Preloader based on the handoff information, that may have changed

cd soc_dev_kit_xip_epcq

./create_preloader.sh


* creates the following:

soc_dev_kit_xip_epcq/software/spl_bsp_epcq/preloader.hex <- Preloader HEX image, used for flashing

These are the steps done by the above script, that can also be done manually: 1. Delete the existing preloader directory.

cd soc_dev_kit_xip_epcq/software/

rm -rf spl_bsp_epcq

2. Use bsp-editor to generate a Preloader

bsp-editor &

3. In bsp-editor, perform the following changes to the default configuration parameters

  • Go to File > New HPS BSP
  • In the new BSP window

* Select the Preloader Settings Directory to be hps_isw_handoff/soc_system_epcq_hps_0/

* Uncheck Use default locations

* Edit the BSP target directory to be soc_dev_kit_xip_epcq/software/spl_bsp_epcq

* Click OK

  • Select the following options in the bsp-editor window

* Uncheck BOOT_FROM_SDMMC

* Check BOOT_FROM_QSPI

* Check EXE_ON_FPGA

* Check SKIP_SDRAM

* Uncheck WATCHDOG_ENABLE

* Click Generate

4. Change current folder to the Generated Preloader

cd soc_dev_kit.updated/software/spl_bsp_epcq

6. Compile the Preloader using 'make' – this will bring in all source code

make

7. Clean the Preloader using the following command

make clean

8. Patch the Preloader source using the following command:

cd uboot-socfpga

patch -p1 < ../../../preloader_epcq.patch

The output will be something similar to the following:

patching file common/spl/spl.c

patching file drivers/mtd/spi/spi_spl_load.c

patching file include/configs/socfpga_common.h

cd ..

9. Recompile the Preloader using ‘make’:

make

10. Run the following command to convert the Preloader ELF file to HEX:

arm-altera-eabi-objcopy -O ihex --adjust-vma -0xc0000000 uboot-socfpga/spl/u-boot-spl preloader.hex

  • creates the following:

soc_dev_kit_xip_epcq/software/spl_bsp_epcq/preloader.hex <- Preloader HEX image, used for flashing

Recompiling the Baremetal Application

In this scenario, we are re-compiling the bare-metal application, to take into account any source code, or makefile changes:

1. Start an Embedded Command Shell

2. Start ARM DS-5 AE by running the command ‘eclipse &’

3. Select a new workspace (or reuse an existing one)

4. Go to File -> Import -> General -> Existing Projects into Workspace and click ‘Next’

5. Choose ‘Select archive file’ option and click the associate ‘Browse’ button

6. Select the file ‘Altera-SoCFPGA-HardwareLib-EPCQ-XIP-CV-GNU’ and click ‘Open’

7. Click ‘Finish’ to import the project.

8. Go to Project->Build project. This will compile the project


To build from command line:

1. Start an Embedded Command Shell 2. Change the current folder to the Baremetal Source folder

cd soc_dev_kit_xip_epcq/software/Altera-SoCFPGA-HardwareLib-EPCQ-XIP-CV-GNU

3. Delete the existing binaries from previous build

make clean

4. Rebuild Baremetal binaries

make

  • creates the following:

soc_dev_kit_xip_epcq/epcq_demo.axf <- ELF image, used for debugging soc_dev_kit_xip_epcq/epcq_demo-mkimage.hex <- HEX image, used for flashing

Debugging the Demo

This section presents how to debug the demo using ARM DS-5 Altera Edition.

Connecting to the running Bare-metal Application and Debugging It 

1. Flash the EPCQ image to the EPCQ(by running script - flash_jic.sh) 

2. Start Eclipse by running ‘eclipse&’ from the Embedded Command Shell. 

3. Select any location on your hard drive as the ‘workspace’ location. Recommended to use soc_dev_kit_xip_epcq/software/workspace 

4. In Eclipse, go to File > Import > General > Existing Projects into Workspace and click Next 5. In Import Projects window, click Browse to select the folder soc_dev_kit_xip_epcq/software/Altera-SoCFPGA-HardwareLib-XIP-EPCQ-CV-GNU, then click Finish 

6. Go to Run -> Debug Configurations 

7. In Debug Configuration, select Ds-5 Debugger > Debug-Running-Application 

8. In Connection tab, click Browse under Bare Metal Debug to select the desired USB Blaster </br> 9. In the Connection Browser window, select the USB blaster to be the one you have connected and click Select 

10.In the Debug Configurations window, click Debug 

11.The DS-5 will connect to the board, stop the bare-metal application, and load the symbols from the ELF file 

12.Usual debugging can now be done: 

Debugging the Bare-metal Application from the Beginning

Here we run the bootrom, preloader and then we stop at the beginning of the bare-metal application, so that we can debug it from the beginning: 

1. Repeat steps 1-5 as above 

2. In Eclipse, go to Run > Debug Configurations 

3. In Debug Configuration, select Ds-5 Debugger > Debug-Application-From-Beginning 

4. In Connection tab, click Browse under Bare Metal Debug to select the desired USB Blaster 

5. In the Connection Browser window, select the USB blaster to be the one you have connected and click Select 

6. In the Debug Configurations window, click Debug 

7. The DS-5 will connect to the board, reset it, then run BootROM and Preloader, load bare-metal symbols then run up to the bare-metal application main function. 

8. Usual debugging can now be done: 

Demo Architecture

This section presents a little bit more details about the demo architecture.

The placement of the images in EPCQ:

ElementFile NameEPCQ Address
Main SOFoutput_files/soc_system.sof0x00000000
Secondary SOFoutput_files/soc_system.sof0x00800000
Preloadersoftware/spl_bsp_epcq/preloader.hex0x01000000
BM Applicationsoftware/Altera-SoCFPGA-HardwareLib-EPCQ-XIP-CV-GNU/epcq_demo-mkimage.hex0x01100000

Preloader

The Preloader included in the 16.1 SoC EDS release does not natively supports the EPCQ XIP mode. 

To support this : 

1. Preloader .hex file should be linked to address 0xC0000000 (equivalent of offset 0x0 behind the HPS2FPGA bridge) . There is an internal address expander that shifts the address to (0xC1000000). Thus, Mapping to preloader residing at address 0x01000000 of EPCQ. 

2. File uboot-socfpga/include/configs/socfpga_common.h 

Enabled the EPCQ XIP and defined the EPCQ jump address to where Baremetal application resides ie address 0xC0100000. Thus, Mapping to Baremetal Application residing at address 0x01100000 of EPCQ. 

3. File uboot-socfpga/drivers/mtd/spi/spi_spl_load.c 

When Boot from QSPI is enabled , logic in function spl_spi_load_image() is executed. Added additional logic to support EPCQ XIP, this involves parsing the header at address 0xC0100000 and then executing the Baremetal application. 

ElementH2F addressH2F address+ address expanderEPCQ Address
Preloader0xC00000000xC10000000x01000000
BM Application0xC01000000xC11000000x01100000

Boot Sequence

The standard boot sequence is used, with the BootROM loading the Preloader, then the Preloader loading the bare-metal application:

7/73/Xip-boot-flow.png

Memory Usage

The following table presents the linker sections that the bare-metal application uses. This setup was used for the following reasons:

  • EPCQ image needs to start with the actual entrypoint (Preloader requirement)
  • MMU L1 table needs to be aligned to 16KB (ARM requirement)* MMU L2 table needs to be aligned to 1KB (ARM requirement)
  • MMU L1 table needs to refer to MMU L2 table address as a constant value (MMU tables in QSPI read only memory constraint – address needs to be known at compile time)
SectionStartSizeDescription
ram0xFFFF000064K-4KOnchip RAM. Minus 4KB for the PLL workaround.
epcq_rom_startup0xC010004016K – 64EPCQ: Startup code, needs to be at the beginning of the image. Minus 64 bytes for Mkimage header.
epcq_rom_mmu_ttb10xC010400016KEPCQ: L1 Translation Table
epcq_rom_mmu_ttb20xC01080001KEPCQ: L2 Translation Table
epcq_rom0xC01084001M - 33KEPCQ: Rest of it – code and constant data

Cache Settings

The following table presents the cache settings that were used.

AreaL1 CacheableL2 Cacheable
1MB EPCQ FlashYesYes
64KB OCRAMYesNo
Rest of MemoryNoNo

Note that making OCRAM also L2 cacheable did not improve the speed of the system. That is because OCRAM has a similar speed with the L2 cache. However, making OCRAM L1 cacheable did make a significant improvement in execution speed.

MMU Translation Tables

The MMU tables are used to describe the cache settings for different memory areas. For this demo, the following Tables were used:

  • L1 Translation Table – describes memory like this:
    • 1MB cacheable section for EPCQ window
    • 1MB L2 described page table for last MB of address space
    • 1MB non-cacheable sections for the rest of address space
  • L2 Translation Table for last MB of address space:
    • Large page – 64KB OCRAM as L1 cacheable
    • Large pages – for the rest of the 1MB area

Notes:

  • The MMU tables were generated using the included script – generate_mmu_tables.pl. The tables are human readable and editable, so the script is not really required. It was included for completeness.
  • The absolute placement of the MMU tables is checked by the script check_mmu_tables.pl that is called by the Makefile.

Lock Data in L2 Cache

The application preloads the desired piece of date to the L2 cache and locks it to improve the execution efficiency.

It uses the Lockdown by line feature of the L2 cache controller. When lockdown by line feature is enabled during a period of time, all newly allocated cache lines get marked as locked. The controller then considers them as locked and does not naturally evict them. Lockdown by line feature can be enabled by setting bit [0] of the Lockdown by Line Enable Register.

Two test functions were developed to test the cache lock feature:

  • void test_function_1(void) : contains 64K "nop" instructions (which take up 256 KB of memory)
  • void test_function_1(void) : contains 128K "nop" instructions (which take up 512 KB of memory)

The flow of the sample application is the following:

  1. Measure the duration of test_function_1() before enabling the caches
  2. Enable caches and
  3. Preload test_function_1() in L2 cache
  4. Measure the duration of test_function_1() again, to see the effect of cache preloading
  5. Run test_function_2() and measure duration - withoud preloading, this should remove test_function_1 from cache
  6. Run test_function_2() again and measure duration - it will be faster because part of it will be loaded in cache
  7. Run test_function_1() again and measure duration - it will be almost the same since it is preloaded
Version history
Last update:
‎06-25-2019 04:27 PM
Updated by:
Contributors