Difference between revisions of "APF6 SP DMA simple howto"
(One intermediate revision by the same user not shown) | |||
Line 1: | Line 1: | ||
− | [[Category: | + | [[Category: APF6SP]] |
− | + | ||
[[Category: FPGA]] | [[Category: FPGA]] | ||
[[Category: CycloneV]] | [[Category: CycloneV]] | ||
Line 6: | Line 5: | ||
[[Category: Qsys]] | [[Category: Qsys]] | ||
[[Category: DMA]] | [[Category: DMA]] | ||
+ | [[Category: PCIe]] | ||
== Introduction == | == Introduction == | ||
− | In this howto we will learn how to do DMA accesses through PCIe on an | + | In this howto we will learn how to do DMA accesses through PCIe on an APF6_SP board. |
− | [[File:APF6_SP_dma_simple_howto.png|frame|center|Architecture of the DMA howto in FPGA]] | + | [[File:APF6_SP_dma_simple_howto.png|frame|center|Architecture of the DMA howto in FPGA (TODO: adding pio and irq exports)]] |
== Make the quartus project == | == Make the quartus project == | ||
+ | {{Note| Quartus 15.1.1 was used for this tutorial}} | ||
− | == | + | Follow the [http://www.armadeus.com/wiki/index.php?title=APF6_SP_The_full_howto full howto] from chapter [http://www.armadeus.com/wiki/index.php?title=APF6_SP_The_full_howto#Make_the_quartus_project Make the quartus project] to chapter [http://www.armadeus.com/wiki/index.php?title=APF6_SP_The_full_howto#Adding_the_PCIe_and_CvP Adding the PCIe and CvP]. With only to BARx configured (BAR0 and BAR1). |
− | == Tests == | + | * Add these ''Avalon-MM CycloneV Hard IP for PCI Express'' settings: |
+ | ** In ''Avalon to PCIe Address Translation Settings'' | ||
+ | *** set ''Number of address pages:'' to '''4''' | ||
+ | *** set ''Size of Address pages:'' to '''4Kbyte - 12 bits''' | ||
+ | |||
+ | === Adding Block of RAM === | ||
+ | |||
+ | * In the IP-Catalog: | ||
+ | <pre class="config"> | ||
+ | Library | ||
+ | Basic Functions | ||
+ | On Chip Memory | ||
+ | On-Chip Memory (RAM or ROM) | ||
+ | </pre> | ||
+ | |||
+ | * ''datawidth'': '''64''' bits | ||
+ | * Change capacity to : '''16384''' (16KBytes) | ||
+ | * Connect signals as following: | ||
+ | ** '''coreclkout''' -> '''clk1''' | ||
+ | ** '''nreset_status''' -> '''rst''' | ||
+ | ** '''Rxm_BAR1''' -> '''s1''' | ||
+ | |||
+ | === Adding Reading DMA block === | ||
+ | |||
+ | The registers map of altera DMA block can be found [https://www.altera.co.jp/content/dam/altera-www/global/ja_JP/pdfs/literature/hb/nios2/n2cpu_nii5v3.pdf here] (begin page 239). | ||
+ | |||
+ | The reading DMA block is used to transfert data from the Block of ram in FPGA (CycloneV) to the DDR3 in the processor (i.MX6). | ||
+ | |||
+ | * In the IP-Catalog: | ||
+ | <pre class="config"> | ||
+ | Library | ||
+ | Basic Functions | ||
+ | DMA | ||
+ | DMA Controller | ||
+ | </pre> | ||
+ | |||
+ | * Set ''Width of the DMA length register (1-32)(bits):'' to : '''32''' bits | ||
+ | * Set ''Enable burst transfers'': '''1024''' | ||
+ | * Set ''Depth fifo burst''': '''2048''' | ||
+ | * Connect signals as following: | ||
+ | ** '''coreclkout''' -> '''clk''' | ||
+ | ** '''nreset_status''' -> '''reset''' | ||
+ | ** '''Rxm_irq''' <- '''irq''' | ||
+ | ** onchip_memory.'''s1''' <- '''read_master''' | ||
+ | ** pcie.'''TXs''' <- '''write_master''' | ||
+ | |||
+ | ** '''Rxm_BAR0''' -> '''control_port_slave''' | ||
+ | *** Set base address to '''0x0000_4000''' | ||
+ | |||
+ | === Adding Writing DMA block === | ||
+ | |||
+ | The registers map of altera DMA block can be found [https://www.altera.co.jp/content/dam/altera-www/global/ja_JP/pdfs/literature/hb/nios2/n2cpu_nii5v3.pdf here] (begin page 239). | ||
+ | |||
+ | The Writing DMA block is used to transfert data from the DDR3 in the processor (i.MX6) to the block of ram in FPGA (CycloneV). | ||
+ | |||
+ | * In the IP-Catalog: | ||
+ | <pre class="config"> | ||
+ | Library | ||
+ | Basic Functions | ||
+ | DMA | ||
+ | DMA Controller | ||
+ | </pre> | ||
+ | |||
+ | * Set ''Width of the DMA length register (1-32)(bits):'' to : '''32''' bits | ||
+ | * Set ''Enable burst transfers'': '''1024''' | ||
+ | * Set ''Depth fifo burst''': '''2048''' | ||
+ | * Connect signals as following: | ||
+ | ** '''coreclkout''' -> '''clk''' | ||
+ | ** '''nreset_status''' -> '''reset''' | ||
+ | ** '''Rxm_irq''' <- '''irq''' | ||
+ | ** onchip_memory.'''s1''' <- '''write_master''' | ||
+ | ** pcie.'''TXs''' <- '''read_master''' | ||
+ | |||
+ | ** '''Rxm_BAR0''' -> '''control_port_slave''' | ||
+ | *** Set base address to '''0x0000_4020''' | ||
+ | |||
+ | === Adding irq mapper === | ||
+ | |||
+ | For performances measurement we need to see irqs in scope. Then we have to export it. | ||
+ | |||
+ | * In the IP-Catalog: | ||
+ | <pre class="config"> | ||
+ | Library | ||
+ | Qsys Interconnect | ||
+ | Interrupt | ||
+ | IRQ mapper | ||
+ | </pre> | ||
+ | |||
+ | * Leave all default options then click on finish | ||
+ | * Connect following signals | ||
+ | ** '''coreclkout''' -> '''clk''' | ||
+ | ** '''nreset_status''' -> '''clk_reset''' | ||
+ | ** dma_0.'''irq''' -> '''receiver0''' | ||
+ | ** dma_1.'''irq''' -> '''receiver1''' | ||
+ | * Export '''sender''' with name :'''irqs''' | ||
+ | |||
+ | === Adding PIO === | ||
+ | |||
+ | For performances measurement we need to toggle some signals from the driver. | ||
+ | |||
+ | <pre class="config"> | ||
+ | Library | ||
+ | Processors and peripherals | ||
+ | Peripherals | ||
+ | PIO (Parallel I/O) | ||
+ | </pre> | ||
+ | |||
+ | * Set following options | ||
+ | ** Width : '''1''' | ||
+ | ** Direction : '''Output''' | ||
+ | ** Click on finish | ||
+ | |||
+ | * Connect following signals: | ||
+ | ** '''coreclkout''' -> '''clk''' | ||
+ | ** '''nreset_status''' -> '''reset''' | ||
+ | ** '''Rxm_BAR0''' -> '''s1''' | ||
+ | *** Set base address to '''0x0000_4040''' | ||
+ | |||
+ | * Export '''external_connection''' with name : '''pio''' | ||
+ | |||
+ | === Mapping === | ||
+ | |||
+ | The mapping of the FPGA design will be as following : | ||
+ | |||
+ | * '''BAR0''' | ||
+ | {| class="wikitable" style="text-align:left; width:40%;" | ||
+ | |- | ||
+ | | bgcolor=#FFFF00 | begin addr (8bits) | ||
+ | | bgcolor=#FFFF00 | end addr (8bits) | ||
+ | | bgcolor=#FFFF00 | name | ||
+ | |- | ||
+ | | 0x0000_0000 | ||
+ | | 0x0000_3FFF | ||
+ | | bgcolor=#BBBBBB | Cra | ||
+ | |- | ||
+ | | 0x0000_4000 | ||
+ | | 0x0000_401F | ||
+ | | bgcolor=#BBBBBB | DMA Read Controller | ||
+ | |- | ||
+ | | 0x0000_4020 | ||
+ | | 0x0000_403F | ||
+ | | bgcolor=#BBBBBB | DMA write Controller | ||
+ | |- | ||
+ | | 0x0000_4040 | ||
+ | | 0x0000_404F | ||
+ | | bgcolor=#BBBBBB | PIO Controller | ||
+ | |} | ||
+ | |||
+ | * '''BAR1''': | ||
+ | {| class="wikitable" style="text-align:left; width:40%;" | ||
+ | |- | ||
+ | | scope=col bgcolor=#FFFF00 | begin addr (8bits) | ||
+ | | scope=col bgcolor=#FFFF00 | end addr (8bits) | ||
+ | | scope=col bgcolor=#FFFF00 | name | ||
+ | |- | ||
+ | | 0x0000_0000 | ||
+ | | 0x0000_3FFF | ||
+ | | bgcolor=#BBBBBB | Block of RAM 16Kbytes | ||
+ | |} | ||
+ | |||
+ | == Generate bitstream with Quartus == | ||
+ | |||
+ | * On Qsys windows click on Finish an generate component. | ||
+ | |||
+ | === Top === | ||
+ | |||
+ | Edit a file named '''dmasimple.v''' with your favorite editor and add this code : | ||
+ | <source lang="verilog"> | ||
+ | |||
+ | `timescale 1 ps / 1 ps | ||
+ | |||
+ | module dmasimple ( | ||
+ | input wire hip_serial_rx_in0, | ||
+ | output wire hip_serial_tx_out0, | ||
+ | input wire npor_npor, | ||
+ | input wire npor_pin_perst, | ||
+ | input wire refclk_clk, | ||
+ | output wire pio); | ||
+ | |||
+ | reg [1:0] irqs_irq; | ||
+ | reg pio_export; | ||
+ | |||
+ | assign pio = pio_export | irqs_irq[0] | irqs_irq [1]; | ||
+ | |||
+ | qsys_dmasimple u0 ( | ||
+ | .hip_serial_rx_in0 (hip_serial_rx_in0 ), // hip_serial.rx_in0 | ||
+ | .hip_serial_tx_out0 (hip_serial_tx_out0), // .tx_out0 | ||
+ | .irqs_irq (irqs_irq ), // irqs.irq | ||
+ | .npor_npor (npor_npor ), // npor.npor | ||
+ | .npor_pin_perst (npor_pin_perst ), // .pin_perst | ||
+ | .refclk_clk (refclk_clk ), // refclk.clk | ||
+ | .pio_export (pio_export ) // pio.export | ||
+ | ); | ||
+ | |||
+ | endmodule | ||
+ | </source> | ||
+ | |||
+ | * Add this file in quartus the set it as '''top-level'''. | ||
+ | |||
+ | === Pinout === | ||
+ | |||
+ | * Outside Quartus, create a file named ''pcie_pinout.tcl'' and edit it with your favorite text editor. | ||
+ | * Copy all lines from [http://www.armadeus.com/wiki/index.php?title=APF6_SP_DDR3_PINOUT#PCIe| here]. | ||
+ | * Add this line: | ||
+ | <source lang="tcl"> | ||
+ | set_location_assignment PIN_AB22 -to pio | ||
+ | </source> | ||
+ | * Save your file, then add it to the project inside Quartus | ||
+ | * Run the script with following menu, select it then '''run''': | ||
+ | <pre class="config"> | ||
+ | Tools | ||
+ | TCL Scripts ... | ||
+ | </pre> | ||
+ | |||
+ | === Synthesis/place&route/bitstream === | ||
+ | |||
+ | * Right click on ''Assembly'' then run. | ||
+ | * Generate bitstream with menu | ||
+ | <pre class="config"> | ||
+ | File | ||
+ | Convert Programming Files | ||
+ | </pre> | ||
+ | * Set the following | ||
+ | ** Programming file type : '''Raw Binary File (.rbf) | ||
+ | ** File name: '''DMAsimple.rbf''' | ||
+ | * under ''Input files to convert'' click on '''SOF Data''' then '''Add File ...''' | ||
+ | ** Set '''DMAsimple.sof''' in directory '''output_files''' then '''open''' | ||
+ | ** Checkbox ''Create CvP files (Generate DMAsimple.periph.rbf and DMAsimple.core.rbf)'' | ||
+ | * Click on '''Generate''' | ||
+ | |||
+ | Once done, two files named ''DMAsimple.periph.rbf'' and ''DMAsimple.core.rbf'' will be available in the root project directory. | ||
+ | |||
+ | == Linux Driver == | ||
+ | |||
+ | Three sources files are provided with this tutorial : | ||
+ | |||
+ | * '''Makefile''': to compile the driver | ||
+ | * '''dma_ram.h''': somes usefull constants | ||
+ | * '''dmasimple.c''': the driver code | ||
+ | |||
+ | === Modify Makefile === | ||
+ | |||
+ | * In the Makefile change the path to your local path of armadeus BSP: | ||
+ | <source lang="make"> | ||
+ | BSP_APF6_BROUTPUT := /usr/local/projects/armadeus/apf6/buildroot/output/ | ||
+ | </source> | ||
+ | |||
+ | * Then try to compile it with make : | ||
+ | <pre class="host"> | ||
+ | $ make | ||
+ | make -C /usr/local/projects/armadeus/apf6/buildroot/output/build/linux-3.19.8/ SUBDIRS=/usr/local/projects/apf6sp/software/dmasimple ARCH=arm CROSS_COMPILE=arm-linux- modules CC=/usr/local/projects/armadeus/apf6/buildroot/output/host/usr/bin/arm-linux-gnueabihf-gcc LD=/usr/local/projects/armadeus/apf6/buildroot/output/host/usr/bin/arm-linux-gnueabihf-ld OBJDUMP=/usr/local/projects/armadeus/apf6/buildroot/output/host/usr/bin/arm-linux-gnueabihf-objdump | ||
+ | make[1]: Entering directory '/usr/local/projects/armadeus/apf6/buildroot/output/build/linux-3.19.8' | ||
+ | CC [M] /usr/local/projects/apf6sp/software/dmasimple/dmasimple.o | ||
+ | Building modules, stage 2. | ||
+ | MODPOST 1 modules | ||
+ | CC /usr/local/projects/apf6sp/software/dmasimple/dmasimple.mod.o | ||
+ | LD [M] /usr/local/projects/apf6sp/software/dmasimple/dmasimple.ko | ||
+ | make[1]: Leaving directory '/usr/local/projects/armadeus/apf6/buildroot/output/build/linux-3.19.8' | ||
+ | </pre> | ||
+ | |||
+ | * Copy dmasimple.ko to the target. | ||
+ | |||
+ | === Understand the driver === | ||
+ | |||
+ | The entry point of the driver is the probe function '''dmasimple_probe()'''. | ||
+ | The Linux driver allocate a buffer '''dmabuff''' in the i.MX6 memory. | ||
+ | <source lang="C"> | ||
+ | /* Allocate and initialize shared control data */ | ||
+ | dmas->dmabuff = dmam_alloc_coherent(&pdev->dev, BUFF_SIZE, &dmas->dma_handle, GFP_KERNEL); | ||
+ | if (!dmas->dmabuff){ | ||
+ | printk("Error, can't alloc coherent\n"); | ||
+ | goto err_return; | ||
+ | } | ||
+ | </source> | ||
+ | |||
+ | And register its hard base address '''dma_handle''' in the CycloneV PCI express hard ip register. | ||
+ | |||
+ | <source lang="C"> | ||
+ | writel((unsigned long)(dmas->dma_handle), &dmas->bar0[CRA_REG_A2P_ADDR_MAP_LO0/4]); | ||
+ | </source> | ||
+ | |||
+ | ==== Dma read ==== | ||
+ | When an echo is done on the sysfs file named '''dmaread''': | ||
+ | <source lang="bash"> | ||
+ | echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/dmaread | ||
+ | </source> | ||
+ | |||
+ | the dma transfert is started and data from the ram bloc in FPGA are transfered in dmabuff. | ||
+ | <source lang="C"> | ||
+ | static ssize_t store_dmaread(struct device *dev, struct device_attribute *attr, | ||
+ | const char *buf, size_t count) | ||
+ | { | ||
+ | |||
+ | [...] | ||
+ | |||
+ | /* Launch DMA transfert */ | ||
+ | writel(ALT_DMA_CTRL_FLG_DBL_WD | ALT_DMA_CTRL_FLG_LEEN | | ||
+ | ALT_DMA_CTRL_FLG_GO | ALT_DMA_CTRL_FLG_I_EN, | ||
+ | &dmas->bar0[ALT_DMA_REG_CONTROL/4]); | ||
+ | |||
+ | return count; | ||
+ | } | ||
+ | </source> | ||
+ | |||
+ | A ''cat'' command in the file '''dmaread''': | ||
+ | <source lang="bash"> | ||
+ | cat /sys/bus/pci/devices/0000\:01\:00.0/dmaread | ||
+ | </source> | ||
+ | will launch verification of data copied. | ||
+ | <source lang="C"> | ||
+ | /* Checking DMA transfert */ | ||
+ | static ssize_t show_dmaread(struct device *dev, struct device_attribute *attr, char *buf) | ||
+ | { [...] } | ||
+ | </source> | ||
+ | |||
+ | ==== Dma write ==== | ||
+ | When an echo is done on the sysfs file named '''dmawrite''': | ||
+ | <source lang="bash"> | ||
+ | echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/dmawrite | ||
+ | </source> | ||
+ | |||
+ | the dma transfert is started and data from the DDR3 in processor are transfered in the block of ram in FPGA. | ||
+ | <source lang="C"> | ||
+ | static ssize_t store_dmawrite(struct device *dev, struct device_attribute *attr, | ||
+ | const char *buf, size_t count) | ||
+ | { | ||
+ | |||
+ | [...] | ||
+ | |||
+ | /* Launch DMA transfert */ | ||
+ | writel(ALT_DMA_CTRL_FLG_DBL_WD | ALT_DMA_CTRL_FLG_LEEN | | ||
+ | ALT_DMA_CTRL_FLG_GO | ALT_DMA_CTRL_FLG_I_EN, | ||
+ | &dmas->bar0[ALT_DMA_WRITE_REG_CONTROL/4]); | ||
+ | |||
+ | return count; | ||
+ | } | ||
+ | </source> | ||
+ | |||
+ | A ''cat'' command in the file '''dmawrite''': | ||
+ | <source lang="bash"> | ||
+ | cat /sys/bus/pci/devices/0000\:01\:00.0/dmawrite | ||
+ | </source> | ||
+ | will launch verification of data copied. | ||
+ | <source lang="C"> | ||
+ | /* Checking DMA transfert */ | ||
+ | static ssize_t show_dmawrite(struct device *dev, struct device_attribute *attr, char *buf) | ||
+ | { [...] } | ||
+ | </source> | ||
+ | |||
+ | == Boot system and Tests == | ||
+ | |||
+ | === Booting system === | ||
+ | Copy the bitstream files in the host tftpboot directory : | ||
+ | <pre class="host"> | ||
+ | $ cp -v *.rbf /tftpboot/ | ||
+ | ‘DMAsimple.core.rbf’ -> ‘/tftpboot/DMAsimple.core.rbf’ | ||
+ | ‘DMAsimple.periph.rbf’ -> ‘/tftpboot/DMAsimple.periph.rbf’ | ||
+ | </pre> | ||
+ | |||
+ | Flash the apf6_sp with a [http://www.armadeus.com/wiki/index.php?title=Target_Software_Installation fresh git view] of armadeus project. | ||
+ | Then. | ||
+ | |||
+ | * Boot under U-Boot and download periph bitstream: | ||
+ | <pre class="apf"> | ||
+ | BIOS> tftpboot ${loadaddr} DMAsimple.periph.rbf | ||
+ | BIOS> fpga load 0 ${loadaddr} ${filesize} | ||
+ | </pre> | ||
+ | |||
+ | * Then boot Linux | ||
+ | <pre class="apf"> | ||
+ | BIOS> boot | ||
+ | apf6 login: root | ||
+ | </pre> | ||
+ | |||
+ | * Download the core.rbf then load it (with 192.168.0.214 the address of the tftp host server): | ||
+ | <pre class="apf"> | ||
+ | $ tftp -g -r DMAsimple.core.rbf 192.168.0.214 | ||
+ | $ load_fpga DMAsimple.core.rbf | ||
+ | </pre> | ||
+ | |||
+ | * Download then load the DMAsimple driver: | ||
+ | <pre class="apf"> | ||
+ | $ tftp -g -r dmasimple.ko 192.168.0.214 | ||
+ | $ insmod dmasimple.ko | ||
+ | </pre> | ||
+ | |||
+ | === Test === | ||
+ | |||
+ | ==== i.MX <- FPGA ==== | ||
+ | |||
+ | * To launch a transfert, echo a value in sysfs file : | ||
+ | <pre class="apf"> | ||
+ | echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/dmaread | ||
+ | </pre> | ||
+ | |||
+ | * To check back that values are correctly transfered do : | ||
+ | <pre class="apf"> | ||
+ | cat /sys/bus/pci/devices/0000\:01\:00.0/dmaread | ||
+ | </pre> | ||
+ | |||
+ | ==== i.MX -> FPGA ==== | ||
+ | |||
+ | * To launch a transfert, echo a value in sysfs file : | ||
+ | <pre class="apf"> | ||
+ | echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/dmawrite | ||
+ | </pre> | ||
+ | |||
+ | * To check back that values are correctly transfered do : | ||
+ | <pre class="apf"> | ||
+ | cat /sys/bus/pci/devices/0000\:01\:00.0/dmawrite | ||
+ | </pre> | ||
+ | |||
+ | == Links == | ||
+ | '''Armadeus documentation''' | ||
+ | * [http://www.armadeus.com/wiki/index.php?title=APF6_SP_Interfaces_description All APF6_SP FPGA documentation] | ||
+ | * [http://www.armadeus.com/wiki/index.php?title=APF6_SP_The_full_howto The full howto to now how to begin with the APF6_SP FPGA] | ||
+ | |||
+ | |||
+ | '''Altera documentation''' | ||
+ | * [http://www.altera.com/literature/ug/ug_c5_pcie_avmm.pdf Datasheet of the PCIe hard IP controller, with CRA (Configuration Registers access) in page 13 ] | ||
+ | * [https://www.altera.co.jp/content/dam/altera-www/global/ja_JP/pdfs/literature/hb/nios2/n2cpu_nii5v3.pdf Datasheet of Qsys components, with DMA controller page 233] |
Latest revision as of 12:27, 5 July 2021
Contents
Introduction
In this howto we will learn how to do DMA accesses through PCIe on an APF6_SP board.
Make the quartus project
Follow the full howto from chapter Make the quartus project to chapter Adding the PCIe and CvP. With only to BARx configured (BAR0 and BAR1).
- Add these Avalon-MM CycloneV Hard IP for PCI Express settings:
- In Avalon to PCIe Address Translation Settings
- set Number of address pages: to 4
- set Size of Address pages: to 4Kbyte - 12 bits
- In Avalon to PCIe Address Translation Settings
Adding Block of RAM
- In the IP-Catalog:
Library Basic Functions On Chip Memory On-Chip Memory (RAM or ROM)
- datawidth: 64 bits
- Change capacity to : 16384 (16KBytes)
- Connect signals as following:
- coreclkout -> clk1
- nreset_status -> rst
- Rxm_BAR1 -> s1
Adding Reading DMA block
The registers map of altera DMA block can be found here (begin page 239).
The reading DMA block is used to transfert data from the Block of ram in FPGA (CycloneV) to the DDR3 in the processor (i.MX6).
- In the IP-Catalog:
Library Basic Functions DMA DMA Controller
- Set Width of the DMA length register (1-32)(bits): to : 32 bits
- Set Enable burst transfers: 1024
- Set Depth fifo burst': 2048
- Connect signals as following:
- coreclkout -> clk
- nreset_status -> reset
- Rxm_irq <- irq
- onchip_memory.s1 <- read_master
- pcie.TXs <- write_master
- Rxm_BAR0 -> control_port_slave
- Set base address to 0x0000_4000
- Rxm_BAR0 -> control_port_slave
Adding Writing DMA block
The registers map of altera DMA block can be found here (begin page 239).
The Writing DMA block is used to transfert data from the DDR3 in the processor (i.MX6) to the block of ram in FPGA (CycloneV).
- In the IP-Catalog:
Library Basic Functions DMA DMA Controller
- Set Width of the DMA length register (1-32)(bits): to : 32 bits
- Set Enable burst transfers: 1024
- Set Depth fifo burst': 2048
- Connect signals as following:
- coreclkout -> clk
- nreset_status -> reset
- Rxm_irq <- irq
- onchip_memory.s1 <- write_master
- pcie.TXs <- read_master
- Rxm_BAR0 -> control_port_slave
- Set base address to 0x0000_4020
- Rxm_BAR0 -> control_port_slave
Adding irq mapper
For performances measurement we need to see irqs in scope. Then we have to export it.
- In the IP-Catalog:
Library Qsys Interconnect Interrupt IRQ mapper
- Leave all default options then click on finish
- Connect following signals
- coreclkout -> clk
- nreset_status -> clk_reset
- dma_0.irq -> receiver0
- dma_1.irq -> receiver1
- Export sender with name :irqs
Adding PIO
For performances measurement we need to toggle some signals from the driver.
Library Processors and peripherals Peripherals PIO (Parallel I/O)
- Set following options
- Width : 1
- Direction : Output
- Click on finish
- Connect following signals:
- coreclkout -> clk
- nreset_status -> reset
- Rxm_BAR0 -> s1
- Set base address to 0x0000_4040
- Export external_connection with name : pio
Mapping
The mapping of the FPGA design will be as following :
- BAR0
begin addr (8bits) | end addr (8bits) | name |
0x0000_0000 | 0x0000_3FFF | Cra |
0x0000_4000 | 0x0000_401F | DMA Read Controller |
0x0000_4020 | 0x0000_403F | DMA write Controller |
0x0000_4040 | 0x0000_404F | PIO Controller |
- BAR1:
begin addr (8bits) | end addr (8bits) | name |
0x0000_0000 | 0x0000_3FFF | Block of RAM 16Kbytes |
Generate bitstream with Quartus
- On Qsys windows click on Finish an generate component.
Top
Edit a file named dmasimple.v with your favorite editor and add this code :
`timescale 1 ps / 1 ps
module dmasimple (
input wire hip_serial_rx_in0,
output wire hip_serial_tx_out0,
input wire npor_npor,
input wire npor_pin_perst,
input wire refclk_clk,
output wire pio);
reg [1:0] irqs_irq;
reg pio_export;
assign pio = pio_export | irqs_irq[0] | irqs_irq [1];
qsys_dmasimple u0 (
.hip_serial_rx_in0 (hip_serial_rx_in0 ), // hip_serial.rx_in0
.hip_serial_tx_out0 (hip_serial_tx_out0), // .tx_out0
.irqs_irq (irqs_irq ), // irqs.irq
.npor_npor (npor_npor ), // npor.npor
.npor_pin_perst (npor_pin_perst ), // .pin_perst
.refclk_clk (refclk_clk ), // refclk.clk
.pio_export (pio_export ) // pio.export
);
endmodule
- Add this file in quartus the set it as top-level.
Pinout
- Outside Quartus, create a file named pcie_pinout.tcl and edit it with your favorite text editor.
- Copy all lines from here.
- Add this line:
set_location_assignment PIN_AB22 -to pio
- Save your file, then add it to the project inside Quartus
- Run the script with following menu, select it then run:
Tools TCL Scripts ...
Synthesis/place&route/bitstream
- Right click on Assembly then run.
- Generate bitstream with menu
File Convert Programming Files
- Set the following
- Programming file type : Raw Binary File (.rbf)
- File name: DMAsimple.rbf
- under Input files to convert click on SOF Data then Add File ...
- Set DMAsimple.sof in directory output_files then open
- Checkbox Create CvP files (Generate DMAsimple.periph.rbf and DMAsimple.core.rbf)
- Click on Generate
Once done, two files named DMAsimple.periph.rbf and DMAsimple.core.rbf will be available in the root project directory.
Linux Driver
Three sources files are provided with this tutorial :
- Makefile: to compile the driver
- dma_ram.h: somes usefull constants
- dmasimple.c: the driver code
Modify Makefile
- In the Makefile change the path to your local path of armadeus BSP:
BSP_APF6_BROUTPUT := /usr/local/projects/armadeus/apf6/buildroot/output/
- Then try to compile it with make :
$ make make -C /usr/local/projects/armadeus/apf6/buildroot/output/build/linux-3.19.8/ SUBDIRS=/usr/local/projects/apf6sp/software/dmasimple ARCH=arm CROSS_COMPILE=arm-linux- modules CC=/usr/local/projects/armadeus/apf6/buildroot/output/host/usr/bin/arm-linux-gnueabihf-gcc LD=/usr/local/projects/armadeus/apf6/buildroot/output/host/usr/bin/arm-linux-gnueabihf-ld OBJDUMP=/usr/local/projects/armadeus/apf6/buildroot/output/host/usr/bin/arm-linux-gnueabihf-objdump make[1]: Entering directory '/usr/local/projects/armadeus/apf6/buildroot/output/build/linux-3.19.8' CC [M] /usr/local/projects/apf6sp/software/dmasimple/dmasimple.o Building modules, stage 2. MODPOST 1 modules CC /usr/local/projects/apf6sp/software/dmasimple/dmasimple.mod.o LD [M] /usr/local/projects/apf6sp/software/dmasimple/dmasimple.ko make[1]: Leaving directory '/usr/local/projects/armadeus/apf6/buildroot/output/build/linux-3.19.8'
- Copy dmasimple.ko to the target.
Understand the driver
The entry point of the driver is the probe function dmasimple_probe(). The Linux driver allocate a buffer dmabuff in the i.MX6 memory.
/* Allocate and initialize shared control data */
dmas->dmabuff = dmam_alloc_coherent(&pdev->dev, BUFF_SIZE, &dmas->dma_handle, GFP_KERNEL);
if (!dmas->dmabuff){
printk("Error, can't alloc coherent\n");
goto err_return;
}
And register its hard base address dma_handle in the CycloneV PCI express hard ip register.
writel((unsigned long)(dmas->dma_handle), &dmas->bar0[CRA_REG_A2P_ADDR_MAP_LO0/4]);
Dma read
When an echo is done on the sysfs file named dmaread:
echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/dmaread
the dma transfert is started and data from the ram bloc in FPGA are transfered in dmabuff.
static ssize_t store_dmaread(struct device *dev, struct device_attribute *attr,
const char *buf, size_t count)
{
[...]
/* Launch DMA transfert */
writel(ALT_DMA_CTRL_FLG_DBL_WD | ALT_DMA_CTRL_FLG_LEEN |
ALT_DMA_CTRL_FLG_GO | ALT_DMA_CTRL_FLG_I_EN,
&dmas->bar0[ALT_DMA_REG_CONTROL/4]);
return count;
}
A cat command in the file dmaread:
cat /sys/bus/pci/devices/0000\:01\:00.0/dmaread
will launch verification of data copied.
/* Checking DMA transfert */
static ssize_t show_dmaread(struct device *dev, struct device_attribute *attr, char *buf)
{ [...] }
Dma write
When an echo is done on the sysfs file named dmawrite:
echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/dmawrite
the dma transfert is started and data from the DDR3 in processor are transfered in the block of ram in FPGA.
static ssize_t store_dmawrite(struct device *dev, struct device_attribute *attr,
const char *buf, size_t count)
{
[...]
/* Launch DMA transfert */
writel(ALT_DMA_CTRL_FLG_DBL_WD | ALT_DMA_CTRL_FLG_LEEN |
ALT_DMA_CTRL_FLG_GO | ALT_DMA_CTRL_FLG_I_EN,
&dmas->bar0[ALT_DMA_WRITE_REG_CONTROL/4]);
return count;
}
A cat command in the file dmawrite:
cat /sys/bus/pci/devices/0000\:01\:00.0/dmawrite
will launch verification of data copied.
/* Checking DMA transfert */
static ssize_t show_dmawrite(struct device *dev, struct device_attribute *attr, char *buf)
{ [...] }
Boot system and Tests
Booting system
Copy the bitstream files in the host tftpboot directory :
$ cp -v *.rbf /tftpboot/ ‘DMAsimple.core.rbf’ -> ‘/tftpboot/DMAsimple.core.rbf’ ‘DMAsimple.periph.rbf’ -> ‘/tftpboot/DMAsimple.periph.rbf’
Flash the apf6_sp with a fresh git view of armadeus project. Then.
- Boot under U-Boot and download periph bitstream:
BIOS> tftpboot ${loadaddr} DMAsimple.periph.rbf BIOS> fpga load 0 ${loadaddr} ${filesize}
- Then boot Linux
BIOS> boot apf6 login: root
- Download the core.rbf then load it (with 192.168.0.214 the address of the tftp host server):
$ tftp -g -r DMAsimple.core.rbf 192.168.0.214 $ load_fpga DMAsimple.core.rbf
- Download then load the DMAsimple driver:
$ tftp -g -r dmasimple.ko 192.168.0.214 $ insmod dmasimple.ko
Test
i.MX <- FPGA
- To launch a transfert, echo a value in sysfs file :
echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/dmaread
- To check back that values are correctly transfered do :
cat /sys/bus/pci/devices/0000\:01\:00.0/dmaread
i.MX -> FPGA
- To launch a transfert, echo a value in sysfs file :
echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/dmawrite
- To check back that values are correctly transfered do :
cat /sys/bus/pci/devices/0000\:01\:00.0/dmawrite
Links
Armadeus documentation
Altera documentation