APF6 SP DMA simple howto
Contents
Introduction
In this howto we will learn how to do DMA accesses through PCIe on an APF6_SP board.
Make the quartus project
Follow the full howto from chapter Make the quartus project to chapter Adding the PCIe and CvP. With only to BARx configured (BAR0 and BAR1).
- Add these Avalon-MM CycloneV Hard IP for PCI Express settings:
- In Avalon to PCIe Address Translation Settings
- set Number of address pages: to 4
- set Size of Address pages: to 4Kbyte - 12 bits
- In Avalon to PCIe Address Translation Settings
Adding Block of RAM
- In the IP-Catalog:
Library Basic Functions On Chip Memory On-Chip Memory (RAM or ROM)
- datawidth: 64 bits
- Change capacity to : 16384 (16KBytes)
- Connect signals as following:
- coreclkout -> clk1
- nreset_status -> rst
- Rxm_BAR1 -> s1
Adding Reading DMA block
The registers map of altera DMA block can be found here (begin page 239).
The reading DMA block is used to transfert data from the Block of ram in FPGA (CycloneV) to the DDR3 in the processor (i.MX6).
- In the IP-Catalog:
Library Basic Functions DMA DMA Controller
- Set Width of the DMA length register (1-32)(bits): to : 32 bits
- Set Enable burst transfers: 1024
- Set Depth fifo burst': 2048
- Connect signals as following:
- coreclkout -> clk
- nreset_status -> reset
- Rxm_irq <- irq
- onchip_memory.s1 <- read_master
- pcie.TXs <- write_master
- Rxm_BAR0 -> control_port_slave
- Set base address to 0x0000_4000
- Rxm_BAR0 -> control_port_slave
Adding Writing DMA block
The registers map of altera DMA block can be found here (begin page 239).
The Writing DMA block is used to transfert data from the DDR3 in the processor (i.MX6) to the block of ram in FPGA (CycloneV).
- In the IP-Catalog:
Library Basic Functions DMA DMA Controller
- Set Width of the DMA length register (1-32)(bits): to : 32 bits
- Set Enable burst transfers: 1024
- Set Depth fifo burst': 2048
- Connect signals as following:
- coreclkout -> clk
- nreset_status -> reset
- Rxm_irq <- irq
- onchip_memory.s1 <- write_master
- pcie.TXs <- read_master
- Rxm_BAR0 -> control_port_slave
- Set base address to 0x0000_4020
- Rxm_BAR0 -> control_port_slave
Adding irq mapper
For performances measurement we need to see irqs in scope. Then we have to export it.
- In the IP-Catalog:
Library Qsys Interconnect Interrupt IRQ mapper
- Leave all default options then click on finish
- Connect following signals
- coreclkout -> clk
- nreset_status -> clk_reset
- dma_0.irq -> receiver0
- dma_1.irq -> receiver1
- Export sender with name :irqs
Adding PIO
For performances measurement we need to toggle some signals from the driver.
Library Processors and peripherals Peripherals PIO (Parallel I/O)
- Set following options
- Width : 1
- Direction : Output
- Click on finish
- Connect following signals:
- coreclkout -> clk
- nreset_status -> reset
- Rxm_BAR0 -> s1
- Set base address to 0x0000_4040
- Export external_connection with name : pio
Mapping
The mapping of the FPGA design will be as following :
- BAR0
begin addr (8bits) | end addr (8bits) | name |
0x0000_0000 | 0x0000_3FFF | Cra |
0x0000_4000 | 0x0000_401F | DMA Read Controller |
0x0000_4020 | 0x0000_403F | DMA write Controller |
0x0000_4040 | 0x0000_404F | PIO Controller |
- BAR1:
begin addr (8bits) | end addr (8bits) | name |
0x0000_0000 | 0x0000_3FFF | Block of RAM 16Kbytes |
Generate bitstream with Quartus
- On Qsys windows click on Finish an generate component.
Top
Edit a file named dmasimple.v with your favorite editor and add this code :
`timescale 1 ps / 1 ps
module dmasimple (
input wire hip_serial_rx_in0,
output wire hip_serial_tx_out0,
input wire npor_npor,
input wire npor_pin_perst,
input wire refclk_clk,
output wire pio);
reg [1:0] irqs_irq;
reg pio_export;
assign pio = pio_export | irqs_irq[0] | irqs_irq [1];
qsys_dmasimple u0 (
.hip_serial_rx_in0 (hip_serial_rx_in0 ), // hip_serial.rx_in0
.hip_serial_tx_out0 (hip_serial_tx_out0), // .tx_out0
.irqs_irq (irqs_irq ), // irqs.irq
.npor_npor (npor_npor ), // npor.npor
.npor_pin_perst (npor_pin_perst ), // .pin_perst
.refclk_clk (refclk_clk ), // refclk.clk
.pio_export (pio_export ) // pio.export
);
endmodule
- Add this file in quartus the set it as top-level.
Pinout
- Outside Quartus, create a file named pcie_pinout.tcl and edit it with your favorite text editor.
- Copy all lines from here.
- Add this line:
set_location_assignment PIN_AB22 -to pio
- Save your file, then add it to the project inside Quartus
- Run the script with following menu, select it then run:
Tools TCL Scripts ...
Synthesis/place&route/bitstream
- Right click on Assembly then run.
- Generate bitstream with menu
File Convert Programming Files
- Set the following
- Programming file type : Raw Binary File (.rbf)
- File name: DMAsimple.rbf
- under Input files to convert click on SOF Data then Add File ...
- Set DMAsimple.sof in directory output_files then open
- Checkbox Create CvP files (Generate DMAsimple.periph.rbf and DMAsimple.core.rbf)
- Click on Generate
Once done, two files named DMAsimple.periph.rbf and DMAsimple.core.rbf will be available in the root project directory.
Linux Driver
Three sources files are provided with this tutorial :
- Makefile: to compile the driver
- dma_ram.h: somes usefull constants
- dmasimple.c: the driver code
Modify Makefile
- In the Makefile change the path to your local path of armadeus BSP:
BSP_APF6_BROUTPUT := /usr/local/projects/armadeus/apf6/buildroot/output/
- Then try to compile it with make :
$ make make -C /usr/local/projects/armadeus/apf6/buildroot/output/build/linux-3.19.8/ SUBDIRS=/usr/local/projects/apf6sp/software/dmasimple ARCH=arm CROSS_COMPILE=arm-linux- modules CC=/usr/local/projects/armadeus/apf6/buildroot/output/host/usr/bin/arm-linux-gnueabihf-gcc LD=/usr/local/projects/armadeus/apf6/buildroot/output/host/usr/bin/arm-linux-gnueabihf-ld OBJDUMP=/usr/local/projects/armadeus/apf6/buildroot/output/host/usr/bin/arm-linux-gnueabihf-objdump make[1]: Entering directory '/usr/local/projects/armadeus/apf6/buildroot/output/build/linux-3.19.8' CC [M] /usr/local/projects/apf6sp/software/dmasimple/dmasimple.o Building modules, stage 2. MODPOST 1 modules CC /usr/local/projects/apf6sp/software/dmasimple/dmasimple.mod.o LD [M] /usr/local/projects/apf6sp/software/dmasimple/dmasimple.ko make[1]: Leaving directory '/usr/local/projects/armadeus/apf6/buildroot/output/build/linux-3.19.8'
- Copy dmasimple.ko to the target.
Understand the driver
The entry point of the driver is the probe function dmasimple_probe(). The Linux driver allocate a buffer dmabuff in the i.MX6 memory.
/* Allocate and initialize shared control data */
dmas->dmabuff = dmam_alloc_coherent(&pdev->dev, BUFF_SIZE, &dmas->dma_handle, GFP_KERNEL);
if (!dmas->dmabuff){
printk("Error, can't alloc coherent\n");
goto err_return;
}
And register its hard base address dma_handle in the CycloneV PCI express hard ip register.
writel((unsigned long)(dmas->dma_handle), &dmas->bar0[CRA_REG_A2P_ADDR_MAP_LO0/4]);
Dma read
When an echo is done on the sysfs file named dmaread:
echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/dmaread
the dma transfert is started and data from the ram bloc in FPGA are transfered in dmabuff.
static ssize_t store_dmaread(struct device *dev, struct device_attribute *attr,
const char *buf, size_t count)
{
[...]
/* Launch DMA transfert */
writel(ALT_DMA_CTRL_FLG_DBL_WD | ALT_DMA_CTRL_FLG_LEEN |
ALT_DMA_CTRL_FLG_GO | ALT_DMA_CTRL_FLG_I_EN,
&dmas->bar0[ALT_DMA_REG_CONTROL/4]);
return count;
}
A cat command in the file dmaread:
cat /sys/bus/pci/devices/0000\:01\:00.0/dmaread
will launch verification of data copied.
/* Checking DMA transfert */
static ssize_t show_dmaread(struct device *dev, struct device_attribute *attr, char *buf)
{ [...] }
Dma write
When an echo is done on the sysfs file named dmawrite:
echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/dmawrite
the dma transfert is started and data from the DDR3 in processor are transfered in the block of ram in FPGA.
static ssize_t store_dmawrite(struct device *dev, struct device_attribute *attr,
const char *buf, size_t count)
{
[...]
/* Launch DMA transfert */
writel(ALT_DMA_CTRL_FLG_DBL_WD | ALT_DMA_CTRL_FLG_LEEN |
ALT_DMA_CTRL_FLG_GO | ALT_DMA_CTRL_FLG_I_EN,
&dmas->bar0[ALT_DMA_WRITE_REG_CONTROL/4]);
return count;
}
A cat command in the file dmawrite:
cat /sys/bus/pci/devices/0000\:01\:00.0/dmawrite
will launch verification of data copied.
/* Checking DMA transfert */
static ssize_t show_dmawrite(struct device *dev, struct device_attribute *attr, char *buf)
{ [...] }
Boot system and Tests
Booting system
Copy the bitstream files in the host tftpboot directory :
$ cp -v *.rbf /tftpboot/ ‘DMAsimple.core.rbf’ -> ‘/tftpboot/DMAsimple.core.rbf’ ‘DMAsimple.periph.rbf’ -> ‘/tftpboot/DMAsimple.periph.rbf’
Flash the apf6_sp with a fresh git view of armadeus project. Then.
- Boot under U-Boot and download periph bitstream:
BIOS> tftpboot ${loadaddr} DMAsimple.periph.rbf BIOS> fpga load 0 ${loadaddr} ${filesize}
- Then boot Linux
BIOS> boot apf6 login: root
- Download the core.rbf then load it (with 192.168.0.214 the address of the tftp host server):
$ tftp -g -r DMAsimple.core.rbf 192.168.0.214 $ load_fpga DMAsimple.core.rbf
- Download then load the DMAsimple driver:
$ tftp -g -r dmasimple.ko 192.168.0.214 $ insmod dmasimple.ko
Test
i.MX <- FPGA
- To launch a transfert, echo a value in sysfs file :
echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/dmaread
- To check back that values are correctly transfered do :
cat /sys/bus/pci/devices/0000\:01\:00.0/dmaread
i.MX -> FPGA
- To launch a transfert, echo a value in sysfs file :
echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/dmawrite
- To check back that values are correctly transfered do :
cat /sys/bus/pci/devices/0000\:01\:00.0/dmawrite
Links
Armadeus documentation
Altera documentation