APF6 SP DMA simple howto

From ArmadeusWiki
Revision as of 12:27, 5 July 2021 by FabienM (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search


Introduction

In this howto we will learn how to do DMA accesses through PCIe on an APF6_SP board.

Architecture of the DMA howto in FPGA (TODO: adding pio and irq exports)

Make the quartus project

Note Note: Quartus 15.1.1 was used for this tutorial


Follow the full howto from chapter Make the quartus project to chapter Adding the PCIe and CvP. With only to BARx configured (BAR0 and BAR1).

  • Add these Avalon-MM CycloneV Hard IP for PCI Express settings:
    • In Avalon to PCIe Address Translation Settings
      • set Number of address pages: to 4
      • set Size of Address pages: to 4Kbyte - 12 bits

Adding Block of RAM

  • In the IP-Catalog:
Library
    Basic Functions
        On Chip Memory
            On-Chip Memory (RAM or ROM)
  • datawidth: 64 bits
  • Change capacity to : 16384 (16KBytes)
  • Connect signals as following:
    • coreclkout -> clk1
    • nreset_status -> rst
    • Rxm_BAR1 -> s1

Adding Reading DMA block

The registers map of altera DMA block can be found here (begin page 239).

The reading DMA block is used to transfert data from the Block of ram in FPGA (CycloneV) to the DDR3 in the processor (i.MX6).

  • In the IP-Catalog:
Library
    Basic Functions
        DMA
            DMA Controller
  • Set Width of the DMA length register (1-32)(bits): to : 32 bits
  • Set Enable burst transfers: 1024
  • Set Depth fifo burst': 2048
  • Connect signals as following:
    • coreclkout -> clk
    • nreset_status -> reset
    • Rxm_irq <- irq
    • onchip_memory.s1 <- read_master
    • pcie.TXs <- write_master
    • Rxm_BAR0 -> control_port_slave
      • Set base address to 0x0000_4000

Adding Writing DMA block

The registers map of altera DMA block can be found here (begin page 239).

The Writing DMA block is used to transfert data from the DDR3 in the processor (i.MX6) to the block of ram in FPGA (CycloneV).

  • In the IP-Catalog:
Library
    Basic Functions
        DMA
            DMA Controller
  • Set Width of the DMA length register (1-32)(bits): to : 32 bits
  • Set Enable burst transfers: 1024
  • Set Depth fifo burst': 2048
  • Connect signals as following:
    • coreclkout -> clk
    • nreset_status -> reset
    • Rxm_irq <- irq
    • onchip_memory.s1 <- write_master
    • pcie.TXs <- read_master
    • Rxm_BAR0 -> control_port_slave
      • Set base address to 0x0000_4020

Adding irq mapper

For performances measurement we need to see irqs in scope. Then we have to export it.

  • In the IP-Catalog:
Library
    Qsys Interconnect
        Interrupt
            IRQ mapper
  • Leave all default options then click on finish
  • Connect following signals
    • coreclkout -> clk
    • nreset_status -> clk_reset
    • dma_0.irq -> receiver0
    • dma_1.irq -> receiver1
  • Export sender with name :irqs

Adding PIO

For performances measurement we need to toggle some signals from the driver.

Library
    Processors and peripherals
        Peripherals
            PIO (Parallel I/O)
  • Set following options
    • Width : 1
    • Direction : Output
    • Click on finish
  • Connect following signals:
    • coreclkout -> clk
    • nreset_status -> reset
    • Rxm_BAR0 -> s1
      • Set base address to 0x0000_4040
  • Export external_connection with name : pio

Mapping

The mapping of the FPGA design will be as following :

  • BAR0
begin addr (8bits) end addr (8bits) name
0x0000_0000 0x0000_3FFF Cra
0x0000_4000 0x0000_401F DMA Read Controller
0x0000_4020 0x0000_403F DMA write Controller
0x0000_4040 0x0000_404F PIO Controller
  • BAR1:
begin addr (8bits) end addr (8bits) name
0x0000_0000 0x0000_3FFF Block of RAM 16Kbytes

Generate bitstream with Quartus

  • On Qsys windows click on Finish an generate component.

Top

Edit a file named dmasimple.v with your favorite editor and add this code :

`timescale 1 ps / 1 ps 

module dmasimple (
                input wire hip_serial_rx_in0, 
                output wire hip_serial_tx_out0,
                input wire npor_npor, 
                input wire npor_pin_perst,
                input wire refclk_clk,   
                output wire pio);       

reg [1:0] irqs_irq;
reg pio_export;

assign pio = pio_export | irqs_irq[0] | irqs_irq [1];

qsys_dmasimple u0 (
    .hip_serial_rx_in0  (hip_serial_rx_in0 ),  // hip_serial.rx_in0
    .hip_serial_tx_out0 (hip_serial_tx_out0), //           .tx_out0
    .irqs_irq           (irqs_irq          ),           //       irqs.irq
    .npor_npor          (npor_npor         ),          //       npor.npor
    .npor_pin_perst     (npor_pin_perst    ),     //           .pin_perst
    .refclk_clk         (refclk_clk        ),         //     refclk.clk
    .pio_export         (pio_export        )          //        pio.export
  );

endmodule
  • Add this file in quartus the set it as top-level.

Pinout

  • Outside Quartus, create a file named pcie_pinout.tcl and edit it with your favorite text editor.
  • Copy all lines from here.
  • Add this line:
set_location_assignment PIN_AB22 -to pio
  • Save your file, then add it to the project inside Quartus
  • Run the script with following menu, select it then run:
Tools
    TCL Scripts ...

Synthesis/place&route/bitstream

  • Right click on Assembly then run.
  • Generate bitstream with menu
File
    Convert Programming Files
  • Set the following
    • Programming file type : Raw Binary File (.rbf)
    • File name: DMAsimple.rbf
  • under Input files to convert click on SOF Data then Add File ...
    • Set DMAsimple.sof in directory output_files then open
    • Checkbox Create CvP files (Generate DMAsimple.periph.rbf and DMAsimple.core.rbf)
  • Click on Generate

Once done, two files named DMAsimple.periph.rbf and DMAsimple.core.rbf will be available in the root project directory.

Linux Driver

Three sources files are provided with this tutorial :

  • Makefile: to compile the driver
  • dma_ram.h: somes usefull constants
  • dmasimple.c: the driver code

Modify Makefile

  • In the Makefile change the path to your local path of armadeus BSP:
BSP_APF6_BROUTPUT := /usr/local/projects/armadeus/apf6/buildroot/output/
  • Then try to compile it with make :
$ make
make -C /usr/local/projects/armadeus/apf6/buildroot/output/build/linux-3.19.8/ SUBDIRS=/usr/local/projects/apf6sp/software/dmasimple ARCH=arm CROSS_COMPILE=arm-linux- modules CC=/usr/local/projects/armadeus/apf6/buildroot/output/host/usr/bin/arm-linux-gnueabihf-gcc LD=/usr/local/projects/armadeus/apf6/buildroot/output/host/usr/bin/arm-linux-gnueabihf-ld OBJDUMP=/usr/local/projects/armadeus/apf6/buildroot/output/host/usr/bin/arm-linux-gnueabihf-objdump
make[1]: Entering directory '/usr/local/projects/armadeus/apf6/buildroot/output/build/linux-3.19.8'
  CC [M]  /usr/local/projects/apf6sp/software/dmasimple/dmasimple.o
  Building modules, stage 2.
  MODPOST 1 modules
  CC      /usr/local/projects/apf6sp/software/dmasimple/dmasimple.mod.o
  LD [M]  /usr/local/projects/apf6sp/software/dmasimple/dmasimple.ko
make[1]: Leaving directory '/usr/local/projects/armadeus/apf6/buildroot/output/build/linux-3.19.8'
  • Copy dmasimple.ko to the target.

Understand the driver

The entry point of the driver is the probe function dmasimple_probe(). The Linux driver allocate a buffer dmabuff in the i.MX6 memory.

/* Allocate and initialize shared control data */
dmas->dmabuff = dmam_alloc_coherent(&pdev->dev, BUFF_SIZE, &dmas->dma_handle, GFP_KERNEL);
if (!dmas->dmabuff){
  printk("Error, can't alloc coherent\n");
  goto err_return;
}

And register its hard base address dma_handle in the CycloneV PCI express hard ip register.

writel((unsigned long)(dmas->dma_handle), &dmas->bar0[CRA_REG_A2P_ADDR_MAP_LO0/4]);

Dma read

When an echo is done on the sysfs file named dmaread:

echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/dmaread

the dma transfert is started and data from the ram bloc in FPGA are transfered in dmabuff.

static ssize_t store_dmaread(struct device *dev, struct device_attribute *attr,
               const char *buf, size_t count)
{

  [...]

  /* Launch DMA transfert */
  writel(ALT_DMA_CTRL_FLG_DBL_WD | ALT_DMA_CTRL_FLG_LEEN |
      ALT_DMA_CTRL_FLG_GO | ALT_DMA_CTRL_FLG_I_EN,
      &dmas->bar0[ALT_DMA_REG_CONTROL/4]);

  return count;
}

A cat command in the file dmaread:

cat /sys/bus/pci/devices/0000\:01\:00.0/dmaread

will launch verification of data copied.

/* Checking DMA transfert */
static ssize_t show_dmaread(struct device *dev, struct device_attribute *attr, char *buf)
{ [...] }

Dma write

When an echo is done on the sysfs file named dmawrite:

echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/dmawrite

the dma transfert is started and data from the DDR3 in processor are transfered in the block of ram in FPGA.

static ssize_t store_dmawrite(struct device *dev, struct device_attribute *attr,
               const char *buf, size_t count)
{

  [...]

  /* Launch DMA transfert */
  writel(ALT_DMA_CTRL_FLG_DBL_WD | ALT_DMA_CTRL_FLG_LEEN |
      ALT_DMA_CTRL_FLG_GO | ALT_DMA_CTRL_FLG_I_EN,
      &dmas->bar0[ALT_DMA_WRITE_REG_CONTROL/4]);

  return count;
}

A cat command in the file dmawrite:

cat /sys/bus/pci/devices/0000\:01\:00.0/dmawrite

will launch verification of data copied.

/* Checking DMA transfert */
static ssize_t show_dmawrite(struct device *dev, struct device_attribute *attr, char *buf)
{ [...] }

Boot system and Tests

Booting system

Copy the bitstream files in the host tftpboot directory :

$ cp -v *.rbf /tftpboot/
‘DMAsimple.core.rbf’ -> ‘/tftpboot/DMAsimple.core.rbf’
‘DMAsimple.periph.rbf’ -> ‘/tftpboot/DMAsimple.periph.rbf’

Flash the apf6_sp with a fresh git view of armadeus project. Then.

  • Boot under U-Boot and download periph bitstream:
BIOS> tftpboot ${loadaddr} DMAsimple.periph.rbf
BIOS> fpga load 0 ${loadaddr} ${filesize}
  • Then boot Linux
BIOS> boot
apf6 login: root
  • Download the core.rbf then load it (with 192.168.0.214 the address of the tftp host server):
$ tftp -g -r DMAsimple.core.rbf 192.168.0.214
$ load_fpga DMAsimple.core.rbf
  • Download then load the DMAsimple driver:
$ tftp -g -r dmasimple.ko 192.168.0.214
$ insmod dmasimple.ko 

Test

i.MX <- FPGA

  • To launch a transfert, echo a value in sysfs file :
echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/dmaread
  • To check back that values are correctly transfered do :
cat /sys/bus/pci/devices/0000\:01\:00.0/dmaread

i.MX -> FPGA

  • To launch a transfert, echo a value in sysfs file :
echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/dmawrite
  • To check back that values are correctly transfered do :
cat /sys/bus/pci/devices/0000\:01\:00.0/dmawrite

Links

Armadeus documentation


Altera documentation