Difference between revisions of "APF6 SP DMA simple howto"

From ArmadeusWiki
Jump to: navigation, search
(Blanked the page)
 
Line 1: Line 1:
 +
[[Category: APF6SP]]
 +
[[Category: FPGA]]
 +
[[Category: CycloneV]]
 +
[[Category: Quartus]]
 +
[[Category: Qsys]]
 +
[[Category: DMA]]
 +
[[Category: PCIe]]
  
 +
== Introduction ==
 +
 +
In this howto we will learn how to do DMA accesses through PCIe on an APF6_SP board.
 +
 +
[[File:APF6_SP_dma_simple_howto.png|frame|center|Architecture of the DMA howto in FPGA (TODO: adding pio and irq exports)]]
 +
 +
== Make the quartus project ==
 +
{{Note| Quartus 15.1.1 was used for this tutorial}}
 +
 +
Follow the [http://www.armadeus.com/wiki/index.php?title=APF6_SP_The_full_howto full howto] from chapter [http://www.armadeus.com/wiki/index.php?title=APF6_SP_The_full_howto#Make_the_quartus_project Make the quartus project] to chapter [http://www.armadeus.com/wiki/index.php?title=APF6_SP_The_full_howto#Adding_the_PCIe_and_CvP Adding the PCIe and CvP]. With only to BARx configured (BAR0 and BAR1).
 +
 +
* Add these ''Avalon-MM CycloneV Hard IP for PCI Express'' settings:
 +
** In ''Avalon to PCIe Address Translation Settings''
 +
*** set ''Number of address pages:'' to '''4'''
 +
*** set ''Size of Address pages:'' to '''4Kbyte - 12 bits'''
 +
 +
=== Adding Block of RAM ===
 +
 +
* In the IP-Catalog:
 +
<pre class="config">
 +
Library
 +
    Basic Functions
 +
        On Chip Memory
 +
            On-Chip Memory (RAM or ROM)
 +
</pre>
 +
 +
* ''datawidth'': '''64''' bits
 +
* Change capacity to : '''16384''' (16KBytes)
 +
* Connect signals as following:
 +
** '''coreclkout''' -> '''clk1'''
 +
** '''nreset_status''' -> '''rst'''
 +
** '''Rxm_BAR1''' -> '''s1'''
 +
 +
=== Adding Reading DMA block ===
 +
 +
The registers map of altera DMA block can be found [https://www.altera.co.jp/content/dam/altera-www/global/ja_JP/pdfs/literature/hb/nios2/n2cpu_nii5v3.pdf here] (begin page 239).
 +
 +
The reading DMA block is used to transfert data from the Block of ram in FPGA (CycloneV) to the DDR3 in the processor (i.MX6).
 +
 +
* In the IP-Catalog:
 +
<pre class="config">
 +
Library
 +
    Basic Functions
 +
        DMA
 +
            DMA Controller
 +
</pre>
 +
 +
* Set ''Width of the DMA length register (1-32)(bits):'' to : '''32''' bits
 +
* Set ''Enable burst transfers'': '''1024'''
 +
* Set ''Depth fifo burst''': '''2048'''
 +
* Connect signals as following:
 +
** '''coreclkout''' -> '''clk'''
 +
** '''nreset_status''' -> '''reset'''
 +
** '''Rxm_irq''' <- '''irq'''
 +
** onchip_memory.'''s1''' <- '''read_master'''
 +
** pcie.'''TXs''' <- '''write_master'''
 +
 +
** '''Rxm_BAR0''' -> '''control_port_slave'''
 +
*** Set base address to '''0x0000_4000'''
 +
 +
=== Adding Writing DMA block ===
 +
 +
The registers map of altera DMA block can be found [https://www.altera.co.jp/content/dam/altera-www/global/ja_JP/pdfs/literature/hb/nios2/n2cpu_nii5v3.pdf here] (begin page 239).
 +
 +
The Writing DMA block is used to transfert data from the DDR3 in the processor (i.MX6) to the block of ram in FPGA (CycloneV).
 +
 +
* In the IP-Catalog:
 +
<pre class="config">
 +
Library
 +
    Basic Functions
 +
        DMA
 +
            DMA Controller
 +
</pre>
 +
 +
* Set ''Width of the DMA length register (1-32)(bits):'' to : '''32''' bits
 +
* Set ''Enable burst transfers'': '''1024'''
 +
* Set ''Depth fifo burst''': '''2048'''
 +
* Connect signals as following:
 +
** '''coreclkout''' -> '''clk'''
 +
** '''nreset_status''' -> '''reset'''
 +
** '''Rxm_irq''' <- '''irq'''
 +
** onchip_memory.'''s1''' <- '''write_master'''
 +
** pcie.'''TXs''' <- '''read_master'''
 +
 +
** '''Rxm_BAR0''' -> '''control_port_slave'''
 +
*** Set base address to '''0x0000_4020'''
 +
 +
=== Adding irq mapper ===
 +
 +
For performances measurement we need to see irqs in scope. Then we have to export it.
 +
 +
* In the IP-Catalog:
 +
<pre class="config">
 +
Library
 +
    Qsys Interconnect
 +
        Interrupt
 +
            IRQ mapper
 +
</pre>
 +
 +
* Leave all default options then click on finish
 +
* Connect following signals
 +
** '''coreclkout''' -> '''clk'''
 +
** '''nreset_status''' -> '''clk_reset'''
 +
** dma_0.'''irq''' -> '''receiver0'''
 +
** dma_1.'''irq''' -> '''receiver1'''
 +
* Export '''sender''' with name :'''irqs'''
 +
 +
=== Adding PIO ===
 +
 +
For performances measurement we need to toggle some signals from the driver.
 +
 +
<pre class="config">
 +
Library
 +
    Processors and peripherals
 +
        Peripherals
 +
            PIO (Parallel I/O)
 +
</pre>
 +
 +
* Set following options
 +
** Width : '''1'''
 +
** Direction : '''Output'''
 +
** Click on finish
 +
 +
* Connect following signals:
 +
** '''coreclkout''' -> '''clk'''
 +
** '''nreset_status''' -> '''reset'''
 +
** '''Rxm_BAR0''' -> '''s1'''
 +
*** Set base address to '''0x0000_4040'''
 +
 +
* Export '''external_connection''' with name : '''pio'''
 +
 +
=== Mapping ===
 +
 +
The mapping of the FPGA design will be as following :
 +
 +
* '''BAR0'''
 +
{| class="wikitable" style="text-align:left; width:40%;"
 +
|-
 +
| bgcolor=#FFFF00 | begin addr (8bits)
 +
| bgcolor=#FFFF00 | end addr (8bits)
 +
| bgcolor=#FFFF00 | name
 +
|-
 +
| 0x0000_0000
 +
| 0x0000_3FFF
 +
| bgcolor=#BBBBBB | Cra
 +
|-
 +
| 0x0000_4000
 +
| 0x0000_401F
 +
| bgcolor=#BBBBBB | DMA Read Controller
 +
|-
 +
| 0x0000_4020
 +
| 0x0000_403F
 +
| bgcolor=#BBBBBB | DMA write Controller
 +
|-
 +
| 0x0000_4040
 +
| 0x0000_404F
 +
| bgcolor=#BBBBBB | PIO Controller
 +
|}
 +
 +
* '''BAR1''':
 +
{| class="wikitable" style="text-align:left; width:40%;"
 +
|-
 +
| scope=col bgcolor=#FFFF00 | begin addr (8bits)
 +
| scope=col bgcolor=#FFFF00 | end addr (8bits)
 +
| scope=col bgcolor=#FFFF00 | name
 +
|-
 +
| 0x0000_0000
 +
| 0x0000_3FFF
 +
| bgcolor=#BBBBBB | Block of RAM 16Kbytes
 +
|}
 +
 +
== Generate bitstream with Quartus ==
 +
 +
* On Qsys windows click on Finish an generate component.
 +
 +
=== Top ===
 +
 +
Edit a file named '''dmasimple.v''' with your favorite editor and add this code :
 +
<source lang="verilog">
 +
 +
`timescale 1 ps / 1 ps
 +
 +
module dmasimple (
 +
                input wire hip_serial_rx_in0,
 +
                output wire hip_serial_tx_out0,
 +
                input wire npor_npor,
 +
                input wire npor_pin_perst,
 +
                input wire refclk_clk, 
 +
                output wire pio);     
 +
 +
reg [1:0] irqs_irq;
 +
reg pio_export;
 +
 +
assign pio = pio_export | irqs_irq[0] | irqs_irq [1];
 +
 +
qsys_dmasimple u0 (
 +
    .hip_serial_rx_in0  (hip_serial_rx_in0 ),  // hip_serial.rx_in0
 +
    .hip_serial_tx_out0 (hip_serial_tx_out0), //          .tx_out0
 +
    .irqs_irq          (irqs_irq          ),          //      irqs.irq
 +
    .npor_npor          (npor_npor        ),          //      npor.npor
 +
    .npor_pin_perst    (npor_pin_perst    ),    //          .pin_perst
 +
    .refclk_clk        (refclk_clk        ),        //    refclk.clk
 +
    .pio_export        (pio_export        )          //        pio.export
 +
  );
 +
 +
endmodule
 +
</source>
 +
 +
* Add this file in quartus the set it as '''top-level'''.
 +
 +
=== Pinout ===
 +
 +
* Outside Quartus, create a file named ''pcie_pinout.tcl'' and edit it with your favorite text editor.
 +
* Copy all lines from [http://www.armadeus.com/wiki/index.php?title=APF6_SP_DDR3_PINOUT#PCIe| here].
 +
* Add this line:
 +
<source lang="tcl">
 +
set_location_assignment PIN_AB22 -to pio
 +
</source>
 +
* Save your file, then add it to the project inside Quartus
 +
* Run the script with following menu, select it then '''run''':
 +
<pre class="config">
 +
Tools
 +
    TCL Scripts ...
 +
</pre>
 +
 +
=== Synthesis/place&route/bitstream ===
 +
 +
* Right click on ''Assembly'' then run.
 +
* Generate bitstream with menu
 +
<pre class="config">
 +
File
 +
    Convert Programming Files
 +
</pre>
 +
* Set the following
 +
** Programming file type : '''Raw Binary File (.rbf)
 +
** File name: '''DMAsimple.rbf'''
 +
* under ''Input files to convert'' click on '''SOF Data''' then '''Add File ...'''
 +
** Set '''DMAsimple.sof''' in directory '''output_files''' then '''open'''
 +
** Checkbox ''Create CvP files (Generate DMAsimple.periph.rbf and DMAsimple.core.rbf)''
 +
* Click on '''Generate'''
 +
 +
Once done, two files named ''DMAsimple.periph.rbf'' and ''DMAsimple.core.rbf'' will be available in the root project directory.
 +
 +
== Linux Driver ==
 +
 +
Three sources files are provided with this tutorial :
 +
 +
* '''Makefile''': to compile the driver
 +
* '''dma_ram.h''': somes usefull constants
 +
* '''dmasimple.c''': the driver code
 +
 +
=== Modify Makefile ===
 +
 +
* In the Makefile change the path to your local path of armadeus BSP:
 +
<source lang="make">
 +
BSP_APF6_BROUTPUT := /usr/local/projects/armadeus/apf6/buildroot/output/
 +
</source>
 +
 +
* Then try to compile it with make :
 +
<pre class="host">
 +
$ make
 +
make -C /usr/local/projects/armadeus/apf6/buildroot/output/build/linux-3.19.8/ SUBDIRS=/usr/local/projects/apf6sp/software/dmasimple ARCH=arm CROSS_COMPILE=arm-linux- modules CC=/usr/local/projects/armadeus/apf6/buildroot/output/host/usr/bin/arm-linux-gnueabihf-gcc LD=/usr/local/projects/armadeus/apf6/buildroot/output/host/usr/bin/arm-linux-gnueabihf-ld OBJDUMP=/usr/local/projects/armadeus/apf6/buildroot/output/host/usr/bin/arm-linux-gnueabihf-objdump
 +
make[1]: Entering directory '/usr/local/projects/armadeus/apf6/buildroot/output/build/linux-3.19.8'
 +
  CC [M]  /usr/local/projects/apf6sp/software/dmasimple/dmasimple.o
 +
  Building modules, stage 2.
 +
  MODPOST 1 modules
 +
  CC      /usr/local/projects/apf6sp/software/dmasimple/dmasimple.mod.o
 +
  LD [M]  /usr/local/projects/apf6sp/software/dmasimple/dmasimple.ko
 +
make[1]: Leaving directory '/usr/local/projects/armadeus/apf6/buildroot/output/build/linux-3.19.8'
 +
</pre>
 +
 +
* Copy dmasimple.ko to the target.
 +
 +
=== Understand the driver ===
 +
 +
The entry point of the driver is the probe function '''dmasimple_probe()'''.
 +
The Linux driver allocate a buffer '''dmabuff''' in the i.MX6 memory.
 +
<source lang="C">
 +
/* Allocate and initialize shared control data */
 +
dmas->dmabuff = dmam_alloc_coherent(&pdev->dev, BUFF_SIZE, &dmas->dma_handle, GFP_KERNEL);
 +
if (!dmas->dmabuff){
 +
  printk("Error, can't alloc coherent\n");
 +
  goto err_return;
 +
}
 +
</source>
 +
 +
And register its hard base address '''dma_handle''' in the CycloneV PCI express hard ip register.
 +
 +
<source lang="C">
 +
writel((unsigned long)(dmas->dma_handle), &dmas->bar0[CRA_REG_A2P_ADDR_MAP_LO0/4]);
 +
</source>
 +
 +
==== Dma read ====
 +
When an echo is done on the sysfs file named '''dmaread''':
 +
<source lang="bash">
 +
echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/dmaread
 +
</source>
 +
 +
the dma transfert is started and data from the ram bloc in FPGA are transfered in dmabuff.
 +
<source lang="C">
 +
static ssize_t store_dmaread(struct device *dev, struct device_attribute *attr,
 +
              const char *buf, size_t count)
 +
{
 +
 +
  [...]
 +
 +
  /* Launch DMA transfert */
 +
  writel(ALT_DMA_CTRL_FLG_DBL_WD | ALT_DMA_CTRL_FLG_LEEN |
 +
      ALT_DMA_CTRL_FLG_GO | ALT_DMA_CTRL_FLG_I_EN,
 +
      &dmas->bar0[ALT_DMA_REG_CONTROL/4]);
 +
 +
  return count;
 +
}
 +
</source>
 +
 +
A ''cat'' command in the file '''dmaread''':
 +
<source lang="bash">
 +
cat /sys/bus/pci/devices/0000\:01\:00.0/dmaread
 +
</source>
 +
will launch verification of data copied.
 +
<source lang="C">
 +
/* Checking DMA transfert */
 +
static ssize_t show_dmaread(struct device *dev, struct device_attribute *attr, char *buf)
 +
{ [...] }
 +
</source>
 +
 +
==== Dma write ====
 +
When an echo is done on the sysfs file named '''dmawrite''':
 +
<source lang="bash">
 +
echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/dmawrite
 +
</source>
 +
 +
the dma transfert is started and data from the DDR3 in processor  are transfered in the block of ram in FPGA.
 +
<source lang="C">
 +
static ssize_t store_dmawrite(struct device *dev, struct device_attribute *attr,
 +
              const char *buf, size_t count)
 +
{
 +
 +
  [...]
 +
 +
  /* Launch DMA transfert */
 +
  writel(ALT_DMA_CTRL_FLG_DBL_WD | ALT_DMA_CTRL_FLG_LEEN |
 +
      ALT_DMA_CTRL_FLG_GO | ALT_DMA_CTRL_FLG_I_EN,
 +
      &dmas->bar0[ALT_DMA_WRITE_REG_CONTROL/4]);
 +
 +
  return count;
 +
}
 +
</source>
 +
 +
A ''cat'' command in the file '''dmawrite''':
 +
<source lang="bash">
 +
cat /sys/bus/pci/devices/0000\:01\:00.0/dmawrite
 +
</source>
 +
will launch verification of data copied.
 +
<source lang="C">
 +
/* Checking DMA transfert */
 +
static ssize_t show_dmawrite(struct device *dev, struct device_attribute *attr, char *buf)
 +
{ [...] }
 +
</source>
 +
 +
== Boot system and Tests ==
 +
 +
=== Booting system ===
 +
Copy the bitstream files in the host tftpboot directory :
 +
<pre class="host">
 +
$ cp -v *.rbf /tftpboot/
 +
‘DMAsimple.core.rbf’ -> ‘/tftpboot/DMAsimple.core.rbf’
 +
‘DMAsimple.periph.rbf’ -> ‘/tftpboot/DMAsimple.periph.rbf’
 +
</pre>
 +
 +
Flash the apf6_sp with a [http://www.armadeus.com/wiki/index.php?title=Target_Software_Installation fresh git view] of armadeus project.
 +
Then.
 +
 +
* Boot under U-Boot and download periph bitstream:
 +
<pre class="apf">
 +
BIOS> tftpboot ${loadaddr} DMAsimple.periph.rbf
 +
BIOS> fpga load 0 ${loadaddr} ${filesize}
 +
</pre>
 +
 +
* Then boot Linux
 +
<pre class="apf">
 +
BIOS> boot
 +
apf6 login: root
 +
</pre>
 +
 +
* Download the core.rbf then load it (with 192.168.0.214 the address of the tftp host server):
 +
<pre class="apf">
 +
$ tftp -g -r DMAsimple.core.rbf 192.168.0.214
 +
$ load_fpga DMAsimple.core.rbf
 +
</pre>
 +
 +
* Download then load the DMAsimple driver:
 +
<pre class="apf">
 +
$ tftp -g -r dmasimple.ko 192.168.0.214
 +
$ insmod dmasimple.ko
 +
</pre>
 +
 +
=== Test ===
 +
 +
==== i.MX <- FPGA ====
 +
 +
* To launch a transfert, echo a value in sysfs file :
 +
<pre class="apf">
 +
echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/dmaread
 +
</pre>
 +
 +
* To check back that values are correctly transfered do :
 +
<pre class="apf">
 +
cat /sys/bus/pci/devices/0000\:01\:00.0/dmaread
 +
</pre>
 +
 +
==== i.MX -> FPGA ====
 +
 +
* To launch a transfert, echo a value in sysfs file :
 +
<pre class="apf">
 +
echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/dmawrite
 +
</pre>
 +
 +
* To check back that values are correctly transfered do :
 +
<pre class="apf">
 +
cat /sys/bus/pci/devices/0000\:01\:00.0/dmawrite
 +
</pre>
 +
 +
== Links ==
 +
'''Armadeus documentation'''
 +
* [http://www.armadeus.com/wiki/index.php?title=APF6_SP_Interfaces_description All APF6_SP FPGA documentation]
 +
* [http://www.armadeus.com/wiki/index.php?title=APF6_SP_The_full_howto The full howto to now how to begin with the APF6_SP FPGA]
 +
 +
 +
'''Altera documentation'''
 +
* [http://www.altera.com/literature/ug/ug_c5_pcie_avmm.pdf Datasheet of the PCIe hard IP controller, with CRA (Configuration Registers access) in page 13 ]
 +
* [https://www.altera.co.jp/content/dam/altera-www/global/ja_JP/pdfs/literature/hb/nios2/n2cpu_nii5v3.pdf Datasheet of Qsys components, with DMA controller page 233]

Latest revision as of 12:27, 5 July 2021


Introduction

In this howto we will learn how to do DMA accesses through PCIe on an APF6_SP board.

Architecture of the DMA howto in FPGA (TODO: adding pio and irq exports)

Make the quartus project

Note Note: Quartus 15.1.1 was used for this tutorial


Follow the full howto from chapter Make the quartus project to chapter Adding the PCIe and CvP. With only to BARx configured (BAR0 and BAR1).

  • Add these Avalon-MM CycloneV Hard IP for PCI Express settings:
    • In Avalon to PCIe Address Translation Settings
      • set Number of address pages: to 4
      • set Size of Address pages: to 4Kbyte - 12 bits

Adding Block of RAM

  • In the IP-Catalog:
Library
    Basic Functions
        On Chip Memory
            On-Chip Memory (RAM or ROM)
  • datawidth: 64 bits
  • Change capacity to : 16384 (16KBytes)
  • Connect signals as following:
    • coreclkout -> clk1
    • nreset_status -> rst
    • Rxm_BAR1 -> s1

Adding Reading DMA block

The registers map of altera DMA block can be found here (begin page 239).

The reading DMA block is used to transfert data from the Block of ram in FPGA (CycloneV) to the DDR3 in the processor (i.MX6).

  • In the IP-Catalog:
Library
    Basic Functions
        DMA
            DMA Controller
  • Set Width of the DMA length register (1-32)(bits): to : 32 bits
  • Set Enable burst transfers: 1024
  • Set Depth fifo burst': 2048
  • Connect signals as following:
    • coreclkout -> clk
    • nreset_status -> reset
    • Rxm_irq <- irq
    • onchip_memory.s1 <- read_master
    • pcie.TXs <- write_master
    • Rxm_BAR0 -> control_port_slave
      • Set base address to 0x0000_4000

Adding Writing DMA block

The registers map of altera DMA block can be found here (begin page 239).

The Writing DMA block is used to transfert data from the DDR3 in the processor (i.MX6) to the block of ram in FPGA (CycloneV).

  • In the IP-Catalog:
Library
    Basic Functions
        DMA
            DMA Controller
  • Set Width of the DMA length register (1-32)(bits): to : 32 bits
  • Set Enable burst transfers: 1024
  • Set Depth fifo burst': 2048
  • Connect signals as following:
    • coreclkout -> clk
    • nreset_status -> reset
    • Rxm_irq <- irq
    • onchip_memory.s1 <- write_master
    • pcie.TXs <- read_master
    • Rxm_BAR0 -> control_port_slave
      • Set base address to 0x0000_4020

Adding irq mapper

For performances measurement we need to see irqs in scope. Then we have to export it.

  • In the IP-Catalog:
Library
    Qsys Interconnect
        Interrupt
            IRQ mapper
  • Leave all default options then click on finish
  • Connect following signals
    • coreclkout -> clk
    • nreset_status -> clk_reset
    • dma_0.irq -> receiver0
    • dma_1.irq -> receiver1
  • Export sender with name :irqs

Adding PIO

For performances measurement we need to toggle some signals from the driver.

Library
    Processors and peripherals
        Peripherals
            PIO (Parallel I/O)
  • Set following options
    • Width : 1
    • Direction : Output
    • Click on finish
  • Connect following signals:
    • coreclkout -> clk
    • nreset_status -> reset
    • Rxm_BAR0 -> s1
      • Set base address to 0x0000_4040
  • Export external_connection with name : pio

Mapping

The mapping of the FPGA design will be as following :

  • BAR0
begin addr (8bits) end addr (8bits) name
0x0000_0000 0x0000_3FFF Cra
0x0000_4000 0x0000_401F DMA Read Controller
0x0000_4020 0x0000_403F DMA write Controller
0x0000_4040 0x0000_404F PIO Controller
  • BAR1:
begin addr (8bits) end addr (8bits) name
0x0000_0000 0x0000_3FFF Block of RAM 16Kbytes

Generate bitstream with Quartus

  • On Qsys windows click on Finish an generate component.

Top

Edit a file named dmasimple.v with your favorite editor and add this code :

`timescale 1 ps / 1 ps 

module dmasimple (
                input wire hip_serial_rx_in0, 
                output wire hip_serial_tx_out0,
                input wire npor_npor, 
                input wire npor_pin_perst,
                input wire refclk_clk,   
                output wire pio);       

reg [1:0] irqs_irq;
reg pio_export;

assign pio = pio_export | irqs_irq[0] | irqs_irq [1];

qsys_dmasimple u0 (
    .hip_serial_rx_in0  (hip_serial_rx_in0 ),  // hip_serial.rx_in0
    .hip_serial_tx_out0 (hip_serial_tx_out0), //           .tx_out0
    .irqs_irq           (irqs_irq          ),           //       irqs.irq
    .npor_npor          (npor_npor         ),          //       npor.npor
    .npor_pin_perst     (npor_pin_perst    ),     //           .pin_perst
    .refclk_clk         (refclk_clk        ),         //     refclk.clk
    .pio_export         (pio_export        )          //        pio.export
  );

endmodule
  • Add this file in quartus the set it as top-level.

Pinout

  • Outside Quartus, create a file named pcie_pinout.tcl and edit it with your favorite text editor.
  • Copy all lines from here.
  • Add this line:
set_location_assignment PIN_AB22 -to pio
  • Save your file, then add it to the project inside Quartus
  • Run the script with following menu, select it then run:
Tools
    TCL Scripts ...

Synthesis/place&route/bitstream

  • Right click on Assembly then run.
  • Generate bitstream with menu
File
    Convert Programming Files
  • Set the following
    • Programming file type : Raw Binary File (.rbf)
    • File name: DMAsimple.rbf
  • under Input files to convert click on SOF Data then Add File ...
    • Set DMAsimple.sof in directory output_files then open
    • Checkbox Create CvP files (Generate DMAsimple.periph.rbf and DMAsimple.core.rbf)
  • Click on Generate

Once done, two files named DMAsimple.periph.rbf and DMAsimple.core.rbf will be available in the root project directory.

Linux Driver

Three sources files are provided with this tutorial :

  • Makefile: to compile the driver
  • dma_ram.h: somes usefull constants
  • dmasimple.c: the driver code

Modify Makefile

  • In the Makefile change the path to your local path of armadeus BSP:
BSP_APF6_BROUTPUT := /usr/local/projects/armadeus/apf6/buildroot/output/
  • Then try to compile it with make :
$ make
make -C /usr/local/projects/armadeus/apf6/buildroot/output/build/linux-3.19.8/ SUBDIRS=/usr/local/projects/apf6sp/software/dmasimple ARCH=arm CROSS_COMPILE=arm-linux- modules CC=/usr/local/projects/armadeus/apf6/buildroot/output/host/usr/bin/arm-linux-gnueabihf-gcc LD=/usr/local/projects/armadeus/apf6/buildroot/output/host/usr/bin/arm-linux-gnueabihf-ld OBJDUMP=/usr/local/projects/armadeus/apf6/buildroot/output/host/usr/bin/arm-linux-gnueabihf-objdump
make[1]: Entering directory '/usr/local/projects/armadeus/apf6/buildroot/output/build/linux-3.19.8'
  CC [M]  /usr/local/projects/apf6sp/software/dmasimple/dmasimple.o
  Building modules, stage 2.
  MODPOST 1 modules
  CC      /usr/local/projects/apf6sp/software/dmasimple/dmasimple.mod.o
  LD [M]  /usr/local/projects/apf6sp/software/dmasimple/dmasimple.ko
make[1]: Leaving directory '/usr/local/projects/armadeus/apf6/buildroot/output/build/linux-3.19.8'
  • Copy dmasimple.ko to the target.

Understand the driver

The entry point of the driver is the probe function dmasimple_probe(). The Linux driver allocate a buffer dmabuff in the i.MX6 memory.

/* Allocate and initialize shared control data */
dmas->dmabuff = dmam_alloc_coherent(&pdev->dev, BUFF_SIZE, &dmas->dma_handle, GFP_KERNEL);
if (!dmas->dmabuff){
  printk("Error, can't alloc coherent\n");
  goto err_return;
}

And register its hard base address dma_handle in the CycloneV PCI express hard ip register.

writel((unsigned long)(dmas->dma_handle), &dmas->bar0[CRA_REG_A2P_ADDR_MAP_LO0/4]);

Dma read

When an echo is done on the sysfs file named dmaread:

echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/dmaread

the dma transfert is started and data from the ram bloc in FPGA are transfered in dmabuff.

static ssize_t store_dmaread(struct device *dev, struct device_attribute *attr,
               const char *buf, size_t count)
{

  [...]

  /* Launch DMA transfert */
  writel(ALT_DMA_CTRL_FLG_DBL_WD | ALT_DMA_CTRL_FLG_LEEN |
      ALT_DMA_CTRL_FLG_GO | ALT_DMA_CTRL_FLG_I_EN,
      &dmas->bar0[ALT_DMA_REG_CONTROL/4]);

  return count;
}

A cat command in the file dmaread:

cat /sys/bus/pci/devices/0000\:01\:00.0/dmaread

will launch verification of data copied.

/* Checking DMA transfert */
static ssize_t show_dmaread(struct device *dev, struct device_attribute *attr, char *buf)
{ [...] }

Dma write

When an echo is done on the sysfs file named dmawrite:

echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/dmawrite

the dma transfert is started and data from the DDR3 in processor are transfered in the block of ram in FPGA.

static ssize_t store_dmawrite(struct device *dev, struct device_attribute *attr,
               const char *buf, size_t count)
{

  [...]

  /* Launch DMA transfert */
  writel(ALT_DMA_CTRL_FLG_DBL_WD | ALT_DMA_CTRL_FLG_LEEN |
      ALT_DMA_CTRL_FLG_GO | ALT_DMA_CTRL_FLG_I_EN,
      &dmas->bar0[ALT_DMA_WRITE_REG_CONTROL/4]);

  return count;
}

A cat command in the file dmawrite:

cat /sys/bus/pci/devices/0000\:01\:00.0/dmawrite

will launch verification of data copied.

/* Checking DMA transfert */
static ssize_t show_dmawrite(struct device *dev, struct device_attribute *attr, char *buf)
{ [...] }

Boot system and Tests

Booting system

Copy the bitstream files in the host tftpboot directory :

$ cp -v *.rbf /tftpboot/
‘DMAsimple.core.rbf’ -> ‘/tftpboot/DMAsimple.core.rbf’
‘DMAsimple.periph.rbf’ -> ‘/tftpboot/DMAsimple.periph.rbf’

Flash the apf6_sp with a fresh git view of armadeus project. Then.

  • Boot under U-Boot and download periph bitstream:
BIOS> tftpboot ${loadaddr} DMAsimple.periph.rbf
BIOS> fpga load 0 ${loadaddr} ${filesize}
  • Then boot Linux
BIOS> boot
apf6 login: root
  • Download the core.rbf then load it (with 192.168.0.214 the address of the tftp host server):
$ tftp -g -r DMAsimple.core.rbf 192.168.0.214
$ load_fpga DMAsimple.core.rbf
  • Download then load the DMAsimple driver:
$ tftp -g -r dmasimple.ko 192.168.0.214
$ insmod dmasimple.ko 

Test

i.MX <- FPGA

  • To launch a transfert, echo a value in sysfs file :
echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/dmaread
  • To check back that values are correctly transfered do :
cat /sys/bus/pci/devices/0000\:01\:00.0/dmaread

i.MX -> FPGA

  • To launch a transfert, echo a value in sysfs file :
echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/dmawrite
  • To check back that values are correctly transfered do :
cat /sys/bus/pci/devices/0000\:01\:00.0/dmawrite

Links

Armadeus documentation


Altera documentation