Difference between revisions of "APF27 FPGA-IMX interface description"
(10 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
− | + | ||
This article describe the interface between IMX and Spartan3A on APF27. | This article describe the interface between IMX and Spartan3A on APF27. | ||
Line 12: | Line 12: | ||
[[image:fpgaimx_wire.png|center|500px|thumb|'''figure 1''' - ''FPGA-i.MXL wiring'']] | [[image:fpgaimx_wire.png|center|500px|thumb|'''figure 1''' - ''FPGA-i.MXL wiring'']] | ||
− | + | Signals used in the design are: | |
* '''CLKO''': Clock generated by i.MX. Used as general clock by the FPGA. | * '''CLKO''': Clock generated by i.MX. Used as general clock by the FPGA. | ||
− | * '''DATA[16]''': | + | * '''DATA[16]''': 16 bits data bus. |
* '''ADDR[13]''': 12 bits address bus, least significant bit (ADDR[0]) is not used because only word access are done. | * '''ADDR[13]''': 12 bits address bus, least significant bit (ADDR[0]) is not used because only word access are done. | ||
* '''CS4N_DTACK''': '''C'''hip '''S'''elect 4 or '''D'''ata '''T'''ransmit '''ACK'''nowledge. | * '''CS4N_DTACK''': '''C'''hip '''S'''elect 4 or '''D'''ata '''T'''ransmit '''ACK'''nowledge. | ||
Line 25: | Line 25: | ||
Each chip select has its own configuration (timing, address range, ...) that can be used for different slaves in the FPGA. | Each chip select has its own configuration (timing, address range, ...) that can be used for different slaves in the FPGA. | ||
− | '''CLKO''' is by default configured to | + | '''CLKO''' is now configured at 100MHz by default configured to simplify FPGA |
+ | IP design. | ||
== Chip Select Timings configuration == | == Chip Select Timings configuration == | ||
− | === | + | ===Old configuration (133MHz)=== |
− | The | + | The old configuration uses '''CS5''' for accessing the FPGA. 32 bits register CS5 |
− | is used to configure all timing for this chip select. It's | + | is used to configure all timing for this chip select. It's old |
− | configuration | + | configuration '''was''' : |
− | * CS5U (Upper 16bits, see page | + | * CS5U (Upper 16bits, see page 521 of reference manual ): mw D8002050 00000600 |
This will add 6 waits state on access to read value correctly (WSC: Wait | This will add 6 waits state on access to read value correctly (WSC: Wait | ||
State Control). | State Control). | ||
− | * CS5L (Lower 16bits,see page | + | * CS5L (Lower 16bits,see page 525 of reference manual): mw D8002054 00000D01 |
Enable chip select (CSEN), Enable for only write access EB[] (EBC) and select data port size to | Enable chip select (CSEN), Enable for only write access EB[] (EBC) and select data port size to | ||
16bits (DSZ: Data port SiZe). | 16bits (DSZ: Data port SiZe). | ||
Line 46: | Line 47: | ||
[[image:timingstatic.png|center|500px|thumb|'''figure 2''' - ''Static timings chronograms'']] | [[image:timingstatic.png|center|500px|thumb|'''figure 2''' - ''Static timings chronograms'']] | ||
− | With this configuration, the access time (read/write) to the FPGA | + | With this configuration, the access time (read/write) to the FPGA was set to 44ns. |
− | This configuration | + | This configuration was interesting because all timings are under control. But |
− | the problem | + | the problem was that to be perfectly synchronous, the FPGA was clocked at 133MHz |
like WEIM and some IP design doesn't work at this frequency. | like WEIM and some IP design doesn't work at this frequency. | ||
Line 60: | Line 61: | ||
==== i.MX registers configuration ==== | ==== i.MX registers configuration ==== | ||
− | To configure CS5N access using dtack, the gpio | + | To configure CS5N access using dtack, the gpio PF21 must be configured on input with a_out : |
<pre class="apf"> | <pre class="apf"> | ||
md 10015500 1 # read direction register PTF_DDIR | md 10015500 1 # read direction register PTF_DDIR | ||
Line 305: | Line 306: | ||
[[image:dtack.png|center|500px|thumb|'''figure 3''' - ''Read access using dtack'']] | [[image:dtack.png|center|500px|thumb|'''figure 3''' - ''Read access using dtack'']] | ||
− | === Synchronous access with FPGA at 100MHz === | + | === Synchronous access with FPGA at 100MHz ''(default configuration on APF27)''=== |
The main problem with fpga clocked at 100MHz is that wishbone will not be synchronous with WEIM interface (clocked at 133MHz). | The main problem with fpga clocked at 100MHz is that wishbone will not be synchronous with WEIM interface (clocked at 133MHz). | ||
Line 311: | Line 312: | ||
==== Simulation ==== | ==== Simulation ==== | ||
− | To unsure that interface work well, we will simulate it | + | To unsure that interface work well, we will simulate it. |
[[image:static_fpga100.jpg|center|500px|thumb|'''figure 4''' - ''Timings simulation with FPGA clock at 100MHz, with WSC=6'']] | [[image:static_fpga100.jpg|center|500px|thumb|'''figure 4''' - ''Timings simulation with FPGA clock at 100MHz, with WSC=6'']] | ||
Line 319: | Line 320: | ||
''Changing CLK0 to 100MHz:'' | ''Changing CLK0 to 100MHz:'' | ||
− | To change | + | To change CLK0 to 100MHz, we will use HCLK Source (400MHz) divided by 4. |
− | Selecting HCLK (CCSR) | + | Selecting HCLK Source (CCSR) |
<pre class="apf"> | <pre class="apf"> | ||
mw 10027028 00008305 # HCLK Source (MPLL 2x clock output / 3) := 400MHz | mw 10027028 00008305 # HCLK Source (MPLL 2x clock output / 3) := 400MHz | ||
Line 343: | Line 344: | ||
And is saved in armadeus tree at | And is saved in armadeus tree at | ||
''buildroot/target/device/armadeus/apf27/apf27.h | ''buildroot/target/device/armadeus/apf27/apf27.h | ||
+ | |||
+ | = Linux testing program = | ||
+ | |||
+ | A program is available for testing imx-fpga communication speed in armadeus tree in [http://armadeus.git.sourceforge.net/git/gitweb.cgi?p=armadeus/armadeus;a=tree;f=target/linux/debug/imx-fpga-test;h=c9582b3e2b47cff36cbc2aeab610f6c79f0919b3;hb=HEAD target/linux/debug/imx-fpga-test.] |
Latest revision as of 11:47, 10 December 2019
This article describe the interface between IMX and Spartan3A on APF27.
Documentation of i.MX interface can be found in the iMX reference manual, chapter 17,
«Wireless External Interface Module (WEIM)».
Hardware
The detailed electronic schematics of apf27 fpga interface can be found on this document page 11. A simplified schema is shown below on figure 1.
Signals used in the design are:
- CLKO: Clock generated by i.MX. Used as general clock by the FPGA.
- DATA[16]: 16 bits data bus.
- ADDR[13]: 12 bits address bus, least significant bit (ADDR[0]) is not used because only word access are done.
- CS4N_DTACK: Chip Select 4 or Data Transmit ACKnowledge.
- CS5,CS1: Chip Select 5 and 1.
- EB0N and EB1N: For Enable Byte, write signal for lower byte and upper byte on data bus.
- OEN: For Output Enable bit, read signal.
- DMA_GRANT# and DMA_REQ#: Signals to use DMA on i.MX.
Each chip select has its own configuration (timing, address range, ...) that can be used for different slaves in the FPGA.
CLKO is now configured at 100MHz by default configured to simplify FPGA IP design.
Chip Select Timings configuration
Old configuration (133MHz)
The old configuration uses CS5 for accessing the FPGA. 32 bits register CS5 is used to configure all timing for this chip select. It's old configuration was :
- CS5U (Upper 16bits, see page 521 of reference manual ): mw D8002050 00000600
This will add 6 waits state on access to read value correctly (WSC: Wait State Control).
- CS5L (Lower 16bits,see page 525 of reference manual): mw D8002054 00000D01
Enable chip select (CSEN), Enable for only write access EB[] (EBC) and select data port size to 16bits (DSZ: Data port SiZe).
- CS5A (additionnal register, page 528): mw D8002058 0
- WCR (WEIM Configuration Register): mw D8002060 00002000
Address unshifted for CS5 (AUS5)
With this configuration, the access time (read/write) to the FPGA was set to 44ns.
This configuration was interesting because all timings are under control. But the problem was that to be perfectly synchronous, the FPGA was clocked at 133MHz like WEIM and some IP design doesn't work at this frequency.
Alternative configuration (with DTACK)
To solve the problem, another solution can be the DTACK signal (asynchronous protocol). The DTACK signal is emitted by the slave to master when write/read is done. With this solution, access time is variable and the timing is not static any more.
i.MX registers configuration
To configure CS5N access using dtack, the gpio PF21 must be configured on input with a_out :
md 10015500 1 # read direction register PTF_DDIR md 10015510 1 # read register PTF_ICONFA2
By default, configuration is ok, we just have to select gpio in use :
mw 10015520 FFBF1E80 # PTF_GIUS
FPGA design
On design, the wishbone signal ack is returned to dtack pin. Here a sample code used to test the dtack :
library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.numeric_std.all;
-- ----------------------------------------------------------------------------
Entity imx27_wb16_wrapper is
-- ----------------------------------------------------------------------------
port
(
-- i.MX Signals
imx_address : in std_logic_vector(11 downto 0);
imx_data : inout std_logic_vector(15 downto 0);
imx_cs_n : in std_logic;
imx_oe_n : in std_logic;
imx_eb3_n : in std_logic;
imx_dtack : out std_logic ;
data0_out : out std_logic ;
addr1_out : out std_logic ;
cs_n_out : out std_logic ;
oe_n_out : out std_logic ;
eb3_n_out : out std_logic ;
dtack_out : out std_logic ;
-- Global Signals
gls_reset : in std_logic;
gls_clk : in std_logic
);
end entity;
-- ----------------------------------------------------------------------------
Architecture RTL of imx27_wb16_wrapper is
-- ----------------------------------------------------------------------------
constant DELAY : natural := 2;
signal write : std_logic;
signal read : std_logic;
signal strobe : std_logic;
signal writedata : std_logic_vector(15 downto 0);
signal address : std_logic_vector(12 downto 0);
signal reg1 : std_logic_vector(15 downto 0);
signal reg2 : std_logic_vector(15 downto 0);
signal write_ack : std_logic ;
signal read_ack : std_logic ;
signal wbm_address : std_logic_vector(12 downto 0);
signal wbm_writedata: std_logic_vector(15 downto 0);
signal wbm_readdata: std_logic_vector(15 downto 0);
signal wbm_strobe : std_logic ;
signal wbm_write : std_logic ;
signal wbm_cycle : std_logic ;
signal dtack_s : std_logic ; -- dtack
signal dtack_d : std_logic ; -- dtack delayed
signal dtack_reg : std_logic_vector( DELAY-1 downto 0);
--i signal dtack_old: std_logic_vector( DELAY-1 downto 0);
signal dtack_old : std_logic ;
begin
dtack_s <= write_ack or read_ack;
-- imx_dtack <= dtack_d;
-- dtack_out <= dtack_d;
data0_out <= imx_data(0);
addr1_out <= imx_address(1);
cs_n_out <= imx_cs_n;
oe_n_out <= imx_oe_n;
eb3_n_out <= imx_eb3_n;
-- ----------------------------------------------------------------------------
-- External signals synchronization process
-- ----------------------------------------------------------------------------
process(gls_clk, gls_reset)
begin
if(gls_reset='1') then
write <= '0';
read <= '0';
strobe <= '0';
writedata <= (others => '0');
address <= (others => '0');
elsif(rising_edge(gls_clk)) then
strobe <= not (imx_cs_n) and not(imx_oe_n and imx_eb3_n);
write <= not (imx_cs_n or imx_eb3_n);
read <= not (imx_cs_n or imx_oe_n);
address <= imx_address & '0';
writedata <= imx_data;
end if;
end process;
wbm_address <= address when (strobe = '1') else (others => '0');
wbm_writedata <= writedata when (write = '1') else (others => '0');
wbm_strobe <= strobe;
wbm_write <= write;
wbm_cycle <= strobe;
sync_p : process (gls_clk,gls_reset)
variable ack: std_logic ;
begin
if gls_reset = '1' then
imx_data <= (others => 'Z');
imx_dtack <= '0';
dtack_old <= '0';
elsif rising_edge(gls_clk) then
if read = '1' then
imx_data <= wbm_readdata;
dtack_old <= (read_ack or write_ack);
imx_dtack <= dtack_old;
dtack_out <= dtack_old;
else
imx_data <= (others => 'Z');
dtack_old <= '0';
imx_dtack <= '0';
dtack_out <= '0';
end if;
end if;
end process sync_p;
register_write : process(gls_clk,gls_reset)
begin
if gls_reset = '1' then
reg1 <= x"caca";
reg2 <= x"5599";
elsif rising_edge(gls_clk) then
if (wbm_strobe = '1') and (wbm_cycle = '1') and (wbm_write = '1') then
if wbm_address = "0000000000000" then
write_ack <= '1';
reg1 <= wbm_writedata;
elsif wbm_address = "0000000000010" then
write_ack <= '1';
reg2 <= wbm_writedata;
end if;
else
write_ack <= '0';
end if;
end if;
end process register_write;
register_read : process(gls_clk,gls_reset)
begin
if gls_reset = '1' then
wbm_readdata <= (others => '0');
elsif rising_edge(gls_clk) then
if (wbm_strobe = '1') and (wbm_cycle = '1') and (wbm_write = '0') then
if wbm_address = "0000000000000" then
read_ack <= '1';
wbm_readdata <= reg1;
elsif wbm_address = "0000000000010" then
read_ack <= '1';
wbm_readdata <= reg2;
end if;
else
read_ack <= '0';
end if;
end if;
end process register_read;
end architecture RTL;
With ucf ;
# Constraint file
#
NET "gls_clk" TNM_NET = "gls_clk";
TIMESPEC "TS_rstgen_syscon00_ext_clk" = PERIOD "gls_clk" 7.5188 ns HIGH 50 %;
NET "gls_clk" LOC="N9" | IOSTANDARD=LVCMOS18;# CLK0
NET "imx_cs_n" LOC="P10" | IOSTANDARD=LVCMOS18;# CS5N
NET "imx_eb3_n" LOC="P9" | IOSTANDARD=LVCMOS18;# EB0N
NET "imx_oe_n" LOC="R9" | IOSTANDARD=LVCMOS18;# OEN
NET "imx_dtack" LOC="R3" | IOSTANDARD=LVCMOS18 | DRIVE=8;# CS4N_DTACK
NET "imx_address<0>" LOC="N5" | IOSTANDARD=LVCMOS18;# ADDR1
NET "imx_address<1>" LOC="L7" | IOSTANDARD=LVCMOS18;# ADDR2
NET "imx_address<2>" LOC="M7" | IOSTANDARD=LVCMOS18;# ADDR3
NET "imx_address<3>" LOC="M8" | IOSTANDARD=LVCMOS18;# ADDR4
NET "imx_address<4>" LOC="L8" | IOSTANDARD=LVCMOS18;# ADDR5
NET "imx_address<5>" LOC="L9" | IOSTANDARD=LVCMOS18;# ADDR6
NET "imx_address<6>" LOC="L10" | IOSTANDARD=LVCMOS18;# ADDR7
NET "imx_address<7>" LOC="M11" | IOSTANDARD=LVCMOS18;# ADDR8
NET "imx_address<8>" LOC="P11" | IOSTANDARD=LVCMOS18;# ADDR9
NET "imx_address<9>" LOC="N11" | IOSTANDARD=LVCMOS18;# ADDR10
NET "imx_address<10>" LOC="N12" | IOSTANDARD=LVCMOS18;# ADDR11
NET "imx_address<11>" LOC="P13" | IOSTANDARD=LVCMOS18;# ADDR12
NET "imx_data<0>" LOC="T5" | IOSTANDARD=LVCMOS18 | DRIVE=8;# DATA0
NET "imx_data<1>" LOC="T6" | IOSTANDARD=LVCMOS18 | DRIVE=8;# DATA1
NET "imx_data<2>" LOC="P7" | IOSTANDARD=LVCMOS18 | DRIVE=8;# DATA2
NET "imx_data<3>" LOC="N8" | IOSTANDARD=LVCMOS18 | DRIVE=8;# DATA3
NET "imx_data<4>" LOC="P12" | IOSTANDARD=LVCMOS18 | DRIVE=8;# DATA4
NET "imx_data<5>" LOC="T13" | IOSTANDARD=LVCMOS18 | DRIVE=8;# DATA5
NET "imx_data<6>" LOC="R13" | IOSTANDARD=LVCMOS18 | DRIVE=8;# DATA6
NET "imx_data<7>" LOC="T14" | IOSTANDARD=LVCMOS18 | DRIVE=8;# DATA7
NET "imx_data<8>" LOC="P5" | IOSTANDARD=LVCMOS18 | DRIVE=8;# DATA8
NET "imx_data<9>" LOC="N6" | IOSTANDARD=LVCMOS18 | DRIVE=8;# DATA9
NET "imx_data<10>" LOC="T3" | IOSTANDARD=LVCMOS18 | DRIVE=8;# DATA10
NET "imx_data<11>" LOC="T11" | IOSTANDARD=LVCMOS18 | DRIVE=8;# DATA11
NET "imx_data<12>" LOC="T4" | IOSTANDARD=LVCMOS18 | DRIVE=8;# DATA12
NET "imx_data<13>" LOC="R5" | IOSTANDARD=LVCMOS18 | DRIVE=8;# DATA13
NET "imx_data<14>" LOC="M10" | IOSTANDARD=LVCMOS18 | DRIVE=8;# DATA14
NET "imx_data<15>" LOC="T10" | IOSTANDARD=LVCMOS18 | DRIVE=8;# DATA15
NET "data0_out" LOC="D16" | IOSTANDARD=LVCMOS18 | DRIVE=12;#IO_L22P_1
NET "addr1_out" LOC="D15" | IOSTANDARD=LVCMOS18 | DRIVE=12;#IO_L22N_1
NET "cs_n_out" LOC="N3" | IOSTANDARD=LVCMOS18 | DRIVE=12;#IO_L24P_3
NET "oe_n_out" LOC="R1" | IOSTANDARD=LVCMOS18 | DRIVE=12;#IO_L23P_3
NET "eb3_n_out" LOC="N2" | IOSTANDARD=LVCMOS18 | DRIVE=12;#IO_L22P_3
NET "dtack_out" LOC="M1" | IOSTANDARD=LVCMOS18 | DRIVE=8;#IO_L20P_3
#end
Timings
The main problem with this solution is that i.MX wait a too long time (~42ns) after dtack rise to de-assert its chip select.
Synchronous access with FPGA at 100MHz (default configuration on APF27)
The main problem with fpga clocked at 100MHz is that wishbone will not be synchronous with WEIM interface (clocked at 133MHz).
Simulation
To unsure that interface work well, we will simulate it.
Registers configuration
Changing CLK0 to 100MHz:
To change CLK0 to 100MHz, we will use HCLK Source (400MHz) divided by 4.
Selecting HCLK Source (CCSR)
mw 10027028 00008305 # HCLK Source (MPLL 2x clock output / 3) := 400MHz
Divide by 4 (PCDR0);
mw 10027018 12C41083 # divide by 4
And we have to add one more clock cycle for CS (CSCR5U):
mw.l D8002050 00000600
Timing register configuration
All register configurations for external memory are done in u-boot. Configuration file can be found in buildroot/project_build_armv5te/<project_name>/u-boot-1.3.4/include/configs/apf27.h And is saved in armadeus tree at buildroot/target/device/armadeus/apf27/apf27.h
Linux testing program
A program is available for testing imx-fpga communication speed in armadeus tree in target/linux/debug/imx-fpga-test.