Particle Physics Group

Seminars

News

Calice LDA Docs

(Version 1.2)

Firmware and Hardware information from Manchester Uni.



Comments and ideas are in this format of text though out the document.







Table of Contents

Overview 3

Design Requirements 3

Hardware 3

HDMI Board 4

Gigabit Interface Board 4

Firmware 6

Overview of Firmware design. 6

DIF SERDES Design 7

LDA-DIF Link Design 9

LDA-ODR Link Design 11

Ethernet Version 11

S-Link Version 12

USB Link Design 12

LDA Register Space 12

LDA­-DIF Link Registers. (BLOCK ADDRESS 0x1) 13

LDA­-ODR Link Registers. (BLOCK ADDRESS 0x2) 13

LDA Status Registers. (BLOCK ADDRESS 0x3) 13

LDA Control Registers. (BLOCK ADDRESS 0x4) 13

LDA Packet Format 14

Fast command packet 14

Slow command packet 15

Data packet 16



Revision History

Version

Comments

1.0 (30/4/08)

MPK

Initial design information and comments about the current design.

Missing details in several sections.

1.1 (~July 2008)

MPK

New details for the logical LDA-DIF protocol, plus changes to the structures for ODR sent command sequences. Following the meeting in Camb on 30th June.

1.2 (Dec 2008)

MPK

Alterations to the DIF-LDA information, detailing the new Packet interface level.





Overview

Design Requirements

The LDA (Link Data Aggregation) board is an intermediate board in the envisioned DAQ system that allows multiple DIFs (Detector Interface Boards) to be connected to a single ODR (Off Detector Receiver). The downstream link is expected to be ~1Gigabit a second or of that order. The upstream DIF links are expected to be in the 50-100Megabit range.

Current thinking is that the down steam link is based on Gigabit Ethernet, and all work has been done with that in mind. There are options to do things using a custom protocol, as the ODR is programmable and the LDA can have a different SERDES if needed. Upstream links are envisioned to be totally custom, and implemented in a synchronous SERDES design inside the FPGA. The possibility to use external SERDES's is also kept in mind if things were to transition to higher speed fibre links to the DIFs.

Off the shelf prototyping is envisioned, so that requires a generic FPGA development board and some custom additional boards to handle the interfaces. This has pros and cons, but is our chosen path.

Hardware

The development board is a BroadDown2 from Enterpoint UK. This is a generic PCI form factor Spartan3-2000 board with large available IO. Additional boards are required for DIF and ODR interfaces.




HDMI Board

For the LDA­DIF link it was decided to go with commercially available cables, with high enough data capacity to handle any of the possible connection requirements. The easiest, and more effective solution was to employ HDMI cables due to their multiple LVDS compatible links, the size and availability. To interface these to the BroadDown2 an adaptor board was designed and produced by Enterpoint. This has 10 HDMI connectors, and a small FPGA to do the signal driving.




The Broaddown2 FPGA drives a single clock into the HDMI FPGA, which then fans it out onto all links. Signals to/from the BroadDown2 are single ended, and the HDMI FPGA drives the DIF links with LVDS. Internal termination is used on all RX channels.

However due to an oversight in the design review the data channels are NOT AC coupled. This could be changed in a re-spin of the board. The FPGA on the HDMI board is the S3-400, and only allows 8 of the channels to be used. A S3-1000 needs to be used to enable all 10 channels to work. Those connectors marked in the photo are not 100% connected. A SATA connector is available on the top of the board, with a single ended input and a single ended output. These are currently unused. The current board is untested, and has only been powered up. A future re-spin of the board could be done to make all 10 connectors work and AC coupling. We need to get the ownership of the full design files to allow us to do this in house.

Gigabit Interface Board

Since the design was not finalised as to the form of the link, it was decided to go with Ethernet and also a 2.5 Gigabit link from TI, the TLK 2501 chip set. This we chosen as it is also used in various CERN DAQ systems, and the ODR's Gigabit SERDES's should be programmable to handle the protocol. The board also contains a simple Cypress FX2 USB interface, which is designed to be used when testing LDAs and DIFs on a bench without the full ODR structure.




The physical connection is done via SFP modules, and so could be either Fibre or CAT5 copper for the Ethernet connection. The TLK link is envisioned to be fibre only, although it would be possible to use straight copper Infiniband style cables.

The USB interface has a 8bit data and 2bit control connection to the BroadDown2 FPGA, and contains an embedded 8051 CPU which would need to be programmed during enumeration with any USB host. The board is mounted on 2 Samtec GZF high density connectors, which are 10x40 rows. These are not the cheapest method, and 10 connectors are required if we wish to use our 5 interface boards. Expected cost is ~20 GBP per connector.

Issues with the Gigabit interface board are mainly due to pin constraints, and the fact we have two link options available, so there was not enough pins to the BroadDown2 to wire everything. Some of the SERDES's options are available via jumpers, and others have been hard wired to values. Future versions would of course only have 1 link option, and then the these signals could be routed to the BroadDown2.

Firmware

Overview of Firmware design.




The firmware builds on existing work done for the Calice project when we were testing FPGA Ethernet drivers. The bulk of the Ethernet interface is made up of Xilinx CoreGen blocks and fifos associated with that work. The DIF link side of things is written from scratch, mainly targeted at Xilinx, by using Xilinx Coregen blocks for 8B10B enc/decoders, however a modified Opencores.org decoder/encoder pair also exist, but are not as well understood as the Coregen parts.

The following is how LDA received (From ODR) data is processed. The Ethernet block contains a Xilinx Gigabit MAC, the Xilinx PCS layer (Includes GMII to TBI interface, auto negotiation and 8B10B enc/decoders). The output from this is fed into a MUX that would allow the USB interface to also feed data into the design. The USB block would need to be coded by whoever wants to use it, and should basically output “ Ethernet” style frames into the FIFO in the same manner the Ethernet core does. So that the down stream logic does not need to be changed. After that there is a “ Fast Decoder” that splits off packets that have a packet type defined as containing fast command frames. These are sent to a simple block that reads the packet header before sending the data over the clock crossing boundary to the DIF state machines. This allows us to define 16 bits that can be passed to the DIF in a kind of bypass of normal data. It is envisioned that at the DIF state machines these would be encoded into the data stream via a short cut/fast path route giving minimal latency. Normal data, and these are sent to a simple block that reads the packet header, and has access to 16 flags that are set and then transferred over the clock crossing boundary to the DIF state machines. This allows us to define 16 bits that can be passed to the DIF in a kind of bypass of normal data. It is envisioned that at the DIF state machines these would be encoded into the data stream via a short cut/fast path route giving minimal latency. Non “ Fast Decoder” packets are queued into a RX FIFO block where they await processing by the main LDA state machine. Packets that contain date to forward to the DIF would be examined and the data sent to the correct DIF state machine for sending. If a reply is expected, then the state machine would wait for the DIF to reply, and then forward that back up the chain to the ODR. Packets that deal with local LDA registers would be processed by decoding them, setting or getting the correct register data and then filling a reply packet to be sent to the ODR. Packets involving Event data would handed to the Event readout state machine which would access the memory and send the correct data to the ODR, or fill the memory with data in the case of a test scenario. This Event readout state machine could be separate from the main one, and have a special port to the TX packet buffer giving it priority to send data, we did this in our V4FX Ethernet testing and it worked. Data coming from the LDA state machine to the DIF state machines needs to cross a clock boundary, how this is done needs to be decided, possibly via some kind of asynchronous FIFO feeding a 1:10 MUX on the DIF side of things. This needs to be considered. Or we could have 10 Asynchronous FIFOS and have the 10:1 MUX on the LDA side of the boundary. The DIF event filters basically pull any event data off to the side, before it hits the FIFOs and store it into memory again, using some kind of 10:1 MUX to buffer into a asynchronous FIFO that is then read and pushed into memory (in the same clock domain as the LDA). This memory controller has 2 ports, with the second used to read out any data (or to store test data into).

Possible issues with regard to the amount of block memory needed are envisioned. This could be “bad” We may need to go to a different design if we run out of memory, where we put more and more into the same clock domain, but we ideally want to run in the Gigabit domain, as that way we are not causing back pressure on the LDA-ODR link..

The LDA state machine needs to be written, the Readout State machine can be based on the work we did with the V4FX, but will need to be re-written, and the DIF state machines written from scratch, and the Event filters can be based on the Ethernet packet filter block with some minor changes. The memory controller MUX can be based on the SuperNemo design, which has a 30:1 MUX in 2 stages, we could use just 1 stage of that to do the work. It doesn't feed an Asynchronous FIFO, but that is easily changed. USB interface is not really my (Manchester's) concern, we are planning on using Ethernet to talk to the prototypes in testing.



DIF SERDES Design

The DIF SERDES is done entirely inside the FPGA, and runs off 2 clocks. The first is the bit clock, and the second is a 90degree shifted version of the bit clock used in the Phase Recovery Module. This module allows is to ignore the incoming phase difference in the data, as it recovers the data using DDR registers and the shifted clock. This is required as although the TX is synchronous with the clock we are sending to the DIF, we do not receive a clock back from the DIF with which to clock the return data. This design was then taken from a Xilinx Applications Note, and seems to work well.


The bit clock is envisioned to be in the 40-100 Mhz range, with the TX_VLD line being used to determine which of the cycles is used to latch and define the 20:1 divider that forms the “word” clock. The RX valid will of course be a determined by the 8B10B decoder/Sync block, which locks onto valid 8B10B symbols and so defines that RX “word” clock by asserting RX_VLD. The decoder also has logic to determine if data is corrupt or not. The Signal Detect simply looks for periods when there are no bit transitions on the line, indicating the DIF is not connected, or not powered, since a working DIF would at the very least be sending a constant stream of IDLE words. The Serial TX Delay block exists to allow us to fine tune the delay on transmission, to take into consideration any cable length delays etc. This is used in conjunction with the Calc Trip Delay, which measures Round trip times. By adjusting things you can arrange for all DIFs to receive data at the same time over different path lengths. The TX data is compared to RX data, and the bit shift delay computed. This requires the DIF to be in serial loop back mode, which is done by sending a sequence of custom commands over the link. LDA loop back can be enabled via a signal, and is not controllable remotely.

The DIF SERDES is almost the same as the LDAs, with some minor changes to the RX path, as the DIF does not need to adjust for unknown phase delays, so does not need the 90degree adjusted clock.

The current SERDES assumes that the bit clock used to send the data back from the DIF will be the same one we sent with the TX data. We also assume all logic in the parallel, “word” clock region of the SERDES is running at the same rate as the bit clock, and the valid signal are just 1 clock period enable lines. This just makes the design easier, we could of course put a DCM into the SERDES, but its a waste of resources.

LDA-DIF Link Design
Physical Hardware design

The link is designed to provide a self contained module that allows the “application” firmware to ignore most of the internals. It contains various state machines that perform operations on power up and during running.

The data is presented on a 16bit, 2 byte wide interface. For transmission it is 8B10B encoded, and the application firmware can specify if bytes are “K” Control characters or not. A “ready” signal is sent to the user firmware 1 clock cycle before new data is required, if he application does not present data at the next clock with a data valid qualifier then an “Idle” sequence will be transmitted to keep the link alive.

After power up the link attempts to train and synchronise with its partner. This is done in several stages.

  • LDA holds line idle for 128 bit clocks, causing any DIF to reset its state machine.

  • LDA transmits 32 IDLE sequences to let the DIF obtain bit lock.

  • If the link is set to come up, then the LDA switches to transmitting LINK_UP sequence. Otherwise it keeps sending ILDE. If the link is set to do a round trip delay calculation then it will transmit LOOPBACK sequence.

  • IF while sending LINK_UP we see 3 LINK_ACK's we then wait for the DIF to begin sending the IDLE sequence, indicating it is locked and running. Once this happens then control of the link TX is handed over the user firmware.

  • IF we're sending the LOOPBACK sequence, and we receive confirmation the remote is looped back, we begin sending random data and perform the calculation for delay times. To exit, we transmit the LOOP_EXIT sequence, and re-initializing the link from the start. This will clear the DIF and bring it back up in normal mode.

  • At any time during normal operation a loss of signal detected will trigger a reinitialization of the link, the DIF losing sync will trigger it to not send data for long enough to trigger the LDA to also go into reinitialization.

  • During normal operation the round trip delay calculation can be performed at any time.

  • During running, the low lever bit sync state machine will monitor for Disparity and Code errors, and if those occur more than 3 in a row it will flag the link is down, and signal to begin link reinitialization.

Status registers accessible by the LDA state machine allow monitoring and controlling of the link. A simple 3 bit status line allows the user firmware to monitor the link status in real time, so that it can determine when it has access to the TX and RX data.

The firmware block for the links is as follows:

entity lda_dif_links is

port (

clk : in std_logic; -- Main serial clock in

rst : in std_logic; -- Main serial rst in


-- Serial LDA_DIF links

dif_serial_in : in std_logic_vector(MAX_DIFS downto 1);

dif_serial_out : out std_logic_vector(MAX_DIFS downto 1);

dif_serial_clk : out std_logic; -- Master clock out

-- DIF status lines.

dif_link_status : out link_status_array;


-- DIF parallel links. MAX_DIFS things in an array

dif_parallel_tx : in link_data_array;

dif_parallel_tx_k : in link_misc_array;

dif_parallel_tx_en : in std_logic_vector(MAX_DIFS downto 1);

dif_parallel_tx_rdy : out std_logic_vector(MAX_DIFS downto 1);

dif_parallel_rx : out link_data_array;

dif_parallel_rx_k : out link_misc_array;

dif_parallel_rx_valid : out std_logic_vector(MAX_DIFS downto 1);

dif_parallel_rx_code_err : out link_misc_array;

dif_parallel_rx_disp_err : out link_misc_array;

-- Management interface


host_data_in : in std_logic_vector(15 downto 0);

host_data_out : out std_logic_vector(15 downto 0);

host_data_add : in std_logic_vector(5 downto 0); -- 64 regs

host_wr : in std_logic;

host_rd : in std_logic

);

end entity lda_dif_links;



The DIF SERDES block contains a similar structure to this, and has internally the corresponding link negotiating state machines:



entity dif_module is

port (

serial_clk : in std_logic;

serial_rst : in std_logic;

dif_link_restart : in std_logic; -- Re neg the link.

dif_link_status : out std_logic_vector(2 downto 0); -- See above.

clk5 : out std_logic;

clk5_stb : out std_logic_vector(19 downto 0);

-- Serial Side

dif_serial_in : in std_logic;

dif_serial_out : out std_logic;

-- Local side of things.

dif_parallel_tx : in std_logic_vector(15 downto 0);

dif_parallel_tx_k : in std_logic_vector(1 downto 0);

dif_parallel_tx_en : in std_logic;

dif_parallel_tx_rdy : out std_logic;

dif_parallel_rx : out std_logic_vector(15 downto 0);

dif_parallel_rx_k : out std_logic_vector(1 downto 0);

dif_parallel_rx_valid : out std_logic;

dif_parallel_rx_code_err : out std_logic_vector(1 downto 0);

dif_parallel_rx_disp_err : out std_logic_vector(1 downto 0)

);

end entity dif_module;




On the LDA it is assumed that the logical-link layer will interface to this to provide the packet based interface, with a layer that allows for the Fast Commands to be interfacHere there also exists a clk5, which is a 1:20 divided bit clock, that is synchronised to the word clock. It also exports 20 strobe lines, that can be used for timing purposes by the DIF. During link changes, these should not be used, only when the link is confirmed as being established.

Logical Link Protocol

The LDA-DIF link is based along a packet system, where the LDA forwards data packets to the DIF without having to do any detailed decoding of them internally. The ODR/Run control software is expected to build the correct packets for the type of DIF that is connected. The ODR-LDA packet format is shown on page 19.

The links are 8B10B encoded and this allows the usage of “K-Codes” to denote special functions, see a good web page on 8B10B for full description of this. Several of these are pre-defined, and are used by the Link protocol state machines to control the link. Others will be used to control the link protocol (Partially ripped off from Gigabit Ethernet specification.)

K Character

Function

K28.0

SYNCMD. Used to retime the /20 clock at the DIF.

K28.3

COMMAND. Used to transfer FAST Commands to DIF.

K28.4

NEIGHBOURCMD. Fast Command for neighbour DIF.

K28.5

Comma Character, used in the control of the low level link protocol. Should never be sent by User-Logic.

K.28.7

Not Used... EVER EVER EVER. Can cause issues if mixed with other symbols, and form a “fake” synchronisation sequence.

K27.7

Signals Start of DATA Frame. (/S/)

K29.7

Signals End of DATA Frame. (/T/)

K23.7

Carrier Extend. Used to PAD the end of a data frame out to an even number of Symbols, so that next frame, or IDLE sequence starts on an even footing. (/R/)

K28.0

Signals the next symbol is a SYNCCMD.

K28.3

Signals the next symbol is a COMMAND.

K Character

Function

K28.5

Comma Character, used in the control of the low level link protocol. Should never be sent by User-Logic.

K.28.7

Not Used... EVER EVER EVER. Can cause issues if mixed with other symbols, and form a “fake” synchronisation sequence.

K27.7

Signals Start of DATA Frame. (/S/)

K29.7

Signals End of DATA Frame. (/T/)

K23.7

Carrier Extend. Used to PAD the end of a data frame out to an even number of Symbols, so that next frame, or IDLE sequence starts on an even footing. (/R/)

K28.0

Signals the next symbol is a SYNCCMD.

K28.3

Signals the next symbol is a COMMAND.



Several special sequences of symbols are defined, known as Ordered Sets, some of which are used by the low level state machines to handle the link. By default the link will, when idle, be sending/receiving IDLE sequences.

Set

Sequence

Comment

/I1/

/K28.5/D5.6/

Idle sequence, sent when running DP is +, flips it to -. Sent automatically.

/I2/

/K28.5/D16.2/

Idle sequence, sent when running DP is -, maintains it as -. Sent automatically.

/EPD/

/T/R/ or /T/R/R/

Used to end a data frame, the addition of an extra /R/ is used to pad things out to an even number.

/LOOP/

/K28.5/D12.6/

Low-level link loop back start (DON'T SEND from User-Logic)

/ENDLOOP/

/K28.5/D16.7/

Low-level link loop back end (DON'T SEND from User-Logic)

/LINKSTART/

/K28.5/D1.4/

Link Start. (DON'T SEND from User-Logic)

/LINKACK/

/K28.5/D30.3/

Link Start ACK. (DON'T SEND from User-Logic)



SYNCCMD and COMMAND

These are special command words that can be sent at any time, even during a data packet. If they are during a data packet, then the packet is paused for the duration of the commands transfer and then continues. Any Data Packet CRC checking will need to be paused, as the command will not be part of it.

The commands are consist of a single K Character followed by a single data character. These will always occur such that the K is in the lower 8 bits of the serdes word, and the upper 8 bits is the data character.

For the SYNCCMD the data is arranged so that 5 bits represent an “offset” into the normal word clock and 3 bits represent the command. Currently, there is only 1 command.



SYNCCMD = /K28.0/<[0:2 Command ][3:7 Offset ]>/



For the COMMAND the data is 6 bits, with 2 bits to indicate some routing status. Broadcast set means it should be passed along to the Neighbour DIF, as well as being acted upon by the receiving DIF. Neighbour set means that the receiving DIF should not act up on it, but pass it along. If neither is set, then the receiving DIF acts upon it alone.



COMMAND = /K28.3/<[0:5 Command][6 Broadcast][7 Neighbour]>/


Further ideas about this mechanism are welcome, it needs to be tested on a prototype to make sure its stable and does what is needed.

DIF Firmware Block

The DIF side of the link contains a similar physical layer, although it does not have the same phase recovery logic, as this is not needed on the RX on the DIF. A logical packet layer is then wrapped around this, which is what the DIF firmware is expected to interface to. The firmware for the DIF side can be found in CVS1.


entity dif_module is

port (

serial_clk : in std_logic;

serial_rst : in std_logic;

dif_link_restart : in std_logic;

dif_link_status : out std_logic_vector(2 downto 0);

-- Serial Side

dif_serial_in : in std_logic;

dif_serial_out : out std_logic;

-- Local side of things, from MAC layer

rx_data : out std_logic_vector(15 downto 0);

rx_data_valid : out std_logic;

rx_start_frame : out std_logic;

rx_good_frame : out std_logic;

rx_bad_frame : out std_logic;

tx_data : in std_logic_vector(15 downto 0);

tx_data_valid : in std_logic;

tx_end_frame : in std_logic;

tx_underrun : in std_logic;

tx_ack : out std_logic;

-- Fast Command Frames

fast_cmd_comma : out std_logic_vector(7 downto 0);

fast_cmd_data : out std_logic_vector(7 downto 0);

fast_cmd_strobe : out std_logic;

-- SlowClock stuff

slowclk : out std_logic;

slowclk_resync : out std_logic;

slowclk_stb : out std_logic_vector(19 downto 0)

);

end entity dif_module;



Fast Commands are processed and the comma and data values are output, qualified with the fast_cmd_strobe signal. For clock retimings, the slowclk_resync signals when a retiming event has happened. During the retiming sequence slowclk is held low, and the slowclk_stb lines are held static as well.

RX data is started by a low to high transition of the rx_data_valid strobe, together with a high value on the rx_start_data. Each word is then received on subsequent high transitions of the rx_data_valid. The end of packet is signalled by either a rx_good_frame or rx_bad_frame, which is determined the the results of the CRC16 that has been performed on the received data. The actual CRC word is also passed on.

TX data frames are initiated by presenting data, and transitioning the tx_data_valid to high. The internal state machine will then wait for the appropriate timing before asserting the tx_ack, which is used to acknowlege that the word has been processed and the next word can be presented, and so on until the last word is presented, at which time you present it with the tx_end_frame high. You can hold the tx_data_valid high during the whole packet, and just change the data on each pulse of tx_ack. Holding the tx_underflow can be used to abort the packet, for instance if a buffer has underrun. DIF firmware does NOT need to append the CRC16 to the data, the DIF link firmware will automatically do this when the tx_end_frame is asserted. A fake CRC is added when the packet is aborted, so that the LDA will ignore the packet.

LDA-DIF Data Packet

This is defined by the DIF group, and follows a packet based system. The document is in progress and will be available from themIt is envisioned that the LDA will simply pass over any data packets from the ODR to the DIF. However it might require ACK/NAK processing, which would come back down the LINK and this needs to be thought about.

LDA-DIF Trigger Line

We can use the line to signal a simple asynchronous pulse to the DIF, probably used as a trigger or some such. The lines are AC coupled so something more complex than a simple 0-1 transition will be used. This is likely to be done entirely on the HDMI interface board, and not involve the LDA at all, or if it does, the LDA will also be a slave to the HDMI board and not generate the pulse itself. A “FULL/BUSY” line is also envisioned as coming back, and basically be the reverse of the trigger. Again the LDA will probably just get a single copy of it, and the HDMI FPGA will OR together all channels, and output it direct to the CCC box.

LDA-ODR Link Design
Ethernet Version

The initial design uses Ethernet as the LDA-ODR transport layer, driving RAW Ethernet packets. This was chosen based on previous work done using Ethernet and was deemed easiest to carry over into the LDA design. It also allows the LDA to be used on normal Ethernet switches to aggregate multiple LDAs. By choosing Ethernet it also allows the possibility of using a cheap off the shelf Gigabit Ethernet card as the ODR receiver.

Since we are designing for a Xilinx S3 FPGA the cores used are from Xilinx. A 1000Base-X PCS layer handles the Ten Bit Interface (TBI) to GMII link, and deals with the 8B10B encoding, link auto negotiation and other low level physical layer requirements.


The Gigabit MAC is connected to a standard GMII interface, and could in theory be replaced by any MAC module that has such a connection. The Xilinx one is best suited as it will work with the least changes. The RX elastic buffering is done in the PCS block, so the MAC entirely runs off a single 125Mhz reference (The TX GTX clock), and requires no RX reference clock.

Data from the MAC block is then fed into the Fast Decode MUX, that will extract any packets destined for the Fast Decoder logic, any other packets will be pushed into the TX/RX FIFO blocks for readout by the LDAs main state machine. For TX, the MAC will just automatically empty any packets that are in the TX FIFO.

We may put in a MUX here to allow event data to take priority over other data, like we did in the V4FX Ethernet, it will drop in as it uses the same interface protocols.

The link's auto negotiation is handled by a separate state machine that sits on the MAC's local host interface and drives the various registers to bring the link up. Once the link is available, then the block allows the LDA state machine access to the registers to set/get various things. The MAC also has a statistics output, which could be used to get various metrics that the LDA could send back to the monitoring software.

Currently some way to store a unique MAC address per board is needed. Possible thinking is the CPLD(s) mounted on top. A 16 bit serial number could be stored in those, allowing us to generate a number for the MAC's lower 2 bytes. There do not seem to be any other options for this at present, apart from custom firmware per board. Which we want to avoid. Programming the CPLD is a one time event, and should be easy.

S-Link Version

A version using the TLK2501 chip set is possible, since the ODR can handle the 8B10B encoded data it produces. It runs the link at 20x the GTX clock, of 125Mhz. So we would need to slow the data rate down a little, as the LDA could not cope with 2Gigabit a second of data.

USB Link Design

There is no plan to do the USB interface in Manchester at present. However it is envisioned that it will output something that resembles an Ethernet packet into the MUX block. That way the down stream code doesn't need to change. The USB host PC could use a pretty simple driver that just dumps data blocks into the USB's memory and the embedded code copies it over to the FPGA and streams it into the buffers. The reverse for readout.

LDA Register Space

The LDA has several internal registers used to configure and control its operation. They can be broken down into the following groups:



LDA­-DIF Link Registers. (BLOCK ADDRESS 0x1)


Register Name

Address

LDA_DIF_REG_SERIAL_TX_EN

0x000

LDA_DIF_REG_SERIAL_RX_EN

0x002

LDA_DIF_REG_LINK_RTT

0x004

LDA_DIF_REG_LINK_PAUSE

0x006

LDA_DIF_REG_LINK_RESTART

0x008

LDA_DIF_REG_LINK_STATUS1

0x00A

LDA_DIF_REG_LINK_STATUS2

0x00B

LDA_DIF_REG_LINK_NOSIG

0x00E

LDA_DIF_REG_LINK_LOCKED

0x022

LDA_DIF_REG_SERIAL_DELAY1

0x010

LDA_DIF_REG_SERIAL_DELAY2

0x011

LDA_DIF_REG_SERIAL_DELAY3

0x012

LDA_DIF_REG_SERIAL_DELAY4

0x013

LDA_DIF_REG_SERIAL_RTTDEL1

0x018

LDA_DIF_REG_SERIAL_RTTDEL2

0x019

LDA_DIF_REG_SERIAL_RTTDEL3

0x01A

LDA_DIF_REG_SERIAL_RTTDEL4

0x01B

LDA_DIF_REG_SERIAL_RTTDEL_DONE

0x020

LDA_DIF_REG_SERIAL_REG_DCM

0x024


LDA­-ODR Link Registers. (BLOCK ADDRESS 0x2)
LDA Status Registers. (BLOCK ADDRESS 0x3)
LDA Control Registers. (BLOCK ADDRESS 0x4)



If we assume 16bit Address space, then we can use the upper 4 bits to define which block we are talking to. This allows us to expand in future to more blocks. It also gives us 12 bit lower address space for each block, or 4096 address. If we assume for now that registers are usually 32 bits wide, and any that are not will have their upper bits zeroed on reads and ignored on writes. This gives a good mapping to the existing registers inside the MAC for instance. Registers that are 16 bits wide, will just ignore the upper bits, and we will deal with that in firmware.

LDA Packet Format

The basic format is RAW Ethernet, the Ethernet_II style frame, or DIX. Linux will allow you to talk this frame type directly without problems, providing your application has root permissions to open raw sockets.

Field

Size

Destination

Address

6 Bytes

Source Address

6 Bytes

Packet Type

2 Bytes

Data

46-1500 Bytes

FCS

4 bytes (Not usually added by user.)

The LDA on power up could generate a packet that is sent to the broadcast address to announce its presence, this will allow any switches in the link to at least know its ARP address. And any directly connected ODR to also know its address.

Currently there are a few defined Packet Types.

Fast Command

0x0809

Slow Command

0x0810

Data Packet

0x0811

Fast Commands are special, and are processed by the LDA very quickly, and consist of just a 2 byte bit mask that is used to signal special events to the LDA. They are expected to be sent by the ODR to accomplish timing related matters. This the a machanism by which the ODR can replicate/initiate the LDA-DIF special link commands as mentioned above in the LDA-DIF link protocol. They will never be sent by the LDA to the ODR.

Fast command packet

The basic C style definition of the packet is as follows:

struct fast_packet {

struct ethhdr ethernet_header; /* Defined in linux/if_ether.h */

u_int16_t fast_cmd /* Simple identifier */

u_int8_t fast_dif /* 8 Bit DIF/HDMI port ID */

u_int8_t fast_comma /* the COMMA flags /* 8 bits of status data*/

u_int8_t fast_flags /* the flags */

u_int16_t fast_parity /* parity data */

u_int8_t fast_pad[44]; /* Padded to min packet size */

} __attribute__((packed));


The fast_pad is there to make sure we fill to a valid Ethernet size, and is ignored by the LDA. The fast_cmd field is currently defined as being 0xFA57. For the simple reason it looks like “fast”. fast_dif is an 8 bit field that is used to set a DIF connection to send the data via. A value of 0xFF would indicate send it to all DIF connections. fast_comma and fast_flags are the 8 bit fields send as the comma and data to the DIFs in the 8B10B encoded words. fast_parity is a simple parity check on the fast_cmd, fast_comma and fast_flags fieldsflags is the 8 bit field that is sent to the DIFs in the 8B10B encoded word. fast_parity is a simple parity check on the fast_cmd and fast_flags field to do a simple check. The lower 4 bits of the parity field correspond to the following

Bit 0

Parity of Command's Low Byte

Bit 1

Parity of Command's High Byte

Bit 2

Parity of DIF Byte

Bit 3

Parity of comma Byte

Bit 4

Parity of the flags Byte

We are using an Even Parity scheme in everything here.

Normal data packet

Bit 0

Parity of Command's Low Byte

Bit 1

Parity of Command's High Byte

Bit 2

Parity of DIF Byte

Bit 3

Parity of Flag Byte

We are using Even Parity in everything.

Slow command/data packet

struct lda_packet {

struct ethhdr ethernet_header; /* Defined in linux/if_ether.h */

u_int16_t lda_packettype;

u_int16_t lda_typemodifier;

u_int16_t lda_pktID;

u_int16_t lda_data_length;

u_int8_t lda_packetdata[];

} __attribute__((packed));


The slow command packet is the common one used, and it is processed internally by the main LDA state machine. Current provisional lda_packettype's are:

0x0001 /* Register Write */

0x0002 /* Register Read */

x 0x0003 /* Write ACK */

x 0x0004 /* Read Reply */


0x0101 /* DIF Write, so pass on packet to correct link */

0x0102 /* DIF Read, so pass on packet to correct link,

and send reply back to ODR */

x 0x0103 /* DIF Write ACK */

x 0x0104 /* DIF Read Reply */

x 0x0105 /* DIF Read NAK */


0x0201 /* Write to Event buffer */

0x0202 /* Read from Event buffer */

x 0x0203 /* Write ACK */

x 0x0204 /* Read Reply, with event data. Will be inside a DATA Packet*/

x 0x0205 /* Write NAK */


Types that are marked with “x” are those which only the LDA sends, it will not process any packets it receives with those types.

lda_pktID is used to track replies etc., and as a simple way to track packets.

lda_data_length is the total number of bytes used in the lda_packetdata[] array.

lda_typemodifier is used to determine some specific things, such as on DIF commands it specifies which DIF we are talking to/from.

For register writes it determines the number of registers we are going to access in the packet. When doing register access's to LDA registers the packet data contains one or more of the following structure

struct lda_register {

u_int16_t address; /* the generic address we need */

u_int32_t data; /* the data for the address */

} __attribute__((packed));


Data packet

The basic data packet is the same as the lda_packet above, but the lda_typemodifier specifies how many lda_event structures are enclosed in the lda_packetdata[] array. The reason to do it based on another packet type is that it allows us to do filtering in hardware pretty easily, and allows us to do things with Ethernet switches etc.

struct lda_event {

u_int32_t address; /* Base address, offset in memory basically */

u_int16_t length; /* Length of data following, a multiple of 2!! */

u_int8_t data[]; /* the data */

}


The same structure is used in the read and write requests used in the above commands as well. Basically, all requests involving event data from the ODR are sent inside Slow Command Ethernet Packets, while replies that contain data will be sent inside Data Command packets.




IWe need to have some kind of CRC16 or CRC32 after the lda_packetdata[] in the packet, this needs to be decided based on how easy it is to implement in hardware. . Also, if the packet is less then the required Ethernet size, than after the lda_packet the rest of the Ethernet frame is padded with zeros. So that the LDA CRC follows right away, but then there are the pad words upto the correct size to fill the Ethernet frame.



1http://daqlic.hep.man.ac.uk/cgi-bin/cvsweb.cgi/calice_dif/


Last modified Sun  7 December 2008
Switch to HTTPS . Built with GridSite 2.2.6

Top^