Nanoseconds Timing System Based on IEEE 1588 FPGA
Nanoseconds Timing System Based on IEEE 1588 FPGA Implementation Davide Pedretti ------ On behalf of JUNO Padova Electronics Group & JUNO Collaboration 21 st IEEE Real Time Conference June 9 -15, 2018 Colonial Williamsburg, VA, USA 1
JUNO Synchronization System Trigger and event synchronization GPS Global time Accurate timestamping White Rabbit network BEC 1 The primary task of the JUNO synchronization system is to handle an accurate time distribution with the target resolution of 32 ns. JUNO detector Central trigger and timing system White Rabbit BEC 2 FPGA Global time Backend electronics 80 m twisted pairs CAT-5 e or equivalent cable. FPGA Local time GCU 1 GCU 2 GCU 48 IEEE 1588 -2008 digital implementation Frontend electronics GCU June 9 -15, 2018, Colonial Williamsburg 21 st IEEE Real Time Conference 2
IEEE 1588 -2008 Digital Implementation The proposed research work addresses the clock offset correction mechanism between backend and frontend electronics necessary to ensure that timestamps in all devices use the same time base. BEC 1 FPGA 80 m twisted pairs CAT-5 e cable. Global time FPGA Local time GCU 1 GCU 2 June 9 -15, 2018, Colonial Williamsburg GCU 48 1) Clock syntonization. The global clock signal is distributed to all the GCUs as encoded information. Any GCU recovers the global clock with a CDR and counts the time locally. 2) Every local time will experience an offset with respect to the global time since the start of the counting is not synchronized among GCUs. The clock offset correction mechanism is based on a digital implementation of the IEEE 1588 -2008 Precision Time Protocol (PTP). 3) Implementation of a full duplex and deterministic latency communication channel between the BEC and any GCU over a couple of copper twisted pairs available. 21 st IEEE Real Time Conference 3
Clock Offset Correction Mechanism Implemented Master Global Time Slave Local Time Ø Assumption : Delay master-slave = delay slave-master = delay t 1_l Ø The master sends the synch message to the slave and records t 1_g sync h(t 1 _g) q y_re dela Delay master -slave the transmission time t 1_g. Ø The slave records the reception time t 2_l and computes: t 2_l Ø t 1_g – t 2_l = offset - delay. t 3_l Ø The slave sends the delay request message to master and Delay slave-master t 4_g t 5_g Ø t 1_g – t 1_l = offset to be measured. records the transmission time t 3_l. Ø The master records the reception time t 4_g. Ø The master sends to the slave the delay message containing delay the t 4_g. _resp (t 4_g Ø The slave computes t 4_g – t 3_l = offset + delay. ) t 6_l © 2008 IEEE Ø The slave computes: ( t 1_g – t 2_l) + (t 4_g – t 3_l) = 2 x offset Ø The slave computes the offset: 1 bit right shift. The PTP protocol is based on the assumption that transmit and receive paths are symmetric; the accuracy is degraded by any source of asymmetry in the communication channel between master and slave. June 9 -15, 2018, Colonial Williamsburg 21 st IEEE Real Time Conference 4
PTP Hardware Implementation Advantages ü Timestamping ü Clock count ü Physical and data link layer symmetry The synchronization accuracy is extended to ± 1 global clock period. ϕ is mainly determined by the transmission latency and the PTP cannot resolve this phase difference between global and local clock. ϕ is unknown but in principle, without variations of the cable length, it is invariant with a standard deviation imposed by jitter. June 9 -15, 2018, Colonial Williamsburg 21 st IEEE Real Time Conference 5
The Interconnection Model Implemented Central Trigger System TRIGGER TTC DECODER 48 TTC DECODER 1 FPGA DELAY CTRL 3, 6 1, 2 TTC ENCODER GCU 1 TRIGGER Kintex-7 DELAY CTRL 21 st IEEE Real Time Conference 3, 6 GCU 48 Kintex-7 DELAY CTRL TTC DECODER 1, 2 CAT-5 e ~ 80 m TIMING TTC ENCODER CAT-5 e ~ 80 m TIMING June 9 -15, 2018, Colonial Williamsburg TIMING BEC card TTC DECODER ü Framing: Ø Broadcast commands: 16 -bit frames. Ø Individually addressed commands/data: 42 -bit frames. ü Error correction and detection: Ø Recovering from single bit errors (Hamming check sum) Ø Detection of double bit errors. ü Cables length mismatch ü Deterministic latency ü Firmware asymmetry compensation. Χ Cable asymmetry compensation WR PTP CORE TTC ENCODER PTP is not limited to Ethernet. The interconnection model implemented to ensure bidirectional messaging between master and slaves is based on the CERN’s timing, trigger and control (TTC) system concept. TRIGGER 6
TTC Physical Layer and Clock Syntonization The global clock signal is distributed to the frontend nodes as encoded information in two communication channels that are Time Division Multiplexed (TDM). At physical layer data is Bi-phase Mark encoded (BMC): § data = 1 bi-phase § data = 0 constant level TMD Serial link synchronization CHA CHB 0 1 0 0 125 MHz BMC data 250 MHz 250 Mbps Syntonization CHB 62. 5 MHz data ü DC balanced ü self-clocking CHA Recovered clock ADN 2817 CDR 250 MHz Kintex-7 BMC data Low jitter June 9 -15, 2018, Colonial Williamsburg 21 st IEEE Real Time Conference 7
Serial Link Synchronization Problem FPGA – recovered clock domain tap +1 tap -1 DELAY CONTROL SLAVE TTC DECODER cable driver cable equalizer CDR FPGA tap +1 TTC ENCODER 250 Mbps tap count global clock domain tap enable tap incr TTC ENCODER FINE DELAY cable driver cable equalizer 250 Mbps GCU 1 IDELAYE 2 = 31 delay taps. 1 tap ~ 78 ps. Complessive delay ~ 10 ns. cable equalizer From GCU 48 tap -1 DELAY CONTROL MASTER TTC DECODER 1 ERROR COUNT TTC DECODER 48 ERROR COUNT 250 Mbps from Plan. Ahead BEC 48 LVDS data streams in different phase relationship with the global clock are captured into the FPGA synchronous logic: we likely have the marginal capturing problem. The issue has been solved using a cascade of 4 programmable delay primitives in the frontend FPGA, whose tap count is remotely incremented/decremented by the master delay control core running in the backend FPGA. June 9 -15, 2018, Colonial Williamsburg 21 st IEEE Real Time Conference 8
Serial Link Synchronization Solution The data stream captured in the backend FPGA is delayed incrementally in steps of 78 ps and plotting the TTC decoder frame error count versus the tap count we get the information about the input data stream eye opening and the best sampling point. 51 tap ~ 4 ns best sampling point June 9 -15, 2018, Colonial Williamsburg In this example the data stream was delayed of 35 taps to match the best sampling point. A reliable and error free communication between backend and frontend electronics is essential in a 18000 channels setup like JUNO. 21 st IEEE Real Time Conference 9
BMC Serializer Addressed Frame generator 0 b 00000 TO GCU 2 tap_incr tap_decr tap_rst req grant t 1_g t 4_g req grant Broadcast frame generator Syntonization and serial link synchronization TTC RX 1 Physical and data link layer 250 Mbps Clock offset correction Coarse delay Channel bonding & deserializer 0 b 00110 BRD frame decoder Error counter Addressed frame decoder REGs June 9 -15, 2018, Colonial Williamsburg Kintex-7 global clock BMC Serializer TTC TX Broadcast frame generator grant req data CDR clk Coarse delay Channel bonding & deserializer 0 b 00110 grant req Frontend electronics BRD frame decoder Error counter Addressed frame decoder REGs local clk 250 MHz PLL Kintex-7 LOCAL TIME counter 21 st IEEE Real Time Conference Timestamping Serial link Synchronization slave delay_req Addressed Frame generator 250 Mbps TTC RX FSM 250 MHz tap_incr, tap_decr, tap_rst Fine delay 1588 PTP master GLOBAL TIME time counter PLL local free-running oscillator FSM delay_req White Rabbit node ü All we need is a low cost FPGA in the backend, a low cost FPGA in the frontend a CAT-5 e cable. ü HR pins low power and low % of FPGA resources usage. ü The VHDL code is generic, it may be easily synthesized for a different FPGA family and manufacturer. ü The PTP is conceived as a Finite State Machine (FSM). ü If a message is not delivered correctly a watchdog timer takes the FSMs back to IDLE state. ü The synchronization procedure is periodical. Serial link Synchronization master FSM tap_incr Fine delay TO GCU 48 Legend: TTC TX tap_decr tap_rst RTL Design Overview Backend electronics 1588 PTP slave t 4_g t 1_g FSM offset Offset compute time Timestamping
Test 1 Setup and Results This test setup with 3 mboards long cables leadsprogrammed to a time accuracy of Thesmall backend and frontend have been to 1 generate ns. As expected is no phase control, and the phases of the a pulse there at a scheduled time. frontend clocks move in a range of ± 4 ns changing the cables length. -555 ps June 9 -15, 2018, Colonial Williamsburg 445 ps 21 st IEEE Real Time Conference 11
Test 2 Setup and Results Still good; the time accuracy achieved is well within the requirement of 32 ns. BEC ~ GCU 1 0 s June 9 -15, 2018, Colonial Williamsburg 3 m ~ ~ 80 m Test repeated with longer cables to reproduce a condition similar to the final installation on field. GCU 2 seems to have an offset error of one clock period. 50 GCU 2 v m v GCU 3 4 ns 21 st IEEE Real Time Conference 12
Cable Asymmetry This asymmetry of 10 ns generates the clock period offset error observed. Without any compensation the offset error introduced by the asymmetry of a 100 m long CAT-5 e cable may be up to 25 ns. § If the cable layout does not change after the installation, the asymmetry can be manually measured and compensated with coarse delay primitives in the firmware. § The automatic measurement and compensation of cable length imbalance are possible only introducing in the frontend and backend electronics the hardware necessary to swap the transmit and receive differential pairs. § The digital implementation of the PTP is compatible with a TTC optical distribution. The optical fiber asymmetry would be negligible claiming a ± 4 ns timing system. June 9 -15, 2018, Colonial Williamsburg 21 st IEEE Real Time Conference 13
Conclusions The fully hardware implementation of the IEEE 1588 -2008 PTP over a full duplex and deterministic latency communication channel based on the CERN TTC system enables the synchronization of thousands of timing receivers with a resolution of few nanoseconds. The proposed timing system has been developed for the JUNO experiment where the potting of underwater electronics imposes tight constraints on the communication medium between backend and frontend electronics, making commercial timing systems unattainable solutions. The test results demonstrate that a time accuracy of ± 4 ns is achievable using two low cost FPGAs (one in the backend and one in the frontend electronics) and a communication medium, between the two FPGAs, consisting of a couple of twisted pairs in a CAT-5 e cable. The communication over copper cables is the only source of asymmetry of the implemented timing system and, if not compensated, may degrade the time accuracy. The design is compatible with a TTC optical distribution that, outside the JUNO context, would extend the transmission range and get rid of the cable asymmetry problem. June 9 -15, 2018, Colonial Williamsburg 21 st IEEE Real Time Conference 14
Thanks! Questions June 9 -15, 2018, Colonial Williamsburg 21 st IEEE Real Time Conference 15
- Slides: 15