Ultra Low Latency ULL Electronic Trading Why Deterministic

  • Slides: 13
Download presentation
Ultra Low Latency (ULL) Electronic Trading Why Deterministic Latencies are Critical & Processing Speed

Ultra Low Latency (ULL) Electronic Trading Why Deterministic Latencies are Critical & Processing Speed still Significant Ted Hruzd, Sr. Infrastructure Architect, RBC Capital Markets (since March 2016) Wall Street IT since 1983 ULL Architect at Citi, DB, JP Morgan 2005 -2016 NYU Teacher of ULL Architectures for Electronic Trading course – Summer 2017 Panelist 12/8 Intelligent Trading Summit’s ULL Market Data Webinar Any comments made by Ted Hruzd here or on the webinar are their own personal view and not that of Ted’s current or past employers

Describe optimal access to Low Latency Mkt Data and its importance • • •

Describe optimal access to Low Latency Mkt Data and its importance • • • ULL NOT-ULL Co. Lo instead of Extra. Nets FPGA (full or part) Tick 2 Trade (T 2 T) 1 -2 u. Secs L 1 -3 Switches 70 ns Exchg to switch in Colo 5 ns Mkt Data Fan-out to FPGA data normaliz. Exch subscriber Mkt Data < 1 u. Sec $18 K for 48 port L 1 -3 ULL switch • Extra. Nets • Non FPGA Mkt Data Tick Plant • High end Cisco/Arista Switches/Routers • 1 -2 u. Sec Fan-out • 1 -2 milliseconds -------------------------------------- • Multi. Thrd TBB, OMP, AVX-512, kernel/app tuning • Complex Algo’s in GPU’s or Intel Cores • Consolidated for most; Direct for Book Builds • Latency Spikes; missed opportunities • FIX Engines & order books in FPGA NICs • Simple Algo’s 100% in FPGA • Complex Algo’s in GPU’s or Intel Cores • Direct Feeds not Consolidated; UBBO v NBBO • Deterministic Latencies; $$$ during spikes Both • Aggressive SLA’s - Network Q’s & pkt drops • Separate NICs for Mkt Data & Order Flow • Kernel Bypass (FPGA based preferred) • KPI’s all trading partners performance • Internal metrics for ROI • RTmkt data/news analytics- seek alpha • Measure exch, FH, direct & cache clients Microwave Chi-NY 4. 5 ms L 1 -3 ULL switches in Co. Lo

What is Speed 2? Why is it more important than Speed 1 (Raw) revolves

What is Speed 2? Why is it more important than Speed 1 (Raw) revolves around meta-speed (information about speed). dynamics, timeliness, measurability, auditability & transparency of speed & latency. critical aspect of Speed 2 is deterministic latency. What good is raw speed if firms do not understand whether sell-side systems, exchanges, markets, & clients they are connecting to are fully functional? • Or if a trading partner is in the beginning stages of failure, or in the process of being degraded – ex: Exec broker mkt data latencies spiking with Fed announcement or. Trump Tweet? • Buy-Side firms can learn of Exec. Brkr order ack latency spikes (due to Exec. Brkr spikes handling Mkt Data) • immediately route to alternate sell-side execution brokers. • This is where high speed memory analytics provide a significant competitive advantage. • Speed 2 most critical to Market Makers – loose $$ on stale market data Tabb and Corvil refer to the above trade decision point as ‘Speed 2’ – ability to be fully aware of speed (or lack of speed), internally determinately fast in order to maximize trading revenue. • •

What factors disrupt optimal scenario? • • • Improper bandwidth capacity planning/implementation + errors

What factors disrupt optimal scenario? • • • Improper bandwidth capacity planning/implementation + errors in SLA plans Errors in Integration with Exchanges + vendor products: L 1 -3 switches + FPGA Mkt. Data + FPGA or cpu Order Flow Not validating your vendor’s ‘deterministic’ latency claims Combining mkt data, order flow, and admin commands on same NIC! NOT using kernel bypass Per OS kernel tuning - ex: not using RHEL 7’s “network-latency” profile (optimized for speed, not power savings, no auto NUMA, less interrupts, 2 way TCP handshakes, …. Goto redhat site Little or no latency measuring • Can’t fix if no analysis’ can’t analyze if not measuring • Essential to measure exchange, feed handler, client, end-end latencies (some vendors provide this easily) No ‘Speed 2’-like metrics, including metrics for ROI analysis No continuous real-time alerting, analysis for performance tuning, capacity planning, HA, DR Not paying attention to ‘deterministic’ latencies Plan for faster SIP (20 u. Secs from 350 u. Secs). • Will Speed bump exchanges adapt to this? Will prop traders route less to them?

Deterministic Latencies & optimal ULL apphow it works, how to attain • • •

Deterministic Latencies & optimal ULL apphow it works, how to attain • • • Infrastructure and app design in prior slide detail ULL deterministic architecture Serious traders know the sell-side, execution brokers, dark pools, and trading venues that exhibit deterministic speed (and low latencies). Trade with them especially in volatile times: best $$ opportunities Trading signals have a short life, often in micro seconds. Best Buy side traders route to the fastest and most deterministic sell side firms. IB’s optimize their deterministic dark pools route Exec Brokers route to deterministic latency dark pools and lit exchanges Metrics for ROI Test, Profile, Analyze, Project Architects, engineers, developers, QA to recommend new infrastructures for positive ROI, Next, meticulously engineer, configure, profile tune, validate expected latencies, fill rates, and thus revenue (positive ROI) • metrics may play significant role as to whether a trading firm should stay in ET or exit the business. • TABB: a large global investment bank has stated that every millisecond lost results in $100 m per annum in lost opportunity. (public info to Tabb) • ARCA …. Competitive (order acks, execs, ARCA BOOK), in the BLACK

Latency Impacts of execution, TCS, surveillance & algo back testing • • • Mkt

Latency Impacts of execution, TCS, surveillance & algo back testing • • • Mkt data for TCA, market surveillance & algo back testing must NOT impact T 2 T, order ack times, or trade execution times. Offload any analytics and above functions separately & asynchronously Proper software can replay market data with multiple algo’s at original rates, latencies, or alter (ex: speed-up, change dynamics, algo goals, etc). Many Mkt Data vendors provide this service TCA (maybe shoukd be referred to as RT analytics or RTA for alpha instead) From Tabb: TCA is increasingly being used in real time. TCA can therefore generate alpha by projecting and exposing lower costs and specific trading venues for buying or selling securities. This is referred to as “opportunity cost” and is very time sensitive. THE POOL for NON ULL Trading is decreasing Nearly 50% of equity desks globally now use TCA (RTA) as both a pre- and post-trade tool with onequarter using TCA (RTA) for real-time, in-trade analytics. Due to their access to the resources for deploying and integrating new technology, the highest-volume firms are the biggest users.

HW Acceleration + in-mem DB’s for higher performance in Mkt Data Mgt • •

HW Acceleration + in-mem DB’s for higher performance in Mkt Data Mgt • • FPGA’s are deterministic by design and process in parallel • Already discussed role in ULL and deterministic Mkt Data Re: Analytics • Both are required as hardware acceleration will speed up (and deterministically) market data to high speed memory regions for analytics that may include timely alpha seeking strategies.

Are ULL Mkt Data BP’s agreed to? Or do they vary per specialization, function,

Are ULL Mkt Data BP’s agreed to? Or do they vary per specialization, function, size? • • Agreed upon- NO. Some feel few milliseconds latencies or more for market data still suffices. • Prime Ex: Low Freq, alpha-seeking per fundamental analysis for long term holds Others strive for nanoseconds. Key is to ID your business goals & ROI on ULL spending. Specialization and function are significant factors. Goals of Prop Traders, Market Makers, HFT traders, arbitrage, are much more latency sensitive than most Buy side and asset managers. Equites & Futures traders are more apt to opt for ULL FX not so but getting closer; Bond and Commodity trading much less to choose ULL.

Why is pure speed still significant? • Industry leading. Tick-2 -Trade latencies have decreased

Why is pure speed still significant? • Industry leading. Tick-2 -Trade latencies have decreased by factor of 10 every 3 years • ULL trading firms continue to prioritize speed. • Buy Side tracks latencies with Sell Side (SS) and routes to fastest SS firms • Buy Side has accelerated adoption of SS-like ULL technologies; hence, proportion of trades to non ULL firms is decreasing

Top Trends Past 2 years, continuing • • • In 2017, Nasdaq will decrease

Top Trends Past 2 years, continuing • • • In 2017, Nasdaq will decrease time to disseminate SIP (Consolidated Market Data) from 480 u. Secs to 50 u. Secs, then 20 u. Secs; • Almost all Trading Firms actively use the SIP • Trading Firms will optimize infrastructure for quicker access to SIP More firms & algo applications are upgraded for single digit u. Secs for Tick-2 -Trades Speed enhancements in FPGA’s, GPU’s, Servers, Caches, Middleware continue to lead to lower trading latencies • FIX Engines and Mkt Data order books in FPGA NICs • Simple algo’s, along with FIX Engines, SOR, Risk checks – 100% FPGA ready Increasing # of vendors now compete in space of ULL appliances that feature parallelized processing using 2 or more of following: • FPGA’s, GPU’s, Intel Cores, ULL Switch capabilities Some ULL Switches now transmit Market Data to subscribers in 5 ns 48 port ULL switches are inexpensive (under $20 K) - easier to meet ROI Kernel Bypass is now ubiquitous; I/O’s are at approx. 1 u. Sec, down from 12 u. Secs • Increased use of FPGA based kernel bypass to sub u. Sec I/O latencies More Trading firms embracing FPFGA’s for deterministic latencies GPU’s remain important speed factor but more for risk (ex Monte. Carlo) & analytics

Additional Top Trends (Cont) Very early or more longer term strategic • Real Time

Additional Top Trends (Cont) Very early or more longer term strategic • Real Time Big Data (Machine Learning), some in Cloud, now send more time sensitive alpha trading signals over high speed interconnects to Trading Apps • Another reason to speed up trading systems • • New Binary FIX protocol, expected in 2017, will further decrease latencies FPGA programming is becoming easier and is increasing in use cases • New Intel libraries and C like A++ language a major reason Matching Engines 100% in FPGA •

Co. Lo App – using L 1 -3 ULL switches Optimized for lowering network

Co. Lo App – using L 1 -3 ULL switches Optimized for lowering network deterministic latencies from as much as 1+ ms to under 1 u. Sec; pass Mkt Data to Pure FPGA based Algo App may result in 1 u. Sec T 2 T

Pure FPGA Based Solution Optimized for lowering network deterministic latencies from few ms to

Pure FPGA Based Solution Optimized for lowering network deterministic latencies from few ms to under 1 u. Sec: