Virtual Queues as a Trade Processing Pattern Uri























































- Slides: 55
Virtual Queues as a Trade Processing Pattern Uri Cohen @uri 1803 | github. com/uric Head of Product @ Giga. Spaces
Event Processing at Massive Scale Approaches to Concurrency Uri Cohen @uri 1803 | github. com/uric Head of Product @ Giga. Spaces
This is What It Used to Be Like
That’s What It’s Like Now
Some Numbers 15 Billion Trades / Day on NYSE alone http: //www. nytimes. com/2011/08/27/business/as-trade-volumes-soar-exchanges-cash-in. html
Some Numbers That’s 641 K Trades / Second http: //www. nytimes. com/2011/08/27/business/as-trade-volumes-soar-exchanges-cash-in. html
Some Numbers 12 Billion Shares change hands every day http: //www. bloomberg. com/news/2012 -01 -23/stock-trading-is-lowest-in-u-s-since-2008. html
Some Numbers $4 Million The cost of 1 millisecond of latency to a broker http: //www. tabbgroup. com/Publication. Detail. aspx? Publication. ID=346
The Problem Massive stream of events Time is money, literally Can’t lose a single message Fairness is a must
Order Book Simplistic Example Buy Sell 50, $12 60, $10 60, $11 100, $11 30, $10 30, $12
Order Book Simplistic Example Price: $10 Buy Sell 50, $12 60, $10 60, $11 100, $11 30, $10 30, $12
Order Book Simplistic Example Price: $10 Buy Sell 60, $11 10, $10 30, $10 100, $11 30, $12
Order Book Simplistic Example Price: $10 Buy Sell 60, $11 10, $10 30, $10 100, $11 30, $12
Order Book Simplistic Example Price: $10 Buy Sell 50, $11 100, $11 30, $10 30, $12
Order Book Simplistic Example Price: $11 Buy Sell 50, $11 100, $11 30, $10 30, $12
Order Book Simplistic Example Price: $11 Buy Sell 30, $10 50, $11 30, $12
Low latency What it Really Means In memory, GC tuning Scalability Multi-core Multi-node Ordering By price, order time Exclusivity Resiliency
Trading is Just One Use Case All things FCFS, with a limited stock Flight booking Betting Online Auctions Cloud Spot Instances e. Commerce
Let’s Talk Solutions
Queue (SEDA/Actor Style) Not Validated Validator Processed Processor
Queue (SEDA /Actor Style) The Good: Ordered (Is it fair? ) Multi-threaded The Bad: Not very scalable Locking Context switching Transient
The Cost of Locking Method Time in msec Single Thread 300 Single Thread w/ Lock 10. 000 2 Threads w/ Lock 224. 000 Single Thread w/ CAS 5. 700 2 Threads w/ CAS 30. 000 Single Thread w/ Volatile Write 4. 700 http: //disruptor. googlecode. com/files/Disruptor-1. 0. pdf
Queue (Lack of) Fairness Buy 50 Sell 100 60 Consumer Thread 1 Consumer Thread 2
Queue (Lack of) Fairness Buy 50 100 60 Consumer Thread 1 Sell Consumer Thread 2
Queue (Lack of) Fairness Can you tell which order will be executed 1 st? Buy Sell 100 60 50 Consumer Thread 1 Consumer Thread 2
Single-Threaded Queue Validator Processor
The Good: Single. Threaded Queue Fast, very fast No contention No context switches Always fair The Bad: Multi-core? Not fit for Intense compute & I/O Need to be async. Transient
Single. Threaded Queue They do it…
Disruptor (LMAX)
Segmented Queue Processor thread pool per segment Symbol=A-H Symbol=I-S Symbol=T-Z Validator Processor
Segmented Queue - Optimization Single Processor thread pool, pick random segment Symbol=A-H Symbol=I-S Processor Symbol=T-Z
The Good: Scalable But segments can get hot Segmented Queue Minimizes contention The Bad: Not trivial to implement Still unfair Is total ordering needed? Transient
Exclusivity is Key What about Fairness? Process one message for each segment at the same time No exclusivity across segments
Implicit Exclusivity Single processor thread per segment Symbol=A-H Symbol=I-S Symbol=T-Z Processor
Explicit Exclusivity Shared thread pool, mark segments under processing (CAS) Segment 1 Segment 2 Segment 3 Processor Segment 3
Explicit Exclusivity Shared thread pool, mark segments under processing (CAS) Segment 1 X Segment 1 Segment 2 Segment 3 Processor Segment 3
Explicit Exclusivity Shared thread pool, mark segments under processing (CAS) Segment 1 X Segment 1 Segment 2 Segment 3 X Processor Segment 3
Explicit Exclusivity Shared thread pool, mark segments under processing (CAS) Segment 1 Segment 2 Segment 3 X Processor Segment 3
Explicit Exclusivity Num. of segments is key Too few: little concurrency Too many: wasting memory
Dynamic Segmentation Segments are created and removed as needed Processor
Dynamic Segmentation Segments are created and removed as needed “GOOG” Processor
Dynamic Segmentation Segments are created and removed as needed “GOOG” Processor
Dynamic Segmentation Segments are created and removed as needed GOOG “GOOG” “AAPL” AAPL Processor
Dynamic Segmentation Segments are created and removed as needed GOOG X “GOOG” “AAPL” AAPL AMZN Processor “AMZN”
Segments created as needed Dynamic Segmentation Randomize on segments until available one found Fast, scalable, fair We call it “FIFO groups” or “Virtual Queues”
It Can (and Does) Get Much More Complex Memory state can get corrupt on errors It’s not always as simple as “pop off the queue” limits, priorities, circuit breakers, etc. Resiliency is always a pain
What you don’t want to do A Bit about Usability Implement data structures Handle concurrency Handle HA Handle transactions
What you want to control A Bit about Usability Event flow Grouping attribute (e. g symbol) Event handlers
Data Grid as a Foundation Transactional Highly available Supports complex matching
How We Thought of It
How We Thought of It
How We Thought of It
How We Thought of It
How We Thought of It
Thank You! References: http: //martinfowler. com/articles/lmax. html http: //www. nytimes. com/2011/08/27/business/as-tradevolumes-soar-exchanges-cash-in. html http: //disruptor. googlecode. com/files/Disruptor-1. 0. pdf http: //www. gigaspaces. com/wiki/display/XAP 9/FIFO+Grouping