Physics of Market Data: Ticker Plants & Books
Why normalization costs 5 microseconds. The physics of Book Building (Deltas vs Snapshots) and Time Series Databases (KDB+/q).
🎯 What You'll Learn
- Deconstruct the Ticker Plant (Normalization)
- Analyze Book Building (Deltas vs Snapshots)
- Trace a Tick: Exchange (ITCH) -> Ticker Plant (SBE) -> Strategy
- Calculate the Memory Bandwidth of a Full Book Rebuild
- Audit a KDB+ Query for Column-Store efficiency
📚 Prerequisites
Before this lesson, you should understand:
Introduction
Raw market data is chaos. NASDAQ mimics ITCH. CME mimics FIX. Binance mimics JSON. Your strategy cannot speak 50 languages. It needs One Language. The Ticker Plant is the Babel Fish of trading. It translates chaos into structure in < 5 microseconds.
This lesson explores the physics of Normalization and the database engines that store trillions of ticks.
The Physics: Ticker Plant (Normalization)
The Ticker Plant has one job: Translate & Broadcast. Input: Exchange Specific Protocol (e.g., NASDAQ ITCH). Output: Internal Standard Format (e.g., SBE Flat Struct).
The Physics:
- Parsing: Read 64 bytes. Extract Price/Qty.
- Mapping: Convert
SymbolID: 19434->AAPL. - Arithmetic: Convert Fixed Point
150000000->150.00. - Distribution: Multicast the Normalized struct to internal switches. Total Latency Budget: < 5 microseconds.
Deep Dive: Book Building (Deltas vs Snapshots)
Exchanges rarely send the full order book (Snapshot). It takes too much bandwidth. They send Deltas (Add, Update, Delete). Your system must maintain the “State” in local RAM.
The Physics:
- Snapshot:
[100.00, 100.01, 100.02...]. Huge Bandwidth. Easy CPU. - Delta:
[Update 100.01 to Size 50]. Tiny Bandwidth. Hard CPU (Search & Modify). - Risk: If you miss ONE delta, your local book is permanently corrupted (Crossed Book).
Strategy: Time Series DB (KDB+/q)
Saving ticks to PostgreSQL is suicide.
SQL is Row Oriented. Ticks are Column Oriented.
KDB+ (and its language q) is the standard because it matches CPU vector instructions.
Physics of a Query:
select avg price from trade where sym = 'IBM'
- Row Store: Read row 1 (Sym, Price, Size), check Sym. Read row 2… Cache Misses everywhere.
- Column Store: Load the
Symvector. Find indices whereSym == 'IBM'. LoadPricevector at those indices. SIMD instructions process 64 symbols per cycle.
Architecture: The Distribution Bus
Once normalized, how does data reach 50 internal strats? Internal Multicast. The Ticker Plant shouts the normalized data onto the internal LAN. Strats listen via Kernel Bypass. Latency: The switch adds ~300 nanoseconds.
Code: KDB+ Query (q)
Real KDB code is terse and vector-optimized.
/ Create a table with 10 million random trades
n:10000000
t:([] time:n?09:30:00.0; sym:n?`AAPL`GOOG`MSFT; price:100+n?50.0; size:100*1+n?10)
/ Query: VWAP (Volume Weighted Average Price) by Symbol
/ runs in milliseconds due to Column Physics
select wavg[size;price] by sym from t
Practice Exercises
Exercise 1: The Bad Map (Beginner)
Scenario: Ticker plant uses a strictly consistent Header Map (std::map) for Symbol IDs.
Result: lookup time. Cache misses.
Fix: Use a flat array (std::vector). lookup. Symbol ID is the index.
Exercise 2: The Crossed Book (Intermediate)
Scenario: Your local book shows Best Bid 100.05, Best Ask 100.00. Cause: You missed a “Modify” message that moved the Ask up, or a “Cancel” that removed the Bid. Action: Detect cross -> Dump State -> Request Snapshot -> Rebuild.
Exercise 3: Storage Math (Advanced)
Scenario: 100 Million ticks per day. 64 bytes per tick. Size: 6.4 GB per day raw. Compression: Trade ticks are highly compressible (Delta encoding timestamps). KDB+ does this automatically.
Knowledge Check
- What is Normalization?
- Why use Deltas instead of Snapshots?
- Why is KDB+ faster than MySQL for ticks?
- What happens if a Ticker Plant slows down?
- What is a “Crossed Book”?
Answers
- Translation. Converting proprietary exchange formats to a standard internal format.
- Bandwidth. Sending only changes is 100x smaller than sending full state.
- Columnar Storage. Matches CPU cache lines and SIMD instructions.
- Stale Strats. The entire trading firm runs on old data. Major PnL risk.
- Impossible State. Bid >= Ask. Implies a trade should have happened but didn’t.
Summary
- Ticker Plant: The Translator.
- Book Builder: The State Keeper.
- KDB+: The Historian.
Questions about this lesson? Working on related infrastructure?
Let's discuss