HFT Infrastructure Scorecard
Learn how institutional trading firms build low-latency systems. Toggle each item to calculate your score.
Physics & Network
The speed of light doesn't negotiate
These are physics constraints. No software optimization can overcome them.
Colocated with exchange/matching engine?
+120µsCloud = 2-20ms. Colo = sub-100µs.
Learn more
Why it matters: Light travels ~300km per millisecond. If you're in AWS us-east-1 and the exchange is in Secaucus, NJ, you're paying a ~2ms round-trip tax on every order. That's 2000 microseconds where a colocated competitor can react first.
The math: At 10,000 trades/day, if you lose 1 tick on 10% of trades due to latency, and each tick is $0.10, that's $100/day or $25,000/year in lost alpha.
What top firms do: Firms like Citadel, Jump, and Two Sigma pay $100k+/month for colo space within 100 meters of the matching engine.
Using kernel bypass (DPDK/RDMA/XDP)?
+80µsStandard TCP/IP = 5+ kernel layers.
Learn more
Why it matters: A normal network packet traverses: NIC → Driver → Kernel Space → TCP/IP Stack → Socket Buffer → User Space. Each transition costs 1-5µs due to context switches and memory copies.
The alternative: DPDK (Data Plane Development Kit) maps the NIC directly to user-space memory. Your application polls the NIC directly - zero kernel involvement. This reduces per-packet latency from ~15µs to ~1µs.
Trade-off: You lose the kernel's TCP implementation. You must implement your own protocol handling or use a library like Seastar.
NIC IRQ affinity pinned to isolated cores?
+40µsUnpinned = 10-50µs jitter spikes.
Learn more
Why it matters: When a packet arrives, the NIC fires an interrupt (IRQ). By default, Linux can route this interrupt to any CPU. If your trading thread is on CPU 2 and the IRQ lands on CPU 5, the data must cross the CPU interconnect.
Worse: If the IRQ lands on the same CPU as your trading thread, it interrupts your critical path. Both scenarios add 10-50µs of jitter.
The fix: Pin NIC IRQs to dedicated cores that do nothing but handle interrupts. Keep trading threads on separate isolated cores.
Hardware timestamping on NICs?
+15µsSoftware timestamps = µs-level drift.
Learn more
Why it matters: When you call `clock_gettime()`, you're asking the kernel for the current time. This involves a syscall (500ns+) and potential clock drift if the kernel is busy.
Hardware timestamps: Modern NICs (Intel X710, Mellanox ConnectX) can stamp packets at the nanosecond level directly in hardware. This is essential for: - Proving execution time to regulators - Detecting MEV timestamp manipulation - Accurate latency measurement
Accuracy: Software timestamps: 1-10µs error. Hardware timestamps: <50ns error.
Architecture
Structure determines your ceiling
Your code architecture creates hard limits. Choose wrong, and no tuning will save you.
Single-threaded hot path (no locks)?
+60µsMutex = unpredictable tail latency.
Learn more
Why it matters: A mutex lock in your critical path means threads can block each other. Even if contention is rare, when it happens, you see 50-100µs stalls. This destroys your P99 latency.
Real numbers: Uncontended mutex: ~20ns. Contended mutex: 10,000-100,000ns.
The pattern: Use a single-threaded event loop for order handling. Communicate with other threads via lock-free SPSC (Single-Producer Single-Consumer) queues.
NUMA-aware memory allocation?
+35µsCross-socket = +30-100ns per access.
Learn more
Why it matters: Modern servers have 2+ CPU sockets. Each socket has its own memory controller. Accessing memory "local" to your CPU: 70ns. Accessing memory on the other socket: 130ns.
The trap: Linux allocates memory from any NUMA node by default. Your trading thread on Socket 0 might be reading order book data from Socket 1's memory.
At scale: If you access memory 1 billion times/second, cross-socket access costs you an extra 60 seconds of latency per day.
Pre-allocated memory pools?
+50µsmalloc() = 100µs+ stalls.
Learn more
Why it matters: `malloc()` and `new` are not constant-time. They search free lists, may call `mmap()`, and can trigger page faults. Worst case: you wait for the kernel to find memory.
Measured: Average malloc: 50ns. But P99 can spike to 100,000ns when memory is fragmented.
The fix: Pre-allocate everything at startup. Use object pools or arena allocators. Your hot path should never call malloc.
Order gateway on same NUMA node as NIC?
+25µsCross-socket = QPI/UPI bus latency.
Learn more
Why it matters: The NIC is physically connected to one CPU socket. If your gateway process runs on the other socket, every packet crosses the inter-socket link (Intel QPI/UPI).
Hidden cost: This adds ~60ns per packet each way. For a market data feed processing 1M messages/second, that's 60ms of extra latency per second.
How to check: `lspci -v` shows which CPU the NIC is connected to. Match your process affinity to that socket.
Software Config
The devil is in the kernel parameters
Linux defaults are optimized for throughput and power savings, not latency. You must override them.
CPU isolation (isolcpus + nohz_full)?
+25µsPrevents OS from stealing cycles.
Learn more
Why it matters: Linux scheduler can migrate your thread to any CPU at any time. Even if it doesn't migrate, it can preempt your thread to run kernel tasks (RCU callbacks, timer ticks, etc).
`isolcpus`: Tells Linux "never schedule anything on CPUs 2-7 unless explicitly asked."
`nohz_full`: Disables the kernel's timer tick (normally 250Hz) on those CPUs. No tick = no interruptions.
Combined effect: Your trading thread runs uninterrupted. P99 drops from 50µs to 5µs.
Huge pages (2MB or 1GB)?
+30µsReduces TLB misses 10-100x.
Learn more
Why it matters: CPUs use a TLB (Translation Lookaside Buffer) to cache page table lookups. With 4KB pages, a 1GB dataset needs 262,144 pages. TLB only holds ~1,500 entries → constant misses.
With 2MB pages: Same 1GB dataset = 512 pages. Fits in TLB = zero misses.
Impact: Each TLB miss costs 10-100 cycles (5-50ns). For memory-intensive HFT (order books, tick databases), this adds up to 1-5µs per operation.
TCP_NODELAY and buffer tuning?
+20µsNagle's algorithm = 40ms delay.
Learn more
Why it matters: By default, TCP uses Nagle's algorithm: it waits up to 40ms to batch small packets together. This is great for throughput, terrible for latency.
TCP_NODELAY: Disables Nagle. Every `send()` goes out immediately.
Buffer tuning: Default socket buffers are sized for throughput. For low-latency, you want smaller buffers to reduce queuing delay.
C-states and P-states disabled?
+15µsPower savings = 2-100µs wake latency.
Learn more
Why it matters: Modern CPUs save power by entering "C-states" (sleep states) when idle. C1: 2µs wake-up. C3: 50µs. C6: 100µs+.
The trap: Your trading thread is waiting for a packet. CPU goes to C3. Packet arrives. You pay 50µs to wake up before processing.
P-states: CPU also varies frequency for power savings. Ramping from 2GHz to 3.5GHz takes 10-50µs.
Monitoring
You can't fix what you don't measure
Without proper measurement, you're optimizing blind. Averages lie. Tails tell the truth.
P99/P999 latency tracking?
-Averages hide the trades you lose.
Learn more
Why it matters: Your average latency could be 10µs, but if P99 is 500µs, you're losing 1% of your trades to a competitor with consistent 50µs.
The math: In a race to the exchange, you only need to be slow once to lose that trade. If you trade 100 times/second, P99 = 1 loss/second.
What to track: - P50 (median): your typical case - P99: once per 100 events - P999: once per 1000 events - often reveals GC, page faults, or scheduler issues
Real-time latency dashboards?
-If you can't see it, you can't fix it.
Learn more
Why it matters: A latency regression at 2:47 PM on Tuesday will be invisible if you only look at daily averages. You need real-time visibility to correlate spikes with events.
What to look for: - Latency heatmaps (not line charts) - Histogram distributions over time - Correlation with system events (GC, cron jobs, etc.)
Alerting: Set up P99 alerts. If median is 10µs, alert at P99 > 50µs.
Toggle items to assess your infrastructure.
Want to discuss your infrastructure challenges?
Connect on LinkedInThis scorecard is based on infrastructure patterns from Akuna Capital, Gemini, and other institutional trading firms.