Real-Time Data Pipeline Architecture for Trading Systems

9 minutes read (2274 words)

May 5th, 2026

Architecture diagram showing data flow from exchanges through processing layers to trading strategies

Your trading strategy needs real-time market data. Not "real-time" as in "updated every minute"—real-time as in microseconds matter.

We've seen trades go wrong because of a 50ms delay that seemed insignificant. We've seen million-dollar strategies fail because a feed went stale and nobody noticed for 3 minutes. Building infrastructure that reliably delivers market data from exchange to strategy, with minimal latency and maximum reliability, is a foundational challenge in quantitative trading.

This post covers the architecture patterns, technology choices, and trade-offs involved in building real-time data pipelines for trading systems.

Why Real-Time Matters

Different use cases have different latency requirements:

High-frequency trading: Microseconds matter. Every nanosecond of latency is a competitive disadvantage.

Medium-frequency strategies: Milliseconds matter. Stale data means missed opportunities or adverse fills.

Lower-frequency strategies: Seconds to minutes acceptable. But even here, data freshness enables better execution and risk management.

Risk monitoring: Sub-second updates critical. You need to know your exposure in real-time, not from a batch job.

Even if your strategy doesn't require ultra-low latency, your risk systems do. And reliable data delivery matters at every timescale.

The Data Flow

A real-time data pipeline has five layers:

Sources → Ingestion → Processing → Storage → Consumption

Sources

Where the data comes from:

  • Exchanges: Direct feeds, co-located connections
  • Data vendors: Consolidated feeds, normalized data
  • Brokers: Proprietary data, execution feeds
  • Alternative data: News, sentiment, satellite imagery

Each source has different characteristics:

  • Latency (direct feeds faster than vendor feeds)
  • Coverage (vendors aggregate multiple exchanges)
  • Format (FIX protocol, JSON, binary protocols)
  • Reliability (redundancy, failover)

Ingestion

Getting data into your system:

  • Connection management: Maintain persistent connections, handle reconnects
  • Protocol handling: Parse exchange-specific formats
  • Timestamping: Record when data arrived (not just exchange timestamp)
  • Initial buffering: Handle bursts without dropping data

Processing

Transforming raw data into usable form:

  • Normalization: Convert to common format
  • Validation: Check data quality
  • Enrichment: Add derived fields
  • Aggregation: Build bars, compute features

Storage

Persisting data for different use cases:

  • Hot storage: In-memory for real-time access
  • Warm storage: Recent history for analysis
  • Cold storage: Full history for backtesting

Consumption

Delivering data to consumers:

  • Push: Stream to subscribers
  • Pull: Query on demand
  • Hybrid: Subscribe to updates, query for history

Architecture Patterns

Simple Pipeline

For straightforward requirements:

Exchange → Connector → Kafka → Consumer → Strategy
                          ↓
                     TimescaleDB

Components:

  • Exchange connector handles protocol-specific parsing
  • Kafka provides buffering and durability
  • Consumers process and route to destinations
  • TimescaleDB stores time-series history

Pros: Simple, few moving parts Cons: Limited flexibility, all-or-nothing consumption

Fan-Out Pipeline

For multiple consumers with different needs:

Exchange → Connector → Kafka → [Topic per asset class]
                                    ↓
                    ├─→ Real-time Consumer → Redis (hot)
                    ├─→ Analytics Consumer → TimescaleDB (warm)
                    └─→ Archive Consumer → S3/Parquet (cold)

Pros: Consumers can process independently Cons: More complexity, potential consistency issues

Event-Driven Architecture

For complex processing requirements:

Exchange → Connector → Event Bus
                          ↓
            ├─→ Normalizer → Normalized Event Bus
            │                     ↓
            │       ├─→ Feature Calculator → Feature Store
            │       ├─→ Risk Calculator → Risk Engine
            │       └─→ Signal Generator → Order Manager
            │
            └─→ Raw Archiver → Cold Storage

Pros: Highly decoupled, each component can scale independently Cons: Complex, harder to debug, potential latency accumulation

Technology Choices

Message Queues

Apache Kafka:

  • High throughput, durable
  • Good for most trading workloads
  • Typical latency: 1-10ms

NATS:

  • Lower latency than Kafka
  • Less durable (but NATS JetStream adds persistence)
  • Simpler operations

ZeroMQ:

  • Very low latency (microseconds)
  • No broker (peer-to-peer)
  • No durability by default

Aeron:

  • Ultra-low latency (nanoseconds)
  • Designed for trading systems
  • Requires more expertise to operate

For most teams, Kafka is the right default. It's well-understood, widely deployed, and fast enough for all but the most latency-sensitive strategies.

Stream Processing

Kafka Streams:

  • Tight Kafka integration
  • Good for simple transformations
  • Exactly-once semantics

Apache Flink:

  • Powerful windowing and stateful processing
  • Lower latency than batch alternatives
  • More operational complexity

Custom processing:

  • For maximum control and minimum latency
  • More development effort
  • No framework overhead

For trading systems, custom processing often wins. Framework overhead matters when microseconds count.

Storage

Hot tier (in-memory):

  • Redis: Simple key-value, pub/sub
  • Aerospike: Higher throughput, persistence
  • Custom in-process: Lowest latency

Warm tier (recent history):

  • TimescaleDB: PostgreSQL-based, SQL interface
  • QuestDB: Column-oriented, very fast queries
  • InfluxDB: Purpose-built for time-series

Cold tier (full history):

  • Parquet on S3: Columnar, cost-effective
  • Delta Lake: ACID transactions on object storage
  • Apache Iceberg: Modern table format

Design Considerations

Latency vs Throughput

You can optimize for one or the other, but not both simultaneously.

Low latency design:

  • Process immediately, don't batch
  • In-memory everything
  • Direct connections, no intermediaries
  • Single-threaded to avoid lock contention
  • Kernel bypass networking (DPDK, Solarflare OpenOnload)
  • Busy-polling instead of interrupt-driven I/O

High throughput design:

  • Batch for efficiency
  • Compress data
  • Parallelize processing
  • Trade latency for volume

The tension is real: batching improves throughput but adds latency. Compression saves bandwidth but costs CPU cycles. Parallelization increases throughput but introduces coordination overhead.

Most trading systems need both—low latency for the hot path (signal generation, order submission) and high throughput for the warm path (analytics, storage). Design separate paths rather than trying to optimize one path for both.

Reliability

Data loss is expensive. A missed tick during a flash crash could mean a missed trading opportunity—or worse, a risk system that doesn't know your true exposure.

Redundancy: Multiple connections to data sources. Not just primary/backup, but ideally from different network paths. If your primary and backup both go through the same switch, you haven't solved the problem.

Persistence: Don't rely on in-memory only. But understand the latency cost of persistence. Write-ahead logs add microseconds. Synchronous replication adds milliseconds. Know which data must be durable and which can be rebuilt.

Monitoring: Know immediately when something fails. "Immediately" means seconds, not minutes. If your alerting latency is longer than your data staleness tolerance, you'll learn about problems from traders, not dashboards.

Recovery: Ability to replay from checkpoint. This requires careful design—your consumers must be idempotent, your timestamps must be deterministic, and your replay must not affect live trading.

Failure modes to design for:

  • Network partition between data center and exchange
  • Vendor feed going stale (sending data, but old data)
  • Upstream system sending malformed messages
  • Clock drift causing timestamp inconsistencies
  • Memory pressure causing garbage collection pauses
  • Disk full preventing persistence

Scalability

As data volume grows, your pipeline needs to scale. But scaling a real-time system is different from scaling a batch system.

Horizontal scaling: Add more consumers. But this only works if your data is partitionable. Order book updates for a single symbol cannot be parallelized—they must be processed in order.

Partitioning strategies:

  • By symbol: Most common, works well for independent instruments
  • By exchange: Good for multi-venue strategies
  • By asset class: Useful when processing logic differs
  • By client: For multi-tenant systems

The partition key determines your parallelism ceiling. Choose carefully—repartitioning a live system is painful.

Backpressure: Handle bursts without dropping data. Market opens, economic announcements, and flash crashes all produce traffic spikes of 10-100x normal volume. Your system must either buffer (adding latency) or shed load intelligently (dropping less important data first).

Data Volume

Typical volumes for different data types:

Data TypeVolume per DayStorage per Year
Daily barsMBGB
Minute barsGBTB
Tick dataTens of GBHundreds of TB
Full order bookHundreds of GBPB

Plan your storage accordingly. Full order book data for US equities alone can exceed 5TB per day uncompressed. Most firms keep full depth for recent history (days to weeks) and progressively downsample older data.

The economics matter: storing a year of tick data costs roughly $10-50K in cloud storage. Storing a year of full order book data costs 10-100x more. Factor in egress costs if you're running backtests that scan historical data.

Operational Realities

What Actually Goes Wrong

In theory, data flows smoothly from exchange to strategy. In practice:

Vendor feeds go stale. The connection stays up, messages keep arriving, but the timestamps stop advancing. Your system thinks it's receiving live data when it's actually receiving delayed or replayed data. We've seen a feed replay 10-minute-old data during a flash crash—the system happily traded on prices that no longer existed. Detection requires comparing feed timestamps to wall clock time—and handling the legitimate case where markets are simply quiet.

Exchanges send bad data. Erroneous prints, crossed markets, trades at impossible prices. Your pipeline needs to either filter these (risking filtering legitimate data) or pass them through with quality flags (requiring downstream systems to handle bad data).

Sequence gaps appear. You receive message 1000, then message 1002. Is message 1001 lost forever, or just delayed? The answer determines whether you should wait, request retransmission, or proceed without it.

Timestamps lie. Exchange timestamps reflect when the event occurred at the exchange. Your receipt timestamps reflect when you received it. The difference varies by milliseconds to seconds depending on network conditions. Reconciling these for accurate latency measurement is surprisingly difficult.

Bursts overwhelm buffers. Market opens produce 100x normal message rates. FOMC announcements are worse—we've seen 500x normal traffic in the 100ms after a rate decision. If your buffer fills, you either drop messages (bad) or block upstream (also bad, and may cascade). Proper backpressure design is essential but rarely implemented correctly the first time.

Monitoring That Actually Helps

Generic infrastructure monitoring (CPU, memory, disk) isn't enough. You need domain-specific observability:

Message rates by symbol and exchange. A sudden drop might indicate a feed problem—or a trading halt. You need context to distinguish.

Latency percentiles, not averages. P99 latency matters more than mean latency. A system with 1ms mean latency but 100ms P99 will cause problems that don't show up in average-based dashboards.

Sequence gap tracking. How many gaps per hour? How long until gaps are filled? Trending upward is a warning sign.

Cross-feed divergence. If you have multiple feeds for the same instruments, they should agree. Divergence indicates a problem with at least one feed.

Consumer lag. How far behind real-time is each consumer? Lag that grows over time indicates a consumer that can't keep up with production rate.

This is a core area where our observability platform adds value: real-time visibility into pipeline health metrics that matter for trading. But "tuned to distinguish operational issues from normal market behavior" is the key phrase—and that tuning is ongoing work. What counts as abnormal latency changes as your infrastructure evolves. What counts as a suspicious throughput drop depends on time of day and market conditions. The platform provides the visibility; your team provides the judgment about what the numbers mean and when thresholds need adjustment.

Common Architectures

For Small Teams

Vendor Feed → Python Connector → Redis → Strategy
                    ↓
               PostgreSQL

Simple, maintainable, good enough for many use cases. Use a vendor feed to avoid exchange connectivity complexity. Redis for real-time, PostgreSQL for history.

For Medium Teams

Exchange Feeds → Go Connector → Kafka → Consumer Group
                                            ↓
                          ├─→ Redis (latest quotes)
                          ├─→ TimescaleDB (bars, history)
                          └─→ S3 (tick archive)

Multiple exchanges, robust message bus, separate storage tiers. Go or Rust connectors for performance.

For Large Teams

Multiple Exchanges → Custom FPGAs → Low-Latency Bus
                                         ↓
                         ├─→ Strategy Engines (co-located)
                         ├─→ Risk Systems
                         └─→ Archival Pipeline → Data Lake

Hardware-accelerated where latency matters, sophisticated infrastructure throughout. Significant engineering investment.

The Build vs Buy Decision

When Custom Makes Sense

Build custom infrastructure when:

Latency is your edge. If you're competing on speed, every component in your stack is a potential optimization target. Generic solutions optimize for the general case, not your specific case.

Your requirements are unusual. Multi-asset strategies, exotic instruments, or unique data sources may not fit vendor assumptions.

You have the team. Building and operating real-time infrastructure requires specific expertise. If you don't have it, buying buys you time to develop it.

When Vendor Solutions Win

Buy when:

Time-to-market matters more than optimization. A vendor solution that's 80% as good but available today often beats a custom solution that's perfect but takes 18 months.

Your edge is elsewhere. If your alpha comes from better signals, not faster execution, infrastructure is a cost center. Minimize it.

Operational burden exceeds value. Running Kafka, TimescaleDB, and Redis in production requires on-call rotations, upgrade planning, and incident response. Managed services transfer that burden.

The Hybrid Approach

Most mature firms end up with a hybrid: vendor solutions for commodity infrastructure (message queues, databases), custom code for the hot path (connectors, signal generation, order routing).

The key is knowing which is which. Don't build a message queue. Don't buy a trading strategy.

Conclusion

Real-time data pipeline architecture is foundational to trading systems. The choices you make here constrain everything downstream—latency, reliability, scalability.

Start simple. Measure everything. Optimize where it matters.

Most teams over-engineer initially and under-monitor. Build something that works, instrument it thoroughly, and iterate based on data. The teams that succeed are not the ones with the most sophisticated initial architecture—they're the ones who can see what's happening in their pipeline and respond quickly when things go wrong.

That's the value of systematic observability: not eliminating operational work, but changing it from reactive debugging to proactive refinement. When you can see latency percentiles, message rates, and consumer lag in real-time, you catch problems before they cascade. When you have historical baselines, you can tune alerting thresholds based on data rather than guesswork. The infrastructure requires ongoing attention—but instrumented infrastructure tells you where to focus that attention.

If you need help designing data infrastructure for trading systems—or building the observability layer to understand what's actually happening—reach out. We've built these systems at multiple scales and can help you make the right trade-offs for your situation.

Frequently Asked Questions

What is the best message queue for trading systems?
For most teams, Apache Kafka is the right default—it's well-understood, widely deployed, and fast enough for all but ultra-low-latency strategies (1-10ms typical latency). For sub-millisecond requirements, consider Aeron (nanosecond latency) or ZeroMQ (microseconds). NATS offers lower latency than Kafka with simpler operations.
How do you handle market data bursts at market open?
Market opens can produce 100x normal message rates; FOMC announcements can spike to 500x. Design for backpressure: either buffer (adding latency) or shed load intelligently (dropping less important data first). Your system must handle bursts without dropping critical data or blocking upstream.
What latency should I target for trading data pipelines?
It depends on your strategy. High-frequency trading requires microseconds. Medium-frequency strategies need milliseconds. Lower-frequency strategies can tolerate seconds. Risk monitoring systems need sub-second updates. Even if your strategy doesn't require ultra-low latency, your risk systems do.
Should I build or buy trading data infrastructure?
Build custom when latency is your competitive edge, your requirements are unusual, or you have the specialized team. Buy when time-to-market matters more than optimization, your alpha comes from signals not execution, or operational burden exceeds value. Most mature firms use a hybrid: vendor solutions for commodity infrastructure, custom code for the hot path.