Real-Time Data Pipeline Architecture for Trading Systems
9 minutes read (2274 words)
May 5th, 2026

Your trading strategy needs real-time market data. Not "real-time" as in "updated every minute"—real-time as in microseconds matter.
We've seen trades go wrong because of a 50ms delay that seemed insignificant. We've seen million-dollar strategies fail because a feed went stale and nobody noticed for 3 minutes. Building infrastructure that reliably delivers market data from exchange to strategy, with minimal latency and maximum reliability, is a foundational challenge in quantitative trading.
This post covers the architecture patterns, technology choices, and trade-offs involved in building real-time data pipelines for trading systems.
Why Real-Time Matters
Different use cases have different latency requirements:
High-frequency trading: Microseconds matter. Every nanosecond of latency is a competitive disadvantage.
Medium-frequency strategies: Milliseconds matter. Stale data means missed opportunities or adverse fills.
Lower-frequency strategies: Seconds to minutes acceptable. But even here, data freshness enables better execution and risk management.
Risk monitoring: Sub-second updates critical. You need to know your exposure in real-time, not from a batch job.
Even if your strategy doesn't require ultra-low latency, your risk systems do. And reliable data delivery matters at every timescale.
The Data Flow
A real-time data pipeline has five layers:
Sources → Ingestion → Processing → Storage → Consumption
Sources
Where the data comes from:
- Exchanges: Direct feeds, co-located connections
- Data vendors: Consolidated feeds, normalized data
- Brokers: Proprietary data, execution feeds
- Alternative data: News, sentiment, satellite imagery
Each source has different characteristics:
- Latency (direct feeds faster than vendor feeds)
- Coverage (vendors aggregate multiple exchanges)
- Format (FIX protocol, JSON, binary protocols)
- Reliability (redundancy, failover)
Ingestion
Getting data into your system:
- Connection management: Maintain persistent connections, handle reconnects
- Protocol handling: Parse exchange-specific formats
- Timestamping: Record when data arrived (not just exchange timestamp)
- Initial buffering: Handle bursts without dropping data
Processing
Transforming raw data into usable form:
- Normalization: Convert to common format
- Validation: Check data quality
- Enrichment: Add derived fields
- Aggregation: Build bars, compute features
Storage
Persisting data for different use cases:
- Hot storage: In-memory for real-time access
- Warm storage: Recent history for analysis
- Cold storage: Full history for backtesting
Consumption
Delivering data to consumers:
- Push: Stream to subscribers
- Pull: Query on demand
- Hybrid: Subscribe to updates, query for history
Architecture Patterns
Simple Pipeline
For straightforward requirements:
Exchange → Connector → Kafka → Consumer → Strategy
↓
TimescaleDB
Components:
- Exchange connector handles protocol-specific parsing
- Kafka provides buffering and durability
- Consumers process and route to destinations
- TimescaleDB stores time-series history
Pros: Simple, few moving parts Cons: Limited flexibility, all-or-nothing consumption
Fan-Out Pipeline
For multiple consumers with different needs:
Exchange → Connector → Kafka → [Topic per asset class]
↓
├─→ Real-time Consumer → Redis (hot)
├─→ Analytics Consumer → TimescaleDB (warm)
└─→ Archive Consumer → S3/Parquet (cold)
Pros: Consumers can process independently Cons: More complexity, potential consistency issues
Event-Driven Architecture
For complex processing requirements:
Exchange → Connector → Event Bus
↓
├─→ Normalizer → Normalized Event Bus
│ ↓
│ ├─→ Feature Calculator → Feature Store
│ ├─→ Risk Calculator → Risk Engine
│ └─→ Signal Generator → Order Manager
│
└─→ Raw Archiver → Cold Storage
Pros: Highly decoupled, each component can scale independently Cons: Complex, harder to debug, potential latency accumulation
Technology Choices
Message Queues
Apache Kafka:
- High throughput, durable
- Good for most trading workloads
- Typical latency: 1-10ms
NATS:
- Lower latency than Kafka
- Less durable (but NATS JetStream adds persistence)
- Simpler operations
ZeroMQ:
- Very low latency (microseconds)
- No broker (peer-to-peer)
- No durability by default
Aeron:
- Ultra-low latency (nanoseconds)
- Designed for trading systems
- Requires more expertise to operate
For most teams, Kafka is the right default. It's well-understood, widely deployed, and fast enough for all but the most latency-sensitive strategies.
Stream Processing
Kafka Streams:
- Tight Kafka integration
- Good for simple transformations
- Exactly-once semantics
Apache Flink:
- Powerful windowing and stateful processing
- Lower latency than batch alternatives
- More operational complexity
Custom processing:
- For maximum control and minimum latency
- More development effort
- No framework overhead
For trading systems, custom processing often wins. Framework overhead matters when microseconds count.
Storage
Hot tier (in-memory):
- Redis: Simple key-value, pub/sub
- Aerospike: Higher throughput, persistence
- Custom in-process: Lowest latency
Warm tier (recent history):
- TimescaleDB: PostgreSQL-based, SQL interface
- QuestDB: Column-oriented, very fast queries
- InfluxDB: Purpose-built for time-series
Cold tier (full history):
- Parquet on S3: Columnar, cost-effective
- Delta Lake: ACID transactions on object storage
- Apache Iceberg: Modern table format
Design Considerations
Latency vs Throughput
You can optimize for one or the other, but not both simultaneously.
Low latency design:
- Process immediately, don't batch
- In-memory everything
- Direct connections, no intermediaries
- Single-threaded to avoid lock contention
- Kernel bypass networking (DPDK, Solarflare OpenOnload)
- Busy-polling instead of interrupt-driven I/O
High throughput design:
- Batch for efficiency
- Compress data
- Parallelize processing
- Trade latency for volume
The tension is real: batching improves throughput but adds latency. Compression saves bandwidth but costs CPU cycles. Parallelization increases throughput but introduces coordination overhead.
Most trading systems need both—low latency for the hot path (signal generation, order submission) and high throughput for the warm path (analytics, storage). Design separate paths rather than trying to optimize one path for both.
Reliability
Data loss is expensive. A missed tick during a flash crash could mean a missed trading opportunity—or worse, a risk system that doesn't know your true exposure.
Redundancy: Multiple connections to data sources. Not just primary/backup, but ideally from different network paths. If your primary and backup both go through the same switch, you haven't solved the problem.
Persistence: Don't rely on in-memory only. But understand the latency cost of persistence. Write-ahead logs add microseconds. Synchronous replication adds milliseconds. Know which data must be durable and which can be rebuilt.
Monitoring: Know immediately when something fails. "Immediately" means seconds, not minutes. If your alerting latency is longer than your data staleness tolerance, you'll learn about problems from traders, not dashboards.
Recovery: Ability to replay from checkpoint. This requires careful design—your consumers must be idempotent, your timestamps must be deterministic, and your replay must not affect live trading.
Failure modes to design for:
- Network partition between data center and exchange
- Vendor feed going stale (sending data, but old data)
- Upstream system sending malformed messages
- Clock drift causing timestamp inconsistencies
- Memory pressure causing garbage collection pauses
- Disk full preventing persistence
Scalability
As data volume grows, your pipeline needs to scale. But scaling a real-time system is different from scaling a batch system.
Horizontal scaling: Add more consumers. But this only works if your data is partitionable. Order book updates for a single symbol cannot be parallelized—they must be processed in order.
Partitioning strategies:
- By symbol: Most common, works well for independent instruments
- By exchange: Good for multi-venue strategies
- By asset class: Useful when processing logic differs
- By client: For multi-tenant systems
The partition key determines your parallelism ceiling. Choose carefully—repartitioning a live system is painful.
Backpressure: Handle bursts without dropping data. Market opens, economic announcements, and flash crashes all produce traffic spikes of 10-100x normal volume. Your system must either buffer (adding latency) or shed load intelligently (dropping less important data first).
Data Volume
Typical volumes for different data types:
| Data Type | Volume per Day | Storage per Year |
|---|---|---|
| Daily bars | MB | GB |
| Minute bars | GB | TB |
| Tick data | Tens of GB | Hundreds of TB |
| Full order book | Hundreds of GB | PB |
Plan your storage accordingly. Full order book data for US equities alone can exceed 5TB per day uncompressed. Most firms keep full depth for recent history (days to weeks) and progressively downsample older data.
The economics matter: storing a year of tick data costs roughly $10-50K in cloud storage. Storing a year of full order book data costs 10-100x more. Factor in egress costs if you're running backtests that scan historical data.
Operational Realities
What Actually Goes Wrong
In theory, data flows smoothly from exchange to strategy. In practice:
Vendor feeds go stale. The connection stays up, messages keep arriving, but the timestamps stop advancing. Your system thinks it's receiving live data when it's actually receiving delayed or replayed data. We've seen a feed replay 10-minute-old data during a flash crash—the system happily traded on prices that no longer existed. Detection requires comparing feed timestamps to wall clock time—and handling the legitimate case where markets are simply quiet.
Exchanges send bad data. Erroneous prints, crossed markets, trades at impossible prices. Your pipeline needs to either filter these (risking filtering legitimate data) or pass them through with quality flags (requiring downstream systems to handle bad data).
Sequence gaps appear. You receive message 1000, then message 1002. Is message 1001 lost forever, or just delayed? The answer determines whether you should wait, request retransmission, or proceed without it.
Timestamps lie. Exchange timestamps reflect when the event occurred at the exchange. Your receipt timestamps reflect when you received it. The difference varies by milliseconds to seconds depending on network conditions. Reconciling these for accurate latency measurement is surprisingly difficult.
Bursts overwhelm buffers. Market opens produce 100x normal message rates. FOMC announcements are worse—we've seen 500x normal traffic in the 100ms after a rate decision. If your buffer fills, you either drop messages (bad) or block upstream (also bad, and may cascade). Proper backpressure design is essential but rarely implemented correctly the first time.
Monitoring That Actually Helps
Generic infrastructure monitoring (CPU, memory, disk) isn't enough. You need domain-specific observability:
Message rates by symbol and exchange. A sudden drop might indicate a feed problem—or a trading halt. You need context to distinguish.
Latency percentiles, not averages. P99 latency matters more than mean latency. A system with 1ms mean latency but 100ms P99 will cause problems that don't show up in average-based dashboards.
Sequence gap tracking. How many gaps per hour? How long until gaps are filled? Trending upward is a warning sign.
Cross-feed divergence. If you have multiple feeds for the same instruments, they should agree. Divergence indicates a problem with at least one feed.
Consumer lag. How far behind real-time is each consumer? Lag that grows over time indicates a consumer that can't keep up with production rate.
This is a core area where our observability platform adds value: real-time visibility into pipeline health metrics that matter for trading. But "tuned to distinguish operational issues from normal market behavior" is the key phrase—and that tuning is ongoing work. What counts as abnormal latency changes as your infrastructure evolves. What counts as a suspicious throughput drop depends on time of day and market conditions. The platform provides the visibility; your team provides the judgment about what the numbers mean and when thresholds need adjustment.
Common Architectures
For Small Teams
Vendor Feed → Python Connector → Redis → Strategy
↓
PostgreSQL
Simple, maintainable, good enough for many use cases. Use a vendor feed to avoid exchange connectivity complexity. Redis for real-time, PostgreSQL for history.
For Medium Teams
Exchange Feeds → Go Connector → Kafka → Consumer Group
↓
├─→ Redis (latest quotes)
├─→ TimescaleDB (bars, history)
└─→ S3 (tick archive)
Multiple exchanges, robust message bus, separate storage tiers. Go or Rust connectors for performance.
For Large Teams
Multiple Exchanges → Custom FPGAs → Low-Latency Bus
↓
├─→ Strategy Engines (co-located)
├─→ Risk Systems
└─→ Archival Pipeline → Data Lake
Hardware-accelerated where latency matters, sophisticated infrastructure throughout. Significant engineering investment.
The Build vs Buy Decision
When Custom Makes Sense
Build custom infrastructure when:
Latency is your edge. If you're competing on speed, every component in your stack is a potential optimization target. Generic solutions optimize for the general case, not your specific case.
Your requirements are unusual. Multi-asset strategies, exotic instruments, or unique data sources may not fit vendor assumptions.
You have the team. Building and operating real-time infrastructure requires specific expertise. If you don't have it, buying buys you time to develop it.
When Vendor Solutions Win
Buy when:
Time-to-market matters more than optimization. A vendor solution that's 80% as good but available today often beats a custom solution that's perfect but takes 18 months.
Your edge is elsewhere. If your alpha comes from better signals, not faster execution, infrastructure is a cost center. Minimize it.
Operational burden exceeds value. Running Kafka, TimescaleDB, and Redis in production requires on-call rotations, upgrade planning, and incident response. Managed services transfer that burden.
The Hybrid Approach
Most mature firms end up with a hybrid: vendor solutions for commodity infrastructure (message queues, databases), custom code for the hot path (connectors, signal generation, order routing).
The key is knowing which is which. Don't build a message queue. Don't buy a trading strategy.
Conclusion
Real-time data pipeline architecture is foundational to trading systems. The choices you make here constrain everything downstream—latency, reliability, scalability.
Start simple. Measure everything. Optimize where it matters.
Most teams over-engineer initially and under-monitor. Build something that works, instrument it thoroughly, and iterate based on data. The teams that succeed are not the ones with the most sophisticated initial architecture—they're the ones who can see what's happening in their pipeline and respond quickly when things go wrong.
That's the value of systematic observability: not eliminating operational work, but changing it from reactive debugging to proactive refinement. When you can see latency percentiles, message rates, and consumer lag in real-time, you catch problems before they cascade. When you have historical baselines, you can tune alerting thresholds based on data rather than guesswork. The infrastructure requires ongoing attention—but instrumented infrastructure tells you where to focus that attention.
If you need help designing data infrastructure for trading systems—or building the observability layer to understand what's actually happening—reach out. We've built these systems at multiple scales and can help you make the right trade-offs for your situation.