Data Quality Monitoring - Referential Labs

Bad data in, bad signals out. A single corrupted data point can cascade through your entire pipeline—and you won't know until your P&L tells you.

The Problem

Trading systems depend on data quality, but data problems are often invisible until they cause losses:

Gaps: Missing ticks, bars, or updates that create false signals
Stale data: Feeds that stop updating without throwing errors
Outliers: Bad prints, erroneous quotes, or corrupted records
Schema changes: Upstream changes that break your pipeline
Distribution drift: Gradual changes in data characteristics that invalidate models

Our Approach

Systematic monitoring that surfaces data issues for investigation. The platform catches anomalies; your team determines which are real problems versus unusual but legitimate market behavior.

Gap Detection

Real-time detection of missing data. Alert immediately when expected updates don't arrive.

Anomaly Detection

Statistical outlier detection for prices, volumes, and derived metrics. Catch bad data before it propagates.

Schema Monitoring

Detect field additions, removals, and type changes. Prevent silent pipeline failures.

Distribution Tracking

Monitor feature distributions over time. Alert when characteristics drift outside expected ranges.

Feature Store Monitoring

For ML-based strategies, feature quality is everything. We monitor:

Freshness: Are features being updated on schedule?
Completeness: What's the null rate? Are there unexpected gaps?
Distribution: Has the feature distribution shifted from training data?
Point-in-time correctness: Are historical lookups returning the right values?
Cross-feature consistency: Do related features stay in sync?

Data Lineage

When something goes wrong, you need to know the blast radius. Our lineage tracking shows:

Which raw data sources feed which features
Which strategies depend on which features
Impact analysis when a source has issues
Historical audit trail of data changes

Ongoing Calibration

Data quality monitoring requires continuous refinement:

Thresholds need tuning per instrument and data source
New data sources require new validation rules
Market regime changes affect what counts as "anomalous"
False positives need investigation and rule adjustment

The platform handles routine detection so your team can focus on judgment calls: Is this a bad print or a legitimate block trade? Is this gap a feed problem or a trading halt? Over time, your rules improve as you learn what patterns matter for your specific data.

Get Started

Get visibility into your data infrastructure. We'll help you establish baseline monitoring and refine it as you learn what issues actually affect your strategies.

Frequently Asked Questions

What data sources can you monitor?

We monitor market data feeds, alternative data sources, feature stores, and derived signals. We support real-time streaming and batch data. Each source type has different characteristics—we help you configure appropriate checks for each.

How do you detect data quality issues?

We use statistical methods to detect gaps, outliers, stale data, schema changes, and distribution shifts. Checks are configurable per data source and asset class. Flagged anomalies require human review—distinguishing bad data from unusual but legitimate market events often requires judgment.

Can you monitor my feature store?

Yes. We track feature freshness, null rates, distribution drift, and point-in-time correctness. For ML-based strategies, feature quality directly impacts model performance. We help you establish monitoring that catches issues before they affect predictions.

Do you provide data lineage tracking?

Yes. We trace data from source through transformations to consumption. When an issue is detected, you can see which downstream strategies and features are affected. This helps prioritize which issues need immediate attention versus which can wait.