Streaming Machine Learning
Build real-time ML pipelines that compute features from event streams, execute streaming inference, and implement online learning that adapts models continuously.
Streaming Feature Engineering
Real-time features capture the latest context that batch features miss. Stream processing frameworks enable continuous feature computation:
| Feature Type | Example | Window |
|---|---|---|
| Count Aggregations | Transactions in last 5 minutes | Tumbling or sliding |
| Statistical Features | Average order value last hour | Sliding window |
| Behavioral Sequences | Last 10 pages visited | Session-based |
| Rate Features | Login attempts per minute | Tumbling window |
| Recency Features | Time since last purchase | Event-triggered |
Windowing Strategies
Tumbling Windows
Fixed-size, non-overlapping time intervals. Each event belongs to exactly one window. Best for periodic aggregations like hourly counts or daily averages.
Sliding Windows
Fixed-size windows that advance by a smaller step. Events can belong to multiple windows. Best for smooth rolling aggregations like moving averages.
Session Windows
Dynamic windows defined by gaps in activity. A new window starts when no events arrive within a timeout period. Best for user session analytics.
Global Windows
A single window that spans all time, with custom triggers for output. Best for maintaining running totals or lifetime aggregations per entity.
Streaming Inference Patterns
Embedded Inference
Load the model directly into the stream processor. Each event triggers a prediction within the same process, minimizing latency and network overhead.
External Service Call
Stream processor calls an external model serving endpoint for predictions. Adds network latency but enables independent model scaling and updates.
Async Enrichment
Publish events to a prediction topic, consume results from a response topic. Decouples stream processing from inference for better resilience.
Side-Output Scoring
Fork the event stream, score events asynchronously, and rejoin results with the main flow. Enables non-blocking inference in latency-sensitive pipelines.
Online Learning
Online learning updates models continuously as new data arrives, rather than waiting for batch retraining:
- Incremental Updates: Update model parameters with each new observation using algorithms like online gradient descent or incremental matrix factorization
- Mini-Batch Streaming: Collect small batches of events and update the model periodically, balancing update frequency with computational efficiency
- Concept Drift Adaptation: Detect distribution shifts and accelerate model updates or trigger full retraining when drift exceeds thresholds
- A/B Validation: Run online-updated models alongside batch-trained baselines to verify that continuous updates improve rather than degrade predictions
Lilly Tech Systems