Intermediate

Streaming Machine Learning

Build real-time ML pipelines that compute features from event streams, execute streaming inference, and implement online learning that adapts models continuously.

Streaming Feature Engineering

Real-time features capture the latest context that batch features miss. Stream processing frameworks enable continuous feature computation:

Feature TypeExampleWindow
Count AggregationsTransactions in last 5 minutesTumbling or sliding
Statistical FeaturesAverage order value last hourSliding window
Behavioral SequencesLast 10 pages visitedSession-based
Rate FeaturesLogin attempts per minuteTumbling window
Recency FeaturesTime since last purchaseEvent-triggered

Windowing Strategies

  1. Tumbling Windows

    Fixed-size, non-overlapping time intervals. Each event belongs to exactly one window. Best for periodic aggregations like hourly counts or daily averages.

  2. Sliding Windows

    Fixed-size windows that advance by a smaller step. Events can belong to multiple windows. Best for smooth rolling aggregations like moving averages.

  3. Session Windows

    Dynamic windows defined by gaps in activity. A new window starts when no events arrive within a timeout period. Best for user session analytics.

  4. Global Windows

    A single window that spans all time, with custom triggers for output. Best for maintaining running totals or lifetime aggregations per entity.

Critical Consideration: Handle late-arriving events with watermark strategies. Define how long to wait for late data before closing windows, and decide whether to update or discard features computed from late events.

Streaming Inference Patterns

Embedded Inference

Load the model directly into the stream processor. Each event triggers a prediction within the same process, minimizing latency and network overhead.

External Service Call

Stream processor calls an external model serving endpoint for predictions. Adds network latency but enables independent model scaling and updates.

Async Enrichment

Publish events to a prediction topic, consume results from a response topic. Decouples stream processing from inference for better resilience.

Side-Output Scoring

Fork the event stream, score events asynchronously, and rejoin results with the main flow. Enables non-blocking inference in latency-sensitive pipelines.

Online Learning

Online learning updates models continuously as new data arrives, rather than waiting for batch retraining:

  • Incremental Updates: Update model parameters with each new observation using algorithms like online gradient descent or incremental matrix factorization
  • Mini-Batch Streaming: Collect small batches of events and update the model periodically, balancing update frequency with computational efficiency
  • Concept Drift Adaptation: Detect distribution shifts and accelerate model updates or trigger full retraining when drift exceeds thresholds
  • A/B Validation: Run online-updated models alongside batch-trained baselines to verify that continuous updates improve rather than degrade predictions
💡
Looking Ahead: In the next lesson, we will explore real-time decision systems, covering decision engines, rule-model hybrids, and production patterns for fraud detection and dynamic pricing.