Advanced
Enhancements
Streaming features, governance, team onboarding, and frequently asked questions.
Enhancement 1: Streaming Features
# Stream features from Kafka using Feast
from feast import StreamFeatureView, KafkaSource
kafka_source = KafkaSource(
name="driver_trips_stream",
kafka_bootstrap_servers="localhost:9092",
topic="driver_trips",
timestamp_field="event_timestamp",
message_format={"json": {"schema": "..."}},
)
driver_trips_stream = StreamFeatureView(
name="driver_trips_stream",
entities=[driver],
schema=[
Field(name="trips_today", dtype=Int64),
Field(name="total_distance", dtype=Float32),
],
source=kafka_source,
online=True,
)
Enhancement 2: Feature Governance
# Feature catalog with metadata and ownership
feature_catalog = {
"driver_stats": {
"owner": "ml-platform-team",
"description": "Core driver performance metrics",
"sla": "99.9% availability, <5ms p99 latency",
"pii": False,
"tags": ["driver", "performance"],
"consumers": ["trip-prediction", "driver-ranking"],
}
}
# Access control
def check_access(team, feature_view):
allowed = {
"ml-team": ["driver_stats", "trip_features"],
"analytics": ["driver_stats"],
}
return feature_view in allowed.get(team, [])
Enhancement 3: Team Onboarding
# Feature discovery CLI
# feast feature-views list
# feast entities list
# feast feature-services list
# SDK usage for new teams
from feast import FeatureStore
store = FeatureStore(repo_path="feature_repo")
# Discover available features
for fv in store.list_feature_views():
print(f"{fv.name}: {[f.name for f in fv.features]}")
print(f" Entity: {fv.entities}")
print(f" Tags: {fv.tags}")
Frequently Asked Questions
Q: Feast vs Tecton vs Hopsworks?
Feast is open-source and self-hosted. Tecton is a managed platform with streaming support. Hopsworks focuses on feature pipelines. Feast is best for teams wanting full control.
Feast is open-source and self-hosted. Tecton is a managed platform with streaming support. Hopsworks focuses on feature pipelines. Feast is best for teams wanting full control.
Q: How often should I materialize?
Depends on feature freshness requirements. Real-time: use streaming. Hourly: cron job. Daily: batch pipeline.
Depends on feature freshness requirements. Real-time: use streaming. Hourly: cron job. Daily: batch pipeline.
Q: How do I handle feature versioning?
Use feature view versions (e.g., driver_stats_v2) and stage transitions. Keep old versions alive until all consumers migrate.
Use feature view versions (e.g., driver_stats_v2) and stage transitions. Keep old versions alive until all consumers migrate.
Q: Can I use this with non-Python models?
Yes. The FastAPI serves features as JSON, so any language can consume them. You can also use gRPC for lower latency.
Yes. The FastAPI serves features as JSON, so any language can consume them. You can also use gRPC for lower latency.
Congratulations
You have built a complete ML feature platform with feature definitions, offline/online stores, a serving API, and monitoring. These are the same patterns used by leading ML teams at scale.