Advanced
Design TikTok/Instagram Content Feed
Design the content ranking system for a short-video platform with 500M daily active users. This question is the hardest recommendation problem because content is highly ephemeral (new videos every second), user preferences shift rapidly, and the system must balance user engagement against creator fairness and societal impact.
Step 1: Clarify Requirements
Requirements (confirmed with interviewer):
Scale:
- 500M daily active users (DAU)
- Average session: 30 minutes, ~100 videos viewed per session
- 2M new videos uploaded per day
- Content library: 5B total videos
- Peak QPS: 2M feed requests per second
Latency:
- Feed generation: < 200ms for initial 10 videos
- Prefetch: next batch ready before user finishes current video
- Real-time adaptation: incorporate engagement signals within same session
Functional:
- "For You" feed (personalized, not following-based)
- Must surface new creators (not just popular ones)
- Content diversity (topics, formats, creators)
- Support cold start for new users (no watch history)
Success Metrics:
- Online: Session duration, videos watched, next-day retention
- NOT just engagement: also track user satisfaction surveys, regret rate
- Creator metrics: views distribution, new creator reach
- Safety: harmful content exposure rate
Constraints:
- Must handle 100+ languages and cultural contexts
- Content moderation (filter unsafe content before ranking)
- Regulatory: transparency on why content was shown
- Ethical: avoid addictive dark patterns, respect user well-being
Step 2: High-Level Architecture
Architecture: Short-Video Content Feed System
User opens app -> Feed Request (user_id, session_context)
|
v
[Feed Service] (< 200ms total)
|
|--- Stage 1: CANDIDATE SOURCING (retrieve ~10,000 from 5B)
| |
| |--- [Interest-Based] (~5000 videos)
| | Two-tower model: user interest embedding vs. video embedding
| | ANN search on video embedding index
| |
| |--- [Following + Social] (~1000 videos)
| | Recent videos from followed creators
| | Videos liked by similar users (collaborative)
| |
| |--- [Trending] (~1000 videos)
| | Videos trending in user's geo/language
| | Videos trending in user's interest categories
| |
| |--- [New Creator Exploration] (~2000 videos)
| | Random sampling from new creators (< 30 days, < 1000 followers)
| | Content-based features only (no engagement history)
| |
| |--- [Diversity / Serendipity] (~1000 videos)
| | Topics user has NOT engaged with before
| | High-quality content from underrepresented categories
| |
| Union + dedup + content safety filter = ~10,000 candidates
|
|--- Stage 2: PRE-RANKING (score 10,000 -> 500)
| |
| |--- Lightweight model (2-layer MLP, < 20ms)
| | Features: user embedding, video embedding, basic stats
| | Purpose: fast pruning, not final ranking
|
|--- Stage 3: RANKING (score 500 -> 100)
| |
| |--- [Feature Store] -> rich features per (user, video) pair
| |
| |--- Deep ranking model (multi-task, GPU inference):
| | P(complete_view): probability user watches > 80%
| | P(like): probability user likes
| | P(share): probability user shares
| | P(comment): probability user comments
| | P(follow_creator): probability user follows
| | E(watch_time): expected watch time in seconds
| | P(not_interested): probability user swipes away quickly
| |
| |--- Combined score = weighted sum of predictions
| | Key insight: SUBTRACT negative signals
| | score = w1*E(watch_time) + w2*P(like) + w3*P(share)
| | - w4*P(not_interested) - w5*P(regret)
|
|--- Stage 4: POST-RANKING REBALANCING (100 -> final 50)
| |
| |--- Diversity rules:
| | Max 3 videos from same creator in top 50
| | Min 5 different topic categories
| | Min 2 videos from new creators
| | Max 2 consecutive videos of same format
| |
| |--- Explore vs. Exploit mixing:
| | 80% exploit (high predicted engagement)
| | 20% explore (uncertain predictions, new content)
| |
| |--- Business insertions:
| | Ad slots at positions 5, 12, 20, 30, 40
| | Promoted content at position 3
|
v
[Feed Response: ordered list of 50 videos]
|--- Client prefetches video content from CDN
|--- Client reports engagement events back in real-time
Step 3: Deep Dive — Multi-Stage Ranking
The Deep Ranking Model
# Multi-task ranking model for content feed
class ContentFeedRanker(nn.Module):
def __init__(self):
super().__init__()
# Feature encoders
self.user_encoder = UserFeatureEncoder(output_dim=256)
self.video_encoder = VideoFeatureEncoder(output_dim=256)
self.context_encoder = ContextEncoder(output_dim=64) # time, device, session state
# Cross-feature interaction network
self.interaction = nn.Sequential(
nn.Linear(256 + 256 + 64 + 128, 1024), # user + video + context + cross_features
nn.ReLU(), nn.BatchNorm1d(1024), nn.Dropout(0.2),
nn.Linear(1024, 512),
nn.ReLU(), nn.BatchNorm1d(512), nn.Dropout(0.1),
nn.Linear(512, 256),
nn.ReLU()
)
# Multi-task heads
self.complete_view_head = nn.Linear(256, 1) # P(watch > 80%)
self.like_head = nn.Linear(256, 1) # P(like)
self.share_head = nn.Linear(256, 1) # P(share)
self.comment_head = nn.Linear(256, 1) # P(comment)
self.follow_head = nn.Linear(256, 1) # P(follow creator)
self.watch_time_head = nn.Linear(256, 1) # E(watch seconds)
self.skip_head = nn.Linear(256, 1) # P(skip in < 2 sec)
self.regret_head = nn.Linear(256, 1) # P(user regrets watching)
def forward(self, user_features, video_features, context, cross_features):
user_emb = self.user_encoder(user_features)
video_emb = self.video_encoder(video_features)
ctx_emb = self.context_encoder(context)
combined = torch.cat([user_emb, video_emb, ctx_emb, cross_features], dim=-1)
shared = self.interaction(combined)
return {
"p_complete": torch.sigmoid(self.complete_view_head(shared)),
"p_like": torch.sigmoid(self.like_head(shared)),
"p_share": torch.sigmoid(self.share_head(shared)),
"p_comment": torch.sigmoid(self.comment_head(shared)),
"p_follow": torch.sigmoid(self.follow_head(shared)),
"watch_time": F.softplus(self.watch_time_head(shared)),
"p_skip": torch.sigmoid(self.skip_head(shared)),
"p_regret": torch.sigmoid(self.regret_head(shared))
}
# CRITICAL: The ranking formula determines platform behavior
# Pure engagement: score = watch_time * p_complete (maximizes session time)
# Responsible ranking: also subtract negative signals
RANKING_WEIGHTS = {
"watch_time": 0.30, # Expected engagement
"p_complete": 0.15, # Quality signal
"p_like": 0.15, # Explicit positive feedback
"p_share": 0.15, # Strongest quality signal
"p_comment": 0.05, # Engagement (can be negative comments)
"p_follow": 0.10, # Creator discovery
"p_skip": -0.20, # Penalize likely-to-skip content
"p_regret": -0.30, # Penalize regrettable content (strongest negative)
}
Step 3: Deep Dive — Explore vs. Exploit
The explore-exploit tradeoff is critical for content platforms. Pure exploitation (only show what the model predicts users will like) creates filter bubbles and kills new creator growth. Pure exploration (random content) destroys user experience.
# Explore vs. Exploit strategies
class ExploreExploitMixer:
def mix(self, ranked_videos: list, user: User) -> list:
final_feed = []
# 1. Thompson Sampling for uncertain videos
# For videos where the model is uncertain (high variance),
# sample from the predicted distribution instead of using point estimate
for video in ranked_videos:
if video.prediction_variance > UNCERTAINTY_THRESHOLD:
# Sample from posterior: could rank higher or lower
video.sampled_score = np.random.normal(
video.predicted_score, video.prediction_variance
)
else:
video.sampled_score = video.predicted_score
# 2. New creator boost
# Reserve 10-20% of feed for creators with < 30 days / < 1000 followers
new_creator_slots = int(len(ranked_videos) * 0.15)
new_creator_videos = [v for v in ranked_videos if v.creator.is_new]
# Select from new creators based on content quality (not engagement, which is low)
new_creator_picks = self.select_by_content_quality(new_creator_videos, new_creator_slots)
# 3. Topic diversification
# Ensure user sees at least 5 different topic categories
topic_buckets = self.group_by_topic(ranked_videos)
# Pick top videos from each bucket, round-robin style
# 4. Anti-addiction measures
if user.session_duration > 60 * 60: # > 1 hour
# Increase exploration, decrease engagement optimization
# Insert "take a break" reminders
pass
return self.interleave(exploit_videos, new_creator_picks, diverse_picks)
# Measuring exploration effectiveness:
# - Track new creator impression -> follower conversion rate
# - Track topic diversity score per user per session
# - A/B test explore percentages: measure long-term retention (not just session length)
Real-Time Signal Adaptation
# Within-session real-time adaptation
class RealTimeAdapter:
"""Adjusts rankings based on user behavior within current session."""
def adapt(self, session_events: list, remaining_candidates: list) -> list:
# Track within-session signals
session_profile = {
"topics_engaged": set(), # Topics where user watched > 50%
"topics_skipped": set(), # Topics user skipped quickly
"creators_followed": set(), # Creators followed this session
"avg_watch_percentage": 0.0, # Are they actively watching or half-engaged?
"session_fatigue_score": 0.0, # Decreasing engagement over time?
}
for event in session_events:
if event.watch_pct > 0.5:
session_profile["topics_engaged"].add(event.video.topic)
elif event.watch_pct < 0.1:
session_profile["topics_skipped"].add(event.video.topic)
# Boost videos matching session interests
for video in remaining_candidates:
if video.topic in session_profile["topics_engaged"]:
video.score *= 1.3 # 30% boost for engaged topics
if video.topic in session_profile["topics_skipped"]:
video.score *= 0.5 # 50% penalty for skipped topics
# Detect session fatigue
if session_profile["session_fatigue_score"] > 0.7:
# User engagement is dropping -> show more diverse/novel content
# or suggest "take a break" message
pass
return sorted(remaining_candidates, key=lambda v: v.score, reverse=True)
Step 4: Trade-Offs Discussion
| Decision | Chosen | Alternative | Rationale |
|---|---|---|---|
| Candidate sourcing | Multiple sources merged | Single ANN model | Multiple sources ensure diversity and new creator coverage; single model would bias toward popular content |
| Ranking objective | Multi-objective with negative signals | Pure engagement (watch time) | Subtracting regret and skip prevents optimizing for clickbait; improves long-term retention |
| Exploration | Thompson Sampling + reserved slots | Epsilon-greedy | Thompson Sampling adapts exploration based on uncertainty; epsilon-greedy wastes exploration budget randomly |
| New creator boost | 15% reserved feed slots | No special treatment | Without boost, new creators get zero impressions (cold start death spiral), killing platform growth |
| Real-time adaptation | Within-session feature updates | Batch-only (daily profiles) | User mood shifts within sessions; real-time adaptation increases engagement by 10-15% |
Ethical Considerations
Responsible feed design: Optimizing purely for engagement maximizes addictive content, misinformation, and outrage. Production systems must include: (1) regret prediction to penalize content users wish they hadn't seen, (2) session time limits and "take a break" prompts, (3) content diversity requirements to prevent echo chambers, (4) transparency about why content was shown, and (5) user controls to adjust feed behavior. Mentioning these in an interview demonstrates maturity and awareness that interviewers value highly.
Key Takeaways
- Content feeds use 4-stage pipelines: sourcing (5B → 10K), pre-ranking (10K → 500), ranking (500 → 100), rebalancing (100 → 50)
- Multi-task ranking with NEGATIVE signals (skip, regret) prevents optimizing for addictive low-quality content
- Explore vs. exploit is existential: without exploration, new creators die and the platform stagnates
- Real-time session adaptation (boosting topics user engaged with, penalizing skipped topics) increases relevance by 10–15%
- Reserved slots for new creators (15% of feed) and diversity rules prevent content concentration
- Always discuss ethical considerations: regret prediction, session limits, echo chamber prevention
Lilly Tech Systems