Advanced

Design TikTok/Instagram Content Feed

Design the content ranking system for a short-video platform with 500M daily active users. This question is the hardest recommendation problem because content is highly ephemeral (new videos every second), user preferences shift rapidly, and the system must balance user engagement against creator fairness and societal impact.

Step 1: Clarify Requirements

Requirements (confirmed with interviewer):

Scale:
- 500M daily active users (DAU)
- Average session: 30 minutes, ~100 videos viewed per session
- 2M new videos uploaded per day
- Content library: 5B total videos
- Peak QPS: 2M feed requests per second

Latency:
- Feed generation: < 200ms for initial 10 videos
- Prefetch: next batch ready before user finishes current video
- Real-time adaptation: incorporate engagement signals within same session

Functional:
- "For You" feed (personalized, not following-based)
- Must surface new creators (not just popular ones)
- Content diversity (topics, formats, creators)
- Support cold start for new users (no watch history)

Success Metrics:
- Online: Session duration, videos watched, next-day retention
- NOT just engagement: also track user satisfaction surveys, regret rate
- Creator metrics: views distribution, new creator reach
- Safety: harmful content exposure rate

Constraints:
- Must handle 100+ languages and cultural contexts
- Content moderation (filter unsafe content before ranking)
- Regulatory: transparency on why content was shown
- Ethical: avoid addictive dark patterns, respect user well-being

Step 2: High-Level Architecture

Architecture: Short-Video Content Feed System

User opens app -> Feed Request (user_id, session_context)
    |
    v
[Feed Service] (< 200ms total)
    |
    |--- Stage 1: CANDIDATE SOURCING (retrieve ~10,000 from 5B)
    |       |
    |       |--- [Interest-Based] (~5000 videos)
    |       |       Two-tower model: user interest embedding vs. video embedding
    |       |       ANN search on video embedding index
    |       |
    |       |--- [Following + Social] (~1000 videos)
    |       |       Recent videos from followed creators
    |       |       Videos liked by similar users (collaborative)
    |       |
    |       |--- [Trending] (~1000 videos)
    |       |       Videos trending in user's geo/language
    |       |       Videos trending in user's interest categories
    |       |
    |       |--- [New Creator Exploration] (~2000 videos)
    |       |       Random sampling from new creators (< 30 days, < 1000 followers)
    |       |       Content-based features only (no engagement history)
    |       |
    |       |--- [Diversity / Serendipity] (~1000 videos)
    |       |       Topics user has NOT engaged with before
    |       |       High-quality content from underrepresented categories
    |       |
    |       Union + dedup + content safety filter = ~10,000 candidates
    |
    |--- Stage 2: PRE-RANKING (score 10,000 -> 500)
    |       |
    |       |--- Lightweight model (2-layer MLP, < 20ms)
    |       |       Features: user embedding, video embedding, basic stats
    |       |       Purpose: fast pruning, not final ranking
    |
    |--- Stage 3: RANKING (score 500 -> 100)
    |       |
    |       |--- [Feature Store] -> rich features per (user, video) pair
    |       |
    |       |--- Deep ranking model (multi-task, GPU inference):
    |       |       P(complete_view): probability user watches > 80%
    |       |       P(like): probability user likes
    |       |       P(share): probability user shares
    |       |       P(comment): probability user comments
    |       |       P(follow_creator): probability user follows
    |       |       E(watch_time): expected watch time in seconds
    |       |       P(not_interested): probability user swipes away quickly
    |       |
    |       |--- Combined score = weighted sum of predictions
    |       |       Key insight: SUBTRACT negative signals
    |       |       score = w1*E(watch_time) + w2*P(like) + w3*P(share)
    |       |              - w4*P(not_interested) - w5*P(regret)
    |
    |--- Stage 4: POST-RANKING REBALANCING (100 -> final 50)
    |       |
    |       |--- Diversity rules:
    |       |       Max 3 videos from same creator in top 50
    |       |       Min 5 different topic categories
    |       |       Min 2 videos from new creators
    |       |       Max 2 consecutive videos of same format
    |       |
    |       |--- Explore vs. Exploit mixing:
    |       |       80% exploit (high predicted engagement)
    |       |       20% explore (uncertain predictions, new content)
    |       |
    |       |--- Business insertions:
    |       |       Ad slots at positions 5, 12, 20, 30, 40
    |       |       Promoted content at position 3
    |
    v
[Feed Response: ordered list of 50 videos]
    |--- Client prefetches video content from CDN
    |--- Client reports engagement events back in real-time

Step 3: Deep Dive — Multi-Stage Ranking

The Deep Ranking Model

# Multi-task ranking model for content feed

class ContentFeedRanker(nn.Module):
    def __init__(self):
        super().__init__()

        # Feature encoders
        self.user_encoder = UserFeatureEncoder(output_dim=256)
        self.video_encoder = VideoFeatureEncoder(output_dim=256)
        self.context_encoder = ContextEncoder(output_dim=64)  # time, device, session state

        # Cross-feature interaction network
        self.interaction = nn.Sequential(
            nn.Linear(256 + 256 + 64 + 128, 1024),  # user + video + context + cross_features
            nn.ReLU(), nn.BatchNorm1d(1024), nn.Dropout(0.2),
            nn.Linear(1024, 512),
            nn.ReLU(), nn.BatchNorm1d(512), nn.Dropout(0.1),
            nn.Linear(512, 256),
            nn.ReLU()
        )

        # Multi-task heads
        self.complete_view_head = nn.Linear(256, 1)   # P(watch > 80%)
        self.like_head = nn.Linear(256, 1)             # P(like)
        self.share_head = nn.Linear(256, 1)            # P(share)
        self.comment_head = nn.Linear(256, 1)          # P(comment)
        self.follow_head = nn.Linear(256, 1)           # P(follow creator)
        self.watch_time_head = nn.Linear(256, 1)       # E(watch seconds)
        self.skip_head = nn.Linear(256, 1)             # P(skip in < 2 sec)
        self.regret_head = nn.Linear(256, 1)           # P(user regrets watching)

    def forward(self, user_features, video_features, context, cross_features):
        user_emb = self.user_encoder(user_features)
        video_emb = self.video_encoder(video_features)
        ctx_emb = self.context_encoder(context)

        combined = torch.cat([user_emb, video_emb, ctx_emb, cross_features], dim=-1)
        shared = self.interaction(combined)

        return {
            "p_complete": torch.sigmoid(self.complete_view_head(shared)),
            "p_like": torch.sigmoid(self.like_head(shared)),
            "p_share": torch.sigmoid(self.share_head(shared)),
            "p_comment": torch.sigmoid(self.comment_head(shared)),
            "p_follow": torch.sigmoid(self.follow_head(shared)),
            "watch_time": F.softplus(self.watch_time_head(shared)),
            "p_skip": torch.sigmoid(self.skip_head(shared)),
            "p_regret": torch.sigmoid(self.regret_head(shared))
        }

# CRITICAL: The ranking formula determines platform behavior
# Pure engagement: score = watch_time * p_complete (maximizes session time)
# Responsible ranking: also subtract negative signals
RANKING_WEIGHTS = {
    "watch_time": 0.30,      # Expected engagement
    "p_complete": 0.15,      # Quality signal
    "p_like": 0.15,          # Explicit positive feedback
    "p_share": 0.15,         # Strongest quality signal
    "p_comment": 0.05,       # Engagement (can be negative comments)
    "p_follow": 0.10,        # Creator discovery
    "p_skip": -0.20,         # Penalize likely-to-skip content
    "p_regret": -0.30,       # Penalize regrettable content (strongest negative)
}

Step 3: Deep Dive — Explore vs. Exploit

The explore-exploit tradeoff is critical for content platforms. Pure exploitation (only show what the model predicts users will like) creates filter bubbles and kills new creator growth. Pure exploration (random content) destroys user experience.

# Explore vs. Exploit strategies

class ExploreExploitMixer:
    def mix(self, ranked_videos: list, user: User) -> list:
        final_feed = []

        # 1. Thompson Sampling for uncertain videos
        # For videos where the model is uncertain (high variance),
        # sample from the predicted distribution instead of using point estimate
        for video in ranked_videos:
            if video.prediction_variance > UNCERTAINTY_THRESHOLD:
                # Sample from posterior: could rank higher or lower
                video.sampled_score = np.random.normal(
                    video.predicted_score, video.prediction_variance
                )
            else:
                video.sampled_score = video.predicted_score

        # 2. New creator boost
        # Reserve 10-20% of feed for creators with < 30 days / < 1000 followers
        new_creator_slots = int(len(ranked_videos) * 0.15)
        new_creator_videos = [v for v in ranked_videos if v.creator.is_new]
        # Select from new creators based on content quality (not engagement, which is low)
        new_creator_picks = self.select_by_content_quality(new_creator_videos, new_creator_slots)

        # 3. Topic diversification
        # Ensure user sees at least 5 different topic categories
        topic_buckets = self.group_by_topic(ranked_videos)
        # Pick top videos from each bucket, round-robin style

        # 4. Anti-addiction measures
        if user.session_duration > 60 * 60:  # > 1 hour
            # Increase exploration, decrease engagement optimization
            # Insert "take a break" reminders
            pass

        return self.interleave(exploit_videos, new_creator_picks, diverse_picks)

# Measuring exploration effectiveness:
# - Track new creator impression -> follower conversion rate
# - Track topic diversity score per user per session
# - A/B test explore percentages: measure long-term retention (not just session length)

Real-Time Signal Adaptation

# Within-session real-time adaptation

class RealTimeAdapter:
    """Adjusts rankings based on user behavior within current session."""

    def adapt(self, session_events: list, remaining_candidates: list) -> list:
        # Track within-session signals
        session_profile = {
            "topics_engaged": set(),      # Topics where user watched > 50%
            "topics_skipped": set(),       # Topics user skipped quickly
            "creators_followed": set(),    # Creators followed this session
            "avg_watch_percentage": 0.0,   # Are they actively watching or half-engaged?
            "session_fatigue_score": 0.0,  # Decreasing engagement over time?
        }

        for event in session_events:
            if event.watch_pct > 0.5:
                session_profile["topics_engaged"].add(event.video.topic)
            elif event.watch_pct < 0.1:
                session_profile["topics_skipped"].add(event.video.topic)

        # Boost videos matching session interests
        for video in remaining_candidates:
            if video.topic in session_profile["topics_engaged"]:
                video.score *= 1.3  # 30% boost for engaged topics
            if video.topic in session_profile["topics_skipped"]:
                video.score *= 0.5  # 50% penalty for skipped topics

        # Detect session fatigue
        if session_profile["session_fatigue_score"] > 0.7:
            # User engagement is dropping -> show more diverse/novel content
            # or suggest "take a break" message
            pass

        return sorted(remaining_candidates, key=lambda v: v.score, reverse=True)

Step 4: Trade-Offs Discussion

Decision	Chosen	Alternative	Rationale
Candidate sourcing	Multiple sources merged	Single ANN model	Multiple sources ensure diversity and new creator coverage; single model would bias toward popular content
Ranking objective	Multi-objective with negative signals	Pure engagement (watch time)	Subtracting regret and skip prevents optimizing for clickbait; improves long-term retention
Exploration	Thompson Sampling + reserved slots	Epsilon-greedy	Thompson Sampling adapts exploration based on uncertainty; epsilon-greedy wastes exploration budget randomly
New creator boost	15% reserved feed slots	No special treatment	Without boost, new creators get zero impressions (cold start death spiral), killing platform growth
Real-time adaptation	Within-session feature updates	Batch-only (daily profiles)	User mood shifts within sessions; real-time adaptation increases engagement by 10-15%

Ethical Considerations

⚠

Responsible feed design: Optimizing purely for engagement maximizes addictive content, misinformation, and outrage. Production systems must include: (1) regret prediction to penalize content users wish they hadn't seen, (2) session time limits and "take a break" prompts, (3) content diversity requirements to prevent echo chambers, (4) transparency about why content was shown, and (5) user controls to adjust feed behavior. Mentioning these in an interview demonstrates maturity and awareness that interviewers value highly.

Key Takeaways

💡

Content feeds use 4-stage pipelines: sourcing (5B → 10K), pre-ranking (10K → 500), ranking (500 → 100), rebalancing (100 → 50)
Multi-task ranking with NEGATIVE signals (skip, regret) prevents optimizing for addictive low-quality content
Explore vs. exploit is existential: without exploration, new creators die and the platform stagnates
Real-time session adaptation (boosting topics user engaged with, penalizing skipped topics) increases relevance by 10–15%
Reserved slots for new creators (15% of feed) and diversity rules prevent content concentration
Always discuss ethical considerations: regret prediction, session limits, echo chamber prevention

← Previous Autonomous Vehicles Next → Tips & Patterns