Intermediate

Design a News Feed Ranking System

A complete ML system design walkthrough for one of the most commonly asked interview questions. Learn how to design a personalized news feed ranking system like those used at Facebook, LinkedIn, and Twitter.

Step 1: Clarify Requirements

Before designing anything, ask the interviewer these questions:

📝
Sample dialogue:
  • “What types of content appear in the feed?” — Posts, photos, videos, ads, stories
  • “What is the scale?” — 500M DAU, each user follows ~200 accounts
  • “What are we optimizing for?” — User engagement (clicks, likes, comments, shares, time spent)
  • “Latency requirements?” — Feed must render in under 200ms
  • “Any content policy constraints?” — Misinformation demotion, diversity requirements

ML Problem Formulation

# Problem formulation
# Business goal:    Maximize user engagement (composite metric)
# ML task:          Multi-objective ranking
# Predict:          P(click), P(like), P(comment), P(share), P(hide)
# Final score:      weighted_sum(predictions) = w1*P(click) + w2*P(like) + ...
# Training data:    Historical user-post interaction logs
# Label:            Binary labels for each action type

Step 2: High-Level Architecture

The system uses a two-stage architecture to handle the scale of ranking thousands of candidate posts per user request.

# Architecture overview (draw this on whiteboard)
#
# [User Request] --> [Candidate Generation] --> [Ranking] --> [Re-ranking] --> [Feed]
#                          |                       |              |
#                    ~10,000 posts            ~500 posts     ~50 posts
#                    (fast retrieval)        (ML scoring)   (policy + diversity)
#
# Supporting systems:
# [Feature Store] - serves user/post features to ranking
# [Training Pipeline] - daily model retraining on interaction logs
# [Monitoring] - tracks CTR, engagement, model drift

Why Two Stages?

StagePurposeModel ComplexityLatency Budget
Candidate GenerationNarrow from millions to ~10K candidatesSimple (ANN, collaborative filtering)~20ms
RankingScore ~500 candidates with rich featuresComplex (deep neural network)~50ms
Re-rankingApply business rules, diversity, freshnessRule-based + lightweight ML~10ms

Step 3: Deep Dive — Feature Engineering

Features are the most impactful part of any ranking system. Group them into categories:

User Features

FeatureTypeDescription
user_age_bucketCategoricalAge group: 18–24, 25–34, 35–44, etc.
user_countryCategoricalUser’s country (hashed embedding)
avg_session_durationNumericalAverage session length in last 7 days
posts_liked_7dNumericalNumber of posts liked in the last 7 days
content_type_affinityEmbeddingVector representing preference for photos/videos/text
topic_interestsEmbeddingLearned topic interest vector from interaction history

Post Features

FeatureTypeDescription
post_age_hoursNumericalHours since post was created
content_typeCategoricalPhoto, video, text, link, poll
author_follower_countNumericalLog-transformed follower count of author
historical_ctrNumericalClick-through rate of this post so far
text_embeddingEmbeddingBERT embedding of post text
has_mediaBinaryWhether post contains image/video

Context Features

FeatureTypeDescription
time_of_dayCyclicalsin/cos encoded hour of day
day_of_weekCategoricalMonday through Sunday
device_typeCategoricalMobile, tablet, desktop
connection_typeCategoricalWiFi, 4G, 5G (affects video auto-play)

Cross Features (User x Post Interactions)

FeatureTypeDescription
user_author_interaction_countNumericalHow many times user interacted with this author
user_topic_affinity_scoreNumericalDot product of user topic vector and post topic vector
friend_interaction_countNumericalHow many of user’s friends interacted with this post

Deep Dive — Ranking Model Architecture

The ranking model is a multi-task deep neural network that predicts multiple engagement types simultaneously.

# Model architecture (simplified)
#
# Input features
#   |
# [Embedding Layer] -- converts sparse categoricals to dense vectors
#   |
# [Feature Interaction Layer] -- DCN (Deep & Cross Network) or DeepFM
#   |
# [Shared Hidden Layers] -- 3 layers of 512, 256, 128 units (ReLU)
#   |
# [Task-Specific Towers]
#   |-- P(click)    -- sigmoid output
#   |-- P(like)     -- sigmoid output
#   |-- P(comment)  -- sigmoid output
#   |-- P(share)    -- sigmoid output
#   |-- P(hide)     -- sigmoid output (negative signal)
#
# Final score = w1*P(click) + w2*P(like) + w3*P(comment)
#             + w4*P(share) - w5*P(hide)
💡
Why multi-task learning? Training a single model with multiple heads is more parameter-efficient than separate models. The shared layers learn general engagement patterns, while task-specific towers capture what makes a user click vs. comment. This approach also reduces serving latency since one forward pass produces all predictions.

Model Alternatives and Trade-Offs

ModelProsConsWhen to Use
Logistic RegressionFast, interpretable, easy to debugLimited feature interactionsV1 baseline, high-QPS systems
GBDT (XGBoost)Handles non-linear interactions, robustHard to serve at low latencyOffline scoring, re-ranking
Deep & Cross NetworkAutomatic feature crosses, good accuracyMore complex to trainProduction ranking at scale
Transformer RankerCaptures sequence patterns, state-of-the-artHigh latency, expensiveWhen accuracy matters most

Deep Dive — Serving Infrastructure

Candidate Generation

Use multiple candidate generators in parallel for coverage:

  • Friends’ posts: Fetch recent posts from followed accounts (simple database query)
  • Collaborative filtering: “Users similar to you also engaged with these posts” (ANN index like FAISS)
  • Content-based: Posts similar to what user engaged with recently (embedding similarity)
  • Trending/viral: Posts with high engagement velocity (popularity-based)

Serving Flow

# Serving flow (latency breakdown)
#
# 1. User opens app                        [0ms]
# 2. Fetch user features from Feature Store [5ms]
# 3. Candidate generation (parallel)        [20ms]
#    - Friends' posts: 15ms
#    - CF candidates: 18ms
#    - Content-based: 12ms
# 4. Fetch post features for candidates     [10ms]
# 5. Compute cross features                 [5ms]
# 6. Run ranking model (batch inference)    [30ms]
# 7. Re-ranking (diversity, freshness)      [10ms]
# 8. Return ranked feed                     [Total: ~80ms]

Deep Dive — Metrics & Evaluation

Offline Metrics

MetricWhat It MeasuresTarget
AUC-ROCRanking quality of click prediction> 0.80
NDCG@10Quality of top-10 ranked items> 0.65
Log LossCalibration of probability predictions< 0.45
Precision@5Fraction of top-5 that user engages with> 0.30

Online Metrics (A/B Test)

MetricWhat It MeasuresGuardrail
CTRFraction of impressions that get clicksMust not decrease
Session DurationAverage time spent per sessionPrimary success metric
Daily Active UsersLong-term retention signalMust not decrease
Content DiversityVariety of topics/authors in feedMust not decrease by >5%
Negative Feedback RateHide/report/unfollow actionsMust not increase

Step 4: Trade-Offs & Extensions

Engagement vs. Well-Being

Optimizing purely for clicks can promote clickbait and outrage content. Use a composite score that penalizes regretful clicks (user hides post after clicking).

🔄

Freshness vs. Relevance

Highly relevant older posts compete with newer but less relevant posts. Apply a time decay multiplier to balance freshness with relevance score.

🌐

Personalization vs. Filter Bubble

Strong personalization can create echo chambers. Inject a diversity bonus to surface content outside the user’s typical interests.

Cold Start Problem

New users and new posts lack interaction data. Use content-based features and popularity signals until enough interaction data is collected.

💡
Interview tip: Mentioning the tension between engagement optimization and user well-being demonstrates product maturity and awareness of responsible AI — a strong signal at senior levels.