Small Language Models

Not every task needs a 400-billion-parameter model. Small Language Models (SLMs) deliver impressive performance at a fraction of the cost, latency, and compute. Learn about the latest SLM families — Phi, Gemma, and more — along with quantization techniques and on-device deployment strategies.

6
Lessons
15+
Models
~3hr
Total Time
Efficient

What You'll Learn

By the end of this course, you'll understand when and how to use small language models effectively, from model selection to on-device deployment.

💫

Model Families

Deep dive into Phi, Gemma, and other SLM families. Understand their architectures, training data strategies, and where each excels.

📦

Quantization

Learn techniques to compress models from 16-bit to 4-bit and beyond, dramatically reducing memory and compute requirements.

📱

On-Device Deployment

Run language models on phones, laptops, and edge devices. Cover frameworks, optimization, and real-world deployment patterns.

🎯

Use Case Selection

Know when a small model is the right choice and when you need something larger. Make informed cost-performance trade-off decisions.

Course Lessons

Follow the lessons in order for a structured learning experience, or jump directly to the topic you need.

Prerequisites

What you need before starting this course.

Before You Begin:
  • Basic understanding of how language models work
  • Familiarity with Python programming
  • Understanding of model inference concepts (helpful)
  • Experience with Hugging Face ecosystem (helpful but not required)