AI Skills

Master the practitioner skills that AI engineers reach for every day. Not theory, not framework tours — the practical techniques that separate engineers who ship reliable AI from those who don't. 47 skills, 282 hands-on lessons.

47 Skills
282 Lessons
100% Free
💻 Hands-On Code

All Skills

47 skills organized into 6 categories spanning the full AI engineering stack — from prompts to production.

Prompting & LLM Mastery

💡

Advanced Prompt Engineering

Master the craft of prompt engineering: structured prompts, system messages, role design, output contracts, and reliable instruction-following at scale.

6 Lessons
🧠

Chain-of-Thought Prompting

Use step-by-step reasoning to dramatically improve LLM accuracy on math, logic, multi-hop QA, and planning tasks. Learn zero-shot CoT, few-shot CoT, and self-consistency.

6 Lessons
🎯

Few-Shot Learning Skills

Teach LLMs new tasks by showing 2-10 examples instead of fine-tuning. Master example selection, ordering, and the bias-variance tradeoffs of in-context learning.

6 Lessons
🎭

Role and Persona Prompting

Use roles and personas to steer LLM tone, expertise, and behavior. Build expert assistants, customer service voices, and domain specialists with persona prompts.

6 Lessons
📊

Prompt Compression

Cut prompt tokens 50-90% without losing accuracy. Learn LLMLingua, summarization, schema compression, and token-aware prompt engineering for cost reduction.

6 Lessons
🎨

Multimodal Prompting

Prompt vision-language and audio-language models. Build OCR pipelines, chart readers, document analyzers, and image-grounded chat with GPT-4V, Claude, and Gemini.

6 Lessons
📄

Structured Output Prompting

Force LLMs to emit valid JSON, XML, YAML, or any structured format. Master JSON mode, function calling, Pydantic schemas, and grammar-constrained decoding.

6 Lessons
🛡

Prompt Injection Defense

Defend production LLM apps from prompt injection, jailbreaks, and indirect attacks. Layer detection, isolation, and constraint techniques to harden user-facing AI.

6 Lessons

RAG & Retrieval

📑

Document Chunking Strategies

The chunking strategy makes or breaks RAG quality. Master fixed-size, recursive, semantic, structural, and late chunking patterns for documents, code, and PDFs.

6 Lessons
🔬

Vector Embedding Selection

Choose the right embedding model for your data, language, and budget. Compare OpenAI, Cohere, Voyage, BGE, E5, and learn when to fine-tune your own.

6 Lessons
🔍

Hybrid Search (BM25 + Vector)

Combine keyword (BM25) and semantic (vector) search to beat either alone. Learn fusion techniques, RRF, weighted scoring, and tuning for production retrieval.

6 Lessons
📋

Reranking Models

Boost RAG accuracy 10-30% with a second-stage reranker. Master cross-encoders, Cohere Rerank, BGE-Reranker, and ColBERT for precision-critical retrieval.

6 Lessons

Query Rewriting and Expansion

Rewrite vague user queries into search-optimized forms. Master HyDE, multi-query, query decomposition, and step-back prompting for better retrieval.

6 Lessons
🏷

Metadata Filtering

Combine vector search with structured filters: dates, tenants, permissions, categories. Build multi-tenant RAG that returns only the documents users are allowed to see.

6 Lessons
📊

RAG Evaluation Metrics

Measure RAG quality across retrieval (hit rate, MRR, NDCG) and generation (faithfulness, answer relevance). Build eval suites with RAGAS, TruLens, and DeepEval.

6 Lessons
📚

Long Context RAG

Stuff 100K-1M token contexts into Claude, Gemini, and GPT-4 long-context models. Learn when long-context replaces RAG and when to combine them.

6 Lessons

Model Customization

🔧

Fine-Tuning with LoRA

Fine-tune 7B-70B parameter LLMs on consumer GPUs using LoRA adapters. Train domain-specific models for 1-10% of the cost of full fine-tuning.

6 Lessons

QLoRA Quantized Fine-Tuning

Fine-tune 70B parameter models on a single A100 with 4-bit quantization. Master bitsandbytes, NF4, double quantization, and the QLoRA training recipe.

6 Lessons
📝

Instruction Tuning

Turn a base LLM into an instruction-following assistant. Curate datasets like Alpaca, Dolly, and OpenHermes; format with chat templates; train SFT pipelines.

6 Lessons

DPO and RLHF Alignment

Align LLMs with human preferences using Direct Preference Optimization (DPO) and Reinforcement Learning from Human Feedback (RLHF). Build preference datasets and training loops.

6 Lessons
🧶

Model Distillation

Compress a large teacher model into a small student model. Cut inference cost 10-100x while preserving most capability through knowledge distillation.

6 Lessons
📦

Model Quantization (GGUF, AWQ, GPTQ)

Run 70B models on a laptop or 7B models on a phone via 4-bit and 2-bit quantization. Master GGUF, AWQ, GPTQ, and bitsandbytes quantization formats.

6 Lessons
🔗

Model Merging

Combine the strengths of multiple fine-tuned models without further training. Master TIES, DARE, SLERP, and frankenmerges with mergekit.

6 Lessons
📖

Continued Pretraining

Inject new domain knowledge by continuing pretraining on raw text. Build models that know your codebase, medical literature, or legal corpus deeply.

6 Lessons

Agent Skills

Production AI

💰

LLM Cost Optimization

Cut LLM bills 50-90% without sacrificing quality. Master model routing, prompt caching, batch inference, and the levers that drive production LLM cost.

6 Lessons
💾

Prompt Caching Mastery

Slash latency and cost with prompt caching. Master Anthropic prompt caching, OpenAI auto-caching, semantic caching, and cache invalidation patterns.

6 Lessons
🌊

Streaming LLM Responses

Build snappy chat UIs with streaming responses. Master SSE, WebSockets, partial JSON parsing, and stream cancellation across web, mobile, and serverless apps.

6 Lessons

Inference Latency Tuning

Cut p50 and p99 LLM latency with the right techniques. Master TTFT optimization, speculative decoding, KV cache reuse, and batching for low-latency inference.

6 Lessons
🚀

Model Serving with vLLM

Serve LLMs in production with vLLM. Master PagedAttention, continuous batching, tensor parallelism, and the OpenAI-compatible API for high-throughput inference.

6 Lessons
💻

GPU Memory Management

Fit big models on small GPUs. Master OOM debugging, gradient checkpointing, model offloading, FlashAttention, and the techniques that double effective memory.

6 Lessons
📦

Batch Inference Optimization

Process millions of LLM calls cheaply with batch APIs. Master OpenAI/Anthropic batch APIs, Ray Data, async parallelism, and offline LLM workflows.

6 Lessons
💵

Token Budget Management

Stay within context windows and cost ceilings. Master tokenizers, dynamic context trimming, summarization, and budget-aware request routing.

6 Lessons

Evaluation & Safety

LLM-as-Judge Evaluation

Use a strong LLM to grade outputs of another LLM. Master pairwise comparison, rubric scoring, judge calibration, and avoiding the bias traps.

6 Lessons
👀

Hallucination Detection

Detect when LLMs make things up. Master citation grounding, NLI verification, self-consistency checks, and SelfCheckGPT for production hallucination detection.

6 Lessons

Bias and Fairness Auditing

Audit LLMs and ML models for demographic bias. Master fairness metrics, counterfactual probing, BBQ benchmark, and bias mitigation techniques.

6 Lessons
🔥

AI Red Teaming

Stress-test AI systems for harmful, unsafe, and policy-violating behavior. Master jailbreak techniques, automated red teaming, and reporting findings.

6 Lessons
🛡

PII Detection and Redaction

Strip personal data from prompts and outputs. Master Microsoft Presidio, regex, NER, and LLM-based redaction for HIPAA, GDPR, and CCPA compliance.

6 Lessons

AI Output Validation

Validate every LLM output before it ships to users or systems. Master Pydantic, Guardrails AI, JSON Schema, and the validation patterns that prevent disasters.

6 Lessons
📁

Eval Dataset Creation

Build the eval datasets your AI features need. Master synthetic generation, human labeling, edge case mining, and the iteration loop that compounds quality.

6 Lessons
🔎

Production AI Monitoring

Monitor AI in production: cost, latency, quality, drift, and abuse. Master LangSmith, Arize, Helicone, and the observability stack for LLM applications.

6 Lessons

Why a Skills Track?

Projects show you what to build. Skills show you the techniques you reuse across every project.

🔨

Transferable

Each skill applies to dozens of projects. Learn it once, use it for the rest of your career.

🎯

Practical

Every lesson includes runnable code and a production checklist. No theory dumps.

📊

Job-Ready

The exact skills hiring managers screen for in AI engineer, ML engineer, and applied scientist roles.

🎯

Composable

Skills are designed to combine. Pair them to solve problems no single technique can.