llama.cpp Mastery
Master llama.cpp — the C++ inference engine that runs LLMs on anything. Learn GGUF, quantization formats, Metal/CUDA backends, and tuning for CPU, GPU, and edge.
6
Lessons
💻
Code Examples
✅
Production-Ready
100%
Free
Lessons in This Topic
Work through these 6 lessons in order, or jump to whichever topic you need most.
Lilly Tech Systems