Methods Taxonomy
Papers With Code organizes ML techniques into a comprehensive taxonomy of methods. Learn to navigate this knowledge graph to understand how architectures and components relate to each other.
What Are Methods?
In the Papers With Code taxonomy, a method is a specific technique, component, or architectural pattern used in machine learning. Methods range from high-level architectures (like Transformers) to specific components (like Multi-Head Attention) to training techniques (like Dropout).
The Methods Hierarchy
Methods are organized in a tree-like hierarchy:
- General categories: Attention, Convolutions, Normalization, Activation Functions, etc.
- Specific methods: Self-Attention, Depthwise Separable Convolution, Layer Normalization, GELU, etc.
- Variants: Flash Attention, Grouped Query Attention, RoPE, etc.
Key Method Categories
| Category | Examples | Used In |
|---|---|---|
| Attention Mechanisms | Self-Attention, Cross-Attention, Flash Attention | Transformers, LLMs, Vision models |
| Normalization | BatchNorm, LayerNorm, RMSNorm, GroupNorm | Nearly all deep learning models |
| Activation Functions | ReLU, GELU, SiLU/Swish, Mish | All neural networks |
| Regularization | Dropout, Weight Decay, Label Smoothing | Training procedures |
| Positional Encoding | Sinusoidal, RoPE, ALiBi, Learned | Transformers, sequence models |
| Loss Functions | Cross-Entropy, Focal Loss, Contrastive Loss | Training objectives |
Using Method Pages
Each method page on Papers With Code provides:
- Description: What the method does and how it works
- Diagram: Visual explanation of the method (when available)
- Papers: The original paper and subsequent papers that use or improve the method
- Code: Implementations in various frameworks (PyTorch, TensorFlow, JAX)
- Related methods: Parent and child methods in the taxonomy
Tracing Architectural Evolution
The methods section is particularly useful for understanding how architectures have evolved over time:
- Transformers: Original Attention → Multi-Head Attention → Grouped Query Attention → Flash Attention
- CNNs: Standard Convolution → Depthwise Separable → Inverted Residuals → ConvNeXt
- Normalization: BatchNorm → LayerNorm → RMSNorm (used in modern LLMs)
Finding Implementations
When you find a method you want to use, look for implementations in your preferred framework. Many methods are available in standard libraries:
- PyTorch:
torch.nncontains most standard methods - Hugging Face Transformers: Pre-built transformer components and models
- timm: PyTorch Image Models library with vision method implementations
- Custom repos: For newer methods, check the linked GitHub repositories
Lilly Tech Systems