Intermediate

Methods Taxonomy

Papers With Code organizes ML techniques into a comprehensive taxonomy of methods. Learn to navigate this knowledge graph to understand how architectures and components relate to each other.

What Are Methods?

In the Papers With Code taxonomy, a method is a specific technique, component, or architectural pattern used in machine learning. Methods range from high-level architectures (like Transformers) to specific components (like Multi-Head Attention) to training techniques (like Dropout).

The Methods Hierarchy

Methods are organized in a tree-like hierarchy:

General categories: Attention, Convolutions, Normalization, Activation Functions, etc.
Specific methods: Self-Attention, Depthwise Separable Convolution, Layer Normalization, GELU, etc.
Variants: Flash Attention, Grouped Query Attention, RoPE, etc.

Key Method Categories

Category	Examples	Used In
Attention Mechanisms	Self-Attention, Cross-Attention, Flash Attention	Transformers, LLMs, Vision models
Normalization	BatchNorm, LayerNorm, RMSNorm, GroupNorm	Nearly all deep learning models
Activation Functions	ReLU, GELU, SiLU/Swish, Mish	All neural networks
Regularization	Dropout, Weight Decay, Label Smoothing	Training procedures
Positional Encoding	Sinusoidal, RoPE, ALiBi, Learned	Transformers, sequence models
Loss Functions	Cross-Entropy, Focal Loss, Contrastive Loss	Training objectives

Using Method Pages

Each method page on Papers With Code provides:

Description: What the method does and how it works
Diagram: Visual explanation of the method (when available)
Papers: The original paper and subsequent papers that use or improve the method
Code: Implementations in various frameworks (PyTorch, TensorFlow, JAX)
Related methods: Parent and child methods in the taxonomy

✅

Learning strategy: Use the methods taxonomy to build a mental model of how ML techniques relate. Start from a high-level category and drill down into specific variants. Understanding the taxonomy helps you read papers faster because you can quickly place new techniques in context.

Tracing Architectural Evolution

The methods section is particularly useful for understanding how architectures have evolved over time:

Transformers: Original Attention → Multi-Head Attention → Grouped Query Attention → Flash Attention
CNNs: Standard Convolution → Depthwise Separable → Inverted Residuals → ConvNeXt
Normalization: BatchNorm → LayerNorm → RMSNorm (used in modern LLMs)

Finding Implementations

When you find a method you want to use, look for implementations in your preferred framework. Many methods are available in standard libraries:

PyTorch: torch.nn contains most standard methods
Hugging Face Transformers: Pre-built transformer components and models
timm: PyTorch Image Models library with vision method implementations
Custom repos: For newer methods, check the linked GitHub repositories

← Previous Datasets Next → Best Practices