Intermediate

Methods Taxonomy

Papers With Code organizes ML techniques into a comprehensive taxonomy of methods. Learn to navigate this knowledge graph to understand how architectures and components relate to each other.

What Are Methods?

In the Papers With Code taxonomy, a method is a specific technique, component, or architectural pattern used in machine learning. Methods range from high-level architectures (like Transformers) to specific components (like Multi-Head Attention) to training techniques (like Dropout).

The Methods Hierarchy

Methods are organized in a tree-like hierarchy:

  • General categories: Attention, Convolutions, Normalization, Activation Functions, etc.
  • Specific methods: Self-Attention, Depthwise Separable Convolution, Layer Normalization, GELU, etc.
  • Variants: Flash Attention, Grouped Query Attention, RoPE, etc.

Key Method Categories

CategoryExamplesUsed In
Attention MechanismsSelf-Attention, Cross-Attention, Flash AttentionTransformers, LLMs, Vision models
NormalizationBatchNorm, LayerNorm, RMSNorm, GroupNormNearly all deep learning models
Activation FunctionsReLU, GELU, SiLU/Swish, MishAll neural networks
RegularizationDropout, Weight Decay, Label SmoothingTraining procedures
Positional EncodingSinusoidal, RoPE, ALiBi, LearnedTransformers, sequence models
Loss FunctionsCross-Entropy, Focal Loss, Contrastive LossTraining objectives

Using Method Pages

Each method page on Papers With Code provides:

  • Description: What the method does and how it works
  • Diagram: Visual explanation of the method (when available)
  • Papers: The original paper and subsequent papers that use or improve the method
  • Code: Implementations in various frameworks (PyTorch, TensorFlow, JAX)
  • Related methods: Parent and child methods in the taxonomy
Learning strategy: Use the methods taxonomy to build a mental model of how ML techniques relate. Start from a high-level category and drill down into specific variants. Understanding the taxonomy helps you read papers faster because you can quickly place new techniques in context.

Tracing Architectural Evolution

The methods section is particularly useful for understanding how architectures have evolved over time:

  1. Transformers: Original Attention → Multi-Head Attention → Grouped Query Attention → Flash Attention
  2. CNNs: Standard Convolution → Depthwise Separable → Inverted Residuals → ConvNeXt
  3. Normalization: BatchNorm → LayerNorm → RMSNorm (used in modern LLMs)

Finding Implementations

When you find a method you want to use, look for implementations in your preferred framework. Many methods are available in standard libraries:

  • PyTorch: torch.nn contains most standard methods
  • Hugging Face Transformers: Pre-built transformer components and models
  • timm: PyTorch Image Models library with vision method implementations
  • Custom repos: For newer methods, check the linked GitHub repositories