ML Algorithm Landscape Overview
Machine learning encompasses a vast ecosystem of algorithms, each designed for specific types of problems. This overview maps out the full landscape, helps you understand how to read each algorithm entry in this directory, and provides a guide for selecting the right algorithm for your task.
The 10 Algorithm Categories
We organize 100+ algorithms into 10 major categories based on their learning paradigm, problem type, and output:
| # | Category | Count | Learning Type | Description |
|---|---|---|---|---|
| 1 | Regression | 15 | Supervised | Predict continuous numeric values |
| 2 | Classification | 17 | Supervised | Predict discrete class labels |
| 3 | Clustering | 11 | Unsupervised | Group similar data points |
| 4 | Dimensionality Reduction | 10 | Unsupervised | Reduce feature space while preserving information |
| 5 | Ensemble Methods | 7 | Supervised | Combine multiple models for better performance |
| 6 | Reinforcement Learning | 14 | RL | Learn optimal actions through environment interaction |
| 7 | Neural Networks & Deep Learning | 14 | Supervised / Unsupervised | Layered architectures for complex pattern recognition |
| 8 | Time Series | 5 | Supervised | Forecast future values from sequential data |
| 9 | Recommendation | 4 | Varies | Suggest items based on user behavior |
| 10 | Other Algorithms | 10+ | Varies | Anomaly detection, association rules, probabilistic models |
How to Read Each Algorithm Entry
Every algorithm in this directory follows a consistent format to make comparison easy:
Algorithm Name
- Description: A concise explanation of what the algorithm does and the core idea behind it.
- Type: The learning paradigm (supervised, unsupervised, semi-supervised, reinforcement).
- Use Cases: Real-world scenarios where this algorithm excels.
- Key Parameters: The most important hyperparameters you need to tune.
- Pros: Strengths and advantages of the algorithm.
- Cons: Limitations and weaknesses.
- Python Code: A working code snippet using popular libraries (scikit-learn, TensorFlow, PyTorch).
- Complexity: Time complexity (training and inference) and space complexity.
Algorithm Selection Guide
Choosing the right algorithm depends on your data, your problem, and your constraints. Use this decision framework:
Step 1: Define Your Problem Type
| Question | If Yes | Go To |
|---|---|---|
| Do you have labeled data with a continuous target? | Regression | Regression |
| Do you have labeled data with a categorical target? | Classification | Classification |
| Do you need to find groups in unlabeled data? | Clustering | Clustering |
| Do you have too many features? | Dimensionality Reduction | Dimensionality Reduction |
| Do you need to maximize cumulative reward? | Reinforcement Learning | Reinforcement Learning |
| Is your data sequential or temporal? | Time Series / RNN | Time Series |
| Do you need image/text/audio processing? | Deep Learning | Neural Networks |
Step 2: Consider Your Constraints
| Constraint | Favor | Avoid |
|---|---|---|
| Small dataset (<1000 samples) | Linear models, Naive Bayes, KNN, SVM | Deep learning, ensemble methods |
| Large dataset (>1M samples) | SGD-based models, LightGBM, deep learning | KNN, SVM with RBF kernel |
| High dimensionality | Lasso, Ridge, PCA, tree-based models | KNN, plain linear regression |
| Interpretability required | Linear/logistic regression, decision trees | Neural networks, ensemble methods |
| Real-time inference | Linear models, small trees, KNN with KD-tree | Large ensembles, deep networks |
| Missing values common | XGBoost, LightGBM, CatBoost | SVM, KNN, linear models |
| Categorical features | CatBoost, decision trees, Naive Bayes | SVM, KNN (without encoding) |
Step 3: Start Simple, Then Iterate
- Baseline: Start with the simplest algorithm for your problem type (linear regression, logistic regression, K-Means).
- Evaluate: Measure performance with appropriate metrics (RMSE, accuracy, F1, silhouette score).
- Improve: Try more complex algorithms (ensemble methods, neural networks).
- Tune: Optimize hyperparameters with cross-validation.
- Compare: Use this directory to find alternatives in the same category.
Complexity Comparison
Below is a high-level comparison of training and prediction complexity for popular algorithms. Here n = number of samples, p = number of features, k = number of clusters/neighbors, T = number of trees.
| Algorithm | Training Time | Prediction Time | Space |
|---|---|---|---|
| Linear Regression | O(np2) | O(p) | O(p) |
| Logistic Regression | O(np) | O(p) | O(p) |
| KNN | O(1) | O(np) | O(np) |
| SVM (RBF kernel) | O(n2p) to O(n3) | O(nsvp) | O(nsvp) |
| Decision Tree | O(np log n) | O(log n) | O(nodes) |
| Random Forest | O(T · np log n) | O(T · log n) | O(T · nodes) |
| Gradient Boosting | O(T · np) | O(T · log n) | O(T · nodes) |
| K-Means | O(nkp · iterations) | O(kp) | O(kp) |
| DBSCAN | O(n log n) with index | N/A | O(n) |
| PCA | O(np2) | O(np) | O(p2) |
| Naive Bayes | O(np) | O(p) | O(p · classes) |
| Neural Network (MLP) | O(epochs · n · layers · neurons2) | O(layers · neurons2) | O(weights) |
History of ML Algorithms Timeline
Machine learning has evolved over nearly a century. Here are the key milestones:
| Year | Algorithm / Milestone | Creator(s) |
|---|---|---|
| 1805 | Least Squares Regression | Legendre, Gauss |
| 1936 | Linear Discriminant Analysis (LDA) | R.A. Fisher |
| 1943 | McCulloch-Pitts Neuron (first neural model) | McCulloch, Pitts |
| 1957 | Perceptron | Frank Rosenblatt |
| 1958 | Logistic Regression (popularized) | David Cox |
| 1963 | Support Vector Machine (linear) | Vapnik, Lerner |
| 1965 | K-Nearest Neighbors formalized | Cover, Hart |
| 1967 | K-Means Clustering | MacQueen |
| 1970 | ARIMA (Box-Jenkins method) | Box, Jenkins |
| 1980 | Self-Organizing Maps | Teuvo Kohonen |
| 1984 | Classification and Regression Trees (CART) | Breiman et al. |
| 1986 | Backpropagation popularized | Rumelhart, Hinton, Williams |
| 1989 | Convolutional Neural Networks (CNN) | Yann LeCun |
| 1992 | Support Vector Machine (kernel trick) | Boser, Guyon, Vapnik |
| 1993 | Apriori Algorithm | Agrawal, Srikant |
| 1995 | Random Forest | Tin Kam Ho, later Leo Breiman (2001) |
| 1996 | DBSCAN | Ester, Kriegel, Sander, Xu |
| 1997 | Long Short-Term Memory (LSTM) | Hochreiter, Schmidhuber |
| 1997 | AdaBoost | Freund, Schapire |
| 1999 | Gradient Boosting Machine | Jerome Friedman |
| 2001 | Random Forest (final form) | Leo Breiman |
| 2006 | Deep Learning breakthrough (deep belief nets) | Geoffrey Hinton |
| 2008 | t-SNE | van der Maaten, Hinton |
| 2012 | AlexNet (CNN revolution) | Krizhevsky, Sutskever, Hinton |
| 2013 | Variational Autoencoders (VAE) | Kingma, Welling |
| 2014 | Generative Adversarial Networks (GAN) | Ian Goodfellow et al. |
| 2014 | GRU (Gated Recurrent Unit) | Cho et al. |
| 2015 | ResNet (152 layers deep) | He et al. |
| 2016 | XGBoost gains popularity | Tianqi Chen |
| 2017 | Transformer architecture | Vaswani et al. (Google) |
| 2017 | LightGBM | Microsoft Research |
| 2017 | Proximal Policy Optimization (PPO) | Schulman et al. (OpenAI) |
| 2018 | UMAP | McInnes, Healy, Melville |
| 2018 | CatBoost | Yandex |
| 2018 | BERT (Transformer for NLP) | |
| 2020 | Vision Transformer (ViT) | Google Research |
| 2020 | GPT-3 | OpenAI |
| 2022–2025 | Foundation models, LLMs, multimodal AI | Various |
Notation Used in This Directory
| Symbol | Meaning |
|---|---|
n | Number of training samples |
p or d | Number of features (dimensions) |
k | Number of clusters, neighbors, or classes |
T | Number of trees (in ensemble methods) |
α | Learning rate |
λ | Regularization parameter |
γ | Discount factor (RL) or kernel coefficient (SVM) |