Attack Surface Analysis

Lesson 4 of 7 in the AI Security Fundamentals course.

Mapping the AI Attack Surface

An attack surface analysis identifies all the points where an attacker could interact with or influence an AI system. For machine learning systems, the attack surface is significantly larger than traditional applications because it spans data, code, models, infrastructure, and human processes.

A thorough attack surface analysis is the foundation for prioritizing security investments and designing effective defenses. Without it, organizations tend to over-invest in visible threats (like API security) while neglecting less obvious but equally dangerous vectors (like training data integrity).

The ML Lifecycle Attack Surface

Every phase of the ML lifecycle presents distinct attack opportunities:

1. Data Collection and Preparation

  • Web scraping pipelines: Attackers can manipulate web content that gets scraped into training datasets
  • Third-party data feeds: Compromised data providers can inject poisoned samples
  • Annotation platforms: Malicious annotators can systematically mislabel data
  • Data storage: Unauthorized access to data lakes, S3 buckets, or databases containing training data
  • Data preprocessing: Compromised ETL code can modify data before it reaches training

2. Model Training

  • Training code: Malicious modifications to training scripts, hyperparameters, or loss functions
  • Dependencies: Compromised ML libraries (PyTorch, TensorFlow, scikit-learn) or their dependencies
  • Compute environment: Shared GPU clusters where other tenants could access training processes
  • Pre-trained models: Backdoored foundation models or transfer learning base models
  • Experiment tracking: Manipulated metrics in MLflow or Weights & Biases to promote a compromised model
💡
Practical tip: Create a data flow diagram for your entire ML pipeline. Trace every piece of data from its source to the final model prediction. Every point where data crosses a trust boundary is a potential attack surface.

3. Model Evaluation and Selection

  • Test data contamination: Training data leaking into test sets, masking model issues
  • Metric manipulation: Cherry-picking evaluation metrics that hide vulnerabilities
  • Model registry: Unauthorized model promotion from staging to production
  • A/B testing: Manipulating experiment results to deploy a compromised model variant

4. Model Deployment and Serving

  • Model artifacts: Tampered model files loaded into serving infrastructure
  • API endpoints: Unauthenticated or poorly rate-limited prediction APIs
  • Feature stores: Manipulated real-time features that influence model predictions
  • Edge deployment: Models on devices that can be physically accessed and reverse-engineered

Automated Attack Surface Enumeration

Use systematic approaches to enumerate your attack surface:

Python
class MLAttackSurfaceAnalyzer:
    """Enumerate and score attack surfaces for ML systems."""

    PHASES = {
        "data_collection": {
            "vectors": [
                ("Web scraping sources", "Data poisoning via manipulated web content", "HIGH"),
                ("Third-party data APIs", "Compromised data feed injection", "HIGH"),
                ("User-submitted data", "Adversarial training samples", "MEDIUM"),
                ("Internal databases", "Insider data manipulation", "MEDIUM"),
            ]
        },
        "training": {
            "vectors": [
                ("ML framework dependencies", "Supply chain attack via compromised package", "CRITICAL"),
                ("Pre-trained model downloads", "Backdoored foundation model", "HIGH"),
                ("Shared GPU cluster", "Cross-tenant data leakage", "HIGH"),
                ("Training scripts in git", "Malicious code injection", "MEDIUM"),
            ]
        },
        "serving": {
            "vectors": [
                ("Prediction API", "Adversarial examples / model extraction", "HIGH"),
                ("Model file storage", "Model tampering or theft", "HIGH"),
                ("Feature pipeline", "Real-time feature manipulation", "MEDIUM"),
                ("Monitoring endpoints", "Information disclosure", "LOW"),
            ]
        }
    }

    def analyze(self):
        """Generate a complete attack surface report."""
        report = []
        for phase, data in self.PHASES.items():
            for vector, threat, severity in data["vectors"]:
                report.append({
                    "phase": phase,
                    "vector": vector,
                    "threat": threat,
                    "severity": severity
                })
        return sorted(report, key=lambda x: {"CRITICAL": 0, "HIGH": 1, "MEDIUM": 2, "LOW": 3}[x["severity"]])

analyzer = MLAttackSurfaceAnalyzer()
for item in analyzer.analyze():
    print(f"[{item['severity']:8s}] {item['phase']:20s} | {item['vector']}")
    print(f"           Threat: {item['threat']}")

Trust Boundaries in ML Systems

Trust boundaries are the lines between components with different levels of trust. Data crossing these boundaries must be validated:

  1. External to internal: User inputs, third-party data, downloaded models
  2. Training to serving: Model artifacts, configuration files, feature schemas
  3. Data team to ML team: Datasets, labels, data quality reports
  4. Development to production: Code, models, infrastructure configurations
  5. Internal services: Feature store to model server, model server to monitoring
Warning: One of the most commonly overlooked trust boundaries is between pre-trained models and your system. Downloading a model from Hugging Face or TensorFlow Hub is equivalent to running someone else's code. Always verify model integrity and test for backdoors.

Reducing the Attack Surface

After mapping the attack surface, prioritize reducing it:

  • Minimize data exposure: Only collect and retain the data you actually need for training
  • Pin dependencies: Lock all ML library versions and verify checksums
  • Isolate environments: Use separate networks for training, evaluation, and serving
  • Limit API capabilities: Expose only prediction endpoints, not model internals
  • Encrypt everywhere: Data at rest, in transit, and during computation where possible

Summary

A comprehensive attack surface analysis reveals the full scope of security challenges in ML systems. By systematically mapping attack vectors across every lifecycle phase and identifying trust boundaries, you can prioritize defenses where they matter most. The next lesson covers defense in depth strategies for protecting each layer of this attack surface.