Advanced

Compliance & Governance

AI systems face a rapidly evolving regulatory landscape. SOC2 auditors now ask about AI-specific controls, HIPAA requires safeguards for AI that touches health data, and the EU AI Act mandates risk assessments for high-risk AI systems. This lesson shows you how to design audit logging, build model documentation, and implement governance frameworks that satisfy regulators and auditors.

SOC2 for AI Systems

SOC2 Type II audits now include AI-specific controls. Here are the key areas auditors examine and what you need to implement:

SOC2 Trust Principle	AI-Specific Controls	What Auditors Want to See
Security	Prompt injection defenses, model access controls, API authentication	Evidence of input validation, RBAC policies, penetration test reports
Availability	Model serving SLAs, failover between providers, degradation plans	Uptime metrics, incident response logs, provider failover test results
Processing Integrity	Model version control, output validation, hallucination monitoring	Model deployment logs, A/B test results, output accuracy metrics
Confidentiality	PII filtering, data encryption, tenant isolation, data residency	PII scan reports, encryption configurations, isolation architecture diagrams
Privacy	Data minimization, consent management, retention enforcement, DSAR handling	Privacy impact assessments, consent records, data deletion logs

Audit Logging Design

A well-designed audit log is the foundation of all compliance. Here is a production audit logging system that captures everything auditors need:

import json
import time
import uuid
import hashlib
from dataclasses import dataclass, field, asdict
from enum import Enum
from typing import Optional

class AuditEventType(Enum):
    # Authentication events
    AUTH_SUCCESS = "auth.success"
    AUTH_FAILURE = "auth.failure"
    KEY_CREATED = "auth.key_created"
    KEY_REVOKED = "auth.key_revoked"
    KEY_ROTATED = "auth.key_rotated"

    # Model access events
    MODEL_REQUEST = "model.request"
    MODEL_RESPONSE = "model.response"
    MODEL_ERROR = "model.error"
    MODEL_TIMEOUT = "model.timeout"

    # Data events
    PII_DETECTED = "data.pii_detected"
    PII_REDACTED = "data.pii_redacted"
    PII_BLOCKED = "data.pii_blocked"
    DATA_EXPORT = "data.export"
    DATA_DELETION = "data.deletion"

    # Security events
    INJECTION_DETECTED = "security.injection_detected"
    INJECTION_BLOCKED = "security.injection_blocked"
    RATE_LIMIT_HIT = "security.rate_limit"
    EXTRACTION_SUSPECTED = "security.extraction_suspected"

    # Model lifecycle events
    MODEL_DEPLOYED = "model.deployed"
    MODEL_RETIRED = "model.retired"
    MODEL_UPDATED = "model.updated"
    MODEL_EVALUATED = "model.evaluated"

    # Admin events
    CONFIG_CHANGED = "admin.config_changed"
    ROLE_ASSIGNED = "admin.role_assigned"
    POLICY_UPDATED = "admin.policy_updated"

@dataclass
class AuditEvent:
    event_type: AuditEventType
    timestamp: float = field(default_factory=time.time)
    event_id: str = field(default_factory=lambda: str(uuid.uuid4()))

    # Who
    user_id: Optional[str] = None
    team_id: Optional[str] = None
    api_key_id: Optional[str] = None
    ip_address: Optional[str] = None

    # What
    model: Optional[str] = None
    action: Optional[str] = None
    resource: Optional[str] = None

    # Details
    input_tokens: int = 0
    output_tokens: int = 0
    latency_ms: float = 0.0
    status: str = "success"
    error_message: Optional[str] = None

    # Security
    risk_score: float = 0.0
    pii_types_found: list = field(default_factory=list)
    injection_patterns: list = field(default_factory=list)

    # Metadata
    metadata: dict = field(default_factory=dict)


class AuditLogger:
    """
    Production audit logging system for AI compliance.
    Immutable, tamper-evident, queryable.
    """

    def __init__(self, db, stream, config: dict):
        self.db = db
        self.stream = stream
        self.config = config
        self._previous_hash = "genesis"

    async def log(self, event: AuditEvent):
        """Log an audit event with tamper-evident chaining."""
        record = asdict(event)
        record["event_type"] = event.event_type.value

        # Create chain hash (tamper evidence)
        record["previous_hash"] = self._previous_hash
        record_str = json.dumps(record, sort_keys=True)
        record["hash"] = hashlib.sha256(record_str.encode()).hexdigest()
        self._previous_hash = record["hash"]

        # Persist to immutable storage
        await self.db.insert("audit_log", record)

        # Real-time stream for monitoring dashboards
        await self.stream.publish("audit_events", json.dumps(record))

        # Alert on security events
        if event.risk_score > 0.7:
            await self._send_security_alert(event)

    async def query_for_audit(
        self,
        start_date: str,
        end_date: str,
        event_types: list[str] = None,
        user_id: str = None,
    ) -> list[dict]:
        """Query audit logs for compliance reporting."""
        filters = {
            "timestamp_gte": start_date,
            "timestamp_lte": end_date,
        }
        if event_types:
            filters["event_type_in"] = event_types
        if user_id:
            filters["user_id"] = user_id

        return await self.db.query("audit_log", filters, order_by="timestamp")

    async def verify_integrity(self, records: list[dict]) -> dict:
        """Verify audit log chain has not been tampered with."""
        valid = True
        issues = []

        for i, record in enumerate(records):
            # Verify hash
            stored_hash = record.pop("hash")
            expected_hash = hashlib.sha256(
                json.dumps(record, sort_keys=True).encode()
            ).hexdigest()

            if stored_hash != expected_hash:
                valid = False
                issues.append(f"Hash mismatch at record {i}: {record['event_id']}")

            # Verify chain
            if i > 0 and record.get("previous_hash") != records[i-1].get("hash"):
                valid = False
                issues.append(f"Chain broken at record {i}")

            record["hash"] = stored_hash  # Restore

        return {
            "integrity_valid": valid,
            "records_checked": len(records),
            "issues": issues,
        }

    async def generate_compliance_report(
        self, start_date: str, end_date: str
    ) -> dict:
        """Generate a compliance report for auditors."""
        records = await self.query_for_audit(start_date, end_date)

        return {
            "report_period": {"start": start_date, "end": end_date},
            "total_events": len(records),
            "summary": {
                "total_model_requests": sum(
                    1 for r in records if r["event_type"] == "model.request"
                ),
                "auth_failures": sum(
                    1 for r in records if r["event_type"] == "auth.failure"
                ),
                "pii_incidents": sum(
                    1 for r in records if r["event_type"].startswith("data.pii")
                ),
                "security_events": sum(
                    1 for r in records if r["event_type"].startswith("security.")
                ),
                "injection_attempts_blocked": sum(
                    1 for r in records
                    if r["event_type"] == "security.injection_blocked"
                ),
            },
            "data_handling": {
                "pii_types_detected": list(set(
                    pii_type
                    for r in records
                    for pii_type in r.get("pii_types_found", [])
                )),
                "data_deletion_requests": sum(
                    1 for r in records if r["event_type"] == "data.deletion"
                ),
            },
            "integrity_check": await self.verify_integrity(records),
        }

HIPAA for Healthcare AI

Healthcare AI systems that handle Protected Health Information (PHI) must meet specific HIPAA requirements. Here are the key controls:

# HIPAA compliance controls for healthcare AI

HIPAA_AI_CONTROLS = {
    "technical_safeguards": {
        "access_control": {
            "requirement": "Unique user identification, automatic logoff, encryption",
            "ai_implementation": [
                "Per-clinician API keys with role-based model access",
                "Session timeout after 15 minutes of inactivity",
                "AES-256 encryption for all PHI in transit and at rest",
                "MFA required for all users accessing AI systems with PHI",
            ],
        },
        "audit_controls": {
            "requirement": "Record and examine activity in systems containing PHI",
            "ai_implementation": [
                "Log every AI query that includes patient data",
                "Log model outputs that contain PHI",
                "Retain audit logs for minimum 6 years",
                "Tamper-evident logging with hash chains",
            ],
        },
        "integrity_controls": {
            "requirement": "Protect PHI from improper alteration or destruction",
            "ai_implementation": [
                "Model output validation (clinical accuracy checks)",
                "Version-controlled model deployments with rollback",
                "Input validation to prevent malformed medical queries",
                "Backup and disaster recovery for AI model artifacts",
            ],
        },
        "transmission_security": {
            "requirement": "Protect PHI during electronic transmission",
            "ai_implementation": [
                "TLS 1.3 for all API communications",
                "mTLS between internal AI services",
                "No PHI in URLs, query parameters, or logs",
                "VPN or private network for model serving endpoints",
            ],
        },
    },

    "administrative_safeguards": {
        "risk_analysis": "Conduct AI-specific risk assessment annually",
        "workforce_training": "Train all staff on AI system PHI handling",
        "incident_procedures": "AI-specific incident response plan for PHI breaches",
        "baa_requirements": "BAA with all AI/LLM providers (OpenAI, Anthropic, etc.)",
    },

    "ai_specific_requirements": {
        "phi_filtering": "Strip all 18 HIPAA identifiers before sending to external AI models",
        "model_validation": "Clinical validation of AI outputs before use in patient care",
        "de_identification": "Use Safe Harbor or Expert Determination for training data",
        "minimum_necessary": "Only include PHI in AI prompts that is necessary for the task",
        "patient_consent": "Obtain consent before using patient data with AI systems",
    },
}

# The 18 HIPAA identifiers that must be removed before external AI processing
HIPAA_IDENTIFIERS = [
    "names",
    "geographic_data_smaller_than_state",
    "dates_except_year",
    "phone_numbers",
    "fax_numbers",
    "email_addresses",
    "social_security_numbers",
    "medical_record_numbers",
    "health_plan_beneficiary_numbers",
    "account_numbers",
    "certificate_license_numbers",
    "vehicle_identifiers_serial_numbers",
    "device_identifiers_serial_numbers",
    "web_urls",
    "ip_addresses",
    "biometric_identifiers",
    "full_face_photographs",
    "any_other_unique_identifying_number",
]

EU AI Act Requirements

The EU AI Act classifies AI systems by risk level and imposes requirements accordingly. Most enterprise AI falls into "limited risk" or "high risk":

# EU AI Act risk classification and requirements

EU_AI_ACT = {
    "unacceptable_risk": {
        "description": "Banned AI applications",
        "examples": [
            "Social scoring systems by governments",
            "Real-time biometric identification in public spaces (with exceptions)",
            "Manipulation of vulnerable groups",
            "Subliminal manipulation techniques",
        ],
        "action": "PROHIBITED - Do not build",
    },

    "high_risk": {
        "description": "AI in critical domains requiring strict controls",
        "examples": [
            "AI in recruitment and HR decisions",
            "AI in credit scoring and lending",
            "AI in healthcare diagnostics",
            "AI in law enforcement and judicial systems",
            "AI in education (grading, admissions)",
            "AI in critical infrastructure",
        ],
        "requirements": {
            "risk_management": {
                "what": "Continuous risk identification and mitigation",
                "implementation": "Quarterly risk assessments using the framework below",
            },
            "data_governance": {
                "what": "Training data quality, relevance, and representativeness",
                "implementation": "Data quality reports, bias audits, dataset cards",
            },
            "technical_documentation": {
                "what": "Detailed system documentation before market release",
                "implementation": "Model cards, system architecture docs, API specs",
            },
            "record_keeping": {
                "what": "Automatic logging of system operation",
                "implementation": "Audit logging system (see above)",
            },
            "transparency": {
                "what": "Clear instructions and information to users",
                "implementation": "User-facing documentation of AI capabilities and limitations",
            },
            "human_oversight": {
                "what": "Effective human supervision capabilities",
                "implementation": "Human-in-the-loop workflows, override mechanisms",
            },
            "accuracy_robustness": {
                "what": "Appropriate levels of accuracy and cybersecurity",
                "implementation": "Evaluation metrics, adversarial testing, security audits",
            },
        },
    },

    "limited_risk": {
        "description": "AI with transparency obligations",
        "examples": [
            "Chatbots (must disclose AI interaction)",
            "Emotion recognition systems",
            "Deepfake generators",
        ],
        "requirements": {
            "transparency": "Users must be informed they are interacting with AI",
            "content_labeling": "AI-generated content must be labeled as such",
        },
    },

    "minimal_risk": {
        "description": "AI with no specific requirements",
        "examples": ["Spam filters", "AI in video games", "Inventory management"],
        "requirements": "Voluntary codes of conduct encouraged",
    },
}

Model Cards and Documentation

Model cards are standardized documentation that describe a model's intended use, limitations, evaluation metrics, and ethical considerations. They are required by the EU AI Act for high-risk systems and increasingly expected by SOC2 auditors:

from dataclasses import dataclass, field
from typing import Optional
import json

@dataclass
class ModelCard:
    """
    Standardized model documentation for compliance.
    Based on Mitchell et al. (2019) "Model Cards for Model Reporting."
    """

    # Model Details
    model_name: str = ""
    model_version: str = ""
    model_type: str = ""  # e.g., "text-classification", "text-generation"
    organization: str = ""
    release_date: str = ""
    license: str = ""

    # Intended Use
    primary_use_cases: list[str] = field(default_factory=list)
    out_of_scope_uses: list[str] = field(default_factory=list)
    target_users: list[str] = field(default_factory=list)

    # Training Data
    training_data_description: str = ""
    training_data_size: str = ""
    data_preprocessing: str = ""
    data_collection_consent: str = ""
    known_data_biases: list[str] = field(default_factory=list)

    # Evaluation
    evaluation_metrics: dict = field(default_factory=dict)
    evaluation_datasets: list[str] = field(default_factory=list)
    performance_by_group: dict = field(default_factory=dict)

    # Limitations
    known_limitations: list[str] = field(default_factory=list)
    failure_modes: list[str] = field(default_factory=list)
    not_suitable_for: list[str] = field(default_factory=list)

    # Ethical Considerations
    ethical_review_date: Optional[str] = None
    bias_assessment: str = ""
    fairness_metrics: dict = field(default_factory=dict)
    privacy_considerations: str = ""

    # Security
    security_review_date: Optional[str] = None
    adversarial_testing: str = ""
    known_vulnerabilities: list[str] = field(default_factory=list)
    watermark_embedded: bool = False

    # Maintenance
    update_frequency: str = ""
    monitoring_metrics: list[str] = field(default_factory=list)
    feedback_mechanism: str = ""
    retirement_criteria: str = ""

    def to_json(self) -> str:
        return json.dumps(asdict(self), indent=2)

    def validate_completeness(self) -> dict:
        """Check if the model card meets compliance requirements."""
        required_fields = {
            "eu_ai_act": [
                "model_name", "model_version", "model_type",
                "primary_use_cases", "training_data_description",
                "evaluation_metrics", "known_limitations",
                "ethical_review_date", "security_review_date",
            ],
            "soc2": [
                "model_name", "model_version", "evaluation_metrics",
                "known_limitations", "monitoring_metrics",
                "update_frequency",
            ],
            "hipaa": [
                "model_name", "model_version", "training_data_description",
                "data_collection_consent", "privacy_considerations",
                "security_review_date",
            ],
        }

        results = {}
        for framework, fields in required_fields.items():
            missing = [f for f in fields if not getattr(self, f)]
            results[framework] = {
                "complete": len(missing) == 0,
                "missing_fields": missing,
                "completion_pct": round(
                    (len(fields) - len(missing)) / len(fields) * 100
                ),
            }
        return results


# Example model card for a production AI system
example_card = ModelCard(
    model_name="Customer Support Classifier v2.1",
    model_version="2.1.0",
    model_type="text-classification",
    organization="Acme Corp",
    release_date="2026-03-01",
    license="Proprietary",
    primary_use_cases=[
        "Route customer support tickets to correct department",
        "Priority classification (P1-P4) for incoming tickets",
    ],
    out_of_scope_uses=[
        "Medical diagnosis or health-related classification",
        "Legal advice or case classification",
        "Autonomous decision-making without human review",
    ],
    target_users=["Customer support team leads", "Support operations managers"],
    training_data_description="500K customer support tickets from 2023-2025, "
                              "manually labeled by senior support agents",
    training_data_size="500,000 labeled examples",
    known_data_biases=["English-language bias (95% English training data)",
                       "Enterprise customer overrepresentation"],
    evaluation_metrics={"accuracy": 0.94, "f1_macro": 0.91, "f1_weighted": 0.93},
    known_limitations=[
        "Accuracy drops to 78% for non-English tickets",
        "Novel issue types not in training data may be misrouted",
        "Performance degrades for tickets shorter than 20 words",
    ],
    privacy_considerations="Model trained on de-identified ticket data. "
                           "PII removed before training using PIIDetector pipeline.",
    security_review_date="2026-02-15",
    adversarial_testing="Tested against 1000 adversarial inputs. "
                        "3 bypass patterns found and mitigated.",
    monitoring_metrics=["accuracy", "f1_score", "latency_p99", "drift_score"],
    update_frequency="Quarterly retraining with new ticket data",
)

Risk Assessment Framework

Use this framework to systematically assess risks in your AI system. It maps to EU AI Act requirements and is useful for SOC2 risk assessments:

class AIRiskAssessment:
    """
    AI-specific risk assessment framework.
    Run quarterly for high-risk systems, annually for others.
    """

    RISK_CATEGORIES = {
        "safety": {
            "description": "Can the AI cause physical or psychological harm?",
            "questions": [
                "Can incorrect outputs lead to safety-critical decisions?",
                "Is there human oversight for all consequential outputs?",
                "What happens if the model hallucinates in a critical context?",
                "Are there kill switches to disable AI in emergencies?",
            ],
        },
        "fairness": {
            "description": "Does the AI treat all groups equitably?",
            "questions": [
                "Has the model been tested for performance across demographic groups?",
                "Are there known biases in the training data?",
                "Do outcomes disproportionately affect protected groups?",
                "Is there a process for bias complaints and remediation?",
            ],
        },
        "privacy": {
            "description": "Does the AI protect personal data?",
            "questions": [
                "Is PII filtered before model processing?",
                "Can the model memorize and reproduce training data?",
                "Are data retention policies enforced?",
                "Can users request deletion of their data?",
            ],
        },
        "security": {
            "description": "Is the AI protected against attacks?",
            "questions": [
                "Are prompt injection defenses deployed?",
                "Is model extraction attack detection in place?",
                "Are model artifacts signed and integrity-verified?",
                "Has penetration testing been conducted?",
            ],
        },
        "transparency": {
            "description": "Do users understand the AI's capabilities and limitations?",
            "questions": [
                "Are users informed they are interacting with AI?",
                "Can users request explanations for AI decisions?",
                "Is there a model card documenting limitations?",
                "Are AI-generated outputs labeled as such?",
            ],
        },
        "accountability": {
            "description": "Is there clear ownership and oversight?",
            "questions": [
                "Who is responsible for the AI system's behavior?",
                "Is there an incident response plan for AI failures?",
                "Are audit logs maintained for all AI interactions?",
                "Is there a process for model retirement?",
            ],
        },
    }

    def assess(self, system_name: str, responses: dict) -> dict:
        """
        Generate risk assessment report.
        responses: {category: {question_index: {answer: str, risk_level: str, mitigation: str}}}
        """
        report = {
            "system": system_name,
            "assessment_date": time.strftime("%Y-%m-%d"),
            "categories": {},
            "overall_risk": "LOW",
        }

        risk_scores = {"LOW": 1, "MEDIUM": 2, "HIGH": 3, "CRITICAL": 4}
        max_risk = 0

        for category, details in self.RISK_CATEGORIES.items():
            category_responses = responses.get(category, {})
            category_risks = [
                risk_scores.get(r.get("risk_level", "LOW"), 1)
                for r in category_responses.values()
            ]
            avg_risk = sum(category_risks) / max(len(category_risks), 1)
            max_risk = max(max_risk, max(category_risks, default=1))

            report["categories"][category] = {
                "description": details["description"],
                "average_risk_score": round(avg_risk, 2),
                "highest_risk_item": max(category_risks, default=1),
                "items_assessed": len(category_responses),
                "items_total": len(details["questions"]),
            }

        # Overall risk level
        risk_labels = {1: "LOW", 2: "MEDIUM", 3: "HIGH", 4: "CRITICAL"}
        report["overall_risk"] = risk_labels.get(max_risk, "UNKNOWN")

        return report

💡

Apply at work: Start with audit logging. It takes 1-2 days to implement and immediately satisfies the most common SOC2 and HIPAA audit requests. Next, create model cards for every production model — auditors love comprehensive documentation, and it forces your team to think through limitations and risks before deployment.

Key Takeaways

SOC2 auditors now examine AI-specific controls across all five trust principles. Security (prompt injection defenses), Confidentiality (PII filtering), and Processing Integrity (model versioning) are the most scrutinized areas.
Build a tamper-evident audit logging system with hash-chained records. Log authentication, model access, PII incidents, security events, and model lifecycle events. Retain for 7 years minimum.
HIPAA requires stripping all 18 HIPAA identifiers before sending data to external AI models, plus BAAs with every AI provider, and 6-year audit log retention.
The EU AI Act classifies AI by risk level. Most enterprise AI falls into "limited risk" (transparency required) or "high risk" (risk management, documentation, human oversight, and accuracy requirements).
Create model cards for every production model documenting intended use, training data, evaluation metrics, known limitations, and ethical considerations. These satisfy auditors and improve team awareness.
Run AI risk assessments quarterly for high-risk systems. Assess safety, fairness, privacy, security, transparency, and accountability.

What Is Next

In the final lesson, we will bring everything together with an AI security checklist, penetration testing approaches for AI, incident response playbooks for AI breaches, and a comprehensive FAQ for security engineers.

← Previous Model Security Next → Best Practices & Checklist