AI Audit Documentation
Documentation is the primary output of an AI audit. Well-structured documentation enables accountability, facilitates regulatory compliance, and creates a verifiable record of AI system governance.
Model Cards
Model cards (Mitchell et al., 2019) provide structured documentation of an AI model's capabilities, limitations, and intended use. They are increasingly expected by regulators and auditors:
| Section | Contents | Audit Relevance |
|---|---|---|
| Model Details | Architecture, training procedure, version, developers | Establishes what was audited and by whom it was built |
| Intended Use | Primary use cases, out-of-scope uses, users | Defines the boundaries for evaluating appropriateness |
| Performance Metrics | Accuracy, precision, recall, F1 across subgroups | Provides verifiable claims for auditor validation |
| Limitations | Known failure modes, edge cases, environmental constraints | Demonstrates awareness of risks and honest disclosure |
| Ethical Considerations | Sensitive use cases, fairness analysis, potential harms | Shows ethical reflection and risk awareness |
Datasheets for Datasets
-
Motivation
Document why the dataset was created, who created it, and who funded it. This establishes the context and potential biases introduced by the dataset's purpose.
-
Composition
Describe what the data instances represent, the total count, any sampling methodology, demographic distributions, and whether the data is raw or processed.
-
Collection Process
Detail how data was acquired: web scraping, surveys, sensors, purchases. Document consent mechanisms, compensation to data subjects, and any IRB approval obtained.
-
Uses and Distribution
Describe how the dataset has been used, who has access, licensing terms, and any restrictions on use. Flag any uses the creators consider inappropriate.
Audit Report Structure
Executive Summary
High-level overview of scope, methodology, key findings, and overall assessment. Written for non-technical stakeholders including executives, board members, and regulators.
Detailed Findings
Each finding with severity classification, evidence, analysis, and specific recommendations. Include references to the criteria against which the finding was evaluated.
Methodology
Describe audit procedures, tools used, data accessed, and any limitations encountered. This enables stakeholders to evaluate the rigor and completeness of the audit.
Appendices
Supporting evidence, statistical analyses, metric calculations, interview summaries, and technical details. These provide the evidentiary foundation for the audit conclusions.
Compliance Evidence Management
- Evidence repository: Maintain a centralized, tamper-evident repository for all audit evidence including model artifacts, data samples, test results, and documentation
- Version control: Track versions of all audited artifacts. An audit of model v2.1 is not valid for model v3.0. Maintain clear traceability between audit findings and specific versions
- Retention policies: Define retention periods aligned with regulatory requirements. NYC LL144 requires audit results to be published for at least one year. EU AI Act requires documentation to be maintained for 10 years
- Access controls: Restrict access to audit working papers to authorized auditors and compliance personnel. Protect sensitive information while maintaining availability for regulatory examination