AI Disclosure & Provenance
Master AI disclosure and provenance end to end. 60 deep dives across 360 lessons covering foundations (transparency vs disclosure vs provenance, audiences, history, failure modes), content provenance standards (C2PA / Content Credentials, IPTC AI tags, JPEG Trust, PAI synthetic media framework, production implementation), watermarking (text, image, audio, video, robustness, evaluation), AI-content detection (text, image, audio, limits, program operations), model & system disclosure (model cards, system cards, datasheets, RAI cards, frontier-model norms, transparency reports, comparability), training-data provenance (lineage, dataset cards, opt-out registries, TDM ethics, copyright, contamination checking), AI disclosure law & regulation (EU AI Act Article 50 / 53, US state stack, election & political ads, platform labelling, FTC guidance, copyright mandates, cross-border), UX (labels, uncertainty communication, chatbot identity, accessibility), operations & engineering (provenance pipeline, versioning, audit trail, vendor requirements, incident response), and industry / standards / future (W3C / ISO / IEEE / NIST / ITU, C2PA / CAI, newsroom policies, deepfake counter-tech, research frontier).
AI disclosure and provenance is the discipline of making it possible for users, regulators, and downstream systems to know, with appropriate confidence, what an AI system is, what it was trained on, what it just produced, and what they should rely on. It sits at the intersection of content provenance standards (C2PA / Content Credentials, IPTC, JPEG Trust), watermarking research (text, image, audio, video), AI-content detection (and its honest limits), model and system disclosure (model cards, system cards, datasheets, RAI cards, transparency reports), training-data provenance (opt-out registries, TDM exceptions, copyright), and the regulatory machinery that turns the whole stack into a legal obligation (EU AI Act Article 50 / Article 53, US state laws, FTC guidance, platform AI-content labelling rules).
This track is written for the practitioners doing this work day to day: provenance engineers shipping C2PA pipelines, ML engineers integrating watermarks, T&S leads building AI-content labelling, RAI leads authoring system cards, legal / policy partners interpreting disclosure law, newsroom standards leads, and program leaders standing up cross-functional disclosure programs. Every topic explains the underlying discipline (drawing on the C2PA spec, IPTC and JPEG Trust standards, NIST AI RMF, EU AI Act, US state laws, frontier-lab system cards, the canonical research literature on watermarking and detection), the practical methodology that operationalises it, and the failure modes where disclosure work quietly fails to inform anyone. The aim is that a reader can stand up a credible AI disclosure and provenance function, integrate it with engineering and governance, and defend it to regulators, journalists, oversight boards, and the users the system actually informs.
All Topics
60 AI disclosure & provenance topics organized into 10 categories. Each has 6 detailed lessons with frameworks, methodologies, and operational patterns.
Disclosure & Provenance Foundations
AI Disclosure & Provenance Overview
Master what AI disclosure and provenance actually mean. Learn the scope, the lineage from broadcast / publishing standards, the deliverables, and the operating model used by mature programs.
6 LessonsDisclosure Principles
Translate disclosure principles into engineering decisions. Learn the FAIR / OECD / NIST AI RMF principles, the audience-fit principle, the verifiability principle, and the no-harm principle.
6 LessonsTransparency vs Disclosure vs Provenance
Disentangle transparency, disclosure, and provenance. Learn the distinct meanings, the audiences each serves, the engineering implications, and the failure modes when teams conflate them.
6 LessonsAudiences for Disclosure
Map the audiences for AI disclosure. Learn user, regulator, developer, public, journalist, oversight-board, and auditor audiences — each with its own evidentiary standard and language.
6 LessonsDisclosure History & Landmark Cases
Trace AI disclosure from model-cards research to regulated obligation. Learn the milestones (Mitchell et al. model cards, GPT-4 system card, C2PA launch, EU AI Act Art. 50) and lessons each cemented.
6 LessonsDisclosure Failure Modes
Recognise disclosure failure modes early. Learn theatre, opacity, info-overload, audience mismatch, stale-but-unrevised, and the discipline that prevents each from quietly capturing your program.
6 LessonsContent Provenance Standards
C2PA & Content Credentials
Master C2PA (Coalition for Content Provenance and Authenticity) and Content Credentials. Learn the spec, the manifest, signed assertions, and the canonical use cases for AI-generated content.
6 LessonsC2PA Manifest Structure
Read and write C2PA manifests well. Learn the manifest store, claim generators, ingredient tracking, redaction, signature chains, and the verifier-side validation flow.
6 LessonsIPTC Photo Metadata & AI Tags
Use IPTC Photo Metadata Standard with AI-specific tags. Learn the digital-source-type vocabulary (trainedAlgorithmicMedia etc.), the IPTC / C2PA bridge, and the newsroom workflow.
6 LessonsJPEG Trust & Image Standards
Map the JPEG Trust standard and adjacent image-trust work. Learn JPEG Trust core concepts, embedding, the relationship to C2PA, and the trade-offs of multiple competing standards.
6 LessonsPartnership on AI Synthetic Media Framework
Read the Partnership on AI Responsible Practices for Synthetic Media. Learn the 18 best practices, the builder / creator / distributor split, the framework adoption signals, and integration patterns.
6 LessonsC2PA Implementation in Production
Implement C2PA in production. Learn the SDK landscape, signing-key custody, watermark + manifest layering, verifier UX, the long-tail of formats, and CDN / re-encoding survivability.
6 LessonsWatermarking
Watermarking Overview
Map the watermarking landscape. Learn the families (text, image, audio, video), invisible vs visible, statistical vs cryptographic, and the watermark-vs-provenance complementarity.
6 LessonsLLM Text Watermarking
Reason about LLM text watermarking. Learn green-list / red-list approaches (Kirchenbauer et al.), SynthID-Text, the robustness-vs-quality trade-off, evaluation, and deployment realities.
6 LessonsImage Watermarking
Reason about image watermarking. Learn DCT / DWT-domain methods, neural watermarks (SynthID-Image, Stable Signature), survivability under crop / compression / re-photo, and the disclosure UX.
6 LessonsAudio Watermarking
Reason about audio watermarking. Learn echo-hiding / spread-spectrum methods, AudioSeal-style neural watermarks, robustness to compression and re-recording, and voice-clone disclosure.
6 LessonsVideo Watermarking
Reason about video watermarking. Learn frame-level, temporal, and per-clip approaches, the social-platform re-encoding gauntlet, deepfake disclosure, and live-stream constraints.
6 LessonsWatermark Robustness & Attacks
Reason about watermark robustness. Learn the canonical attack categories (paraphrase, regenerate, transform, ensemble), formal robustness research, and the layered-defence pattern.
6 LessonsWatermark Evaluation
Evaluate watermarks credibly. Learn detection metrics (TPR at low FPR), robustness eval, the WAVES / VeriBench-style suites, slice eval, and the link to standards work at NIST and ISO.
6 LessonsAI-Generated Content Detection
AI Content Detection Overview
Map AI-generated content detection. Learn the family taxonomy, the watermark-vs-detector split, eval realities, and why detection is a complement to provenance, not a replacement.
6 LessonsAI-Text Detection
Reason about AI-text detection. Learn statistical detectors (perplexity, log-likelihood), classifier detectors (GPTZero, OpenAI text-classifier era), the false-positive bias problem, and policy stance.
6 LessonsAI-Image Detection
Reason about AI-image detection. Learn frequency-domain artefacts, classifier-based detectors, the deepfake-detection arms race, and the limits when models train against detectors.
6 LessonsAI-Audio & Voice-Clone Detection
Reason about AI-audio and voice-clone detection. Learn artefact-based detectors, liveness checks, the phone-fraud use case, evaluation realities, and the integration with KYC.
6 LessonsLimits of AI Detection
Reason honestly about detection limits. Learn the false-positive harm story, the cat-and-mouse curve, the disclosure ethics, and why detection alone cannot carry the weight of policy.
6 LessonsDetection Program Operations
Run a detection program responsibly. Learn the threshold-setting discipline, audit logging, false-positive review, the human-in-loop requirement, and the user-redress pathway.
6 LessonsModel & System Disclosure
Model Cards (Mitchell et al.)
Author model cards properly. Learn the canonical Mitchell et al. structure, the Hugging Face / Google Cloud variants, intended-use sections, eval reporting, and the maintenance discipline.
6 LessonsSystem Cards
Author system cards (frontier-lab style). Learn the canonical sections, capability vs deployment context split, safety-eval reporting, residual-risk disclosure, and the cross-version comparability ritual.
6 LessonsDatasheets for Datasets
Author datasheets for datasets (Gebru et al.). Learn the canonical structure, motivation / composition / collection / preprocessing sections, intended use, and the maintenance discipline.
6 LessonsResponsible AI / Use Cards
Author responsible-AI cards alongside model and system cards. Learn the format, the intended-audience split (developer vs admin vs user), and the integration with deployment guides.
6 LessonsFrontier Model Disclosure Norms
Read frontier-model disclosure norms. Learn the canonical artefacts (system card, RSP / scaling policy, evaluation reports, AISI access), the comparability question, and the regulator linkage.
6 LessonsAI Transparency Reports
Publish AI transparency reports. Learn the canonical metric set, methodology disclosure, comparability discipline, the DSA-aligned report shape, and the audience-specific publication pattern.
6 LessonsDisclosure Comparability Across Vendors
Compare AI disclosures across vendors. Learn the comparability problem, the eval-set disclosure question, methodology fingerprinting, the Stanford CRFM Foundation Model Transparency Index, and procurement use cases.
6 LessonsTraining Data Provenance
Training Data Provenance Overview
Build training-data provenance end to end. Learn the lineage tracking, source-permission disclosure, the EU AI Act Article 53 obligation for GPAI training-data summaries, and the engineering pattern.
6 LessonsDataset Cards & Documentation
Write dataset cards beyond the academic minimum. Learn the disclosure-grade dataset card, the regulator-grade dataset card, contractor-treatment disclosure, and ongoing maintenance.
6 LessonsOpt-Out Registries & Honour
Honour creator and rights-holder opt-outs. Learn Have-I-Been-Trained / Spawning, robots.txt / ai.txt, TDM reservations under EU CDSM, the engineering of honoured opt-outs, and disclosure.
6 LessonsWeb Scraping Ethics & TDM
Scrape the web ethically for AI training. Learn the EU CDSM TDM exception, US fair-use posture, terms-of-service respect, robots.txt / ai.txt, and the scraping-disclosure pattern.
6 LessonsCopyright & Provenance
Engineer copyright-aware provenance. Learn licensed-corpus tracking, fair-use claim documentation, output-similarity defences, and the disclosure layer that supports both compliance and litigation.
6 LessonsEval Contamination Checking
Catch eval contamination in training. Learn the contamination-detection methodology, public benchmark hygiene, decontamination workflows, disclosure norms, and the link to model-card eval reporting.
6 LessonsAI Disclosure Law & Regulation
EU AI Act Article 50 Disclosure
Engineer for EU AI Act Article 50. Learn the chatbot-disclosure duty, deepfake-labelling duty, GPAI disclosure under Article 53, the machine-readable requirement, and enforcement timelines.
6 LessonsUS State AI Disclosure Laws
Navigate the US state AI disclosure patchwork. Learn California (AB 2013, AB 2655, AB 2839), Illinois, Texas, Colorado, NYC AEDT, and the federal-action stance from the FTC and EOs.
6 LessonsElection & Political Ad Disclosure
Engineer election and political-ad AI disclosure. Learn the Tech Accord on AI & Elections commitments, FEC / FCC stances, state laws, platform policies, and the labelling pattern.
6 LessonsPlatform AI-Content Labeling Rules
Read platform AI-content labelling rules. Learn the Meta / TikTok / Google-YouTube / X policies, the labelling-vs-removal distinction, edge cases (parody, art, journalism), and the enforcement gaps.
6 LessonsFTC AI Disclosure Guidance
Work with FTC AI disclosure guidance. Learn Section 5 unfair / deceptive framing, the AI Comply enforcement actions, endorsement / testimonial rules, and the consumer-disclosure expectations.
6 LessonsCopyright & Training-Data Disclosure Mandates
Comply with copyright and training-data disclosure mandates. Learn EU AI Act Art. 53 training-data summaries, US Copyright Office guidance, JP/UK developments, and the disclosure-pipeline implication.
6 LessonsCross-Border Disclosure Compliance
Run cross-border disclosure compliance. Learn the obligation map (EU, US-state, UK, China, Japan, Korea, Brazil, India), the highest-common-denominator strategy, and per-region overrides.
6 LessonsUX & User-Facing Disclosure
Disclosure UX Overview
Design disclosure UX that users actually read. Learn the noticing / understanding / acting framework, dark-pattern avoidance, plain-language standards, and the layered-disclosure pattern.
6 LessonsAI-Content Labels in Product
Engineer in-product AI-content labels. Learn the canonical label vocabulary (made-with-AI, AI-edited, AI-detected, manipulated), placement rules, accessibility, and DSA Article 21 alignment.
6 LessonsUncertainty Communication
Communicate model uncertainty to users. Learn the confidence-display patterns, the explanation-vs-honesty trade-off, the calibration-grounded disclosure, and the high-stakes UX standards.
6 LessonsChatbot Identity Disclosure
Disclose chatbot identity correctly. Learn the ‘I am AI’ rule, the on-question-answer requirement, persona-but-honest design, the regulatory floor, and the high-deception failure mode.
6 LessonsAccessibility & Plain-Language Disclosure
Make AI disclosure accessible. Learn WCAG conformance, screen-reader / low-vision / motor / cognitive accessibility, plain-language standards, multilingual disclosure, and the audit pattern.
6 LessonsOperations & Engineering
Provenance Pipeline Engineering
Engineer the provenance pipeline end to end. Learn ingest signing, transform-time provenance preservation, storage, retrieval, distribution, and the SLA between provenance owners and consumers.
6 LessonsDisclosure Versioning & Updates
Version disclosures like code. Learn change management, deprecation, the public-changelog pattern, regulator-notification triggers, and the back-fill question on existing artefacts.
6 LessonsDisclosure Audit Trail
Maintain a disclosure audit trail that survives inspection. Learn the artefact set, retention policy, signed-claims discipline, third-party audit readiness, and the regulator-facing workflow.
6 LessonsVendor Disclosure Requirements
Require disclosure from AI vendors. Learn the procurement-grade disclosure questionnaire, model-card / system-card review, training-data audit, ongoing monitoring, and contractual remedies.
6 LessonsDisclosure Incident Response
Respond when disclosure breaks. Learn the incident definitions (false claim, missing label, mislabelled, mass mislabel), severity ladder, triage, public correction, and regulator notification.
6 LessonsIndustry, Standards & Future
Standards Bodies (W3C, ISO, IEEE, NIST, ITU)
Engage with the standards bodies that shape AI disclosure. Learn W3C, ISO/IEC JTC 1, IEEE, NIST, ITU, and the contributor / observer / adopter pattern for engineering teams.
6 LessonsCoalition for Content Provenance & Authenticity
Read the C2PA / Content Authenticity Initiative ecosystem. Learn the founding-member story, governance, working groups, the open-source SDK, and the consumer-brand strategy.
6 LessonsNewsroom AI Disclosure Policies
Read newsroom AI disclosure policies. Learn AP / Reuters / NYT / BBC / Guardian / WaPo policies, the canonical do / do-not / disclose-when patterns, and the engineering implications for newsroom CMS.
6 LessonsFuture of Deepfakes & Counter-Tech
Reason about where the deepfake / counter-tech curve is heading. Learn the generation-quality vs detection-difficulty curve, the watermark + provenance + detection stack, and the policy direction.
6 LessonsDisclosure Research Frontier
Track the disclosure research frontier. Learn the open problems (cross-vendor watermark interop, robust text watermarking, provenance under generative editing), the venues, and the engagement pattern.
6 Lessons
Lilly Tech Systems