Data Governance
Master data governance end to end. 60 deep dives across 360 lessons covering foundations (DG vs DM, DAMA DMBOK / DCAM frameworks, history, business value, maturity), operating model & roles (operating models, CDO, stewardship, council, RACI, budgeting), data strategy & architecture (strategy, domains, data-as-a-product, fabric vs mesh, lakehouse, analytics engineering), metadata / catalogs / lineage (metadata management, catalog platforms, lineage with OpenLineage, business glossary, discovery, active metadata), data quality & observability (DQ dimensions, rules / engines, observability with Monte Carlo / Bigeye / Anomalo, incident management, scorecards, ML data quality), master & reference data (MDM overview, MDM architectures, entity resolution, RDM, MDM platforms, MDM + AI), security & privacy governance (classification, access governance, DLP, masking / tokenisation, privacy governance, third-party data risk), lifecycle & retention (lifecycle, retention schedules, archival / tiering, deletion / right-to-erasure, legal hold, records management), regulatory (landscape, GDPR, CCPA / CPRA, HIPAA, BCBS 239, SOX), and AI / ML modern DG (AI data governance, training data governance, model lineage / inventory, LLMOps DG, vector DB governance, future of DG including DSPM convergence).
Data governance is the foundational discipline of deciding who decides about data — who owns it, who can use it, what it means, what quality it must hold to, where it lives, how long it stays, and how the organisation defends those decisions to regulators and auditors. It sits at the intersection of records management, the DAMA DMBOK and EDM Council DCAM frameworks, the modern data-mesh and data-product movement, the metadata / catalog / lineage stack, the data quality and observability stack, master and reference data management, security and privacy governance, lifecycle and retention, the regulatory landscape (GDPR, CCPA, HIPAA, BCBS 239, SOX, EU AI Act), and the modern AI / ML extension that turns DG into a prerequisite for any serious AI deployment. Over the last decade it has stopped being a back-office function and has become a board-level operating commitment subject to regulator inspection, material fines, and procurement scrutiny.
This track is written for the practitioners doing this work day to day: chief data officers, data governance leads, data stewards, data product owners, data engineers, analytics engineers, data scientists, ML engineers integrating DG controls, privacy and security partners, lawyers and compliance leads, and the cross-functional partners who make DG land. Every topic explains the underlying discipline (drawing on DAMA DMBOK 2, the EDM Council DCAM, ISO 8000, BCBS 239, the OpenLineage / OpenMetadata / DataHub specifications, the canonical research literature on data quality and entity resolution, regulator guidance, and hard-won production experience), the practical methodology that operationalises it, the artefacts and rituals that make it stick, and the failure modes where data governance work quietly fails to govern anything. The aim is that a reader can stand up a credible data governance function, integrate it with engineering, business, and regulatory partners, and defend it to boards, auditors, and customers.
All Topics
60 data governance topics organized into 10 categories. Each has 6 detailed lessons with frameworks, methodologies, and operational patterns.
Data Governance Foundations
Data Governance Overview
Master what data governance actually is. Learn the scope, the lineage from records management and EDM, the deliverables, and the operating model used by mature programs.
6 LessonsData Governance vs Data Management
Disentangle data governance from data management. Learn the boundary, how DG sets the rules and DM operates the systems, the integration patterns, and the failure modes when teams blur them.
6 LessonsDAMA DMBOK, DCAM & EDM Council Frameworks
Read the canonical data governance frameworks. Learn DAMA DMBOK, DCAM (EDM Council), CMMI DMM, ISO 8000, and how to map them into a single working framework for your organisation.
6 LessonsData Governance History & Evolution
Trace data governance from records management through Sarbanes-Oxley to modern data mesh and AI governance. Learn the eras, regulatory drivers, and the lessons each era cemented.
6 LessonsBusiness Value of Data Governance
Articulate the business value of DG without resorting to abstractions. Learn the canonical value categories (risk, productivity, revenue, compliance), the ROI model, and the board-level pitch.
6 LessonsData Governance Maturity Model
Assess and advance DG maturity. Learn the canonical 5-level maturity ladder, assessment methodology, gap analysis, the 18-month roadmap, and the maturity-to-investment conversation.
6 LessonsOperating Model & Roles
DG Operating Models
Pick the right DG operating model. Learn centralised, federated, hub-and-spoke, and data-mesh patterns, the model-evolution decision, and the failure modes of mismatched models.
6 LessonsChief Data Officer Role & Mandate
Stand up or operate as a CDO. Learn the role mandate, reporting line (CEO / COO / CTO / CIO), authority, the 90-day plan, the multi-year strategy, and the CDO-vs-CIO boundary.
6 LessonsData Stewardship & Ownership
Run data stewardship that actually works. Learn business stewards vs technical stewards vs data owners, role definitions, recruitment, accountability, and the steward operating cadence.
6 LessonsData Governance Council & Boards
Stand up a DG council that decides things. Learn membership, decision rights, escalation, cadence, the steering vs working board split, and the link to the executive committee.
6 LessonsRACI Across Data Roles
Build a RACI that holds. Learn RACI across owner / steward / engineer / analyst / scientist / privacy / security / business sponsor, the canonical decisions, and the per-decision matrix.
6 LessonsDG Budgeting & ROI
Model DG budget and demonstrate ROI. Learn the budget structure, capacity-driver mapping, automation-investment ROI, the regulator-driven floor, and the budget-defence narrative.
6 LessonsData Strategy & Architecture
Data Strategy Authoring
Author a data strategy that survives two budget cycles. Learn the strategy template, prioritisation against business strategy, OKR setting, the 18-month roadmap, and the executive pitch.
6 LessonsData Domains & Decomposition
Decompose the enterprise into data domains. Learn domain identification, boundary definition, ownership, the upstream / source domain pattern, and the domain-to-product mapping.
6 LessonsData-as-a-Product
Ship data products properly. Learn the product-thinking framing, data-product specifications, SLAs / SLOs, contracts, the four-data-product types (source, aggregate, consumer-aligned, ML-feature), and adoption.
6 LessonsData Fabric vs Data Mesh
Disentangle data fabric and data mesh. Learn the architectural differences, organisational implications, the hybrid pattern that often wins, and the vendor-marketing-vs-reality gap.
6 LessonsEnterprise Data Architecture
Architect enterprise data flows. Learn the canonical layers (sources, ingest, lake, warehouse, marts, semantic layer, consumer), the lakehouse pattern, governance integration, and reference architectures.
6 LessonsAnalytics Engineering Discipline
Adopt analytics engineering as a governance partner. Learn the role, dbt as the de-facto framework, transformation governance, semantic-layer engineering, and the BI-handoff pattern.
6 LessonsMetadata, Catalogs & Lineage
Metadata Management
Manage metadata as a first-class asset. Learn technical / business / operational metadata, metadata models, ingestion patterns, active metadata, and the discoverability KPI.
6 LessonsData Catalog Platforms
Pick and operate a data catalog. Learn the vendor landscape (Collibra, Alation, Atlan, DataHub, OpenMetadata, Unity Catalog, Polaris, Horizon), build-vs-buy, and adoption strategy.
6 LessonsData Lineage
Build data lineage that engineers and auditors actually use. Learn technical vs business lineage, column-level lineage, OpenLineage, the parsing approach, and the impact-analysis workflow.
6 LessonsBusiness Glossary & Taxonomy
Build a business glossary that is actually consulted. Learn term authoring, definition discipline, term-to-system binding, governance workflow, and the discoverability strategy.
6 LessonsData Discovery & Search
Build data discovery that gets analysts to the right asset. Learn search ranking, social signals, lineage-aware results, LLM-augmented discovery, and the time-to-data KPI.
6 LessonsActive Metadata & Knowledge Graph
Use active metadata to drive action. Learn active vs passive metadata, the metadata knowledge graph, automation triggers (DQ, retention, access, classification), and the modern catalog stack.
6 LessonsData Quality & Observability
Data Quality Dimensions
Master DQ dimensions. Learn the DAMA / ISO 8000 set (accuracy, completeness, consistency, timeliness, uniqueness, validity), critical-to-quality criteria, and the DQ-to-business-outcome link.
6 LessonsDQ Rules & Engines
Build DQ rules that fire reliably. Learn rule types (single-column, multi-column, cross-table, statistical), engines (Great Expectations, Soda, dbt tests, GX), thresholds, and rule-as-code discipline.
6 LessonsData Observability
Adopt data observability. Learn the five pillars (freshness, volume, schema, distribution, lineage), the vendor landscape (Monte Carlo, Bigeye, Anomalo, Datadog Data Streams), and detection patterns.
6 LessonsData Quality Incident Management
Run DQ incident management. Learn incident definitions, severity ladders, on-call rotations for data, the runbook discipline, the post-incident review, and downstream-consumer comms.
6 LessonsDQ Scorecards & SLAs
Publish DQ scorecards and SLAs. Learn scorecard design, per-domain / per-product views, SLA definition, the green-amber-red discipline, and the executive-readout cadence.
6 LessonsData Quality for AI/ML
Adapt DQ to AI/ML. Learn training-set quality, feature-store governance, label quality, drift detection, eval-data hygiene, and the link to model performance.
6 LessonsMaster & Reference Data
Master Data Management Overview
Master MDM as a discipline. Learn the canonical master entities (customer, product, employee, location, supplier, account), the golden-record concept, and the MDM operating model.
6 LessonsMDM Architectures
Pick the right MDM architecture. Learn registry, hub, coexistence, and transactional patterns, the trade-offs, the migration path between styles, and the multi-domain MDM question.
6 LessonsEntity Resolution
Master entity resolution. Learn deterministic matching, probabilistic matching (Fellegi-Sunter), ML-based matching, blocking strategies, evaluation, and the link from matches to golden records.
6 LessonsReference Data Management
Manage reference data well. Learn code-list governance (countries, currencies, accounts, products), the cross-system mapping problem, ISO standards, RDM platforms, and the change-management discipline.
6 LessonsMDM Platforms
Navigate the MDM platform landscape. Learn enterprise vendors (Informatica, IBM, Reltio, Stibo, Profisee, SAP MDG, Oracle), build-vs-buy, the cloud-MDM era, and the integration discipline.
6 LessonsMDM + AI Integration
Integrate MDM with AI/ML. Learn LLM-augmented stewardship, AI-driven entity resolution, MDM as ground truth for ML, the embedding-based discovery pattern, and the governance overlay.
6 LessonsData Security & Privacy Governance
Data Classification
Classify data so controls follow it. Learn classification schemes, automated classification (rule + ML + LLM), tagging at ingest, and the classification-to-control mapping that holds.
6 LessonsData Access Governance
Govern data access. Learn ABAC vs RBAC vs PBAC, the policy-as-code pattern (OPA, Cedar, Apache Ranger, Immuta), purpose-based access, just-in-time access, and the audit pattern.
6 LessonsDLP & Data Egress Controls
Engineer DLP and data egress controls. Learn endpoint / network / cloud DLP, AI-aware DLP for prompts and outputs, the egress-path inventory, the false-positive trade-off, and incident response.
6 LessonsData Masking & Tokenization
Apply data masking and tokenization for governance use cases. Learn static vs dynamic masking, format-preserving encryption, vault vs vaultless tokenisation, and the test-data-management pattern.
6 LessonsPrivacy Governance
Run privacy governance under the data governance umbrella. Learn the DG / privacy interface, ROPA upkeep, DPIA triggers, DSAR fulfilment, and the operating overlap with the privacy engineering function.
6 LessonsThird-Party Data Risk
Govern third-party data risk. Learn vendor data inventories, DPA / data sharing agreements, sub-processor disclosure, breach-notification clauses, and ongoing monitoring.
6 LessonsData Lifecycle & Retention
Data Lifecycle Overview
Govern the data lifecycle end to end. Learn the canonical phases (create / capture, store, process, share, archive, destroy), per-phase controls, and the lifecycle-to-policy mapping.
6 LessonsRetention Schedules
Run retention as engineering, not policy. Learn retention schedules per record type, the legal-hold override, automated enforcement, anomaly alerts, and the audit trail.
6 LessonsArchival & Tiering Strategies
Tier data well. Learn hot / warm / cold / archive tiers, cloud-storage class economics, retrieval SLAs, the tiering policy, and the failure modes of misconfigured lifecycle rules.
6 LessonsData Deletion & Right-to-Erasure
Engineer real deletion. Learn deletion-vs-anonymisation, deletion across replicas / backups / caches / search indexes / ML artefacts, the legal-hold exception, and proof-of-deletion.
6 LessonsLegal Hold Discipline
Run legal-hold discipline. Learn hold triggers, scope definition, custodian notification, technical preservation, the e-discovery linkage, and the release procedure.
6 LessonsRecords Management
Run records management as the parent discipline. Learn ISO 15489, file plans, record series, vital records, the records-management / data-governance overlap, and the modern RM platform.
6 LessonsRegulatory & Compliance
Data Regulatory Landscape Overview
Map the data regulatory landscape. Learn the privacy stack (GDPR, CCPA), sectoral (HIPAA, GLBA, SOX, BCBS 239, FERPA, COPPA), AI-era (EU AI Act, US EO), and the cross-border patchwork.
6 LessonsGDPR for Data Governance
Translate GDPR into data governance controls. Learn lawful bases, controller / processor, data subject rights, ROPA, DPIA, transfers, and the canonical GDPR-to-DG-control mapping.
6 LessonsCCPA / CPRA for Data Governance
Implement CCPA / CPRA at the DG layer. Learn consumer rights, sale / share / sensitive-PI definitions, opt-out (incl. GPC), service-provider obligations, and the canonical control mapping.
6 LessonsHIPAA & Healthcare Data Governance
Run healthcare data governance under HIPAA. Learn covered entities and business associates, PHI handling, privacy / security / breach rules, BAAs, de-identification standards, and HIPAA-aware AI.
6 LessonsBCBS 239
Implement BCBS 239 risk-data aggregation principles. Learn the 14 principles, the four themes (governance, capabilities, reporting, supervisory), and the data-lineage implication.
6 LessonsSOX & Financial Reporting DG
Govern data for SOX. Learn Section 302 / 404 controls, the financial-reporting data scope, ITGCs, change-management discipline, segregation of duties, and the audit-readiness pattern.
6 LessonsAI/ML & Modern Data Governance
AI Data Governance Overview
Stand up AI data governance. Learn the AI-specific obligations, the integration with classical DG, the canonical AI-DG controls, and the cross-functional operating model.
6 LessonsTraining Data Governance
Govern training data end to end. Learn lineage, source-permission tracking, dataset cards, opt-out honour, contamination checking, the EU AI Act Article 53 obligation, and audit.
6 LessonsModel Lineage & Inventory
Maintain a model inventory and lineage. Learn the model registry, the data-to-model lineage, the model-to-deployment lineage, version governance, and the inventory KPI.
6 LessonsLLMOps Data Governance
Govern LLMOps data flows. Learn prompt-data governance, retrieval-source governance, output-logging policy, vendor data flows (no-train clauses), and the eval-data hygiene discipline.
6 LessonsVector Database Governance
Govern vector databases. Learn embedding lineage, source-document classification propagation, access control, deletion (incl. embedding-level deletion), and contamination concerns.
6 LessonsFuture of Data Governance
Reason about where data governance is heading. Learn DSPM, AI-augmented DG, agentic stewardship, the regulatory consolidation curve, and the strategic-posture template.
6 Lessons