Intermediate

Differential Privacy Tools & Libraries

A practical guide to the open-source libraries that make differential privacy accessible for production ML systems. From data analysis to model training, these tools handle the complex math so you can focus on your application.

Library Comparison

Library	Maintainer	Use Case	Language
OpenDP	Harvard/Microsoft	General DP data analysis	Rust + Python bindings
Google DP	Google	Aggregate analytics	C++, Java, Go, Python
Opacus	Meta	PyTorch DP-SGD training	Python (PyTorch)
TF Privacy	Google	TensorFlow DP-SGD training	Python (TensorFlow)
PipelineDP	OpenMined/Google	DP on Spark/Beam pipelines	Python
Tumult Analytics	Tumult Labs	DP SQL-like queries	Python (PySpark)

OpenDP

OpenDP is a community effort to build trustworthy, open-source software tools for statistical analysis with differential privacy:

Python - OpenDP Example

from opendp.mod import enable_features
enable_features("contrib")

import opendp.prelude as dp

# Build a measurement: private mean of ages
input_space = dp.vector_domain(dp.atom_domain(T=float)), \
              dp.symmetric_distance()

# Chain transformations
mean_measurement = (
    input_space >>
    dp.t.then_clamp(bounds=(0.0, 120.0)) >>
    dp.t.then_resize(size=1000, constant=50.0) >>
    dp.t.then_mean() >>
    dp.m.then_laplace(scale=0.12)  # Calibrated noise
)

# Check the privacy guarantee
print(f"Privacy loss: ε = {mean_measurement.map(1)}")

# Apply to data
ages = [25.0, 34.0, 42.0, 56.0, ...]  # 1000 ages
private_mean = mean_measurement(ages)

Google DP Library

Google's differential privacy library provides battle-tested implementations of DP aggregation functions:

Python - Google DP Library

from pydp.algorithms.laplacian import BoundedMean, Count

# Private count
count = Count(epsilon=1.0, dtype="int")
for val in data:
    count.add_entry(val)
private_count = count.result()

# Private bounded mean
mean = BoundedMean(
    epsilon=1.0,
    lower_bound=0,
    upper_bound=100,
    dtype="float"
)
for val in data:
    mean.add_entry(val)
private_mean = mean.result()

Opacus for PyTorch

Opacus is Meta's library for training PyTorch models with differential privacy. It wraps your existing training loop with minimal code changes. See the DP-SGD lesson for a full training example.

TensorFlow Privacy

TensorFlow Privacy provides DP-SGD optimizers that drop into standard TensorFlow/Keras training:

Python - TensorFlow Privacy

from tensorflow_privacy.privacy.optimizers.dp_optimizer_keras \
    import DPKerasSGDOptimizer
from tensorflow_privacy.privacy.analysis \
    import compute_dp_sgd_privacy

optimizer = DPKerasSGDOptimizer(
    l2_norm_clip=1.0,
    noise_multiplier=1.1,
    num_microbatches=256,
    learning_rate=0.01
)

model.compile(optimizer=optimizer, loss='sparse_categorical_crossentropy')
model.fit(x_train, y_train, epochs=10, batch_size=256)

# Compute the achieved privacy guarantee
eps, _ = compute_dp_sgd_privacy.compute_dp_sgd_privacy(
    n=len(x_train), batch_size=256,
    noise_multiplier=1.1, epochs=10, delta=1e-5
)
print(f"Achieved ε = {eps:.2f}")

✅

Choosing a library: For ML model training, use Opacus (PyTorch) or TF Privacy (TensorFlow). For data analytics and queries, use OpenDP or Google's DP library. For large-scale pipeline processing, use PipelineDP or Tumult Analytics.

← Previous Local vs Global DP Next → Best Practices