Differential Privacy Tools & Libraries
A practical guide to the open-source libraries that make differential privacy accessible for production ML systems. From data analysis to model training, these tools handle the complex math so you can focus on your application.
Library Comparison
| Library | Maintainer | Use Case | Language |
|---|---|---|---|
| OpenDP | Harvard/Microsoft | General DP data analysis | Rust + Python bindings |
| Google DP | Aggregate analytics | C++, Java, Go, Python | |
| Opacus | Meta | PyTorch DP-SGD training | Python (PyTorch) |
| TF Privacy | TensorFlow DP-SGD training | Python (TensorFlow) | |
| PipelineDP | OpenMined/Google | DP on Spark/Beam pipelines | Python |
| Tumult Analytics | Tumult Labs | DP SQL-like queries | Python (PySpark) |
OpenDP
OpenDP is a community effort to build trustworthy, open-source software tools for statistical analysis with differential privacy:
from opendp.mod import enable_features enable_features("contrib") import opendp.prelude as dp # Build a measurement: private mean of ages input_space = dp.vector_domain(dp.atom_domain(T=float)), \ dp.symmetric_distance() # Chain transformations mean_measurement = ( input_space >> dp.t.then_clamp(bounds=(0.0, 120.0)) >> dp.t.then_resize(size=1000, constant=50.0) >> dp.t.then_mean() >> dp.m.then_laplace(scale=0.12) # Calibrated noise ) # Check the privacy guarantee print(f"Privacy loss: ε = {mean_measurement.map(1)}") # Apply to data ages = [25.0, 34.0, 42.0, 56.0, ...] # 1000 ages private_mean = mean_measurement(ages)
Google DP Library
Google's differential privacy library provides battle-tested implementations of DP aggregation functions:
from pydp.algorithms.laplacian import BoundedMean, Count # Private count count = Count(epsilon=1.0, dtype="int") for val in data: count.add_entry(val) private_count = count.result() # Private bounded mean mean = BoundedMean( epsilon=1.0, lower_bound=0, upper_bound=100, dtype="float" ) for val in data: mean.add_entry(val) private_mean = mean.result()
Opacus for PyTorch
Opacus is Meta's library for training PyTorch models with differential privacy. It wraps your existing training loop with minimal code changes. See the DP-SGD lesson for a full training example.
TensorFlow Privacy
TensorFlow Privacy provides DP-SGD optimizers that drop into standard TensorFlow/Keras training:
from tensorflow_privacy.privacy.optimizers.dp_optimizer_keras \ import DPKerasSGDOptimizer from tensorflow_privacy.privacy.analysis \ import compute_dp_sgd_privacy optimizer = DPKerasSGDOptimizer( l2_norm_clip=1.0, noise_multiplier=1.1, num_microbatches=256, learning_rate=0.01 ) model.compile(optimizer=optimizer, loss='sparse_categorical_crossentropy') model.fit(x_train, y_train, epochs=10, batch_size=256) # Compute the achieved privacy guarantee eps, _ = compute_dp_sgd_privacy.compute_dp_sgd_privacy( n=len(x_train), batch_size=256, noise_multiplier=1.1, epochs=10, delta=1e-5 ) print(f"Achieved ε = {eps:.2f}")
Lilly Tech Systems