W&B Best Practices
Collaborate effectively with your team, create impactful reports, integrate with popular frameworks, and build production-grade ML workflows.
Team Collaboration
- Shared projects: Create team-level projects so everyone logs to the same workspace.
- Standardize configs: Agree on config keys and metric names across the team.
- Use tags consistently: Tag runs with experiment purpose, data version, and team member.
- Review via Reports: Create W&B Reports for experiment reviews instead of slide decks.
W&B Reports
import wandb
api = wandb.Api()
# Reports are best created via the W&B UI, but you can
# also create them programmatically:
# 1. Create a report from the UI with live charts
# 2. Embed run comparisons, tables, and markdown
# 3. Share via URL - viewers don't need a W&B account
# 4. Export to PDF or LaTeX for publications
# Query runs for analysis
runs = api.runs("team/project", filters={"tags": "production"})
for run in runs:
print(f"{run.name}: accuracy={run.summary.get('val/accuracy')}")
Framework Integration Patterns
from pytorch_lightning.loggers import WandbLogger
from pytorch_lightning import Trainer
wandb_logger = WandbLogger(
project="lightning-demo",
log_model="all" # log all checkpoints as artifacts
)
trainer = Trainer(
max_epochs=50,
logger=wandb_logger,
callbacks=[
ModelCheckpoint(monitor="val_loss"),
EarlyStopping(monitor="val_loss", patience=5)
]
)
trainer.fit(model, datamodule)
import wandb
from wandb.integration.keras import WandbMetricsLogger, WandbModelCheckpoint
wandb.init(project="keras-demo")
model.fit(
X_train, y_train,
validation_data=(X_val, y_val),
epochs=50,
callbacks=[
WandbMetricsLogger(),
WandbModelCheckpoint("models/best.keras")
]
)
Project Organization
| Level | Convention | Example |
|---|---|---|
| Entity | Team or organization | acme-ml-team |
| Project | One per ML task | fraud-detection, churn-prediction |
| Run name | Descriptive, unique | resnet50-augmented-lr0.001 |
| Tags | Categorical labels | ["baseline", "v2", "production"] |
| Groups | Cross-validation folds, ablations | "ablation-study-march" |
Production Workflow
Develop & experiment
Use W&B tracking for all experiments. Compare runs in the dashboard.
Optimize with Sweeps
Run hyperparameter sweeps on the best approaches from step 1.
Version with Artifacts
Log the best model and dataset as versioned artifacts.
Register the model
Promote the best artifact to the Model Registry with "staging" alias.
Validate & promote
Run validation tests. If passed, update alias to "production".
Monitor in production
Log inference metrics to W&B to detect model drift.
Common Pitfalls
- Forgetting wandb.finish(): Always call it at the end of a run, especially in scripts that run multiple experiments. Use context managers or try/finally blocks.
- Logging too frequently: Logging every batch can create large runs. Log every N steps or at epoch level.
- Inconsistent config keys: Use the same key names across runs so you can compare them.
- Not using groups: Group related runs (CV folds, ablation studies) to keep the dashboard organized.
- Hardcoding API keys: Use environment variables or
wandb login, never hardcode keys in source code.
Frequently Asked Questions
Yes. W&B offers free unlimited tracking for individual researchers and academic teams. The free tier includes unlimited runs, 100 GB of storage, and full feature access. Enterprise features (SSO, audit logs, dedicated support) are paid.
Yes, but only with the Enterprise plan. W&B Server can be deployed on-premises or in your private cloud (AWS, GCP, Azure). For open-source self-hosting, consider MLflow instead.
TensorBoard is free and works offline, but W&B offers superior collaboration features, persistent cloud storage, hyperparameter sweeps, artifact versioning, and much better comparison tools. Many teams use W&B as a TensorBoard replacement for team-level work.
Lilly Tech Systems