Design ML Solutions (30-35%) Intermediate

This is the highest-weighted domain on the DP-100 exam. You must understand how to design the Azure ML workspace, choose compute targets, manage data assets, and build pipeline architectures. Master this domain and you are a third of the way to passing.

Azure Machine Learning Workspace

The workspace is the top-level resource for Azure ML. Everything — compute, data, models, endpoints, experiments — lives inside a workspace.

Key Workspace Components

Component	Purpose	Exam Focus
Workspace	Top-level container for all ML assets	Creation, RBAC, networking (private endpoints)
Resource Group	Azure container for related resources	Organizing workspaces by team/project
Storage Account	Default datastore (Blob Storage)	Automatically created with workspace
Key Vault	Secrets and credentials management	Connection strings, API keys
Application Insights	Monitoring and telemetry	Endpoint monitoring, logging
Container Registry	Docker images for environments	Custom environments, deployment images

# Creating a workspace with Python SDK v2
from azure.ai.ml import MLClient
from azure.ai.ml.entities import Workspace
from azure.identity import DefaultAzureCredential

# Authenticate
credential = DefaultAzureCredential()

# Define workspace
ws = Workspace(
    name="dp100-exam-workspace",
    location="eastus",
    display_name="DP-100 Study Workspace",
    description="Workspace for DP-100 exam preparation",
    tags={"purpose": "certification-prep"}
)

# Create workspace (also creates Storage, Key Vault, App Insights)
ml_client = MLClient(
    credential=credential,
    subscription_id="your-subscription-id",
    resource_group_name="dp100-rg"
)
ml_client.workspaces.begin_create_or_update(ws)

Compute Targets

Choosing the right compute is critical for both the exam and real projects. Know when to use each type.

Compute Type	Use Case	Key Features	Exam Tips
Compute Instance	Development, notebooks	Single VM, Jupyter, VS Code, SSH	One per user, auto-shutdown, not for production
Compute Cluster	Training jobs, pipelines	Auto-scaling, multi-node, spot VMs	Min nodes=0 saves cost, max nodes limits spend
Serverless Compute	On-demand training	No cluster management, pay per job	New option — no need to pre-provision
Kubernetes (AKS)	Production inference	Scalable, GPU support, custom networking	Attach existing AKS cluster to workspace
Managed Online Endpoint	Real-time inference	Built-in load balancing, blue-green	Preferred for real-time scoring scenarios
Managed Batch Endpoint	Batch scoring	Process large datasets, parallel jobs	Best for offline/scheduled predictions

# Creating a compute cluster with SDK v2
from azure.ai.ml.entities import AmlCompute

# Define compute cluster
cluster = AmlCompute(
    name="dp100-cluster",
    type="amlcompute",
    size="Standard_DS3_v2",      # VM size
    min_instances=0,              # Scale to zero when idle
    max_instances=4,              # Max parallel nodes
    idle_time_before_scale_down=120,  # Seconds before scale-down
    tier="Dedicated"              # or "LowPriority" for spot VMs
)

ml_client.compute.begin_create_or_update(cluster)

Exam Tip: Know the difference between Dedicated and Low Priority (Spot) VMs. Low Priority VMs cost up to 80% less but can be preempted. Use them for fault-tolerant training jobs with checkpointing. Never use them for real-time endpoints.

Data Assets and Datastores

Azure ML uses datastores to connect to storage services and data assets to reference specific datasets.

Datastore Types

Azure Blob Storage — Default datastore, best for unstructured data (images, text files)
Azure Data Lake Storage Gen2 — Hierarchical namespace, best for large-scale analytics
Azure SQL Database — Structured data, direct SQL queries
Azure Files — File shares, mountable as network drive

Data Asset Types

Type	Description	When to Use
URI File	Points to a single file	Single CSV, Parquet, or model file
URI Folder	Points to a folder	Image datasets, multiple files in a directory
MLTable	Tabular data with schema	Structured data with column types, transformations

# Register a data asset (URI File)
from azure.ai.ml.entities import Data
from azure.ai.ml.constants import AssetTypes

# CSV file in blob storage
data_asset = Data(
    name="customer-churn-data",
    description="Customer churn dataset for training",
    path="azureml://datastores/workspaceblobstore/paths/data/churn.csv",
    type=AssetTypes.URI_FILE,
    version="1"
)

ml_client.data.create_or_update(data_asset)

# Register an MLTable asset
mltable_asset = Data(
    name="customer-churn-mltable",
    description="Customer churn as MLTable with schema",
    path="azureml://datastores/workspaceblobstore/paths/data/churn-mltable/",
    type=AssetTypes.MLTABLE,
    version="1"
)

ml_client.data.create_or_update(mltable_asset)

Environments

Environments define the software dependencies for training and deployment. Azure ML supports curated environments (pre-built) and custom environments.

# Using a curated environment
from azure.ai.ml.entities import Environment

# List curated environments
envs = ml_client.environments.list()
for env in envs:
    if "sklearn" in env.name.lower():
        print(f"{env.name}: {env.version}")

# Create custom environment from conda file
custom_env = Environment(
    name="dp100-custom-env",
    description="Custom environment for DP-100 training",
    conda_file="conda.yml",
    image="mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04:latest",
    version="1"
)

ml_client.environments.create_or_update(custom_env)

Pipeline Architecture

Pipelines orchestrate multi-step ML workflows. For the exam, understand pipeline components, data passing, and scheduling.

# Building a pipeline with SDK v2
from azure.ai.ml import dsl, Input, Output
from azure.ai.ml.entities import CommandComponent

# Define pipeline steps as components
@dsl.pipeline(
    description="DP-100 training pipeline",
    compute="dp100-cluster"
)
def training_pipeline(input_data):
    # Step 1: Data preparation
    prep_step = prep_component(
        raw_data=input_data
    )

    # Step 2: Train model
    train_step = train_component(
        training_data=prep_step.outputs.prepared_data,
        learning_rate=0.01,
        n_estimators=100
    )

    # Step 3: Evaluate model
    eval_step = eval_component(
        model=train_step.outputs.model,
        test_data=prep_step.outputs.test_data
    )

    return {
        "trained_model": train_step.outputs.model,
        "metrics": eval_step.outputs.metrics
    }

# Submit pipeline
pipeline_job = training_pipeline(
    input_data=Input(
        type="uri_file",
        path="azureml://datastores/workspaceblobstore/paths/data/churn.csv"
    )
)

ml_client.jobs.create_or_update(pipeline_job)

Practice Questions

Question 1: You need to create a compute resource for a data scientist to run Jupyter notebooks interactively. The compute should automatically stop when not in use to minimize cost. Which compute type should you create?

A. Compute Cluster with min_instances=0
B. Compute Instance with auto-shutdown enabled
C. Managed Online Endpoint
D. Azure Databricks cluster

Show Answer

B. Compute Instance with auto-shutdown enabled. Compute Instances are designed for individual development work including Jupyter notebooks. Auto-shutdown stops the VM after a period of inactivity. Compute Clusters are for training jobs, not interactive notebook use. Managed Online Endpoints are for inference. Databricks is a separate service not directly managed within Azure ML Studio.

Question 2: Your team needs to train models on large datasets using GPU VMs. The training jobs are fault-tolerant and support checkpointing. You need to minimize cost. What should you configure?

A. Compute Cluster with Dedicated tier and Standard_NC6 VMs
B. Compute Cluster with Low Priority tier and Standard_NC6 VMs
C. Compute Instance with Standard_NC6 VM
D. Serverless Compute with GPU

Show Answer

B. Compute Cluster with Low Priority tier and Standard_NC6 VMs. Low Priority (Spot) VMs cost up to 80% less than Dedicated VMs. Since the jobs are fault-tolerant with checkpointing, preemption is acceptable. Compute Instances are for development, not batch training. Serverless Compute is an option but Low Priority clusters give more cost control for predictable GPU workloads.

Question 3: You need to register a dataset that consists of thousands of image files stored in Azure Blob Storage. Which data asset type should you use?

A. URI File
B. URI Folder
C. MLTable
D. Azure Open Dataset

Show Answer

B. URI Folder. URI Folder references a directory of files, making it ideal for image datasets with many files. URI File is for a single file. MLTable is for tabular/structured data with schema definitions. Azure Open Datasets are pre-existing public datasets, not for registering your own data.

Question 4: You are designing a workspace for a team of 10 data scientists. You need to ensure that only team members can access the workspace and that secrets are stored securely. Which two resources are automatically created with the workspace? (Select two.)

A. Azure Key Vault
B. Azure Kubernetes Service
C. Storage Account
D. Azure Container Registry
E. Virtual Network

Show Answer

A (Azure Key Vault) and C (Storage Account). When you create an Azure ML workspace, Azure automatically provisions a Storage Account (default datastore), Key Vault (secrets), and Application Insights (monitoring). Container Registry is created on-demand when you first build a custom environment or deploy a model. AKS and VNet must be created separately.

Question 5: You need to schedule a training pipeline to run every Monday at 8 AM UTC. Which approach should you use?

A. Create a cron schedule on the pipeline job
B. Use Azure Logic Apps to trigger the pipeline
C. Set up a recurring compute instance
D. Use Azure Data Factory to orchestrate

Show Answer

A. Create a cron schedule on the pipeline job. Azure ML pipelines natively support cron-based scheduling using the SDK v2 or CLI v2. While Logic Apps and Data Factory can also trigger pipelines, the simplest and most direct approach is the built-in schedule. Compute instances do not have recurring job scheduling.

← Exam Overview & Strategy Explore Data & Train Models →