Advanced

Practice Exam

30 exam-style questions covering all six domains. Try to answer each question before revealing the explanation. Aim to complete all questions within 60 minutes to simulate exam pacing (the real exam gives 120 minutes for 50–60 questions).

💡
How to Use This Practice Exam: Read each question carefully, select your answer mentally, then scroll to see the explanation. Track your score — aim for 70% or higher. Review every question you get wrong by revisiting the corresponding lesson.

Domain 1: Architecting Low-Code ML Solutions

Q1
A marketing team wants to predict which customers will respond to a promotional campaign. All customer data is stored in BigQuery. The team consists of data analysts who know SQL but have no Python or ML experience. Which approach minimizes complexity while delivering a working model?

A. Export data to CSV and use Vertex AI AutoML Tables
B. Use BigQuery ML to create a logistic regression model
C. Hire an ML engineer to build a custom TensorFlow model
D. Use the Cloud Natural Language API
Answer: B. BigQuery ML allows the SQL-proficient team to build and deploy a model without learning Python or ML frameworks. The data is already in BigQuery, so there is no data movement needed. Logistic regression is appropriate for binary classification (respond vs. not respond). AutoML (A) requires data export. Hiring (C) is unnecessarily complex. NLP API (D) is for text analysis.
Q2
A company needs to extract text from scanned invoices and classify them by department. They have no labeled training data. Which GCP services should they use?

A. Cloud Vision API (OCR) + Cloud Natural Language API (classification)
B. Document AI
C. AutoML Vision + AutoML Text
D. Vertex AI Custom Training with a BERT model
Answer: B. Document AI is purpose-built for extracting structured data from documents (invoices, receipts, forms). It handles OCR and entity extraction out of the box with pre-trained parsers for common document types like invoices. No labeled data is needed for pre-trained processors. Vision + NLP APIs (A) would work but require more integration. AutoML (C) requires labeled training data they do not have. Custom BERT (D) requires labeled data and ML expertise.
Q3
Your company wants to forecast quarterly revenue for the next 8 quarters. Historical revenue data for the past 5 years is in BigQuery. Which BigQuery ML model type should you use?

A. LINEAR_REG
B. BOOSTED_TREE_REGRESSOR
C. ARIMA_PLUS
D. DNN_REGRESSOR
Answer: C. ARIMA_PLUS is BigQuery ML's time series forecasting model. It handles seasonality, trends, and holiday effects automatically. It is specifically designed for forecasting future values from historical time series data. LINEAR_REG (A) does not handle temporal patterns. BOOSTED_TREE (B) is for tabular classification/regression, not time series. DNN (D) is overkill and does not natively handle time series structure.

Domain 2: Collaborating within & across Teams

Q4
A lending company deploys a credit scoring model. Regulators require that the model does not discriminate based on race or gender. Which two GCP tools should you use to ensure fairness? (Select two)

A. Vertex Explainable AI
B. Vertex AI Model Monitoring
C. TensorFlow Fairness Indicators (via What-If Tool)
D. Cloud Data Loss Prevention
E. BigQuery audit logs
Answer: A and C. Vertex Explainable AI (A) shows which features influence each prediction, helping identify if protected attributes affect decisions. Fairness Indicators / What-If Tool (C) allows slicing model performance by demographic groups to detect disparate impact. Model Monitoring (B) detects drift, not fairness. DLP (D) is for sensitive data detection. Audit logs (E) track access, not model behavior.
Q5
Your ML team uses Jupyter notebooks in Vertex AI Workbench. They want to collaborate on notebooks, track experiments, and share results with stakeholders. Which combination of tools should they use?

A. Vertex AI Workbench + Git integration + Vertex AI Experiments
B. Vertex AI Workbench + Cloud Storage for notebook sharing
C. Colab Enterprise + BigQuery for experiment logs
D. Compute Engine VMs with JupyterLab + shared file system
Answer: A. Vertex AI Workbench with Git integration provides version control for notebooks. Vertex AI Experiments tracks parameters, metrics, and artifacts across runs, making it easy to compare and share results. Cloud Storage (B) does not provide version control. BigQuery (C) is not designed for experiment tracking. Manual VMs (D) lacks managed features.

Domain 3: Scaling Prototypes into ML Models

Q6
You are training a large language model that does not fit into a single GPU's memory. The model has 10 billion parameters. Which training strategy should you use?

A. Data parallelism with MirroredStrategy
B. Model parallelism across multiple GPUs
C. Train on a TPU v4 pod
D. Reduce model size until it fits in one GPU
Answer: B. When a model does not fit in a single GPU's memory, model parallelism is needed to split the model across GPUs. Data parallelism (A) replicates the full model on each GPU, which would not work if the model is too large. TPU pods (C) could also work but the question specifically mentions GPUs. Reducing model size (D) changes the model architecture.
Q7
Your training job runs on Vertex AI with 4 NVIDIA A100 GPUs and takes 48 hours. The training is fault-tolerant (checkpoints saved every 30 minutes). How can you reduce cost by approximately 60–70%?

A. Use preemptible/spot VMs
B. Switch to CPUs
C. Use a smaller batch size
D. Reduce the number of training epochs
Answer: A. Preemptible/spot VMs cost 60–91% less than regular VMs. Since the job is fault-tolerant with frequent checkpoints, it can resume from the last checkpoint if preempted. CPUs (B) would be dramatically slower, negating any savings. Smaller batch size (C) may increase training time. Fewer epochs (D) may reduce model quality.
Q8
A data scientist develops a model in a Jupyter notebook using scikit-learn. The model achieves good results on a small dataset. Now they need to train it on 500 GB of data. What is the recommended migration path?

A. Keep using scikit-learn but add more RAM to the notebook
B. Convert to TensorFlow and use Vertex AI Training with GPUs
C. Package the scikit-learn code and run a Vertex AI Custom Training job with a larger machine
D. Rewrite in PySpark and use Dataproc
Answer: C. Vertex AI Custom Training supports scikit-learn with pre-built containers. Packaging the existing code and running on a larger machine (high-memory instance) is the least disruptive path. Adding RAM to a notebook (A) is limited and not production-grade. Converting to TF (B) requires rewriting. PySpark (D) requires rewriting and is not necessary for scikit-learn workflows.

Domain 4: Serving & Scaling Models

Q9
Your online prediction endpoint receives 100 requests per second during business hours but only 2 requests per second overnight. How should you configure the endpoint to minimize cost?

A. Fixed 10 replicas, 24/7
B. Autoscaling with min replicas = 1 and target CPU utilization = 60%
C. Autoscaling with min replicas = 0 (scale to zero)
D. Two separate endpoints: one for day, one for night
Answer: B. Autoscaling with min replicas = 1 ensures the endpoint is always available (no cold start) while scaling up during peak hours and down overnight. Min replicas = 0 (C) would cause cold start latency. Fixed replicas (A) wastes money overnight. Two endpoints (D) adds unnecessary operational complexity.
Q10
A TensorFlow model takes 200ms per prediction on CPU. The SLA requires predictions under 50ms. The model uses dense matrix operations. Which optimization would most likely meet the SLA?

A. Add more CPU replicas
B. Attach a GPU (NVIDIA T4) to the serving machine
C. Convert the model to TensorFlow Lite
D. Increase the batch size for serving
Answer: B. GPUs accelerate dense matrix operations (the core of neural network inference), typically providing 4–10x speedup for TF models. This would bring 200ms down to 20–50ms. More replicas (A) handle throughput, not per-request latency. TF Lite (C) is for mobile/edge devices. Larger batch sizes (D) increase latency for individual requests.
Q11
You need to serve a model that requires custom pre-processing (image resizing and normalization) before inference. The model is a PyTorch model. What is the best approach on Vertex AI?

A. Use a pre-built PyTorch serving container
B. Build a custom serving container with pre-processing logic and push to Artifact Registry
C. Add pre-processing to the client application
D. Use Cloud Functions for pre-processing, then call the endpoint
Answer: B. Custom serving containers let you include pre-processing, inference, and post-processing in a single container. This reduces latency (no network hop) and ensures consistency. Pre-built containers (A) do not support custom pre-processing. Client-side pre-processing (C) creates training-serving skew risk. Cloud Functions (D) add latency and complexity.

Domain 5: Automating & Orchestrating ML Pipelines

Q12
A team has a Vertex AI Pipeline with 5 steps. Step 3 (model training) was recently updated with a new algorithm. Steps 1–2 (data preparation) are unchanged. How can you re-run the pipeline efficiently?

A. Delete the pipeline and create a new one
B. Re-run the full pipeline; step caching will skip unchanged steps automatically
C. Manually run only step 3 in isolation
D. Disable caching and re-run all steps to ensure consistency
Answer: B. Vertex AI Pipelines supports step caching. When you re-run the pipeline, steps with unchanged inputs and code automatically reuse their cached outputs. Steps 1–2 will be cached, and only step 3 onward will execute. Deleting (A) is destructive. Running step 3 alone (C) is not how pipeline orchestration works. Disabling caching (D) wastes time and compute.
Q13
Your organization wants to automatically retrain a model whenever new data arrives in Cloud Storage. The training should be a Vertex AI Pipeline. What is the simplest trigger mechanism?

A. Cloud Storage trigger → Cloud Function → Vertex AI Pipeline
B. Cloud Scheduler running every hour to check for new files
C. Cloud Composer DAG with a file sensor
D. Eventarc trigger → Vertex AI Pipeline
Answer: D. Eventarc can directly trigger a Vertex AI Pipeline run when a new object is created in Cloud Storage. This is the simplest event-driven approach. Cloud Function (A) works but adds an unnecessary intermediate step. Cloud Scheduler (B) is polling-based, not event-driven, and may miss files between checks. Cloud Composer (C) is overkill for a simple trigger.
Q14
Your ML team needs to track which dataset version, code commit, and hyperparameters produced each model version in production. Which Vertex AI feature provides this?

A. Vertex AI Model Registry
B. Vertex AI Metadata (ML Metadata)
C. Vertex AI TensorBoard
D. Cloud Audit Logs
Answer: B. Vertex AI Metadata (based on ML Metadata) tracks the full lineage of ML artifacts: datasets, models, metrics, and the pipeline runs that produced them. Model Registry (A) stores model versions but not full lineage. TensorBoard (C) visualizes training metrics, not lineage. Audit Logs (D) track API calls, not ML artifacts.

Domain 6: Monitoring ML Solutions

Q15
Your model monitoring shows significant prediction drift (the distribution of model outputs has changed), but no data drift is detected on any input feature. What is the most likely explanation?

A. The monitoring configuration is wrong
B. Concept drift: the relationship between features and target has changed
C. The model was accidentally replaced with a different version
D. The serving infrastructure is malfunctioning
Answer: B. Concept drift occurs when the underlying relationship between inputs and outputs changes, even though the input distributions remain the same. For example, user preferences may shift (same demographics, different buying patterns). This causes prediction drift without data drift. Wrong config (A) and infrastructure issues (D) would cause errors, not systematic drift. Model replacement (C) is possible but less likely than concept drift.
Q16
You want to monitor your model's prediction quality in real time, but ground truth labels are only available after a 30-day delay (e.g., whether a loan defaulted). What should you monitor in the meantime?

A. Prediction accuracy using estimated labels
B. Input feature distributions (data drift) and prediction distributions (prediction drift)
C. Training loss from the last training run
D. GPU utilization of the serving infrastructure
Answer: B. When ground truth labels are delayed, you cannot measure accuracy directly. The best proxy is monitoring input feature distributions and prediction distributions for drift. Significant drift is an early warning signal that model performance may be degrading. Estimated labels (A) would be unreliable. Training loss (C) is static after training. GPU utilization (D) measures infrastructure, not model quality.
Q17
Your model monitoring detects that a feature's distribution has changed significantly. You want to automatically trigger model retraining when this happens. Which GCP architecture should you use?

A. Vertex AI Model Monitoring → Cloud Monitoring alert → Pub/Sub → Cloud Function → Vertex AI Pipeline
B. Vertex AI Model Monitoring → email notification → manual retraining
C. Cloud Scheduler → daily pipeline run regardless of drift
D. Vertex AI Model Monitoring → Cloud Logging → BigQuery
Answer: A. This architecture creates a fully automated drift-triggered retraining loop: monitoring detects drift, alerts route through Pub/Sub, a Cloud Function triggers the retraining pipeline. Email (B) requires human intervention. Scheduled retraining (C) does not respond to drift. Logging to BigQuery (D) is for analysis, not action.

Mixed Domain Questions

Q18
You are building an end-to-end ML system for a ride-sharing company to predict surge pricing. The system must: (1) ingest streaming ride requests, (2) compute real-time features (ride density per zone), (3) serve predictions in under 100ms, (4) retrain weekly. Which architecture is correct?

A. Pub/Sub → Dataflow → Feature Store → Vertex AI Endpoint + weekly Vertex Pipeline
B. Pub/Sub → BigQuery → BigQuery ML prediction
C. Pub/Sub → Dataproc → Cloud Storage → Vertex AI Batch Prediction
D. Pub/Sub → Cloud Functions → Firestore → App Engine
Answer: A. This architecture correctly handles all requirements: Pub/Sub for streaming ingestion, Dataflow for real-time feature computation (ride density), Feature Store for serving features with low latency, Vertex AI Endpoint for sub-100ms predictions, and a weekly Vertex Pipeline for retraining. BigQuery (B) is not designed for sub-100ms serving. Batch prediction (C) does not meet real-time requirements. Cloud Functions (D) is not a scalable ML serving solution.
Q19
A healthcare company must comply with HIPAA. They want to train an ML model on patient data. Which two GCP features are essential for compliance? (Select two)

A. Customer-Managed Encryption Keys (CMEK)
B. VPC Service Controls
C. AutoML Tables
D. Cloud CDN
E. BigQuery public datasets
Answer: A and B. CMEK (A) ensures data is encrypted with customer-controlled keys, required for HIPAA. VPC Service Controls (B) create a security perimeter that prevents data from leaving the authorized network, essential for protecting PHI. AutoML (C) is a training method, not a security feature. Cloud CDN (D) is for content delivery. Public datasets (E) are the opposite of compliance.
Q20
You are evaluating two models for a spam detection system. Model A has 95% precision and 80% recall. Model B has 85% precision and 95% recall. Your business requirement is that users should never miss important emails. Which model should you deploy?

A. Model A (higher precision)
B. Model B (higher recall)
C. Model A because 95% precision is better overall
D. Neither — retrain to improve both metrics
Answer: A. "Users should never miss important emails" means minimizing false positives (legitimate emails classified as spam). High precision means when the model says "spam," it is almost certainly spam. Model A's 95% precision means only 5% of emails marked as spam are actually legitimate. Model B's 85% precision would incorrectly filter 15% of flagged emails. In spam detection, precision is critical to avoid losing important emails.
Q21
Your team has built a recommendation model and wants to measure its real-world business impact before full deployment. The model is deployed on Vertex AI. What should you do?

A. Compare offline evaluation metrics (AUC, precision) against the current model
B. Run an A/B test with traffic splitting on the Vertex AI endpoint, measuring click-through rate
C. Deploy to all users and monitor revenue for one week
D. Use shadow deployment and compare prediction distributions
Answer: B. A/B testing with traffic splitting measures real business impact (click-through rate, revenue) on live users. Offline metrics (A) do not capture business impact. Full deployment (C) is risky. Shadow deployment (D) does not measure user behavior since users only see the old model's recommendations.
Q22
You need to store and serve ML features for both real-time predictions (sub-10ms) and batch training jobs. Features are shared across 5 different ML models. Which GCP service should you use?

A. Cloud Memorystore (Redis)
B. Vertex AI Feature Store
C. Cloud Bigtable
D. Cloud Spanner
Answer: B. Vertex AI Feature Store is purpose-built for this exact use case: it provides low-latency online serving for real-time predictions and bulk offline serving for training, with feature sharing across models. Redis (A) handles online but not offline. Bigtable (C) can handle low-latency reads but lacks ML-specific features like point-in-time lookups and feature sharing. Spanner (D) is for transactional workloads.
Q23
A model trained on English text is deployed globally. Users in Japan report poor performance on Japanese text inputs. What is the root cause and best fix?

A. Data drift — retrain the model with recent data
B. Training-serving skew — fix the feature preprocessing
C. The model was never trained on Japanese text — collect Japanese training data and fine-tune or retrain
D. Model monitoring threshold is too sensitive — adjust the alert
Answer: C. This is a coverage gap, not drift. The model was trained on English text and cannot generalize to Japanese, which has different characters, grammar, and semantics. The fix is to collect labeled Japanese text data and either fine-tune the model or train a separate multilingual model. Retraining with "recent data" (A) would still be English. This is not a skew issue (B). Adjusting thresholds (D) ignores the problem.
Q24
You want to deploy a model that uses both a tabular model (XGBoost) and a text model (BERT) in a single prediction. The tabular model's output is a feature for the BERT model. What is the best serving architecture?

A. Two separate Vertex AI endpoints called sequentially by the client
B. A custom serving container that loads both models and chains predictions
C. A Vertex AI Pipeline that runs both models
D. BigQuery ML for the tabular model and Vertex AI for the BERT model
Answer: B. A custom serving container that loads both models and chains them internally minimizes latency (single network call) and ensures atomic predictions. Two endpoints (A) double the latency and add failure points. Pipelines (C) are for training, not real-time serving. Split across services (D) adds complexity and latency.
Q25
Your model uses a feature "days_since_last_purchase." During training, this feature ranged from 0 to 365. In production, you see values of -5 and 1,200. What should you do?

A. Retrain the model with a wider feature range
B. Add input validation to clip values to [0, 365] before prediction
C. Investigate the data pipeline for bugs, fix the source, and add schema validation
D. Ignore the outliers since the model can handle them
Answer: C. Negative days since last purchase (-5) is physically impossible, indicating a data pipeline bug. Values of 1,200 (3+ years) may also indicate stale or incorrect data. The root cause must be investigated and fixed. Schema validation (using TFDV or ExampleValidator) should be added to catch future anomalies. Clipping (B) masks the bug. Retraining (A) does not fix the source. Ignoring (D) lets bad data through.
Q26
Your organization wants to implement a feature store but is concerned about latency for online serving. The feature store must serve features in under 5ms. Which underlying storage does Vertex AI Feature Store use for online serving?

A. BigQuery
B. Cloud SQL
C. Bigtable
D. Cloud Memorystore (Redis)
Answer: C. Vertex AI Feature Store uses Bigtable as its online serving backend, which provides single-digit millisecond latency at scale. BigQuery (A) is used for offline serving (batch). Cloud SQL (B) is for relational transactional workloads. Memorystore (D) is a separate service not used by Feature Store internally.
Q27
You are training a model on a dataset with 1 million rows. After splitting 80/10/10 (train/validation/test), the validation accuracy is 98% but test accuracy is 72%. What is the most likely problem?

A. Underfitting
B. Data leakage between training and validation sets
C. The test set is too small
D. The model is not complex enough
Answer: B. High validation accuracy (98%) but much lower test accuracy (72%) strongly indicates data leakage. Information from the training set has leaked into the validation set, making validation results artificially high. This commonly happens with improper time-based splitting, duplicate records, or features derived from the target. Underfitting (A) would show low accuracy on both. Test set size (C) is 100K rows, which is adequate. The model is already complex enough (D) given 98% validation accuracy.
Q28
Your team needs to deploy a new model version. The current model serves 10,000 requests per second. You cannot afford any downtime or accuracy regression. Which deployment strategy is safest?

A. Direct replacement (undeploy old, deploy new)
B. Canary deployment: 5% traffic to new model, gradually increase over 2 weeks
C. Blue-green deployment with instant switch
D. A/B test with 50/50 traffic split
Answer: B. Canary deployment is the safest strategy: it exposes only 5% of traffic to the new model initially. If any issues arise, rollback affects only 5% of users. Gradual increase over 2 weeks provides ample time to detect problems. Direct replacement (A) risks all traffic. Blue-green (C) with instant switch is all-or-nothing. A/B 50/50 (D) exposes half the traffic immediately, which is risky at 10K RPS.
Q29
You are building a data pipeline for ML training. The raw data contains PII (names, email addresses) that should not be used as features. Which GCP service should you use to detect and redact PII before training?

A. Cloud Data Loss Prevention (DLP) API
B. Cloud IAM
C. Vertex AI Feature Store
D. Cloud Key Management Service (KMS)
Answer: A. Cloud DLP API automatically detects PII (names, emails, phone numbers, SSNs) and can redact, mask, or tokenize it. Integrate DLP into your Dataflow pipeline before data reaches the training pipeline. Cloud IAM (B) controls access, not data content. Feature Store (C) stores features but does not detect PII. KMS (D) manages encryption keys, not PII detection.
Q30
Your model monitoring shows that one specific feature ("promo_code") has high attribution drift — it has become the most important feature for predictions, whereas it was previously a minor feature. The model's overall accuracy has not changed. What should you do?

A. Nothing — accuracy is fine
B. Remove the "promo_code" feature and retrain
C. Investigate whether promo_code is leaking target information or causing the model to learn a shortcut
D. Add more features to dilute the importance of promo_code
Answer: C. A sudden change in feature importance is a red flag. If promo_code became dominant, it may be leaking information about the target (e.g., only people who convert get promo codes, creating a proxy for the label). This creates a model that looks accurate but makes decisions for the wrong reasons and will fail when the promo campaign ends. Always investigate attribution drift before taking action. Ignoring (A) is risky. Removing blindly (B) does not address the root cause. Adding features (D) is a workaround, not a solution.

Score Yourself

📊
Scoring Guide:
  • 27–30 correct (90%+): Excellent — you are ready for the exam
  • 21–26 correct (70–89%): Good — review the domains where you made mistakes
  • 15–20 correct (50–69%): Needs work — re-read lessons 2–6 and retake this practice exam
  • Below 15 (<50%): Start from lesson 1 and work through each lesson carefully