Advanced

Practice Exam

20 scenario-based questions covering all topics in this course. Try to answer each question before revealing the answer. Aim for 14+ correct (70%) before scheduling your real exam.

💡
How to use this exam: Read each scenario carefully, choose your answer, then click "Show Answer" to check. Track your score. If you score below 70%, review the relevant lesson before retaking.

Questions 1-5: Core Concepts

📝
Q1: You need to deploy 5 replicas of a model inference server. When you update the model version, all old Pods should be replaced with new ones gradually, with zero downtime. Which resource should you use?

A) StatefulSet
B) DaemonSet
C) Deployment
D) ReplicaSet
Show Answer

C) Deployment. Deployments manage rolling updates, gradually replacing old Pods with new ones. StatefulSets provide stable identities (not needed for stateless inference). DaemonSets run one Pod per node. ReplicaSets do not support rolling updates natively.

📝
Q2: A Pod in the ml-training namespace is in CrashLoopBackOff status. Which command shows the Pod's recent logs to diagnose the issue?

A) kubectl describe pod <name> -n ml-training
B) kubectl logs <name> -n ml-training --previous
C) kubectl get events -n ml-training
D) All of the above are useful
Show Answer

D) All of the above are useful. kubectl logs --previous shows logs from the crashed container. kubectl describe pod shows events, restart reasons, and status. kubectl get events shows namespace-level events. In the CKA, use all three for comprehensive troubleshooting.

📝
Q3: Which Kubernetes object sets the maximum total CPU and memory that all Pods in a namespace can consume combined?

A) LimitRange
B) ResourceQuota
C) PodDisruptionBudget
D) HorizontalPodAutoscaler
Show Answer

B) ResourceQuota. ResourceQuota limits aggregate resource consumption per namespace. LimitRange sets per-Pod or per-container defaults and maximums. PodDisruptionBudget controls voluntary disruptions. HPA scales replicas based on metrics.

📝
Q4: An inference server container defines requests.memory: 4Gi and limits.memory: 8Gi. The container currently uses 6Gi. What is its status?

A) OOMKilled — it exceeded the limit
B) Running normally — it is between request and limit
C) Throttled — memory is being restricted
D) Pending — insufficient resources
Show Answer

B) Running normally. The container is using 6Gi, which is above the request (4Gi) but below the limit (8Gi). Memory requests are used for scheduling (guaranteed minimum), while limits are the maximum. The container runs normally as long as it stays below the limit. It would be OOMKilled only if it exceeds 8Gi.

📝
Q5: You want to store database connection strings and API keys that your training Pods need. The data is sensitive. Which Kubernetes object should you use?

A) ConfigMap
B) Secret
C) PersistentVolume
D) Annotation
Show Answer

B) Secret. Secrets are designed for sensitive data (passwords, API keys, tokens). They are base64-encoded and can be mounted as volumes or exposed as environment variables. ConfigMaps are for non-sensitive configuration. Note: Kubernetes Secrets are not encrypted by default — enable encryption at rest for production clusters.

Questions 6-10: GPU Scheduling

📝
Q6: A cluster has 3 GPU nodes, each with 4 NVIDIA A100 GPUs. The NVIDIA device plugin is running. You submit a Pod requesting nvidia.com/gpu: 5. What happens?

A) The Pod is scheduled across two nodes
B) The Pod remains Pending because no single node has 5 GPUs
C) The Pod is scheduled with only 4 GPUs
D) The scheduler automatically adds a second container
Show Answer

B) The Pod remains Pending. A Pod is always scheduled on a single node. Since no node has 5 GPUs, the Pod cannot be scheduled and stays in Pending state. Kubernetes does not split a Pod across nodes. To use more than 4 GPUs, you need distributed training with multiple Pods.

📝
Q7: You taint all GPU nodes with gpu=true:NoSchedule. A web server Deployment (no toleration) tries to schedule on a GPU node. What happens?

A) The web server Pods are scheduled on the GPU nodes
B) The web server Pods remain Pending
C) The web server Pods are scheduled but with a warning
D) The taint is ignored for Deployments
Show Answer

B) The web server Pods remain Pending (assuming only GPU nodes exist). With the NoSchedule taint, Pods without a matching toleration cannot be scheduled on those nodes. If non-GPU nodes exist, the Pods would be scheduled there instead. If only GPU nodes exist, they stay Pending.

📝
Q8: Which Kubernetes object type runs the NVIDIA device plugin, ensuring exactly one instance per GPU node?

A) Deployment
B) StatefulSet
C) DaemonSet
D) Job
Show Answer

C) DaemonSet. A DaemonSet ensures exactly one Pod runs on each node (or a subset of nodes). The NVIDIA device plugin runs as a DaemonSet so it discovers and registers GPUs on every GPU-equipped node in the cluster.

📝
Q9: You want to schedule training jobs on nodes with A100 GPUs only, but allow the scheduler to place them on V100 nodes if no A100 nodes are available. Which affinity type should you use?

A) requiredDuringSchedulingIgnoredDuringExecution
B) preferredDuringSchedulingIgnoredDuringExecution
C) requiredDuringSchedulingRequiredDuringExecution
D) nodeSelector
Show Answer

B) preferredDuringSchedulingIgnoredDuringExecution. The preferred affinity is a soft constraint — the scheduler tries to place the Pod on matching nodes but falls back to other nodes if none are available. required is a hard constraint that would leave the Pod Pending if no A100 nodes exist.

📝
Q10: After deploying the NVIDIA device plugin, kubectl describe node gpu-node-1 shows nvidia.com/gpu: 0 in allocatable resources despite the node having 4 physical GPUs. What is the most likely cause?

A) The GPU driver is not installed on the node
B) The device plugin Pod needs to be restarted
C) The NVIDIA Container Toolkit is not installed
D) Any of the above could be the cause
Show Answer

D) Any of the above could be the cause. The GPU stack requires: (1) NVIDIA drivers installed on the host, (2) NVIDIA Container Toolkit for GPU passthrough to containers, (3) a running device plugin Pod. If any layer fails, GPUs will not appear as allocatable resources. Check nvidia-smi on the node, verify the container runtime config, and check device plugin Pod logs.

Questions 11-15: ML Workloads & Scheduling

📝
Q11: A CronJob runs model retraining every night at midnight. Last night's job is still running at midnight. You want to skip tonight's job. Which concurrencyPolicy achieves this?

A) Allow
B) Forbid
C) Replace
D) Skip
Show Answer

B) Forbid. With Forbid, if the previous Job is still running, the CronJob skips the new run. Allow creates both (concurrent runs). Replace kills the running Job and starts a new one. Skip is not a valid option.

📝
Q12: A training Job has backoffLimit: 3 and the training script fails 4 times. What is the final state of the Job?

A) The Job keeps retrying indefinitely
B) The Job is marked as Failed
C) The Job succeeds on the 4th attempt
D) The Job is deleted automatically
Show Answer

B) The Job is marked as Failed. With backoffLimit: 3, the Job allows 3 retries (for a total of 4 attempts including the original). After 4 failures, the Job transitions to a Failed state and no more Pods are created.

📝
Q13: You deploy a model serving Pod with initialDelaySeconds: 120 on the readiness probe. What happens during the first 120 seconds?

A) The Pod receives traffic normally
B) The Pod does not receive traffic from the Service
C) The Pod is restarted
D) The Pod is deleted
Show Answer

B) The Pod does not receive traffic from the Service. During the initialDelaySeconds period, the readiness probe has not been checked yet, so the Pod is considered not ready. The Service does not route traffic to non-ready Pods. This gives the model time to load into memory before receiving requests.

📝
Q14: You need to run a data preprocessing step before the main training container starts. The preprocessor should download and extract the dataset, then exit. Which Kubernetes feature should you use?

A) Sidecar container
B) Init container
C) PostStart lifecycle hook
D) CronJob
Show Answer

B) Init container. Init containers run before the main containers and must complete successfully before the main containers start. They are perfect for setup tasks like downloading data, populating volumes, or waiting for dependencies. Sidecar containers run alongside the main container (not before).

📝
Q15: An HPA is configured with minReplicas: 2, maxReplicas: 10, and target CPU utilization of 70%. Current CPU utilization is 35% with 4 replicas running. What will the HPA do?

A) Scale down to 2 replicas
B) Scale down to 3 replicas
C) Keep 4 replicas
D) Scale up to 5 replicas
Show Answer

A) Scale down to 2 replicas. The desired replicas formula is: ceil(currentReplicas x (currentUtilization / targetUtilization)) = ceil(4 x (35/70)) = ceil(4 x 0.5) = ceil(2) = 2. Since 2 equals the minReplicas, the HPA scales down to 2 replicas.

Questions 16-20: Networking & Storage

📝
Q16: A training Pod writes model checkpoints to an emptyDir volume. The Pod crashes and is restarted by Kubernetes. Are the checkpoints still available?

A) Yes, emptyDir survives Pod restarts
B) No, emptyDir is deleted when the Pod crashes
C) Yes, but only if the Pod is restarted on the same node
D) It depends on the restartPolicy
Show Answer

A) Yes, emptyDir survives Pod restarts. An emptyDir volume's lifetime is tied to the Pod, not the container. When a container crashes and is restarted within the same Pod, the emptyDir data persists. However, if the Pod is deleted (and rescheduled as a new Pod), the emptyDir is lost. For important checkpoints, use PersistentVolumeClaims.

📝
Q17: You create a PVC with accessMode: ReadWriteOnce and a Pod mounts it on Node A. A second Pod on Node B tries to mount the same PVC. What happens?

A) Both Pods mount the PVC successfully
B) The second Pod stays Pending until the first Pod releases the PVC
C) The first Pod is evicted to allow the second Pod to mount
D) Both Pods mount in read-only mode
Show Answer

B) The second Pod stays Pending. ReadWriteOnce (RWO) means the volume can only be mounted by a single node. Since Node A already has it mounted, the second Pod on Node B cannot mount it and remains Pending. If the second Pod were on Node A, it could potentially mount it (RWO is per-node, not per-Pod, in practice).

📝
Q18: You need to create an Ingress that terminates TLS and routes /api/predict to the prediction Service and /api/train to the training Service. How many Ingress resources do you need?

A) Two (one per path)
B) One (with multiple path rules)
C) Three (one per path plus one for TLS)
D) One per backend Service
Show Answer

B) One (with multiple path rules). A single Ingress resource can define multiple path rules under the same host, each routing to different backend Services. TLS configuration is also included in the same Ingress resource under the tls section.

📝
Q19: A NetworkPolicy with an empty podSelector: {} and policyTypes: ["Ingress"] with no ingress rules is applied to the ml-training namespace. What effect does this have?

A) All ingress traffic to all Pods in the namespace is allowed
B) All ingress traffic to all Pods in the namespace is denied
C) Only egress traffic is affected
D) The policy has no effect
Show Answer

B) All ingress traffic to all Pods in the namespace is denied. An empty podSelector selects all Pods in the namespace. A policy with policyTypes: ["Ingress"] and no ingress rules means "select all Pods and deny all incoming traffic." This is the default-deny pattern commonly used as a baseline security policy.

📝
Q20: A StorageClass has volumeBindingMode: WaitForFirstConsumer. When is the PersistentVolume created?

A) When the PVC is created
B) When a Pod that uses the PVC is scheduled
C) When the StorageClass is created
D) When the administrator manually provisions it
Show Answer

B) When a Pod that uses the PVC is scheduled. WaitForFirstConsumer delays volume provisioning until a Pod that references the PVC is scheduled. This ensures the volume is created in the same availability zone as the Pod. With Immediate (default), the volume is created as soon as the PVC is created, which may cause scheduling issues if the volume is in a different zone.

Score Guide

  • 18-20 correct: Excellent! You are well-prepared for the CKA exam.
  • 14-17 correct: Good foundation. Review missed topics and retake in a few days.
  • 10-13 correct: Review core concepts, GPU scheduling, and workload management lessons.
  • Below 10: Spend more time on hands-on practice. Re-read all lessons and try again.