NCA-AIIO 無料問題集「NVIDIA-Certified Associate AI Infrastructure and Operations」
Your AI cluster is managed using Kubernetes with NVIDIA GPUs. Due to a sudden influx of jobs, your cluster experiences resource overcommitment, where more jobs are scheduled than the available GPU resources can handle. Which strategy would most effectively manage this situation to maintain cluster stability?
正解:A
解答を投票する
解説: (JPNTest メンバーにのみ表示されます)
Your AI team notices that the training jobs on your NVIDIA GPU cluster are taking longer than expected.
Upon investigation, you suspect underutilization of the GPUs. Which monitoring metric is the most critical to determine if the GPUs are being underutilized?
Upon investigation, you suspect underutilization of the GPUs. Which monitoring metric is the most critical to determine if the GPUs are being underutilized?
正解:B
解答を投票する
解説: (JPNTest メンバーにのみ表示されます)
Your organization is setting up an AI infrastructure to support a range of AI workloads, including data processing, model training, and inference. The infrastructure needs to be scalable, support distributed training, and handle large datasets efficiently. Which NVIDIA solution would be most suitable for managing and orchestrating this AI infrastructure?
正解:B
解答を投票する
解説: (JPNTest メンバーにのみ表示されます)
You are managing an AI infrastructure using NVIDIA GPUs to train large language models for a social media company. During training, you observe that the GPU utilization is significantly lower than expected, leading to longer training times. Which of the following actions is most likely to improve GPU utilization and reduce training time?
正解:A
解答を投票する
解説: (JPNTest メンバーにのみ表示されます)
A retail company wants to implement an AI-based system to predict customer behavior and personalize product recommendations across its online platform. The system needs to analyze vast amounts of customer data, including browsing history, purchase patterns, and social media interactions. Which approach would be the most effective for achieving these goals?
正解:D
解答を投票する
解説: (JPNTest メンバーにのみ表示されます)
You are tasked with designing a highly available AI data center platform that can continue to operate smoothly even in the event of hardware failures. The platform must support both training and inference workloads with minimal downtime. Which architecture would best meet these requirements?
正解:D
解答を投票する
解説: (JPNTest メンバーにのみ表示されます)
Your team is running an AI inference workload on a Kubernetes cluster with multiple NVIDIA GPUs. You observe that some nodes with GPUs are underutilized, while others are overloaded, leading to inconsistent inference performance across the cluster. Which strategy would most effectively balance the GPU workload across the Kubernetes cluster?
正解:B
解答を投票する
解説: (JPNTest メンバーにのみ表示されます)