CCA-500 無料問題集「Cloudera Certified Administrator for Apache Hadoop (CCAH)」
A user comes to you, complaining that when she attempts to submit a Hadoop job, it fails. There is a Directory in HDFS
named /data/input. The Jar is named j.jar, and the driver class is named DriverClass.
She runs the command:
Hadoop jar j.jar DriverClass /data/input/data/output
The error message returned includes the line:
PriviligedActionException as:training (auth:SIMPLE)
cause:org.apache.hadoop.mapreduce.lib.input.invalidInputException:
Input path does not exist: file:/data/input
What is the cause of the error?
named /data/input. The Jar is named j.jar, and the driver class is named DriverClass.
She runs the command:
Hadoop jar j.jar DriverClass /data/input/data/output
The error message returned includes the line:
PriviligedActionException as:training (auth:SIMPLE)
cause:org.apache.hadoop.mapreduce.lib.input.invalidInputException:
Input path does not exist: file:/data/input
What is the cause of the error?
正解:D
解答を投票する
You use the hadoop fs -put command to add a file "sales.txt" to HDFS. This file is small enough that it fits into a single
block, which is replicated to three nodes in your cluster (with a replication factor of 3). One of the nodes holding this
file (a single block) fails. How will the cluster handle the replication of file in this situation?
block, which is replicated to three nodes in your cluster (with a replication factor of 3). One of the nodes holding this
file (a single block) fails. How will the cluster handle the replication of file in this situation?
正解:A
解答を投票する
You need to analyze 60,000,000 images stored in JPEG format, each of which is approximately 25 KB. Because you
Hadoop cluster isn't optimized for storing and processing many small files, you decide to do the following actions:
1. Group the individual images into a set of larger files
2. Use the set of larger files as input for a MapReduce job that processes them directly with python using Hadoop
streaming.
Which data serialization system gives the flexibility to do this?
Hadoop cluster isn't optimized for storing and processing many small files, you decide to do the following actions:
1. Group the individual images into a set of larger files
2. Use the set of larger files as input for a MapReduce job that processes them directly with python using Hadoop
streaming.
Which data serialization system gives the flexibility to do this?
正解:A、C
解答を投票する