
試験問題解答ブレーン問題集でAWS-Certified-Data-Analytics-Specialty試験問題集PDF問題
無料ダウンロードAmazon AWS-Certified-Data-Analytics-Specialtyリアル試験問題
Amazon DAS-C01は、Amazon Web Services(AWS)プラットフォーム上でのデータ分析ソリューションの設計、構築、および維持に必要なスキルと知識を検証するためのプロフェッショナルレベルの認定試験です。この認定試験は、データ分析の概念に強い理解を持ち、データ分析に関連するAWSサービスでの経験がある個人を対象としています。
質問 # 99
An online retail company with millions of users around the globe wants to improve its ecommerce analytics capabilities. Currently, clickstream data is uploaded directly to Amazon S3 as compressed files. Several times each day, an application running on Amazon EC2 processes the data and makes search options and reports available for visualization by editors and marketers. The company wants to make website clicks and aggregated data available to editors and marketers in minutes to enable them to connect with users more effectively.
Which options will help meet these requirements in the MOST efficient way? (Choose two.)
- A. Use Amazon Kinesis Data Firehose to upload compressed and batched clickstream records to Amazon Elasticsearch Service.
- B. Use Amazon Elasticsearch Service deployed on Amazon EC2 to aggregate, filter, and process the data.
Refresh content performance dashboards in near-real time. - C. Upload clickstream records from Amazon S3 to Amazon Kinesis Data Streams and use a Kinesis Data Streams consumer to send records to Amazon Elasticsearch Service.
- D. Upload clickstream records to Amazon S3 as compressed files. Then use AWS Lambda to send data to Amazon Elasticsearch Service from Amazon S3.
- E. Use Kibana to aggregate, filter, and visualize the data stored in Amazon Elasticsearch Service. Refresh content performance dashboards in near-real time.
正解:B、C
質問 # 100
A bank wants to migrate a Teradata data warehouse to the AWS Cloud The bank needs a solution for reading large amounts of data and requires the highest possible performance. The solution also must maintain the separation of storage and compute Which solution meets these requirements?
- A. Use Amazon Athena to query the data in Amazon S3
- B. Use PrestoDB on Amazon EMR to query the data in Amazon S3
- C. Use Amazon Redshift with dense compute nodes to query the data in Amazon Redshift managed storage
- D. Use Amazon Redshift with RA3 nodes to query the data in Amazon Redshift managed storage
正解:D
質問 # 101
A company hosts its analytics solution on premises. The analytics solution includes a server that collects log files. The analytics solution uses an Apache Hadoop cluster to analyze the log files hourly and to produce output files. All the files are archived to another server for a specified duration.
The company is expanding globally and plans to move the analytics solution to multiple AWS Regions in the AWS Cloud. The company must adhere to the data archival and retention requirements of each country where the data is stored.
Which solution will meet these requirements?
- A. Create an Amazon S3 bucket in one Region to collect the log files. Use S3 event notifications to invoke an AWS Glue job for log analysis. Store the output files in the target S3 bucket. Use S3 Lifecycle rules on the target S3 bucket to set an expiration period that meets the retention requirements of the country that contains the Region.
- B. Create an Amazon Kinesis Data Firehose delivery stream in each Region to collect log data. Specify an Amazon S3 bucket in each Region as the destination. Use S3 Storage Lens for data analysis. Use S3 Lifecycle rules on each destination S3 bucket to set an expiration period that meets the retention requirements of the country that contains the Region.
- C. Create an Amazon S3 bucket in each Region to collect log files. Create an Amazon EMR cluster. Submit steps on the EMR cluster for analysis. Store the output files in a target S3 bucket in each Region. Use S3 Lifecycle rules on each target S3 bucket to set an expiration period that meets the retention requirements of the country that contains the Region.
- D. Create a Hadoop Distributed File System (HDFS) file system on an Amazon EMR cluster in one Region to collect the log files. Set up a bootstrap action on the EMR cluster to run an Apache Spark job. Store the output files in a target Amazon S3 bucket. Schedule a job on one of the EMR nodes to delete files that no longer need to be retained.
正解:C
質問 # 102
A company plans to store quarterly financial statements in a dedicated Amazon S3 bucket. The financial statements must not be modified or deleted after they are saved to the S3 bucket.
Which solution will meet these requirements?
- A. Create the S3 bucket with S3 Object Lock in governance mode.
- B. Create the S3 bucket with MFA delete enabled.
- C. Create S3 buckets in two AWS Regions. Use S3 Cross-Region Replication (CRR) between the buckets.
- D. Create the S3 bucket with S3 Object Lock in compliance mode.
正解:A
解説:
This solution meets the requirements because:
S3 Object Lock is a feature in Amazon S3 that allows users and businesses to store files in a highly secure, tamper-proof way. It's used for situations in which businesses must be able to prove that data has not been modified or destroyed after it was written, and it relies on a model known as write once, read many (WORM)1.
S3 Object Lock provides two ways to manage object retention: retention periods and legal holds. A retention period specifies a fixed period of time during which an object remains locked. A legal hold provides the same protection as a retention period, but it has no expiration date2.
S3 Object Lock has two retention modes: governance mode and compliance mode. Governance mode allows users with specific IAM permissions to overwrite or delete an object version before its retention period expires. Compliance mode prevents anyone, including the root user of the account that owns the bucket, from overwriting or deleting an object version or altering its lock settings until the retention period expires2.
By creating the S3 bucket with S3 Object Lock in compliance mode, the company can ensure that the quarterly financial statements are stored in a WORM model and cannot be modified or deleted by anyone until the retention period expires or the legal hold is removed. This can help meet regulatory requirements that require WORM storage, or to add another layer of protection against object changes and deletion2.
質問 # 103
A company wants to use a data lake that is hosted on Amazon S3 to provide analytics services for historical dat a. The data lake consists of 800 tables but is expected to grow to thousands of tables. More than 50 departments use the tables, and each department has hundreds of users. Different departments need access to specific tables and columns.
Which solution will meet these requirements with the LEAST operational overhead?
- A. Create an 1AM role for each department. Use AWS Lake Formation based access control to grant each 1AM role access to specific tables and columns. Use Amazon Athena to analyze the data.
- B. Create an Amazon EMR cluster for each department. Configure an 1AM service role for each EMR cluster to access
- C. relevant S3 files. For each department's users, create an 1AM role that provides access to the relevant EMR cluster. Use Amazon EMR to analyze the data.
- D. Create an 1AM role for each department. Use AWS Lake Formation tag-based access control to grant each 1AM role access to only the relevant resources. Create LF-tags that are attached to tables and columns. Use Amazon Athena to analyze the data.
- E. Create an Amazon Redshift cluster for each department. Use AWS Glue to ingest into the Redshift cluster only the tables and columns that are relevant to that department. Create Redshift database users. Grant the users access to the relevant department's Redshift cluster. Use Amazon Redshift to analyze the data.
正解:D
質問 # 104
A company currently uses Amazon Athena to query its global datasets. The regional data is stored in Amazon S3 in the us-east-1 and us-west-2 Regions. The data is not encrypted. To simplify the query process and manage it centrally, the company wants to use Athena in us-west-2 to query data from Amazon S3 in both Regions. The solution should be as low-cost as possible.
What should the company do to achieve this goal?
- A. Run the AWS Glue crawler in us-west-2 to catalog datasets in all Regions. Once the data is crawled, run Athena queries in us-west-2.
- B. Enable cross-Region replication for the S3 buckets in us-east-1 to replicate data in us-west-2. Once the data is replicated in us-west-2, run the AWS Glue crawler there to update the AWS Glue Data Catalog in us-west-2 and run Athena queries.
- C. Update AWS Glue resource policies to provide us-east-1 AWS Glue Data Catalog access to us-west-2.
Once the catalog in us-west-2 has access to the catalog in us-east-1, run Athena queries in us-west-2. - D. Use AWS DMS to migrate the AWS Glue Data Catalog from us-east-1 to us-west-2. Run Athena queries in us-west-2.
正解:B
質問 # 105
A mortgage company has a microservice for accepting payments. This microservice uses the Amazon DynamoDB encryption client with AWS KMS managed keys to encrypt the sensitive data before writing the data to DynamoDB. The finance team should be able to load this data into Amazon Redshift and aggregate the values within the sensitive fields. The Amazon Redshift cluster is shared with other data analysts from different business units.
Which steps should a data analyst take to accomplish this task efficiently and securely?
- A. Create an AWS Lambda function to process the DynamoDB stream. Save the output to a restricted S3 bucket for the finance team. Create a finance table in Amazon Redshift that is accessible to the finance team only. Use the COPY command with the IAM role that has access to the KMS key to load the data from S3 to the finance table.
- B. Create an Amazon EMR cluster with an EMR_EC2_DefaultRole role that has access to the KMS key.
Create Apache Hive tables that reference the data stored in DynamoDB and the finance table in Amazon Redshift. In Hive, select the data from DynamoDB and then insert the output to the finance table in Amazon Redshift. - C. Create an Amazon EMR cluster. Create Apache Hive tables that reference the data stored in DynamoDB. Insert the output to the restricted Amazon S3 bucket for the finance team. Use the COPY command with the IAM role that has access to the KMS key to load the data from Amazon S3 to the finance table in Amazon Redshift.
- D. Create an AWS Lambda function to process the DynamoDB stream. Decrypt the sensitive data using the same KMS key. Save the output to a restricted S3 bucket for the finance team. Create a finance table in Amazon Redshift that is accessible to the finance team only. Use the COPY command to load the data from Amazon S3 to the finance table.
正解:A
質問 # 106
A large ride-sharing company has thousands of drivers globally serving millions of unique customers every day. The company has decided to migrate an existing data mart to Amazon Redshift. The existing schema includes the following tables.
A trips fact table for information on completed rides. A drivers dimension table for driver profiles.
A customers fact table holding customer profile information.
The company analyzes trip details by date and destination to examine profitability by region. The drivers data rarely changes. The customers data frequently changes.
What table design provides optimal query performance?
- A. Use DISTSTYLE EVEN for the drivers table and sort by date. Use DISTSTYLE ALL for both fact tables.
- B. Use DISTSTYLE KEY (destination) for the trips table and sort by date. Use DISTSTYLE ALL for the drivers table. Use DISTSTYLE EVEN for the customers table.
- C. Use DISTSTYLE KEY (destination) for the trips table and sort by date. Use DISTSTYLE ALL for the drivers and customers tables.
- D. Use DISTSTYLE EVEN for the trips table and sort by date. Use DISTSTYLE ALL for the drivers table.
Use DISTSTYLE EVEN for the customers table.
正解:B
解説:
Explanation
https://www.matillion.com/resources/blog/aws-redshift-performance-choosing-the-right-distribution-styles/#:~:te
https://docs.aws.amazon.com/redshift/latest/dg/c_best-practices-best-dist-key.html
質問 # 107
An online retailer needs to deploy a product sales reporting solution. The source data is exported from an external online transaction processing (OLTP) system for reporting. Roll-up data is calculated each day for the previous day's activities. The reporting system has the following requirements:
Have the daily roll-up data readily available for 1 year.
After 1 year, archive the daily roll-up data for occasional but immediate access.
The source data exports stored in the reporting system must be retained for 5 years. Query access will be needed only for re-evaluation, which may occur within the first 90 days.
Which combination of actions will meet these requirements while keeping storage costs to a minimum? (Choose two.)
- A. Store the source data initially in the Amazon S3 Standard-Infrequent Access (S3 Standard-IA) storage class. Apply a lifecycle configuration that changes the storage class to Amazon S3 Glacier Deep Archive 90 days after creation, and then deletes the data 5 years after creation.
- B. Store the daily roll-up data initially in the Amazon S3 Standard-Infrequent Access (S3 Standard-IA) storage class. Apply a lifecycle configuration that changes the storage class to Amazon S3 Glacier 1 year after data creation.
- C. Store the source data initially in the Amazon S3 Glacier storage class. Apply a lifecycle configuration that changes the storage class from Amazon S3 Glacier to Amazon S3 Glacier Deep Archive 90 days after creation, and then deletes the data 5 years after creation.
- D. Store the daily roll-up data initially in the Amazon S3 Standard storage class. Apply a lifecycle configuration that changes the storage class to Amazon S3 Glacier Deep Archive 1 year after data creation.
- E. Store the daily roll-up data initially in the Amazon S3 Standard storage class. Apply a lifecycle configuration that changes the storage class to Amazon S3 Standard-Infrequent Access (S3 Standard-IA) 1 year after data creation.
正解:A、E
質問 # 108
A large media company is looking for a cost-effective storage and analysis solution for its daily media recordings formatted with embedded metadat a. Daily data sizes range between 10-12 TB with stream analysis required on timestamps, video resolutions, file sizes, closed captioning, audio languages, and more. Based on the analysis, processing the datasets is estimated to take between 30-180 minutes depending on the underlying framework selection. The analysis will be done by using business intelligence (Bl) tools that can be connected to data sources with AWS or Java Database Connectivity (JDBC) connectors.
Which solution meets these requirements?
- A. Store the video files in Amazon S3 and use AWS Lambda to extract the metadata from the files and load it to Amazon S3. Use Amazon Athena to provide the data to be analyzed by the BI tools.
- B. Store the video files in Amazon DynamoDB and use Amazon EMR to extract the metadata from the files and load it to Apache Hive. Use Apache Hive to provide the data to be analyzed by the Bl tools.
- C. Store the video files in Amazon S3 and use AWS Glue to extract the metadata from the files and load it to Amazon Redshift. Use Amazon Redshift to provide the data to be analyzed by the Bl tools.
- D. Store the video files in Amazon DynamoDB and use AWS Lambda to extract the metadata from the files and load it to DynamoDB. Use DynamoDB to provide the data to be analyzed by the Bl tools.
正解:A
質問 # 109
A company is hosting an enterprise reporting solution with Amazon Redshift. The application provides reporting capabilities to three main groups: an executive group to access financial reports, a data analyst group to run long-running ad-hoc queries, and a data engineering group to run stored procedures and ETL processes.
The executive team requires queries to run with optimal performance. The data engineering team expects queries to take minutes.
Which Amazon Redshift feature meets the requirements for this task?
- A. Workload management (WLM)
- B. Short query acceleration (SQA)
- C. Concurrency scaling
- D. Materialized views
正解:D
解説:
Explanation
Materialized views:
質問 # 110
A manufacturing company uses Amazon S3 to store its dat
a. The company wants to use AWS Lake Formation to provide granular-level security on those data assets. The data is in Apache Parquet format. The company has set a deadline for a consultant to build a data lake.
How should the consultant create the MOST cost-effective solution that meets these requirements?
- A. Create multiple IAM roles for different users and groups. Assign IAM roles to different data assets in Amazon S3 to create table-based and column-based access controls.
- B. Run Lake Formation blueprints to move the data to Lake Formation. Once Lake Formation has the data, apply permissions on Lake Formation.
- C. To create the data catalog, run an AWS Glue crawler on the existing Parquet data. Register the Amazon S3 path and then apply permissions through Lake Formation to provide granular-level security.
- D. Install Apache Ranger on an Amazon EC2 instance and integrate with Amazon EMR. Using Ranger policies, create role-based access control for the existing data assets in Amazon S3.
正解:B
解説:
https://aws.amazon.com/blogs/big-data/building-securing-and-managing-data-lakes-with-aws-lake-formation/
質問 # 111
A manufacturing company has been collecting IoT sensor data from devices on its factory floor for a year and is storing the data in Amazon Redshift for daily analysis. A data analyst has determined that, at an expected ingestion rate of about 2 TB per day, the cluster will be undersized in less than 4 months. A long-term solution is needed. The data analyst has indicated that most queries only reference the most recent 13 months of data, yet there are also quarterly reports that need to query all the data generated from the past 7 years. The chief technology officer (CTO) is concerned about the costs, administrative effort, and performance of a long-term solution.
Which solution should the data analyst use to meet these requirements?
- A. Create a daily job in AWS Glue to UNLOAD records older than 13 months to Amazon S3 and delete those records from Amazon Redshift. Create an external table in Amazon Redshift to point to the S3 location. Use Amazon Redshift Spectrum to join to data that is older than 13 months.
- B. Take a snapshot of the Amazon Redshift cluster. Restore the cluster to a new cluster using dense storage nodes with additional storage capacity.
- C. Execute a CREATE TABLE AS SELECT (CTAS) statement to move records that are older than 13 months to quarterly partitioned data in Amazon Redshift Spectrum backed by Amazon S3.
- D. Unload all the tables in Amazon Redshift to an Amazon S3 bucket using S3 Intelligent-Tiering. Use AWS Glue to crawl the S3 bucket location to create external tables in an AWS Glue Data Catalog. Create an Amazon EMR cluster using Auto Scaling for any daily analytics needs, and use Amazon Athena for the quarterly reports, with both using the same AWS Glue Data Catalog.
正解:A
質問 # 112
A company collects and transforms data files from third-party providers by using an on-premises SFTP server. The company uses a Python script to transform the dat a.
The company wants to reduce the overhead of maintaining the SFTP server and storing large amounts of data on premises. However, the company does not want to change the existing upload process for the third-party providers.
Which solution will meet these requirements with the LEAST development effort?
- A. Use AWS Transfer Family to create an SFTP server that includes a publicly accessible endpoint. Configure the new server to use Amazon S3 storage. Change the server name to match the name of the on-premises SFTP server. Schedule a Python shell job in AWS Glue to use the existing Python script to run periodically and transform the uploaded files.
- B. Create an Amazon S3 bucket that includes a separate prefix for each provider. Provide the S3 URL to each provider for its respective prefix. Instruct the providers to use the S3 COPY command to upload data. Configure an AWS Lambda function that transforms the data when new files are uploaded.
- C. Deploy the Python script on an Amazon EC2 instance. Install a third-party SFTP server on the EC2 instance. Schedule the script to run periodically on the EC2 instance to perform a data transformation on new files. Copy the transformed files to Amazon S3.
- D. Use AWS Transfer Family to create an SFTP server that includes a publicly accessible endpoint. Configure the new server to use Amazon S3 storage. Change the server name to match the name of the on-premises SFTP server. Use AWS Data Pipeline to schedule a transient Amazon EMR cluster with an Apache Spark step to periodically transform the files.
正解:A
解説:
This solution meets the requirements because:
AWS Transfer Family is a fully managed service that enables secure file transfers to and from Amazon S3 or Amazon EFS using standard protocols such as SFTP, FTPS, and FTP1. By using AWS Transfer Family, the company can reduce the overhead of maintaining the on-premises SFTP server and storing large amounts of data on premises.
The company can create an SFTP-enabled server with a publicly accessible endpoint using AWS Transfer Family. This endpoint can be accessed by the third-party providers over the internet using their existing SFTP clients. The company can also change the server name to match the name of the on-premises SFTP server, so that the existing upload process for the third-party providers does not change. For more information, see Create an SFTP-enabled server.
The company can configure the new SFTP server to use Amazon S3 as the storage service. This way, the data files uploaded by the third-party providers will be stored in an Amazon S3 bucket. The company can also use AWS Identity and Access Management (IAM) roles and policies to control access to the S3 bucket and its objects. For more information, see Using Amazon S3 as your storage service.
The company can schedule a Python shell job in AWS Glue to use the existing Python script to run periodically and transform the uploaded files. AWS Glue is a fully managed extract, transform, and load (ETL) service that makes it easy to prepare and load data for analytics2. A Python shell job is a type of job that runs Python scripts in a managed Apache Spark environment3. The company can use AWS Glue triggers to schedule the Python shell job based on time or events4. For more information, see Working with Python shell jobs.
質問 # 113
A large ride-sharing company has thousands of drivers globally serving millions of unique customers every day. The company has decided to migrate an existing data mart to Amazon Redshift. The existing schema includes the following tables.
A trips fact table for information on completed rides. A drivers dimension table for driver profiles.
A customers fact table holding customer profile information.
The company analyzes trip details by date and destination to examine profitability by region. The drivers data rarely changes. The customers data frequently changes.
What table design provides optimal query performance?
- A. Use DISTSTYLE EVEN for the drivers table and sort by date. Use DISTSTYLE ALL for both fact tables.
- B. Use DISTSTYLE KEY (destination) for the trips table and sort by date. Use DISTSTYLE ALL for the drivers table. Use DISTSTYLE EVEN for the customers table.
- C. Use DISTSTYLE KEY (destination) for the trips table and sort by date. Use DISTSTYLE ALL for the drivers and customers tables.
- D. Use DISTSTYLE EVEN for the trips table and sort by date. Use DISTSTYLE ALL for the drivers table. Use DISTSTYLE EVEN for the customers table.
正解:B
解説:
https://www.matillion.com/resources/blog/aws-redshift-performance-choosing-the-right-distribution-styles/#:~:text=The%20distribution%20style%20is%20how,you%20want%20to%20distribute%20it%E2%80%A6 https://docs.aws.amazon.com/redshift/latest/dg/c_best-practices-best-dist-key.html
質問 # 114
......
Amazon AWS-Certified-Data-Analytics-Specialty試験は、AWSプラットフォーム上のデータ分析の専門知識を証明したいプロフェッショナル向けに設計されています。AWS Certified Data Analytics - Specialty(DAS-C01)試験では、データ収集、ストレージ、処理、可視化、セキュリティなど、データ分析に関する広範なトピックがカバーされます。試験は、少なくとも2年間のデータ分析または関連分野での経験を持つ個人を対象としています。
AWS認定データ分析 - 専門(DAS -C01)試験に備えるには、候補者はAWSサービスを確実に理解し、データ分析ツールと技術に精通している必要があります。また、実際の設定でデータ分析ソリューションを使用した経験もあります。 AWSは、候補者が試験に備え、認定目標を達成するのを支援するために、オンラインコース、練習試験、学習ガイドなど、さまざまなトレーニングリソースと認定準備資料を提供しています。
最新のAmazon AWS-Certified-Data-Analytics-Specialtyリアル試験問題集PDF:https://www.jpntest.com/shiken/AWS-Certified-Data-Analytics-Specialty-mondaishu
AWS-Certified-Data-Analytics-Specialty試験問題集、AWS-Certified-Data-Analytics-Specialty練習テスト問題:https://drive.google.com/open?id=1llJX1XKX7kZkyLZbXJXJ2JvXN3xkkCO3