Databricks-Certified-Professional-Data-Engineer 無料問題集「Databricks Certified Professional Data Engineer」

A Delta Lake table representing metadata about content posts from users has the following schema:
user_id LONG, post_text STRING, post_id STRING, longitude FLOAT, latitude FLOAT, post_time TIMESTAMP, date DATE This table is partitioned by the date column. A query is run with the following filter:
longitude < 20 & longitude > -20
Which statement describes how data will be filtered?

解説: (JPNTest メンバーにのみ表示されます)
Review the following error traceback:

Which statement describes the error being raised?

解説: (JPNTest メンバーにのみ表示されます)
A data engineer, User A, has promoted a new pipeline to production by using the REST API to programmatically create several jobs. A DevOps engineer, User B, has configured an external orchestration tool to trigger job runs through the REST API. Both users authorized the REST API calls using their personal access tokens.
Which statement describes the contents of the workspace audit logs concerning these events?

解説: (JPNTest メンバーにのみ表示されます)
A junior data engineer is working to implement logic for a Lakehouse table namedsilver_device_recordings.
The source data contains 100 unique fields in a highly nested JSON structure.
Thesilver_device_recordingstable will be used downstream to power several production monitoring dashboards and a production model. At present, 45 of the 100 fields are being used in at least one of these applications.
The data engineer is trying to determine the best approach for dealing with schema declaration given the highly-nested structure of the data and the numerous fields.
Which of the following accurately presents information about Delta Lake and Databricks that may impact their decision-making process?

解説: (JPNTest メンバーにのみ表示されます)
An upstream system has been configured to pass the date for a given batch of data to the Databricks Jobs API as a parameter. The notebook to be scheduled will use this parameter to load data with the following code:
df = spark.read.format("parquet").load(f"/mnt/source/(date)")
Which code block should be used to create the date Python variable used in the above code block?

解説: (JPNTest メンバーにのみ表示されます)
A junior data engineer on your team has implemented the following code block.

The viewnew_eventscontains a batch of records with the same schema as theeventsDelta table.
Theevent_idfield serves as a unique key for this table.
When this query is executed, what will happen with new records that have the sameevent_idas an existing record?

解説: (JPNTest メンバーにのみ表示されます)
Incorporating unit tests into a PySpark application requires upfront attention to the design of your jobs, or a potentially significant refactoring of existing code.
Which statement describes a main benefit that offset this additional effort?

A nightly job ingests data into a Delta Lake table using the following code:

The next step in the pipeline requires a function that returns an object that can be used to manipulate new records that have not yet been processed to the next table in the pipeline.
Which code snippet completes this function definition?
def new_records():

解説: (JPNTest メンバーにのみ表示されます)
The viewupdatesrepresents an incremental batch of all newly ingested data to be inserted or updated in the customerstable.
The following logic is used to process these records.

Which statement describes this implementation?

解説: (JPNTest メンバーにのみ表示されます)
A junior developer complains that the code in their notebook isn't producing the correct results in the development environment. A shared screenshot reveals that while they're using a notebook versioned with Databricks Repos, they're using a personal branch that contains old logic. The desired branch nameddev-2.3.9 is not available from the branch selection dropdown.
Which approach will allow this developer to review the current logic for this notebook?

解説: (JPNTest メンバーにのみ表示されます)

弊社を連絡する

我々は12時間以内ですべてのお問い合わせを答えます。

オンラインサポート時間:( UTC+9 ) 9:00-24:00
月曜日から土曜日まで

サポート:現在連絡