D-DS-FN-23無料問題集「EMC Dell Data Science Foundations」

質問 1

Adata scientist is investigating a new database column that needs to be integrated into their model. The column contains 10,000 labels with 300 unique values.
Which data structure should be used when working in R?

（A）List

（B）Array

（C）Factor

（D）Data frame

正解：C 解答を投票する

質問 2

Which word or phrase completes the statement; "A theater actor is to 'artistic and expressive' as a data scientist is to."?

（A）Communicative and collaborative

（B）Introverted and technical

（C）Independent and intelligent

（D）Logical and steadfast

正解：A 解答を投票する

質問 3

In the data preparation phase of the data analytics lifecycle, what does the term "data conditioning" refer to?

（A）Cleaning the data, normalizing datasets. and performing transformations

（B）Deploying the model and monitoring its performance

（C）Building training and testing datasets

（D）Identifying relationships and correlations among variables

正解：A 解答を投票する

解説: (JPNTest メンバーにのみ表示されます)

質問 4

Which SQL OLAP extension provides all possible grouping combinations?

（A）UNION ALL

（B）ROLLUP

（C）CUBE

（D）CROSS JOIN

正解：C 解答を投票する

質問 5

Refer to the exhibit.

You are using K-means clustering to classify customer behavior for a large retailer. You need to determine the optimum number of customer groups. You plot the within-sum-of- squares (wss) data as shown in the exhibit.
How many customer groups should you specify?

（A）3

（B）4

（C）2

（D）8

正解：B 解答を投票する

質問 6

A study was run to identify general dietary patterns among the residents of a small town. Twelve thousand people were surveyed and the data was subject to K-means clustering.
In one of the iterations, there were six clusters formed with 38, 1560, 1799, 2560, 2893, and 3150 respondents.
What should be the next step in identifying optimal clusters?

（A）Multiply each variable by its standard deviation

（B）Remove 38 respondents because the 5 clusters seem to be well distributed

（C）Determine the optimal number of clusters by plotting the Within Sum of Squares (WSS) values as a function of K

（D）Add more categorical variables to the dataset to maximize the Within Sum of Squares (WSS) value for K=6

正解：C 解答を投票する

質問 7

Refer to the Exhibit.

You are working on creating an OLAP query that outputs several rows of with summary rows of subtotals and grand totals in addition to regular rows that may contain NULL as shown in the exhibit.
Which function can you use in your query to distinguish the row from a regular row to a subtotal row?

（A）RANK

（B）ROLLUP

（C）GROUP_ID

（D）GROUPING

正解：D 解答を投票する

質問 8

You are assigned the task of creating customer profiles for your company. In your database, you have
25 key input variables that come together to define 2,500 customers. You decide to run a K-means cluster analysis on the 25 input variables based on k=4 to build your profiles.
Your analysis resulted in four cluster populations:
Cluster A=1,000 customers
Cluster B=560 customers
Cluster C=925 customers
Cluster D=15 customers
What should be attempted first to more evenly distribute the customer population across clusters?

（A）Increase K from 4 to 5

（B）Remove the 15 customers in Cluster D from the population

（C）Remove some of the input variables from the analysis

（D）Reduce K from 4 to 3

正解：D 解答を投票する

質問 9

Refer to the exhibit.

The graph represents an ROC space with four classifiers labelled A through D.
Which point in the graph represents a perfect classification?

（A）R

（B）P

（C）Q

（D）S

正解：D 解答を投票する

質問 10

Refer to the exhibit.

You are using k-means clustering to discover groupings within a data set. You plot within- sum-of-squares (wss) of multiple cluster sizes.
Based on the exhibit, how many clusters should you use in your analysis?

（A）10

（B）4

（C）2

（D）8

正解：B 解答を投票する

質問 11

How is dimensionality defined in a "bag of words" document representation?

（A）Frequency of repeated words in the document

（B）Average number of words per sentence in the document

（C）Number of unique terms in the document

（D）Total number of words in the document

正解：C 解答を投票する

質問 12

You have an automotive database containing numeric characteristics such as engine size, horsepower, and top speed.
Which technique could you use to group similar cars together?

（A）K-means clustering

（B）Naïve Bayes classifier

（C）Association rules

（D）Logistic regression

正解：A 解答を投票する

質問 13

Consider these itemsets:
(hat, scarf, coat)
(hat, scarf, coat, gloves)
(hat, scarf, gloves)
(hat, gloves)
(scarf, coat, gloves)
What is the confidence of the rule (hat, scarf) => gloves?

（A）60%

（B）66%

（C）50%

（D）40%

正解：B 解答を投票する

質問 14

What provides the means for matching and manipulating text strings in SQL?

（A）TF-IDF

（B）Regular expressions

（C）Association rules

（D）PACF

正解：B 解答を投票する

質問 15

Which process in text analysis can be used to reduce dimensionality?

（A）Sorting

（B）Parsing

（C）Digitizing

（D）Stemming

正解：D 解答を投票する

質問 16

A disk drive manufacturer has a defect rate of less than 1.0% with 98% confidence. A quality assurance team samples 1000 disk drives and finds 14 defective units.
Which action should the team recommend?

（A）The manufacturing process is functioning properly and no further action is required.

（B）The manufacturing process should be inspected for problems.

（C）A larger sample size should be taken to determine if the plant is functioning properly

（D）A smaller sample size should be taken to determine if the plant is functioning properly

正解：B 解答を投票する

D-DS-FN-23 無料問題集「EMC Dell Data Science Foundations」

弊社を連絡する

関連リンク

トップ試験