Practice Test Free
  • QUESTIONS
  • COURSES
    • CCNA
    • Cisco Enterprise Core
    • VMware vSphere: Install, Configure, Manage
  • CERTIFICATES
No Result
View All Result
  • Login
  • Register
Quesions Library
  • Cisco
    • 200-301
    • 200-901
      • Multiple Choice
      • Drag Drop
    • 350-401
      • Multiple Choice
      • Drag Drop
    • 350-701
    • 300-410
      • Multiple Choice
      • Drag Drop
    • 300-415
      • Multiple Choice
      • Drag Drop
    • 300-425
    • Others
  • AWS
    • CLF-C02
    • SAA-C03
    • SAP-C02
    • ANS-C01
    • Others
  • Microsoft
    • AZ-104
    • AZ-204
    • AZ-305
    • AZ-900
    • AI-900
    • SC-900
    • Others
  • CompTIA
    • SY0-601
    • N10-008
    • 220-1101
    • 220-1102
    • Others
  • Google
    • Associate Cloud Engineer
    • Professional Cloud Architect
    • Professional Cloud DevOps Engineer
    • Others
  • ISACA
    • CISM
    • CRIS
    • Others
  • LPI
    • 101-500
    • 102-500
    • 201-450
    • 202-450
  • Fortinet
    • NSE4_FGT-7.2
  • VMware
  • >>
    • Juniper
    • EC-Council
      • 312-50v12
    • ISC
      • CISSP
    • PMI
      • PMP
    • Palo Alto Networks
    • RedHat
    • Oracle
    • GIAC
    • F5
    • ITILF
    • Salesforce
Contribute
Practice Test Free
  • QUESTIONS
  • COURSES
    • CCNA
    • Cisco Enterprise Core
    • VMware vSphere: Install, Configure, Manage
  • CERTIFICATES
No Result
View All Result
Practice Test Free
No Result
View All Result
Home Exam Prep Free

DP-200 Exam Prep Free

Table of Contents

Toggle
  • DP-200 Exam Prep Free – 50 Practice Questions to Get You Ready for Exam Day
  • Access Full DP-200 Exam Prep Free

DP-200 Exam Prep Free – 50 Practice Questions to Get You Ready for Exam Day

Getting ready for the DP-200 certification? Our DP-200 Exam Prep Free resource includes 50 exam-style questions designed to help you practice effectively and feel confident on test day

Effective DP-200 exam prep free is the key to success. With our free practice questions, you can:

  • Get familiar with exam format and question style
  • Identify which topics you’ve mastered—and which need more review
  • Boost your confidence and reduce exam anxiety

Below, you will find 50 realistic DP-200 Exam Prep Free questions that cover key exam topics. These questions are designed to reflect the structure and challenge level of the actual exam, making them perfect for your study routine.

Question 1

Note: This question is a part of series of questions that present the same scenario. Each question in the series contains a unique solution. Determine whether the solution meets the stated goals.
You develop a data ingestion process that will import data to an enterprise data warehouse in Azure Synapse Analytics. The data to be ingested resides in parquet files stored in an Azure Data Lake Gen 2 storage account.
You need to load the data from the Azure Data Lake Gen 2 storage account into the Data Warehouse.
Solution:
1. Create an external data source pointing to the Azure storage account
2. Create an external file format and external table using the external data source
3. Load the data using the INSERT`¦SELECT statement
Does the solution meet the goal?

A. Yes

B. No

 


Suggested Answer: B

You load the data using the CREATE TABLE AS SELECT statement.
References:
https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-load-from-azure-data-lake-store

Question 2

DRAG DROP -
You have an Azure SQL database named DB1 in the East US 2 region.
You need to build a secondary geo-replicated copy of DB1 in the West US region on a new server.
Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.
Select and Place:
 Image

 


Suggested Answer:
Correct Answer Image

Step 1: From the Geo-replication settings of DB1, select West US
The following steps create a new secondary database in a geo-replication partnership.
1. In the Azure portal, browse to the database that you want to set up for geo-replication.
2. (Step 1) On the SQL database page, select geo-replication, and then select the region to create the secondary database.
3. (Step 2-3) Select or configure the server and pricing tier for the secondary database.
Reference Image
Step 2: Create a target server and select a pricing tier
Step 3: On the secondary server, create logins that match the SIDs on the primary server.
Incorrect Answers:
Not log shipping: Replication is used.
References: alt=”Reference Image” />
Step 2: Create a target server and select a pricing tier
Step 3: On the secondary server, create logins that match the SIDs on the primary server.
Incorrect Answers:
Not log shipping: Replication is used.
References:
https://docs.microsoft.com/en-us/azure/sql-database/sql-database-active-geo-replication-portal

Question 3

You are designing an enterprise data warehouse in Azure Synapse Analytics. You plan to load millions of rows of data into the data warehouse each day.
You must ensure that staging tables are optimized for data loading.
You need to design the staging tables.
What type of tables should you recommend?

A. Round-robin distributed table

B. Hash-distributed table

C. Replicated table

D. External table

 


Suggested Answer: A

 

Question 4

You plan to build a structured streaming solution in Azure Databricks. The solution will count new events in five-minute intervals and report only events that arrive during the interval. The output will be sent to a Delta Lake table.
Which output mode should you use?

A. complete

B. update

C. append

 


Suggested Answer: C

Append Mode: Only new rows appended in the result table since the last trigger are written to external storage. This is applicable only for the queries where existing rows in the Result Table are not expected to change.
Incorrect Answers:
A: Complete Mode: The entire updated result table is written to external storage. It is up to the storage connector to decide how to handle the writing of the entire table.
B: Update Mode: Only the rows that were updated in the result table since the last trigger are written to external storage. This is different from Complete Mode in that Update Mode outputs only the rows that have changed since the last trigger. If the query doesn’t contain aggregations, it is equivalent to Append mode.
Reference:
https://docs.databricks.com/getting-started/spark/streaming.html

Question 5

HOTSPOT -
You need to ensure that Azure Data Factory pipelines can be deployed. How should you configure authentication and authorization for deployments? To answer, select the appropriate options in the answer choices.
NOTE: Each correct selection is worth one point.
Hot Area:
 Image

 


Suggested Answer:
Correct Answer Image

The way you control access to resources using RBAC is to create role assignments. This is a key concept to understand ג€” it’s how permissions are enforced. A role assignment consists of three elements: security principal, role definition, and scope.
Scenario:
No credentials or secrets should be used during deployments
Phone-based poll data must only be uploaded by authorized users from authorized devices
Contractors must not have access to any polling data other than their own
Access to polling data must set on a per-active directory user basis
References:
https://docs.microsoft.com/en-us/azure/role-based-access-control/overview

Question 6

HOTSPOT -
You need to build a solution to collect the telemetry data for Race Central.
What should you use? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Hot Area:
 Image

 


Suggested Answer:
Correct Answer Image

API: Table –
Azure Cosmos DB provides native support for wire protocol-compatible APIs for popular databases. These include MongoDB, Apache Cassandra, Gremlin, and
Azure Table storage.
Scenario: The telemetry data must migrate toward a solution that is native to Azure.
Consistency level: Strong –
Use the strongest consistency Strong to minimize convergence time.
Scenario: The data must be written to the Azure datacenter closest to each race and must converge in the least amount of time.
Reference:
https://docs.microsoft.com/en-us/azure/cosmos-db/consistency-levels

Question 7

You need to implement complex stateful business logic within an Azure Stream Analytics service.
Which type of function should you create in the Stream Analytics topology?

A. JavaScript user-define functions (UDFs)

B. Azure Machine Learning

C. JavaScript user-defined aggregates (UDA)

 


Suggested Answer: C

Azure Stream Analytics supports user-defined aggregates (UDA) written in JavaScript, it enables you to implement complex stateful business logic. Within UDA you have full control of the state data structure, state accumulation, state decumulation, and aggregate result computation.
References:
https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-javascript-user-defined-aggregates

Question 8

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You plan to create an Azure Databricks workspace that has a tiered structure. The workspace will contain the following three workloads:
✑ A workload for data engineers who will use Python and SQL
✑ A workload for jobs that will run notebooks that use Python, Scala, and SQL
✑ A workload that data scientists will use to perform ad hoc analysis in Scala and R
The enterprise architecture team at your company identifies the following standards for Databricks environments:
✑ The data engineers must share a cluster.
✑ The job cluster will be managed by using a request process whereby data scientists and data engineers provide packaged notebooks for deployment to the cluster.
✑ All the data scientists must be assigned their own cluster that terminates automatically after 120 minutes of inactivity. Currently, there are three data scientists.
You need to create the Databricks clusters for the workloads.
Solution: You create a Standard cluster for each data scientist, a High Concurrency cluster for the data engineers, and a High Concurrency cluster for the jobs.
Does this meet the goal?

A. Yes

B. No

 


Suggested Answer: A

We need a High Concurrency cluster for the data engineers and the jobs.
Note:
Standard clusters are recommended for a single user. Standard can run workloads developed in any language: Python, R, Scala, and SQL.
A high concurrency cluster is a managed cloud resource. The key benefits of high concurrency clusters are that they provide Apache Spark-native fine-grained sharing for maximum resource utilization and minimum query latencies.
References:
https://docs.azuredatabricks.net/clusters/configure.html

Question 9

HOTSPOT -
You are implementing Azure Stream Analytics windowing functions.
Which windowing function should you use for each requirement? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Hot Area:
 Image

 


Suggested Answer:
Correct Answer Image

Box 1: Tumbling –
Tumbling window functions are used to segment a data stream into distinct time segments and perform a function against them, such as the example below. The key differentiators of a Tumbling window are that they repeat, do not overlap, and an event cannot belong to more than one tumbling window.
Reference Image
Box 2: Hopping –
Hopping window functions hop forward in time by a fixed period. It may be easy to think of them as Tumbling windows that can overlap, so events can belong to more than one Hopping window result set. To make a Hopping window the same as a Tumbling window, specify the hop size to be the same as the window size.
Reference Image
Box 3: Sliding –
Sliding window functions, unlike Tumbling or Hopping windows, produce an output only when an event occurs. Every window will have at least one event and the window continuously moves forward by an ג‚¬ (epsilon). Like hopping windows, events can belong to more than one sliding window.
Reference Image
Reference: alt=”Reference Image” />
Box 2: Hopping –
Hopping window functions hop forward in time by a fixed period. It may be easy to think of them as Tumbling windows that can overlap, so events can belong to more than one Hopping window result set. To make a Hopping window the same as a Tumbling window, specify the hop size to be the same as the window size.
Reference Image
Box 3: Sliding –
Sliding window functions, unlike Tumbling or Hopping windows, produce an output only when an event occurs. Every window will have at least one event and the window continuously moves forward by an ג‚¬ (epsilon). Like hopping windows, events can belong to more than one sliding window.
<img src=”https://www.examtopics.com/assets/media/exam-media/03872/0024000002.jpg” alt=”Reference Image” />
Reference:
https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-window-functions

Question 10

A company is designing a hybrid solution to synchronize data and on-premises Microsoft SQL Server database to Azure SQL Database.
You must perform an assessment of databases to determine whether data will move without compatibility issues. You need to perform the assessment.
Which tool should you use?

A. SQL Server Migration Assistant (SSMA)

B. Microsoft Assessment and Planning Toolkit

C. SQL Vulnerability Assessment (VA)

D. Azure SQL Data Sync

E. Data Migration Assistant (DMA)

 


Suggested Answer: E

The Data Migration Assistant (DMA) helps you upgrade to a modern data platform by detecting compatibility issues that can impact database functionality in your new version of SQL Server or Azure SQL Database. DMA recommends performance and reliability improvements for your target environment and allows you to move your schema, data, and uncontained objects from your source server to your target server.
References:
https://docs.microsoft.com/en-us/sql/dma/dma-overview

Question 11

You have an enterprise data warehouse in Azure Synapse Analytics named DW1 on a server named Server1.
You need to verify whether the size of the transaction log file for each distribution of DW1 is smaller than 160 GB.
What should you do?

A. On the master database, execute a query against the sys.dm_pdw_nodes_os_performance_counters dynamic management view.

B. From Azure Monitor in the Azure portal, execute a query against the logs of DW1.

C. On DW1, execute a query against the sys.database_files dynamic management view.

D. Execute a query against the logs of DW1 by using the Get-AzOperationalInsightsSearchResult PowerShell cmdlet.

 


Suggested Answer: A

The following query returns the transaction log size on each distribution. If one of the log files is reaching 160 GB, you should consider scaling up your instance or limiting your transaction size.
— Transaction log size
SELECT –
instance_name as distribution_db,
cntr_value*1.0/1048576 as log_file_size_used_GB,
pdw_node_id
FROM sys.dm_pdw_nodes_os_performance_counters
WHERE –
instance_name like ‘Distribution_%’
AND counter_name = ‘Log File(s) Used Size (KB)’
Reference:
https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-manage-monitor

Question 12

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You are developing a solution that will use Azure Stream Analytics. The solution will accept an Azure Blob storage file named Customers. The file will contain both in-store and online customer details. The online customers will provide a mailing address.
You have a file in Blob storage named LocationIncomes that contains median incomes based on location. The file rarely changes.
You need to use an address to look up a median income based on location. You must output the data to Azure SQL Database for immediate use and to Azure
Data Lake Storage Gen2 for long-term retention.
Solution: You implement a Stream Analytics job that has two streaming inputs, one query, and two outputs.
Does this meet the goal?

A. Yes

B. No

 


Suggested Answer: B

We need one reference data input for LocationIncomes, which rarely changes
Note: Stream Analytics also supports input known as reference data. Reference data is either completely static or changes slowly.
Reference:
https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-add-inputs#stream-and-reference-inputs

Question 13

HOTSPOT -
A company is deploying a service-based data environment. You are developing a solution to process this data.
The solution must meet the following requirements:
✑ Use an Azure HDInsight cluster for data ingestion from a relational database in a different cloud service
✑ Use an Azure Data Lake Storage account to store processed data
✑ Allow users to download processed data
You need to recommend technologies for the solution.
Which technologies should you use? To answer, select the appropriate options in the answer area.
Hot Area:
 Image

 


Suggested Answer:
Correct Answer Image

Box 1: Apache Sqoop –
Apache Sqoop is a tool designed for efficiently transferring bulk data between Apache Hadoop and structured datastores such as relational databases.
Azure HDInsight is a cloud distribution of the Hadoop components from the Hortonworks Data Platform (HDP).
Incorrect Answers:
DistCp (distributed copy) is a tool used for large inter/intra-cluster copying. It uses MapReduce to effect its distribution, error handling and recovery, and reporting.
It expands a list of files and directories into input to map tasks, each of which will copy a partition of the files specified in the source list. Its MapReduce pedigree has endowed it with some quirks in both its semantics and execution.
RevoScaleR is a collection of proprietary functions in Machine Learning Server used for practicing data science at scale. For data scientists, RevoScaleR gives you data-related functions for import, transformation and manipulation, summarization, visualization, and analysis.
Box 2: Apache Kafka –
Apache Kafka is a distributed streaming platform.
A streaming platform has three key capabilities:
Publish and subscribe to streams of records, similar to a message queue or enterprise messaging system.
Store streams of records in a fault-tolerant durable way.
Process streams of records as they occur.
Kafka is generally used for two broad classes of applications:
Building real-time streaming data pipelines that reliably get data between systems or applications
Building real-time streaming applications that transform or react to the streams of data
Box 3: Ambari Hive View –
You can run Hive queries by using Apache Ambari Hive View. The Hive View allows you to author, optimize, and run Hive queries from your web browser.
References:
https://sqoop.apache.org/

https://kafka.apache.org/intro

https://docs.microsoft.com/en-us/azure/hdinsight/hadoop/apache-hadoop-use-hive-ambari-view

Question 14

You have a SQL pool in Azure Synapse.
You discover that some queries fail or take a long time to complete.
You need to monitor for transactions that have rolled back.
Which dynamic management view should you query?

A. sys.dm_pdw_nodes_tran_database_transactions

B. sys.dm_pdw_waits

C. sys.dm_pdw_request_steps

D. sys.dm_pdw_exec_sessions

 


Suggested Answer: A

You can use Dynamic Management Views (DMVs) to monitor your workload including investigating query execution in SQL pool.
If your queries are failing or taking a long time to proceed, you can check and monitor if you have any transactions rolling back.
Example:
— Monitor rollback
SELECT –
SUM(CASE WHEN t.database_transaction_next_undo_lsn IS NOT NULL THEN 1 ELSE 0 END), t.pdw_node_id, nod.[type]
FROM sys.dm_pdw_nodes_tran_database_transactions t
JOIN sys.dm_pdw_nodes nod ON t.pdw_node_id = nod.pdw_node_id
GROUP BY t.pdw_node_id, nod.[type]
Reference:
https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-manage-monitor#monitor-transaction-log-rollback

Question 15

You have the Diagnostics settings of an Azure Storage account as shown in the following exhibit.
 Image
How long will the logging data be retained?

A. 7 days

B. 365 days

C. indefinitely

D. 90 days

 


Suggested Answer: A

Reference:
https://docs.microsoft.com/en-us/azure/storage/common/storage-analytics-metrics

Question 16

SIMULATION -
 Image
Use the following login credentials as needed:
Azure Username: xxxxx -
Azure Password: xxxxx -
The following information is for technical support purposes only:
Lab Instance: 10277521 -
You plan to create multiple pipelines in a new Azure Data Factory V2.
You need to create the data factory, and then create a scheduled trigger for the planned pipelines. The trigger must execute every two hours starting at 24:00:00.
To complete this task, sign in to the Azure portal.

 


Suggested Answer: See the explanation below.

Step 1: Create a new Azure Data Factory V2
1. Go to the Azure portal.
2. Select Create a resource on the left menu, select Analytics, and then select Data Factory.
Reference Image
4. On the New data factory page, enter a name.
5. For Subscription, select your Azure subscription in which you want to create the data factory.
6. For Resource Group, use one of the following steps:
✑ Select Use existing, and select an existing resource group from the list.
✑ Select Create new, and enter the name of a resource group.
7. For Version, select V2.
8. For Location, select the location for the data factory.
9. Select Create.
10. After the creation is complete, you see the Data Factory page.
Step 2: Create a schedule trigger for the Data Factory
1. Select the Data Factory you created, and switch to the Edit tab.
Reference Image
2. Click Trigger on the menu, and click New/Edit.
Reference Image
3. In the Add Triggers page, click Choose trigger…, and click New.
Reference Image
4. In the New Trigger page, do the following steps:
a. Confirm that Schedule is selected for Type.
b. Specify the start datetime of the trigger for Start Date (UTC) to: 24:00:00 c. Specify Recurrence for the trigger. Select Every Hour, and enter 2 in the text box.
Reference Image
5. In the New Trigger window, check the Activated option, and click Next.
6. In the New Trigger page, review the warning message, and click Finish.
7. Click Publish to publish changes to Data Factory. Until you publish changes to Data Factory, the trigger does not start triggering the pipeline runs.
Reference Image
References: alt=”Reference Image” />
4. On the New data factory page, enter a name.
5. For Subscription, select your Azure subscription in which you want to create the data factory.
6. For Resource Group, use one of the following steps:
✑ Select Use existing, and select an existing resource group from the list.
✑ Select Create new, and enter the name of a resource group.
7. For Version, select V2.
8. For Location, select the location for the data factory.
9. Select Create.
10. After the creation is complete, you see the Data Factory page.
Step 2: Create a schedule trigger for the Data Factory
1. Select the Data Factory you created, and switch to the Edit tab.
Reference Image
2. Click Trigger on the menu, and click New/Edit.
Reference Image
3. In the Add Triggers page, click Choose trigger…, and click New.
Reference Image
4. In the New Trigger page, do the following steps:
a. Confirm that Schedule is selected for Type.
b. Specify the start datetime of the trigger for Start Date (UTC) to: 24:00:00 c. Specify Recurrence for the trigger. Select Every Hour, and enter 2 in the text box.
Reference Image
5. In the New Trigger window, check the Activated option, and click Next.
6. In the New Trigger page, review the warning message, and click Finish.
7. Click Publish to publish changes to Data Factory. Until you publish changes to Data Factory, the trigger does not start triggering the pipeline runs.
<img src=”https://www.examtopics.com/assets/media/exam-media/03872/0017700001.png” alt=”Reference Image” />
References:
https://docs.microsoft.com/en-us/azure/data-factory/quickstart-create-data-factory-portal
https://docs.microsoft.com/en-us/azure/data-factory/how-to-create-schedule-trigger

Question 17

HOTSPOT -
You need to implement an Azure Databricks cluster that automatically connects to Azure Data Lake Storage Gen2 by using Azure Active Directory (Azure AD) integration.
How should you configure the new cluster? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Hot Area:
 Image

 


Suggested Answer:
Correct Answer Image

Box 1: Premium –
Credential passthrough requires an Azure Databricks Premium Plan.
Incorrect Answers:
Support for Azure Data Lake Storage credential passthrough on standard clusters is in Public Preview.
Standard clusters with credential passthrough are supported on Databricks Runtime 5.5 and above and are limited to a single user.
Node: Azure Databricks supports three cluster modes: Standard, High Concurrency, and Single Node.
Box 2: Azure Data Lake Storage Gen1 Credential Passthrough
You can authenticate automatically to Azure Data Lake Storage Gen1 and Azure Data Lake Storage Gen2 from Azure Databricks clusters using the same Azure
Active Directory (Azure AD) identity that you use to log into Azure Databricks. When you enable your cluster for Azure Data Lake Storage credential passthrough, commands that you run on that cluster can read and write data in Azure Data Lake Storage without requiring you to configure service principal credentials for access to storage.
Reference:
https://docs.azuredatabricks.net/spark/latest/data-sources/azure/adls-passthrough.html

Question 18

You have an Azure SQL server named Server1 that hosts two development databases named DB1 and DB2.
You have an administrative workstation that has an IP address of 192.168.8.8. The development team at your company has an IP addresses in the range of
192.168.8.1 to 192.168.8.5.
You need to set up firewall rules to meet the following requirements:
✑ Allows connection from your workstation to both databases.
✑ The development team must be able connect to DB1 but must be prevented from connecting to DB2.
✑ Web services running in Azure must be able to connect to DB1 but must be prevented from connecting to DB2.
Which three actions should you perform? Each correct answer presents part of the solution.
NOTE: Each correct selection is worth one point.

A. Create a firewall rule on DB1 that has a start IP address of 192.168.8.1 and an end IP address of 192.168.8.5.

B. Create a firewall rule on DB1 that has a start and end IP address of 0.0.0.0.

C. Create a firewall rule on Server1 that has a start IP address of 192.168.8.1 and an end IP address of 192.168.8.5.

D. Create a firewall rule on DB1 that has a start and end IP address of 192.168.8.8.

E. Create a firewall rule on Server1 that has a start and end IP address of 192.168.8.8.

 


Suggested Answer: ACE

 

Question 19

HOTSPOT -
You are processing streaming data from vehicles that pass through a toll booth.
You need to use Azure Stream Analytics to return the license plate, vehicle make, and hour the last vehicle passed during each 10-minute window.
How should you complete the query? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Hot Area:
 Image

 


Suggested Answer:
Correct Answer Image

Box 1: MAX –
The first step on the query finds the maximum time stamp in 10-minute windows, that is the time stamp of the last event for that window. The second step joins the results of the first query with the original stream to find the event that match the last time stamps in each window.
Query:
WITH LastInWindow AS –
(
SELECT –
MAX(Time) AS LastEventTime –
FROM –
Input TIMESTAMP BY Time –
GROUP BY –
TumblingWindow(minute, 10)
)
SELECT –
Input.License_plate,
Input.Make,
Input.Time –
FROM –
Input TIMESTAMP BY Time –
INNER JOIN LastInWindow –
ON DATEDIFF(minute, Input, LastInWindow) BETWEEN 0 AND 10
AND Input.Time = LastInWindow.LastEventTime
Box 2: TumblingWindow –
Tumbling windows are a series of fixed-sized, non-overlapping and contiguous time intervals.
Box 3: DATEDIFF –
DATEDIFF is a date-specific function that compares and returns the time difference between two DateTime fields, for more information, refer to date functions.
Reference:
https://docs.microsoft.com/en-us/stream-analytics-query/tumbling-window-azure-stream-analytics

Question 20

DRAG DROP -
You develop data engineering solutions for a company. You must migrate data from Microsoft Azure Blob storage to an Azure SQL Data Warehouse for further transformation. You need to implement the solution.
Which four actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.
Select and Place:
 Image

 


Suggested Answer:
Correct Answer Image

Step 1: Provision an Azure SQL Data Warehouse instance.
Create a data warehouse in the Azure portal.
Step 2: Connect to the Azure SQL Data warehouse by using SQL Server Management Studio
Connect to the data warehouse with SSMS (SQL Server Management Studio)
Step 3: Build external tables by using the SQL Server Management Studio
Create external tables for data in Azure blob storage.
You are ready to begin the process of loading data into your new data warehouse. You use external tables to load data from the Azure storage blob.
Step 4: Run Transact-SQL statements to load data.
You can use the CREATE TABLE AS SELECT (CTAS) T-SQL statement to load the data from Azure Storage Blob into new tables in your data warehouse.
References:
https://github.com/MicrosoftDocs/azure-docs/blob/master/articles/sql-data-warehouse/load-data-from-azure-blob-storage-using-polybase.md

Question 21

You create an Azure Databricks cluster and specify an additional library to install.
When you attempt to load the library to a notebook, the library is not found.
You need to identify the cause of the issue.
What should you review?

A. workspace logs

B. notebook logs

C. global init scripts logs

D. cluster event logs

 


Suggested Answer: C

Cluster-scoped Init Scripts: Init scripts are shell scripts that run during the startup of each cluster node before the Spark driver or worker JVM starts. Databricks customers use init scripts for various purposes such as installing custom libraries, launching background processes, or applying enterprise security policies.
Logs for Cluster-scoped init scripts are now more consistent with Cluster Log Delivery and can be found in the same root folder as driver and executor logs for the cluster.
Reference:
https://databricks.com/blog/2018/08/30/introducing-cluster-scoped-init-scripts.html

Question 22

HOTSPOT -
You are building an Azure Stream Analytics job to identify how much time a user spends interacting with a feature on a webpage.
The job receives events based on user actions on the webpage. Each row of data represents an event. Each event has a type of either 'start' or 'end'.
You need to calculate the duration between start and end events.
How should you complete the query? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Hot Area:
 Image

 


Suggested Answer:
Correct Answer Image

Box 1: DATEDIFF –
DATEDIFF function returns the count (as a signed integer value) of the specified datepart boundaries crossed between the specified startdate and enddate.
Syntax: DATEDIFF ( datepart , startdate, enddate )
Box 2: LAST –
The LAST function can be used to retrieve the last event within a specific condition. In this example, the condition is an event of type Start, partitioning the search by PARTITION BY user and feature. This way, every user and feature is treated independently when searching for the Start event. LIMIT DURATION limits the search back in time to 1 hour between the End and Start events.
Example:
SELECT –
[user],
feature,
DATEDIFF(
second,
LAST(Time) OVER (PARTITION BY [user], feature LIMIT DURATION(hour, 1) WHEN Event = ‘start’),
Time) as duration –
FROM input TIMESTAMP BY Time –
WHERE –
Event = ‘end’
Reference:
https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-stream-analytics-query-patterns

Question 23

HOTSPOT -
You are implementing automatic tuning mode for Azure SQL databases.
Automatic tuning mode is configured as shown in the following table.
 Image
For each of the following statements, select Yes if the statement is true. Otherwise, select No.
NOTE: Each correct selection is worth one point.
Hot Area:
 Image

 


Suggested Answer:
Correct Answer Image

Automatic tuning options can be independently enabled or disabled per database, or they can be configured on SQL Database servers and applied on every database that inherits settings from the server. SQL Database servers can inherit Azure defaults for Automatic tuning settings. Azure defaults at this time are set to FORCE_LAST_GOOD_PLAN is enabled, CREATE_INDEX is enabled, and DROP_INDEX is disabled.
References:
https://docs.microsoft.com/en-us/azure/sql-database/sql-database-automatic-tuning

Question 24

HOTSPOT -
You need to mask tier 1 data. Which functions should you use? To answer, select the appropriate option in the answer area.
NOTE: Each correct selection is worth one point.
Hot Area:
 Image

 


Suggested Answer:
Correct Answer Image

A: Default –
Full masking according to the data types of the designated fields.
For string data types, use XXXX or fewer Xs if the size of the field is less than 4 characters (char, nchar, varchar, nvarchar, text, ntext).
B: email –
C: Custom text –
Custom StringMasking method which exposes the first and last letters and adds a custom padding string in the middle. prefix,[padding],suffix
Tier 1 Database must implement data masking using the following masking logic:
Reference Image
References: alt=”Reference Image” />
References:
https://docs.microsoft.com/en-us/sql/relational-databases/security/dynamic-data-masking

Question 25

You have an Azure Stream Analytics job that receives clickstream data from an Azure event hub.
You need to define a query in the Stream Analytics job. The query must meet the following requirements:
✑ Count the number of clicks within each 10-second window based on the country of a visitor.
✑ Ensure that each click is NOT counted more than once.
How should you define the query?

A. SELECT Country, Count(*) AS Count FROM ClickStream TIMESTAMP BY CreatedAt GROUP BY Country, TumblingWindow(second, 10)

B. SELECT Country, Count(*) AS Count FROM ClickStream TIMESTAMP BY CreatedAt GROUP BY Country, SessionWindow(second, 5, 10)

C. SELECT Country, Avg(*) AS Average FROM ClickStream TIMESTAMP BY CreatedAt GROUP BY Country, SlidingWindow(second, 10)

D. SELECT Country, Avg(*) AS Average FROM ClickStream TIMESTAMP BY CreatedAt GROUP BY Country, HoppingWindow(second, 10, 2)

 


Suggested Answer: A

Tumbling window functions are used to segment a data stream into distinct time segments and perform a function against them, such as the example below. The key differentiators of a Tumbling window are that they repeat, do not overlap, and an event cannot belong to more than one tumbling window.
Example:
Incorrect Answers:
B: Session windows group events that arrive at similar times, filtering out periods of time where there is no data.
C: Sliding windows, unlike Tumbling or Hopping windows, output events only for points in time when the content of the window actually changes. In other words, when an event enters or exits the window. Every window has at least one event, like in the case of Hopping windows, events can belong to more than one sliding window.
D: Hopping window functions hop forward in time by a fixed period. It may be easy to think of them as Tumbling windows that can overlap, so events can belong to more than one Hopping window result set. To make a Hopping window the same as a Tumbling window, specify the hop size to be the same as the window size.
Reference:
https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-window-functions

Question 26

HOTSPOT -
Which masking functions should you implement for each column to meet the data masking requirements? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Hot Area:
 Image

 


Suggested Answer:
Correct Answer Image

Box 1: Custom text/string: A masking method, which exposes the first and/or last characters and adds a custom padding string in the middle.
Only show the last four digits of the values in a column named SuspensionSprings.
Box 2: Default –
Default uses a zero value for numeric data types (bigint, bit, decimal, int, money, numeric, smallint, smallmoney, tinyint, float, real).
Scenario: Only show a zero value for the values in a column named ShockOilWeight.
Scenario:
The company identifies the following data masking requirements for the Race Central data that will be stored in SQL Database:
✑ Only show a zero value for the values in a column named ShockOilWeight.
✑ Only show the last four digits of the values in a column named SuspensionSprings.
Reference:
https://docs.microsoft.com/en-us/azure/azure-sql/database/dynamic-data-masking-overview

Question 27

SIMULATION -
 Image
Use the following login credentials as needed:
Azure Username: xxxxx -
Azure Password: xxxxx -
The following information is for technical support purposes only:
Lab Instance: 10277521 -
You plan to create large data sets on db2.
You need to ensure that missing indexes are created automatically by Azure in db2. The solution must apply ONLY to db2.
To complete this task, sign in to the Azure portal.

 


Suggested Answer: See the explanation below.

1. To enable automatic tuning on Azure SQL Database logical server, navigate to the server in Azure portal and then select Automatic tuning in the menu.
Reference Image
2. Select database db2
3. Click the Apply button
Reference: alt=”Reference Image” />
2. Select database db2
3. Click the Apply button
Reference:
https://docs.microsoft.com/en-us/azure/sql-database/sql-database-automatic-tuning-enable

Question 28

HOTSPOT -
You have a SQL pool in Azure Synapse.
You plan to load data from Azure Blob storage to a staging table. Approximately 1 million rows of data will be loaded daily. The table will be truncated before each daily load.
You need to create the staging table. The solution must minimize how long it takes to load the data to the staging table.
How should you configure the table? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Hot Area:
 Image

 


Suggested Answer:
Correct Answer Image

Box 1: Hash –
Hash-distributed tables improve query performance on large fact tables. hey can have very large numbers of rows and still achieve high performance.
Incorrect:
Round-robin tables are useful for improving loading speed.
Box 2: Clustered columnstore –
When creating partitions on clustered columnstore tables, it is important to consider how many rows belong to each partition. For optimal compression and performance of clustered columnstore tables, a minimum of 1 million rows per distribution and partition is needed.
Box 3: Date –
Table partitions enable you to divide your data into smaller groups of data. In most cases, table partitions are created on a date column.
Partition switching can be used to quickly remove or replace a section of a table.
Reference:
https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-partition
https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-distribute

Question 29

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You have a container named Sales in an Azure Cosmos DB database. Sales has 120 GB of data. Each entry in Sales has the following structure.
 Image
The partition key is set to the OrderId attribute.
Users report that when they perform queries that retrieve data by ProductName, the queries take longer than expected to complete.
You need to reduce the amount of time it takes to execute the problematic queries.
Solution: You create a lookup collection that uses ProductName as a partition key and OrderId as a value.
Does this meet the goal?

A. Yes

B. No

 


Suggested Answer: A

One option is to have a lookup collection ג€ProductNameג€ for the mapping of ג€ProductNameג€ to ג€OrderIdג€.
References:
https://azure.microsoft.com/sv-se/blog/azure-cosmos-db-partitioning-design-patterns-part-1/

Question 30

HOTSPOT -
You have the following Azure Stream Analytics query.
 Image
For each of the following statements, select Yes if the statement is true. Otherwise, select No.
NOTE: Each correct selection is worth one point.
Hot Area:
 Image

 


Suggested Answer:
Correct Answer Image

Box 1: No –
Note: You can now use a new extension of Azure Stream Analytics SQL to specify the number of partitions of a stream when reshuffling the data.
The outcome is a stream that has the same partition scheme. Please see below for an example:
WITH step1 AS (SELECT * FROM [input1] PARTITION BY DeviceID INTO 10), step2 AS (SELECT * FROM [input2] PARTITION BY DeviceID INTO 10)
SELECT * INTO [output] FROM step1 PARTITION BY DeviceID UNION step2 PARTITION BY DeviceID
Note: The new extension of Azure Stream Analytics SQL includes a keyword INTO that allows you to specify the number of partitions for a stream when performing reshuffling using a PARTITION BY statement.
Box 2: Yes –
When joining two streams of data explicitly repartitioned, these streams must have the same partition key and partition count.
Box 3: Yes –
Streaming Units (SUs) represents the computing resources that are allocated to execute a Stream Analytics job. The higher the number of SUs, the more CPU and memory resources are allocated for your job.
In general, the best practice is to start with 6 SUs for queries that don’t use PARTITION BY.
Here there are 10 partitions, so 6×10 = 60 SUs is good.
Note: Remember, Streaming Unit (SU) count, which is the unit of scale for Azure Stream Analytics, must be adjusted so the number of physical resources available to the job can fit the partitioned flow. In general, six SUs is a good number to assign to each partition. In case there are insufficient resources assigned to the job, the system will only apply the repartition if it benefits the job.
Reference:
https://azure.microsoft.com/en-in/blog/maximize-throughput-with-repartitioning-in-azure-stream-analytics/
https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-streaming-unit-consumption

Question 31

You have to deploy resources on Azure HDInsight for a batch processing job. The batch processing must run daily and must scale to minimize costs. You also be able to monitor cluster performance.
You need to decide on a tool that will monitor the clusters and provide information on suggestions on how to scale.
You decide on monitoring the cluster load by using the Ambari Web UI.
Would this fulfill the requirement?

A. Yes

B. No

 


Suggested Answer: A

Yes, this will give you a good idea on the load on the Azure HDInsight cluster.
The Microsoft documentation mentions the following:
Monitor cluster load –
Hadoop clusters can deliver the most optimal performance when the load on cluster is evenly distributed across all the nodes. This enables the processing tasks to run without being constrained by RAM, CPU, or disk resources on individual nodes.
To get a high-level look at the nodes of your cluster and their loading, sign in to the Ambari Web UI, then select the Hosts tab. Your hosts are listed by their fully qualified domain names. Each host’s operating status is shown by a colored health indicator:
Reference Image
Reference: alt=”Reference Image” />
Reference:
https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-key-scenarios-to-monitor

Question 32

You have an enterprise data warehouse in Azure Synapse Analytics.
Using PolyBase, you create an external table named [Ext].[Items] to query Parquet files stored in Azure Data Lake Storage Gen2 without importing the data to the data warehouse.
The external table has three columns.
You discover that the Parquet files have a fourth column named ItemID.
Which command should you run to add the ItemID column to the external table?
 Image

A. Option A

B. Option B

C. Option C

D. Option D

 


Suggested Answer: A

Incorrect Answers:
B, D: Only these Data Definition Language (DDL) statements are allowed on external tables:
✑ CREATE TABLE and DROP TABLE
✑ CREATE STATISTICS and DROP STATISTICS
✑ CREATE VIEW and DROP VIEW
Reference:
https://docs.microsoft.com/en-us/sql/t-sql/statements/create-external-table-transact-sql

Question 33

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
A company uses Azure Data Lake Gen 1 Storage to store big data related to consumer behavior.
You need to implement logging.
Solution: Use information stored in Azure Active Directory reports.
Does the solution meet the goal?

A. Yes

B. No

 


Suggested Answer: B

Instead configure Azure Data Lake Storage diagnostics to store logs and metrics in a storage account.
Note:
You can enable diagnostic logging for your Azure Data Lake Storage Gen1 accounts, blobs, files, queues and tables.
Diagnostic logs aren’t available for Data Lake Storage Gen2 accounts [as of August 2019].
References:
https://docs.microsoft.com/en-us/azure/data-lake-store/data-lake-store-diagnostic-logs
https://github.com/MicrosoftDocs/azure-docs/issues/34286

Question 34

You develop data engineering solutions for a company.
You must integrate the company's on-premises Microsoft SQL Server data with Microsoft Azure SQL Database. Data must be transformed incrementally.
You need to implement the data integration solution.
Which tool should you use to configure a pipeline to copy data?

A. Use the Copy Data tool with Blob storage linked service as the source

B. Use Azure PowerShell with SQL Server linked service as a source

C. Use Azure Data Factory UI with Blob storage linked service as a source

D. Use the .NET Data Factory API with Blob storage linked service as the source

 


Suggested Answer: C

The Integration Runtime is a customer managed data integration infrastructure used by Azure Data Factory to provide data integration capabilities across different network environments.
A linked service defines the information needed for Azure Data Factory to connect to a data resource. We have three resources in this scenario for which linked services are needed:
✑ On-premises SQL Server
✑ Azure Blob Storage
✑ Azure SQL database
Note: Azure Data Factory is a fully managed cloud-based data integration service that orchestrates and automates the movement and transformation of data. The key concept in the ADF model is pipeline. A pipeline is a logical grouping of Activities, each of which defines the actions to perform on the data contained in
Datasets. Linked services are used to define the information needed for Data Factory to connect to the data resources.
References:
https://docs.microsoft.com/en-us/azure/machine-learning/team-data-science-process/move-sql-azure-adf

Question 35

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
A company uses Azure Data Lake Gen 1 Storage to store big data related to consumer behavior.
You need to implement logging.
Solution: Configure Azure Data Lake Storage diagnostics to store logs and metrics in a storage account.
Does the solution meet the goal?

A. Yes

B. No

 


Suggested Answer: A

From the Azure Storage account that contains log data, open the Azure Storage account blade associated with Data Lake Storage Gen1 for logging, and then click Blobs. The Blob service blade lists two containers.
Reference Image
Note:
You can enable diagnostic logging for your Azure Data Lake Storage Gen1 accounts, blobs, files, queues and tables.
Diagnostic logs aren’t available for Data Lake Storage Gen2 accounts [as of August 2019].
Reference: alt=”Reference Image” />
Note:
You can enable diagnostic logging for your Azure Data Lake Storage Gen1 accounts, blobs, files, queues and tables.
Diagnostic logs aren’t available for Data Lake Storage Gen2 accounts [as of August 2019].
Reference:
https://docs.microsoft.com/en-us/azure/data-lake-store/data-lake-store-diagnostic-logs
https://github.com/MicrosoftDocs/azure-docs/issues/34286

Question 36

HOTSPOT -
A company plans to use Platform-as-a-Service (PaaS) to create the new data pipeline process. The process must meet the following requirements:
Ingest:
✑ Access multiple data sources.
✑ Provide the ability to orchestrate workflow.
✑ Provide the capability to run SQL Server Integration Services packages.
Store:
✑ Optimize storage for big data workloads
✑ Provide encryption of data at rest.
✑ Operate with no size limits.
Prepare and Train:
✑ Provide a fully-managed and interactive workspace for exploration and visualization.
✑ Provide the ability to program in R, SQL, Python, Scala, and Java.
✑ Provide seamless user authentication with Azure Active Directory.
Model & Serve:
✑ Implement native columnar storage.
✑ Support for the SQL language.
✑ Provide support for structured streaming.
You need to build the data integration pipeline.
Which technologies should you use? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Hot Area:
 Image

 


Suggested Answer:
Correct Answer Image

Ingest: Azure Data Factory –
Azure Data Factory pipelines can execute SSIS packages.
In Azure, the following services and tools will meet the core requirements for pipeline orchestration, control flow, and data movement: Azure Data Factory, Oozie on HDInsight, and SQL Server Integration Services (SSIS).
Store: Data Lake Storage –
Data Lake Storage Gen1 provides unlimited storage.
Note: Data at rest includes information that resides in persistent storage on physical media, in any digital format. Microsoft Azure offers a variety of data storage solutions to meet different needs, including file, disk, blob, and table storage. Microsoft also provides encryption to protect Azure SQL Database, Azure Cosmos
DB, and Azure Data Lake.
Prepare and Train: Azure Databricks
Azure Databricks provides enterprise-grade Azure security, including Azure Active Directory integration.
With Azure Databricks, you can set up your Apache Spark environment in minutes, autoscale and collaborate on shared projects in an interactive workspace.
Azure Databricks supports Python, Scala, R, Java and SQL, as well as data science frameworks and libraries including TensorFlow, PyTorch and scikit-learn.
Model and Serve: Azure Synapse Analytics
Azure Synapse Analytics/ SQL Data Warehouse stores data into relational tables with columnar storage.
Azure SQL Data Warehouse connector now offers efficient and scalable structured streaming write support for SQL Data Warehouse. Access SQL Data
Warehouse from Azure Databricks using the SQL Data Warehouse connector.
Note: Note: As of November 2019, Azure SQL Data Warehouse is now Azure Synapse Analytics.
References:
https://docs.microsoft.com/bs-latn-ba/azure/architecture/data-guide/technology-choices/pipeline-orchestration-data-movement
https://docs.microsoft.com/en-us/azure/azure-databricks/what-is-azure-databricks

Question 37

You need to develop a pipeline for processing data. The pipeline must meet the following requirements:
✑ Scale up and down resources for cost reduction
✑ Use an in-memory data processing engine to speed up ETL and machine learning operations.
✑ Use streaming capabilities
✑ Provide the ability to code in SQL, Python, Scala, and R
Integrate workspace collaboration with Git
 Image
What should you use?

A. HDInsight Spark Cluster

B. Azure Stream Analytics

C. HDInsight Hadoop Cluster

D. Azure SQL Data Warehouse

E. HDInsight Kafka Cluster

F. HDInsight Storm Cluster

 


Suggested Answer: A

Aparch Spark is an open-source, parallel-processing framework that supports in-memory processing to boost the performance of big-data analysis applications.
HDInsight is a managed Hadoop service. Use it deploy and manage Hadoop clusters in Azure. For batch processing, you can use Spark, Hive, Hive LLAP,
MapReduce.
Languages: R, Python, Java, Scala, SQL
You can create an HDInsight Spark cluster using an Azure Resource Manager template. The template can be found in GitHub.
References:
https://docs.microsoft.com/en-us/azure/architecture/data-guide/technology-choices/batch-processing

Question 38

You have an activity in an Azure Data Factory pipeline. The activity calls a stored procedure in a data warehouse in Azure Synapse Analytics and runs daily.
You need to verify the duration of the activity when it ran last.
What should you use?

A. the sys.dm_pdw_wait_stats data management view in Azure Synapse Analytics

B. an Azure Resource Manager template

C. activity runs in Azure Monitor

D. Activity log in Azure Synapse Analytics

 


Suggested Answer: C

Monitor activity runs. To get a detailed view of the individual activity runs of a specific pipeline run, click on the pipeline name.
Example:
Reference Image
The list view shows activity runs that correspond to each pipeline run. Hover over the specific activity run to get run-specific information such as the JSON input,
JSON output, and detailed activity-specific monitoring experiences.
Reference Image
You can check the Duration.
Incorrect Answers:
A: sys.dm_pdw_wait_stats holds information related to the SQL Server OS state related to instances running on the different nodes.
Reference: alt=”Reference Image” />
The list view shows activity runs that correspond to each pipeline run. Hover over the specific activity run to get run-specific information such as the JSON input,
JSON output, and detailed activity-specific monitoring experiences.
<img src=”https://www.examtopics.com/assets/media/exam-media/03872/0044000001.jpg” alt=”Reference Image” />
You can check the Duration.
Incorrect Answers:
A: sys.dm_pdw_wait_stats holds information related to the SQL Server OS state related to instances running on the different nodes.
Reference:
https://docs.microsoft.com/en-us/azure/data-factory/monitor-visually

Question 39

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You plan to create an Azure Databricks workspace that has a tiered structure. The workspace will contain the following three workloads:
✑ A workload for data engineers who will use Python and SQL
✑ A workload for jobs that will run notebooks that use Python, Scala, and SQL
✑ A workload that data scientists will use to perform ad hoc analysis in Scala and R
The enterprise architecture team at your company identifies the following standards for Databricks environments:
✑ The data engineers must share a cluster.
✑ The job cluster will be managed by using a request process whereby data scientists and data engineers provide packaged notebooks for deployment to the cluster.
✑ All the data scientists must be assigned their own cluster that terminates automatically after 120 minutes of inactivity. Currently, there are three data scientists.
You need to create the Databricks clusters for the workloads.
Solution: You create a Standard cluster for each data scientist, a Standard cluster for the data engineers, and a High Concurrency cluster for the jobs.
Does this meet the goal?

A. Yes

B. No

 


Suggested Answer: B

We need a High Concurrency cluster for the data engineers and the jobs.
Note:
Standard clusters are recommended for a single user. Standard can run workloads developed in any language: Python, R, Scala, and SQL.
A high concurrency cluster is a managed cloud resource. The key benefits of high concurrency clusters are that they provide Apache Spark-native fine-grained sharing for maximum resource utilization and minimum query latencies.
References:
https://docs.azuredatabricks.net/clusters/configure.html

Question 40

A company runs Microsoft SQL Server in an on-premises virtual machine (VM).
You must migrate the database to Azure SQL Database. You synchronize users from Active Directory to Azure Active Directory (Azure AD).
You need to configure Azure SQL Database to use an Azure AD user as administrator.
What should you configure?

A. For each Azure SQL Database, set the Access Control to administrator.

B. For each Azure SQL Database server, set the Active Directory to administrator.

C. For each Azure SQL Database, set the Active Directory administrator role.

D. For each Azure SQL Database server, set the Access Control to administrator.

 


Suggested Answer: C

There are two administrative accounts (Server admin and Active Directory admin) that act as administrators.
One Azure Active Directory account, either an individual or security group account, can also be configured as an administrator. It is optional to configure an Azure
AD administrator, but an Azure AD administrator must be configured if you want to use Azure AD accounts to connect to SQL Database.
References:
https://docs.microsoft.com/en-us/azure/sql-database/sql-database-manage-logins

Question 41

Your company manages a payroll application for its customers worldwide. The application uses an Azure SQL database named DB1. The database contains a table named Employee and an identity column named EmployeeId.
A customer requests the EmployeeId be treated as sensitive data.
Whenever a user queries EmployeeId, you need to return a random value between 1 and 10 instead of the EmployeeId value.
Which masking format should you use?

A. string

B. number

C. default

 


Suggested Answer: B

Reference:
https://docs.microsoft.com/en-us/azure/sql-database/sql-database-dynamic-data-masking-get-started-portal

Question 42

SIMULATION -
Use the following login credentials as needed:
Azure Username: xxxxx -
Azure Password: xxxxx -
The following information is for technical support purposes only:
Lab Instance: 10543936 -
 Image
You plan to enable Azure Multi-Factor Authentication (MFA).
You need to ensure that
User1-10543936@ExamUsers.com
can manage any databases hosted on an Azure SQL server named SQL10543936 by signing in using his Azure Active Directory (Azure AD) user account.
To complete this task, sign in to the Azure portal.

 


Suggested Answer: See the explanation below.

Provision an Azure Active Directory administrator for your managed instance
Each Azure SQL server (which hosts a SQL Database or SQL Data Warehouse) starts with a single server administrator account that is the administrator of the entire Azure SQL server. A second SQL Server administrator must be created, that is an Azure AD account. This principal is created as a contained database user in the master database.
1. In the Azure portal, in the upper-right corner, select your connection to drop down a list of possible Active Directories. Choose the correct Active Directory as the default Azure AD. This step links the subscription-associated Active Directory with Azure SQL server making sure that the same subscription is used for both
Azure AD and SQL Server. (The Azure SQL server can be hosting either Azure SQL Database or Azure SQL Data Warehouse.)
Reference Image
2. Search for and select the SQL server SQL10543936
Reference Image
3. In SQL Server page, select Active Directory admin.
4. In the Active Directory admin page, select Set admin.
Reference Image
5. In the Add admin page, search for user
User1-10543936@ExamUsers.com
, select it, and then select Select. (The Active Directory admin page shows all members and groups of your Active Directory. Users or groups that are grayed out cannot be selected because they are not supported as Azure AD administrators.
Reference Image
6. At the top of the Active Directory admin page, select SAVE.
Reference Image
Reference: alt=”Reference Image” />
2. Search for and select the SQL server SQL10543936
Reference Image
3. In SQL Server page, select Active Directory admin.
4. In the Active Directory admin page, select Set admin.
Reference Image
5. In the Add admin page, search for user
User1-10543936@ExamUsers.com
, select it, and then select Select. (The Active Directory admin page shows all members and groups of your Active Directory. Users or groups that are grayed out cannot be selected because they are not supported as Azure AD administrators.
Reference Image
6. At the top of the Active Directory admin page, select SAVE.
<img src=”https://www.examtopics.com/assets/media/exam-media/03872/0008800001.jpg” alt=”Reference Image” />
Reference:
https://docs.microsoft.com/en-us/azure/sql-database/sql-database-aad-authentication-configure?

Question 43

You have an Azure data factory.
You need to examine the pipeline failures from the last 60 days.
What should you use?

A. the Activity log blade for the Data Factory resource

B. Azure Monitor

C. the Monitor & Manage app in Data Factory

D. the Resource health blade for the Data Factory resource

 


Suggested Answer: B

Data Factory stores pipeline-run data for only 45 days. Use Azure Monitor if you want to keep that data for a longer time. With Monitor, you can route diagnostic logs for analysis to multiple different targets.
Reference:
https://docs.microsoft.com/en-us/azure/data-factory/monitor-using-azure-monitor

Question 44

DRAG DROP -
Your company uses Microsoft Azure SQL Database configured with Elastic pools. You use Elastic Database jobs to run queries across all databases in the pool.
You need to analyze, troubleshoot, and report on components responsible for running Elastic Database jobs.
You need to determine the component responsible for running job service tasks.
Which components should you use for each Elastic pool job services task? To answer, drag the appropriate component to the correct task. Each component may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.
NOTE: Each correct selection is worth one point.
Select and Place:
 Image

 


Suggested Answer:
Correct Answer Image

Execution results and diagnostics: Azure Storage
Job launcher and tracker: Job Service
Job metadata and state: Control database
The Job database is used for defining jobs and tracking the status and history of job executions. The Job database is also used to store agent metadata, logs, results, job definitions, and also contains many useful stored procedures, and other database objects, for creating, running, and managing jobs using T-SQL.
References:
https://docs.microsoft.com/en-us/azure/sql-database/sql-database-job-automation-overview

Question 45

You plan to create a dimension table in Azure Synapse Analytics that will be less than 1 GB.
You need to create the table to meet the following requirements:
✑ Provide the fastest query time.
✑ Minimize data movement during queries.
Which type of table should you use?

A. hash distributed

B. heap

C. replicated

D. round-robin

 


Suggested Answer: D

Usually common dimension tables or tables that doesn’t distribute evenly are good candidates for round-robin distributed table.
Note: Dimension tables or other lookup tables in a schema can usually be stored as round-robin tables. Usually these tables connect to more than one fact tables and optimizing for one join may not be the best idea. Also usually dimension tables are smaller which can leave some distributions empty when hash distributed.
Round-robin by definition guarantees a uniform data distribution.
Reference:
https://blogs.msdn.microsoft.com/sqlcat/2015/08/11/choosing-hash-distributed-table-vs-round-robin-distributed-table-in-azure-sql-dw-service/

Question 46

You have an Azure Stream Analytics job.
You need to ensure that the job has enough streaming units provisioned.
You configure monitoring of the SU% Utilization metric.
Which two additional metrics should you monitor? Each correct answer presents part of the solution.
NOTE: Each correct selection is worth one point.

A. Watermark Delay

B. Late Input Events

C. Out of order Events

D. Backlogged Input Events

E. Function Events

 


Suggested Answer: BD

B: Late Input Events: events that arrived later than the configured late arrival tolerance window.
Note: While comparing utilization over a period of time, use event rate metrics. InputEvents and OutputEvents metrics show how many events were read and processed.
D: In job diagram, there is a per partition backlog event metric for each input. If the backlog event metric keeps increasing, it’s also an indicator that the system resource is constrained (either because of output sink throttling, or high CPU).
Reference:
https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-scale-jobs

Question 47

DRAG DROP -
You plan to create a new single database instance of Microsoft Azure SQL Database.
The database must only allow communication from the data engineer's workstation. You must connect directly to the instance by using Microsoft SQL Server
Management Studio.
You need to create and configure the Database. Which three Azure PowerShell cmdlets should you use to develop the solution? To answer, move the appropriate cmdlets from the list of cmdlets to the answer area and arrange them in the correct order.
Select and Place:
 Image

 


Suggested Answer:
Correct Answer Image

Step 1: New-AzureSqlServer –
Create a server.
Step 2: New-AzureRmSqlServerFirewallRule
New-AzureRmSqlServerFirewallRule creates a firewall rule for a SQL Database server.
Can be used to create a server firewall rule that allows access from the specified IP range.
Step 3: New-AzureRmSqlDatabase –
Example: Create a database on a specified server
PS C:>New-AzureRmSqlDatabase -ResourceGroupName “ResourceGroup01” -ServerName “Server01” -DatabaseName “Database01
References:
https://docs.microsoft.com/en-us/azure/sql-database/scripts/sql-database-create-and-configure-database-powershell?toc=%2fpowershell%2fmodule%2ftoc.json

Question 48

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You have an Azure subscription that contains an Azure Storage account.
You plan to implement changes to a data storage solution to meet regulatory and compliance standards.
Every day, Azure needs to identify and delete blobs that were NOT modified during the last 100 days.
Solution: You apply an Azure policy that tags the storage account.
Does this meet the goal?

A. Yes

B. No

 


Suggested Answer: B

Instead apply an Azure Blob storage lifecycle policy.
Reference:
https://docs.microsoft.com/en-us/azure/storage/blobs/storage-lifecycle-management-concepts?tabs=azure-portal

Question 49

A company has a SaaS solution that uses Azure SQL Database with elastic pools. The solution will have a dedicated database for each customer organization.
Customer organizations have peak usage at different periods during the year.
Which two factors affect your costs when sizing the Azure SQL Database elastic pools? Each correct answer presents a complete solution.
NOTE: Each correct selection is worth one point.

A. maximum data size

B. number of databases

C. eDTUs consumption

D. number of read operations

E. number of transactions

 


Suggested Answer: AC

A: With the vCore purchase model, in the General Purpose tier, you are charged for Premium blob storage that you provision for your database or elastic pool.
Storage can be configured between 5 GB and 4 TB with 1 GB increments. Storage is priced at GB/month.
C: In the DTU purchase model, elastic pools are available in basic, standard and premium service tiers. Each tier is distinguished primarily by its overall performance, which is measured in elastic Database Transaction Units (eDTUs).
References:
https://azure.microsoft.com/en-in/pricing/details/sql-database/elastic/

Question 50

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You have an Azure subscription that contains an Azure Storage account.
You plan to implement changes to a data storage solution to meet regulatory and compliance standards.
Every day, Azure needs to identify and delete blobs that were NOT modified during the last 100 days.
Solution: You schedule an Azure Data Factory pipeline.
Does this meet the goal?

A. Yes

B. No

 


Suggested Answer: B

Instead you can use the Delete Activity in Azure Data Factory to delete files or folders from on-premises storage stores or cloud storage stores or apply an Azure
Blob storage lifecycle policy.
Reference:
https://docs.microsoft.com/en-us/azure/data-factory/delete-activity
https://docs.microsoft.com/en-us/azure/storage/blobs/storage-lifecycle-management-concepts?tabs=azure-portal

Access Full DP-200 Exam Prep Free

Want to go beyond these 50 questions? Click here to unlock a full set of DP-200 exam prep free questions covering every domain tested on the exam.

We continuously update our content to ensure you have the most current and effective prep materials.

Good luck with your DP-200 certification journey!

Share18Tweet11
Previous Post

DP-100 Exam Prep Free

Next Post

DP-201 Exam Prep Free

Next Post

DP-201 Exam Prep Free

DP-203 Exam Prep Free

DP-500 Exam Prep Free

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recommended

Network+ Practice Test

Comptia Security+ Practice Test

A+ Certification Practice Test

Aws Cloud Practitioner Exam Questions

Aws Cloud Practitioner Practice Exam

Comptia A+ Practice Test

  • About
  • DMCA
  • Privacy & Policy
  • Contact

PracticeTestFree.com materials do not contain actual questions and answers from Cisco's Certification Exams. PracticeTestFree.com doesn't offer Real Microsoft Exam Questions. PracticeTestFree.com doesn't offer Real Amazon Exam Questions.

  • Login
  • Sign Up
No Result
View All Result
  • Quesions
    • Cisco
    • AWS
    • Microsoft
    • CompTIA
    • Google
    • ISACA
    • ECCouncil
    • F5
    • GIAC
    • ISC
    • Juniper
    • LPI
    • Oracle
    • Palo Alto Networks
    • PMI
    • RedHat
    • Salesforce
    • VMware
  • Courses
    • CCNA
    • ENCOR
    • VMware vSphere
  • Certificates

Welcome Back!

Login to your account below

Forgotten Password? Sign Up

Create New Account!

Fill the forms below to register

All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.