DP-900: Azure Data
Fundamentals(146 expert curated
questions and answers)
What is batch data?
Batch data is any load of data that has a beginning and end, is not
continuous.
What is Batch Data?
Batch data includes CSV, TSV, Json, XML, Parquet, Blob files, another
database, cache for offline viewing
Describe Streaming Data
Streaming data is continuous no start or stop. Examples would be
IOT, logs, etc.
Other examples: sensor data, event or IOT hub, blob storage for
logs, apache kafka, netflix, youtube, course video.
Describe characteristics of relational data
Relational data is structured, has a schema, is rigid. Databases
composed of tables with rows and columns. Data integrity based on
keys, datatypes, relations.
Components include:
- tables
- views
- Primary: unique row
- Foreign Keys: child parent relationship
Schema - layout of the database including table names, column
names and their data types
Databases enforce integrity
,What are the 5 types of Analytics?
Descriptive (What Happened), Diagnostic (Why it Happened),
Predictive (What will Happen), Prescriptive (What should I do),
Cognitive (Machine learning predictions based on model)
What are Descriptive Analytics?
Descriptive Analytics describe what happened. For example revenue
is down 10% year over year.
What are Diagnostic Analytics?
Describes why it happened. For example revenue is down 10% year
over year due to corona virus which resulted in less purchases
industry wide.
What are Predictive Analytics?
Describe what will happen. Next time we have a pandemic we can
expect revenue to drop 10% based on history.
What are Prescriptive Analytics?
Describe what to do to fix problem. We need to drop prices 10% to
encourage customers to make bigger purchases since less people
are outside.
What are cognitive analytics?
ML recommendations based on a model. AI
What is ELT process?
(ELT) a form of data processing that stands for extract, load then
transform.
Example of how this would be done
Extract data and load it to data lake then perform transformations
with databricks and move to synapse data warehouse
Data is available before transformations
ETL
Extract, transform, load. Perform extract and transform logic before
loading data to be available.
Data is not available before transformations are performed
, What are Microsoft Azure's 4 Relational Databases?
SQL Server in VM
SQL Managed Instance
Azure SQL Database
Azure SQL Database for MySQL, PostgreSQL or Maria DB
Why use SQL Server in a VM?
Guaranteed to be compatible to on premise sql server.
No data limitations (run above 4 TB)
Pay for server and licensing not per DB (could be pos/neg)
What are potential disadvantages to SQL Server in a VM?
You have to do all updates, pick install your sql server version. You
manage everything.
What is SQL Managed Instance
(Isn't used much)
Close to 100% compatability to on premise.
Fully managed service (Azure manages)
4-80 vCores
32GB to 8TB
What is Azure SQL DB
Close to 100% compatibility to on premise
Many options for provisioned and serverless DB
Pay for performance or pay for hardware
2-80 vCores
5GB to 4TB
Starting at $5 per month
Uses SQL Server Engine Underneath
Azure Relational Database Options
(IAAS) Infrastructure as a service -
(PAAS) Platform as a service
(SAAS) Software as a service
How many ways can you classify data?
Fundamentals(146 expert curated
questions and answers)
What is batch data?
Batch data is any load of data that has a beginning and end, is not
continuous.
What is Batch Data?
Batch data includes CSV, TSV, Json, XML, Parquet, Blob files, another
database, cache for offline viewing
Describe Streaming Data
Streaming data is continuous no start or stop. Examples would be
IOT, logs, etc.
Other examples: sensor data, event or IOT hub, blob storage for
logs, apache kafka, netflix, youtube, course video.
Describe characteristics of relational data
Relational data is structured, has a schema, is rigid. Databases
composed of tables with rows and columns. Data integrity based on
keys, datatypes, relations.
Components include:
- tables
- views
- Primary: unique row
- Foreign Keys: child parent relationship
Schema - layout of the database including table names, column
names and their data types
Databases enforce integrity
,What are the 5 types of Analytics?
Descriptive (What Happened), Diagnostic (Why it Happened),
Predictive (What will Happen), Prescriptive (What should I do),
Cognitive (Machine learning predictions based on model)
What are Descriptive Analytics?
Descriptive Analytics describe what happened. For example revenue
is down 10% year over year.
What are Diagnostic Analytics?
Describes why it happened. For example revenue is down 10% year
over year due to corona virus which resulted in less purchases
industry wide.
What are Predictive Analytics?
Describe what will happen. Next time we have a pandemic we can
expect revenue to drop 10% based on history.
What are Prescriptive Analytics?
Describe what to do to fix problem. We need to drop prices 10% to
encourage customers to make bigger purchases since less people
are outside.
What are cognitive analytics?
ML recommendations based on a model. AI
What is ELT process?
(ELT) a form of data processing that stands for extract, load then
transform.
Example of how this would be done
Extract data and load it to data lake then perform transformations
with databricks and move to synapse data warehouse
Data is available before transformations
ETL
Extract, transform, load. Perform extract and transform logic before
loading data to be available.
Data is not available before transformations are performed
, What are Microsoft Azure's 4 Relational Databases?
SQL Server in VM
SQL Managed Instance
Azure SQL Database
Azure SQL Database for MySQL, PostgreSQL or Maria DB
Why use SQL Server in a VM?
Guaranteed to be compatible to on premise sql server.
No data limitations (run above 4 TB)
Pay for server and licensing not per DB (could be pos/neg)
What are potential disadvantages to SQL Server in a VM?
You have to do all updates, pick install your sql server version. You
manage everything.
What is SQL Managed Instance
(Isn't used much)
Close to 100% compatability to on premise.
Fully managed service (Azure manages)
4-80 vCores
32GB to 8TB
What is Azure SQL DB
Close to 100% compatibility to on premise
Many options for provisioned and serverless DB
Pay for performance or pay for hardware
2-80 vCores
5GB to 4TB
Starting at $5 per month
Uses SQL Server Engine Underneath
Azure Relational Database Options
(IAAS) Infrastructure as a service -
(PAAS) Platform as a service
(SAAS) Software as a service
How many ways can you classify data?