1. What is the purpose of the DISTINCT keyword in SQL?
A. To return only unique values in a result set
B. To order the result set by specific columns
C. To limit the number of records returned
D. To group records together
Answer: A) To return only unique values in a result set
Rationale: DISTINCT eliminates duplicate rows from the result set, returning only
unique values.
2. Which of the following best describes a data lake?
A. A system used to store structured data only
B. A repository for storing large amounts of unstructured or semi-structured data
C. A tool used for creating data models
D. A method for querying real-time data
Answer: B) A repository for storing large amounts of unstructured or semi-
structured data
,Rationale: A data lake stores vast amounts of unstructured and semi-structured
data, allowing for more flexible data processing and analysis.
3. Which SQL function is used to calculate the total number of rows in a table?
A. COUNT()
B. SUM()
C. AVG()
D. MAX()
Answer: A) COUNT()
Rationale: The COUNT() function is used to return the number of rows in a table
or a result set.
4. Which SQL clause is used to combine data from multiple tables based on a
related column?
A. WHERE
B. GROUP BY
C. JOIN
D. HAVING
, Answer: C) JOIN
Rationale: JOIN is used to combine data from two or more tables based on a
related column.
5. What is the main advantage of using a data warehouse?
A. It stores transactional data
B. It helps with data analysis and reporting
C. It normalizes data automatically
D. It is used for live transaction processing
Answer: B) It helps with data analysis and reporting
Rationale: Data warehouses are used for analyzing large sets of data, aggregating
information, and supporting business intelligence and reporting.
6. What is the purpose of the HAVING clause in SQL?
A. To filter records before grouping
B. To filter records after grouping
C. To limit the number of rows returned
D. To sort the result set
A. To return only unique values in a result set
B. To order the result set by specific columns
C. To limit the number of records returned
D. To group records together
Answer: A) To return only unique values in a result set
Rationale: DISTINCT eliminates duplicate rows from the result set, returning only
unique values.
2. Which of the following best describes a data lake?
A. A system used to store structured data only
B. A repository for storing large amounts of unstructured or semi-structured data
C. A tool used for creating data models
D. A method for querying real-time data
Answer: B) A repository for storing large amounts of unstructured or semi-
structured data
,Rationale: A data lake stores vast amounts of unstructured and semi-structured
data, allowing for more flexible data processing and analysis.
3. Which SQL function is used to calculate the total number of rows in a table?
A. COUNT()
B. SUM()
C. AVG()
D. MAX()
Answer: A) COUNT()
Rationale: The COUNT() function is used to return the number of rows in a table
or a result set.
4. Which SQL clause is used to combine data from multiple tables based on a
related column?
A. WHERE
B. GROUP BY
C. JOIN
D. HAVING
, Answer: C) JOIN
Rationale: JOIN is used to combine data from two or more tables based on a
related column.
5. What is the main advantage of using a data warehouse?
A. It stores transactional data
B. It helps with data analysis and reporting
C. It normalizes data automatically
D. It is used for live transaction processing
Answer: B) It helps with data analysis and reporting
Rationale: Data warehouses are used for analyzing large sets of data, aggregating
information, and supporting business intelligence and reporting.
6. What is the purpose of the HAVING clause in SQL?
A. To filter records before grouping
B. To filter records after grouping
C. To limit the number of rows returned
D. To sort the result set