TEST 2026 FULL QUESTIONS AND SOLUTIONS
◉ Describe the difference between data processing and querying
Answer: In both instances, the user knows what they want. The
difference is that with querying, they can only describe it whereas in
data processing they actually have a way to compute it.
◉ Describe the difference between data exploration and navigation
Answer: In exploration, the user does not know what they want but
wants to get an idea about the data. In navigation, the user DOES
know what they want but does not know how to describe/locate it.
◉ What do these acronyms describe? INS, 3Vs, HMLE Answer: Data
challenges
◉ What does INS stand for and mean? Answer: INS is a lit of data
challenges: Imprecision, Noise, Sparsity
◉ What does 3Vs stand for and mean? Answer: 3Vs is a list of data
challenges: Volume, Velocity, Variety
,◉ What does HMLE stand for and mean? Answer: HMLE is a list of
data challenges: High-dimensional, Multi-modal, Inter-Linked,
Evolving
◉ What is a data schema? Answer: A set of constraints that...
1. describe the "properties" of data
2. describe the structure of data
3. enable validation & efficient storage of data
4. enable querying and retrieval of data
◉ Advantages of a structured database? Answer: 1. Easier to query
2. Easier to optimize
3. Easier to explore
◉ Advantages of semi-structured database? Answer: Data
organization is flexible/malleable (easier to integrate and exchange).
◉ Describe the curse of dimensionality Answer: The more
dimensions we have, the more data we need to discover patterns
(prevent overfitting).
◉ True or False: The distance between two points is equal to the
length of the distance vector. Answer: True
, ◉ Give an example of data transformation Answer: Gender column
originally being a single column with "M" or "F", and then is
transformed into two columns (one for M and one for F) with 0s and
1s to confirm sex.
◉ Which aspects of data should be handled by a scalable data
exploratory system? Answer: 1. The amount of data
2. The diversity of the data types
3. The speed of new data generated
◉ Give an example of prefix search Answer: Find all strings that start
with "tab"
• "table"; "tabular"; "tablet";...
◉ Give an example of subsequence search Answer: Find all strings
that contain the subsequence "ark"
• "marketing"; "spark"; "quark";...
◉ Give an example of subsequence match Answer: - Find the longest
matching subsequence between "plasticity" and "scholastic"
- Find the most frequently repeating 3 character subsequence
• "abcbbbaabbaabcbbbaaabbc"