CSE 578 DATA VISUALIZATION EXAM
ACTUAL EXAMINATION 2026 QUESTIONS
WITH ANSWERS GRADED A+
⫸ Describe the difference between data processing and querying
Answer: In both instances, the user knows what they want. The
difference is that with querying, they can only describe it whereas in
data processing they actually have a way to compute it.
⫸ Describe the difference between data exploration and navigation
Answer: In exploration, the user does not know what they want but
wants to get an idea about the data. In navigation, the user DOES know
what they want but does not know how to describe/locate it.
⫸ What do these acronyms describe? INS, 3Vs, HMLE Answer: Data
challenges
⫸ What does INS stand for and mean? Answer: INS is a lit of data
challenges: Imprecision, Noise, Sparsity
⫸ What does 3Vs stand for and mean? Answer: 3Vs is a list of data
challenges: Volume, Velocity, Variety
⫸ What does HMLE stand for and mean? Answer: HMLE is a list of
data challenges: High-dimensional, Multi-modal, Inter-Linked, Evolving
,⫸ What is a data schema? Answer: A set of constraints that...
1. describe the "properties" of data
2. describe the structure of data
3. enable validation & efficient storage of data
4. enable querying and retrieval of data
⫸ Advantages of a structured database? Answer: 1. Easier to query
2. Easier to optimize
3. Easier to explore
⫸ Advantages of semi-structured database? Answer: Data organization
is flexible/malleable (easier to integrate and exchange).
⫸ Describe the curse of dimensionality Answer: The more dimensions
we have, the more data we need to discover patterns (prevent
overfitting).
⫸ True or False: The distance between two points is equal to the length
of the distance vector. Answer: True
⫸ Give an example of data transformation Answer: Gender column
originally being a single column with "M" or "F", and then is
transformed into two columns (one for M and one for F) with 0s and 1s
to confirm sex.
, ⫸ Which aspects of data should be handled by a scalable data
exploratory system? Answer: 1. The amount of data
2. The diversity of the data types
3. The speed of new data generated
⫸ Give an example of prefix search Answer: Find all strings that start
with "tab"
• "table"; "tabular"; "tablet";...
⫸ Give an example of subsequence search Answer: Find all strings that
contain the subsequence "ark"
• "marketing"; "spark"; "quark";...
⫸ Give an example of subsequence match Answer: - Find the longest
matching subsequence between "plasticity" and "scholastic"
- Find the most frequently repeating 3 character subsequence
• "abcbbbaabbaabcbbbaaabbc"
⫸ What is the edit distance between two sequences? Answer: The
minimum number of edit operations (insert, remove, replace) needed to
convert one sequence to the other.
ACTUAL EXAMINATION 2026 QUESTIONS
WITH ANSWERS GRADED A+
⫸ Describe the difference between data processing and querying
Answer: In both instances, the user knows what they want. The
difference is that with querying, they can only describe it whereas in
data processing they actually have a way to compute it.
⫸ Describe the difference between data exploration and navigation
Answer: In exploration, the user does not know what they want but
wants to get an idea about the data. In navigation, the user DOES know
what they want but does not know how to describe/locate it.
⫸ What do these acronyms describe? INS, 3Vs, HMLE Answer: Data
challenges
⫸ What does INS stand for and mean? Answer: INS is a lit of data
challenges: Imprecision, Noise, Sparsity
⫸ What does 3Vs stand for and mean? Answer: 3Vs is a list of data
challenges: Volume, Velocity, Variety
⫸ What does HMLE stand for and mean? Answer: HMLE is a list of
data challenges: High-dimensional, Multi-modal, Inter-Linked, Evolving
,⫸ What is a data schema? Answer: A set of constraints that...
1. describe the "properties" of data
2. describe the structure of data
3. enable validation & efficient storage of data
4. enable querying and retrieval of data
⫸ Advantages of a structured database? Answer: 1. Easier to query
2. Easier to optimize
3. Easier to explore
⫸ Advantages of semi-structured database? Answer: Data organization
is flexible/malleable (easier to integrate and exchange).
⫸ Describe the curse of dimensionality Answer: The more dimensions
we have, the more data we need to discover patterns (prevent
overfitting).
⫸ True or False: The distance between two points is equal to the length
of the distance vector. Answer: True
⫸ Give an example of data transformation Answer: Gender column
originally being a single column with "M" or "F", and then is
transformed into two columns (one for M and one for F) with 0s and 1s
to confirm sex.
, ⫸ Which aspects of data should be handled by a scalable data
exploratory system? Answer: 1. The amount of data
2. The diversity of the data types
3. The speed of new data generated
⫸ Give an example of prefix search Answer: Find all strings that start
with "tab"
• "table"; "tabular"; "tablet";...
⫸ Give an example of subsequence search Answer: Find all strings that
contain the subsequence "ark"
• "marketing"; "spark"; "quark";...
⫸ Give an example of subsequence match Answer: - Find the longest
matching subsequence between "plasticity" and "scholastic"
- Find the most frequently repeating 3 character subsequence
• "abcbbbaabbaabcbbbaaabbc"
⫸ What is the edit distance between two sequences? Answer: The
minimum number of edit operations (insert, remove, replace) needed to
convert one sequence to the other.