GUIDE 2026 FULL QUESTIONS AND
SOLUTIONS GRADED A+
◍ What is regression analysis?.
Answer: -A statistical tool to describe the association and relationship
between different variables (ex. between dependent variables and
independent variables)-It's main use is for prediction
◍ What is pruning? What are the two types of pruning?.
Answer: The action of trimming nodes to reduce the number of nodes and
the size of the tree.•Pre-pruning: Stop growing the tree before the training
set is optimally classified•Post-pruning: Wait until the training set is
optimally classified, then prune.
◍ What is a histogram?.
Answer: -A data model that outputs data in the form of bars of differing
heights, with each bar grouping your values into ranges-The more vertical
the bar, the higher the value
◍ What is meta-data? (Chapter 1).
Answer: It describes the structure of the primary database.A database system
contains not only the database itself but also a complete definition or
description of the database structure and constraints.
◍ What is persistent data? What was the early name for persistent data? (Topic
1).
Answer: Data that is in the database for a long period of time.The earlier
term used to be named operational data.
◍ What is stochastic gradient descent?.
Answer: A gradient descent that processes one training sample in every
, iteration. The parameters are updated after each iteration.
◍ What are the two types of hierarchical clustering?.
Answer: •Agglomerative (Bottom-up): Initially, each point is considered as
its own cluster•Divisive (Top-down): Start with one big cluster and split it
recursively until it cannot be split anymore
◍ When do we need to make changes to conceptual/logical schema? (Topic 2).
Answer: When the logical structure of the database changes.
◍ True/False (Topic 1):In database systems the logical and physical
representation of data are separated.
Answer: True: In database systems the logical and physical representation of
data are separated
◍ In what case does the perceptron learning algorithm not terminate?.
Answer: It does not terminate if the learning set is not linearly separable.If
the vectors are not linearly separable, learning will never reach a point
where all vectors are classified properly.
◍ What is an attribute? (Chapter 2).
Answer: Represents some property of interest that further describes an
entity, such as the employee's name or salary. (Identified as an oval in ER
diagram)
◍ What does an ER (entity-relationship) model consist of? (Topic 3).
Answer: Entities, Relationships and Attributes
◍ True/False, why? (Topic 2)Achieving logical data independence is just as
difficult as physical data independence..
Answer: False: Achieving logical data independence is MORE difficult than
physical data independence.This is because application programs heavily
rely on the logical structure of the data they access.
◍ What is Atomicity in transactions? (Topic 4).
Answer: Means that transactions are guaranteed either to execute in their
entirety or not to execute at all, even if the system fails halfway through the
, process.
◍ What is Multicollinearity?.
Answer: -A condition in which at least 2 independent variables are highly
linearly correlated. -It will often crash computers
◍ What is the margin in a hyperplane?.
Answer: The margin, m, is the distance from the hyperplane to the closest
points in either class.This is also known as the Maximal-Margin hyperplane.
◍ What is the difference between a 'stored' vs. 'derived' attribute? (Topic 3).
Answer: A stored attribute was entered as data (ex. date of birth)A derived
attribute is be derived from previous data that was already entered in the
database (ex. 'age' can be found from date of birth with calculations). They
indicated by a dashed circle in the ER diagram.
◍ What are the 3 types of 'relationships' in an ER diagram? What do they
mean? (Topic 3).
Answer: -one-to-one (1:1) (ex. 1 manager per 1 department)-one-to-many
(1:N OR 1:M)(ex. 1 department can have many employees)-many-to-many
(M:N)(ex. a supplier can supply many projects, and a project can receive
parts from many suppliers)
◍ What is the superclass of an entity? (Topic 3).
Answer: It is the class from which many subclasses can be created in the
entity.
◍ How do we train the perceptron?.
Answer: We must feed it multiple training samples and calculate the output
for each of them.After each sample, the weights w are adjusted in such a
way so as to minimize the output error (difference between the target and
the actual outputs).
◍ What is a CDF?.
Answer: (Cumulative Density Function)-The probability of a random
variable (discrete or continuous) being less than or equal to a given/defined
value, usually denoted by the equation: F(x) = P(X<=x)-A type of