Tentamen (uitwerkingen)

DATA MINING QUESTIONS & ANSWERS|| 2026 LATEST UPDATE|| VERIFIED A+

Beoordeling

Verkocht

Pagina's

Cijfer

A+

Geüpload op

05-02-2026

Geschreven in

2025/2026

What is data mining? - ANSWERThe process of sorting through large data sets to identify patterns and establish relationships to solve problems through data analysis What are the steps involved in data mining when viewed as a process of knowledge discovery? - ANSWERData Cleaning Data Integration Data Selection Data Transformation Data Mining Pattern Evaluation Knowledge Presentation

Meer zien Lees minder

Instelling

DATA MINING

Vak

DATA MINING

Voorbeeld van de inhoud

DATA MINING QUESTIONS &
ANSWERS|| 2026 LATEST UPDATE||
VERIFIED A+
What is data mining? - ANSWERThe process of sorting through large data sets to
identify patterns and establish relationships to solve problems through data analysis

What are the steps involved in data mining when viewed as a process of knowledge
discovery? - ANSWERData Cleaning
Data Integration
Data Selection
Data Transformation
Data Mining
Pattern Evaluation
Knowledge Presentation

What are the data mining functionalities - ANSWERCharacterization and discrimination
Mining of frequent patterns, associations, and correlations Classification and regression
Clustering analysis
Outlier analysis

Data Characterization - ANSWERA summary of the general characteristics or features
of a target class of data. The data corresponding to the user-specified class is typically
collected by a query. For example, to study the characteristics of software products with
sales that increased by 10% in the previous year, the data related to such products can
be collected by executing an SQL query on the sales database.

Data discrimination - ANSWERcomparison of the target class with one or a set of
comparative classes

Data mining methodology challenges - ANSWERMining various and new kinds of
knowledge
Mining knowledge in multidimensional space
Integrating new methods from multiple disciplines
Boosting the power of discovery in a networked environment
Handling uncertainty, noise, or incompleteness of data
Pattern evaluation and pattern- or constraint-guided mining

Explain one challenge of mining a huge amount of data in comparison with mining a
small amount of data. - ANSWERAlgorithms that deal with data need to scale nicely so
that even vast amounts of data can be handled efficiently, and take short amounts of
time

,What is an outlier? - ANSWERAn object which does not fit in with the general behavior
of the model.

Does an outlier need to be discarded always? - ANSWERIn most cases of data mining,
outliers are discarded. However, there are special circumstances, such as fraud
detection, where outliers can be useful.

The mode is the only measure of central tendency that can be used for nominal
attributes. (T/F) - ANSWERTrue. An example of this would be hair color, with different
categories such as black, brown, blond, and red. Which one is the most common one?

Nominal attribute - ANSWERrefer to symbols or names of things. Categorical. It can
also be represented using a number, however, they are not meant to be used
quantitatively. Has no median, but has a mode

Binary Attributes - ANSWERA nominal attribute with only two categories or states: 0 or
1, where
0 typically means that the attribute is absent, and 1 means that it is present.

Ordinal Attributes - ANSWERAn attribute with possible values that have a meaningful
order or
ranking among them, but the magnitude between successive values is not known.

Numeric Attributes - ANSWERQuantitative; that is, it is a measurable quantity,
represented in
integer or real values. Can be interval-scaled or ratio-scaled.

Discrete Attribute - ANSWERhas a finite or countably infinite set of variables

Continuous Attributes - ANSWERtypically represented as floating-point variables.

The mean is in general affected by outliers (T/F) - ANSWERTrue

Not all numerical data sets have a median. (T/F) - ANSWERFalse

What are the differences between the measures of central tendency and the measures
of dispersion? - ANSWERThe measures of central tendency are the mean, median,
mode and midrange. They are used to measure the location of the middle or the center
of the data distribution, basically where the most values fall. Whereas, the dispersion
measures are the range, quartiles, interquartile range, the five-number summary,
boxplots, the variance and standard deviation of the data. They are mainly used to find
an idea of the dispersion of the data, how is the data spread out, and to identify outliers.

How would you catalog a boxplot, as a measure of dispersion or as a data visualization
aid? Why? - ANSWERAs a data visualization aid. The boxplot shows how the
boundaries relate to each other visually, where the minimum, maximum values lie, and

, the Interquartile ranges with a line signifying the median. It does not give you a specific
measure, but allows you to somewhat visualize the data set. For example, if you have a
boxplot for the grades in a class, if the box is closer to the minimum boundary then you
can see that most scores were low.

What do we understand by similarity measure? - ANSWERIt quantifies the similarity
between two objects. Usually, large values are for similar objects and zero or negative
values are for dissimilar objects.

What is the importance of similarity measures - ANSWERThey are important because
they help us see patterns in data. They also give us knowledge about our data. They
are used in clustering algorithms. Similar data points are put into the same clusters, and
dissimilar points are placed into different clusters.

What do we understand by dissimilarity measure and what is its importance? -
ANSWERMeasuring the difference between to objects, the greater the difference
between two objects the higher the value.

What is the importance of dissimilarity measures - ANSWERThe importance of this is
that in some instances, having two objects with low dissimilarity could mean something
negative. For example, cheating.

Discuss one of the distance measures that are commonly used for computing the
dissimilarity of objects described by numeric attributes. - ANSWEREuclidean distance
d(i, j) =sqrt((xi1 − xj1)^2 + (xi2 − xj2)^2 +··· )
Manhattan Distance |x1 - x2| + |y1 - y2|
Minkowski distance d(i, j) = sqrt(h, |xi1 − xj1|^h + |xi2 − xj2|^h + ...)
Supremum distance d(i, j) = max(f, p) |xif − xjf |

In many real-life databases, objects are described by a mixture of attribute types. How
can we compute the dissimilarity between objects of mixed attribute types? -
ANSWERIn order to determine the dissimilarity between objects of mixed attributes
there are two main approaches. One of them indicates to separate each attribute type
and do a data mining analysis for each of them. This method is acceptable if the results
are consistent. Applying this method to real life projects is not viable as analyzing the
attribute types separately will most likely generate different results. The second
approach is more acceptable. It processes all attributes types together and do only one
analysis by combining the attributes into a dissimilarity matrix

What do we understand by data quality and what is its importance? - ANSWERWhen an
object satisfies the requirements of the intended use. It has many factors like: including
accuracy, completeness, consistency, timeliness, believability, and interpretability. It
also depends on the intended use of the data, for some users the data may be
inconsistent, but for others, it can just be hard to interpret.

Meld schending auteursrecht

Geschreven voor

Instelling: DATA MINING
Vak: DATA MINING

Documentinformatie

Geüpload op: 5 februari 2026
Aantal pagina's: 28
Geschreven in: 2025/2026
Type: Tentamen (uitwerkingen)
Bevat: Vragen en antwoorden

Onderwerpen

data minining
data mining questions answers 2026 latest upda
what is data mining answerthe process of sorti
what are the steps involved in data mining when vi

$16.99

Krijg toegang tot het volledige document:

Geschreven door studenten die geslaagd zijn

Direct beschikbaar na je betaling

Online lezen of als PDF

Maak kennis met de verkoper

shantelleG

4.0

(118)

Ook beschikbaar in voordeelbundel

Maak kennis met de verkoper

shantelleG West Virgina University

Bekijk profiel

Volgen

Verkocht

641

Lid sinds

3 jaar

Aantal volgers

369

Documenten

18264

Laatst verkocht

6 dagen geleden

GOLD PREMIUM

HELLO? welcome to my store thanks for visiting this page here you are guaranteed of well revised and assured EXAMS ALL GRADED A+ thus making your education journey easy and seamless . DO NOT HESITATE TO CONTACT ME IF YOU ARE IN NEED OF ANY EXAM .I AM READY 24/7 TO ASSIST YOU ALSO REFER YOUR FRIENDS.

4.0

118 beoordelingen

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper shantelleG. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor $16.99. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews) Afgelopen 30 dagen zijn er 52912 samenvattingen verkocht Opgericht in 2010, al 16 jaar dé plek om samenvattingen te kopen

DATA MINING QUESTIONS & ANSWERS|| 2026 LATEST UPDATE|| VERIFIED A+

Voorbeeld van de inhoud

Geschreven voor

Documentinformatie

Onderwerpen

Ook beschikbaar in voordeelbundel

Maak kennis met de verkoper

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Niet tevreden? Kies een ander document

Betaal zoals je wilt, start meteen met leren

Bezig met je bronvermelding?

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Tevredenheidsgarantie: hoe werkt dat?

Van wie koop ik deze samenvatting?

Zit ik meteen vast aan een abonnement?

Is Stuvia te vertrouwen?