QUESTIONS & ANSWERS
Why Data Mining? - ANSWERExplosive data growth (in KB, MB, GB,TB, PB, EB, and
ZB)
What is data mining? - ANSWERKnowledge discovery from data (Extraction of
interesting patterns or knowledge from huge amounts of data.)
Benefits of data mining - ANSWERScalability and efficiency
The four views of data mining - ANSWERData, Application, Knowledge, Technique
What are the 5Vs of Data Mining? - ANSWERVolume, Variety, Velocity, Veracity, Value
Relational, transactional data (Data View) - ANSWERE.g., student records, bank
accounts, store purchases
Sequential, temporal, streaming data (Data View) - ANSWERE.g., gene sequences,
stock prices, sensor readings
Spatial, spatial-temporal data (Data View) - ANSWERE.g., land use, bird migration,
traffic condition
Text, multimedia, Web data (Data View) - ANSWERE.g., news articles,
audio/video/image data, hypertext
Graph, network data (Data View) - ANSWERE.g., social network, power grid, co-
authorship
Market Analysis, target advertisement (Application View) - ANSWERE.g., customer
profiling, product recommendation
Healthcare, medical research (Application View) - ANSWERE.g., disease diagnosis,
patient care, drug discovery
Science and engineering (Application View) - ANSWERE.g., air pollution, marine life,
electric vehicles
Security (Application View) - ANSWERE.g., surveillance, intrusion/crime, fraud,
cyberattack
Government, nonprofit (Application View) - ANSWERE.g., urban planning, traffic
control, education
, Frequent pattern , correlation (Knowledge View) - ANSWERE.g., Songs listened
together or in certain sequence
Categorization (Knowledge View) - ANSWERE.g., Similarity among user with certain
purchases, differences between two patient groups
Anomaly, outliers (Knowledge View) - ANSWERE.g., sensor errors, fraud activities,
extreme events
Changes over time (Knowledge View) - ANSWERE.g., emerging new patterns, shift of
user interest
What are the five different techniques for data mining? - ANSWERFrequent pattern
analysis, classification/prediction, clustering, anomaly detection, trend and evolution
analysis
Frequent Pattern Analysis - ANSWERIncludes frequent itemset, frequent sequence,
frequent structure, association rules, correlation analysis
Classification - ANSWERIncludes pre-defined classes, training data, and
distinguishable classes
Prediction - ANSWERIncludes numerical prediction (continuous) values (e.g. weather,
stock price, traffic)
Clustering - ANSWERIncludes no pre-defined classes, intra-cluster similarity, inter-
cluster dissimilarity
Anomaly Detection - ANSWERIncludes anomalies or outliers (e.g. error, noise, fraud,
extreme events)
Trend and Evolution Analysis - ANSWERIncludes changes over time, overall trend,
periodical patterns, anomalies (e.g. Google Trends)
What Steps Form The Data Mining Pipeline? - ANSWERData Understanding, Data
Preprocessing, Data Warehousing, Data Modeling, Pattern Evaluation
Data Understanding - ANSWERAnswering questions like: What types of data? What do
they look like?
Includes statistics and visualization
observes similarity vs dissimilarity